Date Modified Tags tools / gmail / imap

libgmail is a python library to access Gmail using IMAP. Its something I have been using for many years now. I wrote this initially for learning IMAP, but it quickly became something that I found many uses for. Here are some quick notes for anyone working with IMAP.

Pyhton's imaplib is good. It abstracts many aspects of the IMAP protocol. However, to do certain things, you will need to write more code on top of it.

Basics

You connect to an IMAP server using the IMAP or IMAP_SSL class. I use IMAP4_SSL to connect to Gmail's IMAP.

self.conn = imaplib.IMAP4_SSL("imap.gmail.com", 993)

Its as easy as that. Same thing with disconnect:

self.conn.close()

Listing Mailboxes

You can ask the IMAP server for a list of all available mailboxes:

 mailboxes = self.conn.list()
# mailboxes is a tuple.
if mailboxes[0] != 'OK':
    self.logger.error('Could not get list of mailboxes')
    return []
# mailboxes[-1] is the list of the mailbox names

Each mailbox name in the list has a few fields. These fields show various properties of the mailbox and the mailbox name itself. You will need to use regular expressions to parse the mailbox list info a useful datatype - a dict() in libgmail's case. For mode code, see the get_mailboxes() method in the Gmail class.

Selecting Mailboxes

Before you do any operation, you will first need to select a mailbox to work on. The following code shows you how to do this.

self.logger.debug('Selecting mailbox : %s'%(mailbox))
response, msg_count = self.conn.select(mailbox, readonly=True)
# select() returns a tuple
if response == 'NO':
    self.logger.error('Could not select mailbox > %s'%(mailbox))
    return None
self.logger.debug('Selection : %s, Number of Mails : %s'%(response, len(msg_count)))

You can choose to open the mailbox in readonly mode as shown in the code. This is a safe thing to use. Once you have successfully selected a mailbox, you can start to query it.

Searching Emails

Searching is the probably the most distinguishing feature of IMAP. Building search queries, or 'search criteria' for searching is straightforward pmce once you get the format correct.

Here are a few examples:

# All emails from Apr 1st to Apr 5, in 2014
(SINCE "01-Apr-2014" BEFORE "05-Apr-2014")

# All emails from Apr 1st to Apr 5, in 2014, with attachments (Gmail)
(has:attachment AFTER "01-Apr-2014")
# search() returns a tuple. Reason, data
result, data = self.conn.search(None, 'X-GM-RAW', search_criteria)
if not data:
    self.logger.debug('No data returned from search')
    return None

# data contains the ids of the email that matched your search
data = data[0].split()
self.logger.debug('Found %d mail(s)'%( len(data)))
# create a string with the ids, separated by comma
nums = ','.join(data)

Fetching Emails

Once you have the list of email ids as a comma-separated-string, you can then fetch the email.

self.logger.debug('Fetching messages now ..')
response, data = self.conn.fetch(nums, '(RFC822)')
self.logger.debug('Fetch response: %s'%(response))
for msg in data:
    email__ = msg[1]

Attachments

Each email might have multiple attachments. To extract them, you will need to 'walk' the email.

for part in msg.walk():
    if part.get('Content-Disposition').startswith('attach') or 'filename' in part.get('Content-Disposition'):
        filename = part.get_filename()
        data = part.get_payload(decode=True)

You should add more code to check other Content-Disposition and Content-Types, but thats the easiest way to extract attachments.

References