How do I use poplib, and download mails as message instances from email.Message class from email module in Python?
I am writing a program, which analyzes, all emails for specific information, storing parts of the message into a database. I can download the entire mail as text, howver walking through text searching for attachments is difficult.
idea is to parse messages for information
use the FeedParser class in the email.feedparser module to construct an email.Message object from the messages read from the server with poplib.
specifically:
import poplib
import email
pop = poplib.POP3( "server..." )
[establish connection, authenticate, ...]
raw = pop.retr( 1 )
pop.close()
parser = email.parser.FeedParser()
for line in raw[1]:
parser.feed( str( line+b'\n', 'us-ascii' ) )
message = parser.close()
Doesn't deal with character set issues like Adrien Plisson's answer.
import poplib
import email
pop = poplib.POP3( "server..." )
[establish connection, authenticate, ...]
raw = pop.retr( 1 )
pop.close()
message = email.message_from_string('\n'.join(raw[1]))
Related
So I know how to look through the inbox (or any other folder) and find emails to reply to. However in my case, I have a .msg email file from which I extract the MessageID, and I'm looking to use win32com module to reply to that specific email.
Basically I'm looking for something like this:
from extract_msg import Message
msg = Message("message.msg")
outlook = win32com.client.Dispatch('outlook.application')
mail = outlook.CreateItem(0x0)
mail.To = "; ".join(to)
mail.Subject = subject
mail.Body = body
mail.InReplyTo = msg.messageId
I understand that something similar is doable using the smtplib module using:
message['In-Reply-To'] = msg.messageId
but I cannot get smtplb to work with Outlook. And thus, I'm using win32com.
The PR_IN_REPLY_TO_ID property should be set to the PR_INTERNET_MESSAGE_ID property value. Make sure that such value exists in Outlook messages. You can get the value in Outlook in the following way (the sample is in C# but the Outlook object model is common for all kind of programming languages):
string PR_INTERNET_MESSAGE_ID = "http://schemas.microsoft.com/mapi/proptag/0x1035001F";
Microsoft.Office.Interop.Outlook.PropertyAccessor pal = mailItem.PropertyAccessor;
string Internet_Message_Id = pal.GetProperty(PR_INTERNET_MESSAGE_ID).ToString();
You need to set the PR_IN_REPLY_TO_ID MAPI property (DASL name "http://schemas.microsoft.com/mapi/proptag/0x1042001F") using MailItem.PropertyAccessor.SetProperty.
I am trying to write a python script to send an email that uses html formatting and involves a lot of non-breaking spaces. However, when I run it, some of the   strings are interrupted by spaces that occur every 171 characters, as can be seen by this example:
#!/usr/bin/env python
import smtplib
import socket
from email.mime.text import MIMEText
emails = ["my#email.com"]
sender = "test#{0}".format(socket.gethostname())
message = "<html><head></head><body>"
for i in range(20):
message += " " * 50
message += "<br/>"
message += "</body>"
message = MIMEText(message, "html")
message["Subject"] = "Test"
message["From"] = sender
message["To"] = ", ".join(emails)
mailer = smtplib.SMTP("localhost")
mailer.sendmail(sender, emails, message.as_string())
mailer.quit()
The example should produce a blank email that consists of only spaces, but it ends up looking something like this:
  ;
&nb sp;
& nbsp;
&nbs p;
&n bsp;
Edit: In case it is important, I am running Ubuntu 15.04 with Postfix for the smtp client, and using python2.6.
I can replicate this in a way but my line breaks come every 999 characters. RFC 821 says maximum length of a line is 1000 characters including the line break so that's probably why.
This post gives a different way to send a html email in python, and i believe the mime type "multipart/alternative" is the correct way.
Sending HTML email using Python
I'm the developer of yagmail, a package that tries to make it easy to send emails.
You can use the following code:
import yagmail
yag = yagmail.SMTP('me#gmail.com', 'mypassword')
for i in range(20):
message += " " * 50
message += "<br/>"
yag.send(contents = message)
Note that by default it will send a HTML message, and that it also adds automatically the alternative part for non HTML browsers.
Also, note that omitting the subject will leave an empty subject, and without a to argument it will send it to self.
Furthermore, note that if you set yagmail up correctly, you can just login using yag.SMTP(), without having to have username & password in the script (while still being secure). Omitting the password will prompt a getpass.
Adding an attachment is as simple as pointing to a local file, e.g.:
yag.send(contents = [message, 'previously a lot of whitespace', '/local/path/file.zip']
Awesome isn't it? Thanks for the allowing me to show a nice use case for yagmail :)
If you have any feature requests, issues or ideas please let me know at github.
I am trying to use flufl.bounce to scan emails downloaded with poplib and detect bounced e-mail addresses. So far, what I'm getting is a lot of empty sets. Here is some sample code:
import getpass, poplib, email
from flufl.bounce import scan_message
user = 'redacted#redacted.com'
mail = poplib.POP3_SSL('redacted.redacted.com', '995')
mail.user(user)
mail.pass_('redacted')
num_messages = len(mail.list()[1])
for i in range(num_messages):
for msg in mail.retr(i+1)[1]:
msg = email.message_from_string(msg)
bounce = scan_message(msg)
print bounce
mail.quit()
And print bounce is giving me an empty set:
set([])
There are various types of bounce messages in this mailbox, and I can even select one with mail.retr that I know is a bounce message, but when I feed it into scan_message, I still get an empty set back. What am I doing wrong? The flufl.bounce docs don't seem to be very helpful here.
OK, I figured it out.msg is a list of the elements of the e-mail. So, instead of iterating through mail.retr(i+1)[1], I had to join it together with \n before feeding it into email.message_from_string() which gave me a proper message that scan_message could use. Here's the working code
import getpass, poplib, email
from flufl.bounce import scan_message
user = 'redacted#redacted.com'
mail = poplib.POP3_SSL('mail.redacted.com', '995')
mail.user(user)
mail.pass_('redacted')
num_messages = len(mail.list()[1])
for i in range(num_messages):
x = mail.retr(i+1)[1]
msg = email.message_from_string("\n".join(x))
bounce = scan_message(msg)
print bounce
mail.quit()
I am displaying new email with IMAP, and everything looks fine, except for one message subject shows as:
=?utf-8?Q?Subject?=
How can I fix it?
In MIME terminology, those encoded chunks are called encoded-words. You can decode them like this:
import email.header
text, encoding = email.header.decode_header('=?utf-8?Q?Subject?=')[0]
Check out the docs for email.header for more details.
This is a MIME encoded-word. You can parse it with email.header:
import email.header
def decode_mime_words(s):
return u''.join(
word.decode(encoding or 'utf8') if isinstance(word, bytes) else word
for word, encoding in email.header.decode_header(s))
print(decode_mime_words(u'=?utf-8?Q?Subject=c3=a4?=X=?utf-8?Q?=c3=bc?='))
The text is encoded as a MIME encoded-word. This is a mechanism defined in RFC2047 for encoding headers that contain non-ASCII text such that the encoded output contains only ASCII characters.
In Python 3.3+, the parsing classes and functions in email.parser automatically decode "encoded words" in headers if their policy argument is set to policy.default
>>> import email
>>> from email import policy
>>> msg = email.message_from_file(open('message.txt'), policy=policy.default)
>>> msg['from']
'Pepé Le Pew <pepe#example.com>'
The parsing classes and functions are:
email.parser.BytesParser
email.parser.Parser
email.message_from_bytes
email.message_from_binary_file
email.message_from_string
email.message_from_file
Confusingly, up to at least Python 3.10, the default policy for these parsing functions is not policy.default, but policy.compat32, which does not decode "encoded words".
>>> msg = email.message_from_file(open('message.txt'))
>>> msg['from']
'=?utf-8?q?Pep=C3=A9?= Le Pew <pepe#example.com>'
Try Imbox
Because imaplib is a very excessive low level library and returns results which are hard to work with
Installation
pip install imbox
Usage
from imbox import Imbox
with Imbox('imap.gmail.com',
username='username',
password='password',
ssl=True,
ssl_context=None,
starttls=False) as imbox:
all_inbox_messages = imbox.messages()
for uid, message in all_inbox_messages:
message.subject
In Python 3, decoding this to an approximated string is as easy as:
from email.header import decode_header, make_header
decoded = str(make_header(decode_header("=?utf-8?Q?Subject?=")))
See the documentation of decode_header and make_header.
High level IMAP lib may be useful here: imap_tools
from imap_tools import MailBox, AND
# get list of email subjects from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX') as mailbox:
subjects = [msg.subject for msg in mailbox.fetch()]
Parsed email message attributes
Query builder for searching emails
Actions with emails: copy, delete, flag, move, seen
Actions with folders: list, set, get, create, exists, rename, delete, status
No dependencies
I have a python script that has to fetch unseen messages, process it, and mark as seen (or read)
I do this after login in:
typ, data = self.server.imap_server.search(None, '(UNSEEN)')
for num in data[0].split():
print "Mensage " + str(num) + " mark"
self.server.imap_server.store(num, '+FLAGS', '(SEEN)')
The first problem is that, the search returns ALL messages, and not only the UNSEEN.
The second problem is that messages are not marked as SEEN.
Can anybody give me a hand with this?
Thanks!
import imaplib
obj = imaplib.IMAP4_SSL('imap.gmail.com', '993')
obj.login('user', 'password')
obj.select('Inbox') <--- it will select inbox
typ ,data = obj.search(None,'UnSeen')
obj.store(data[0].replace(' ',','),'+FLAGS','\Seen')
I think the flag names need to start with a backslash, eg: \SEEN
I am not so familiar with the imaplib but I implement this well with the imapclient module
import imapclient,pyzmail,html2text
from backports import ssl
context=ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
iobj=imapclient.IMAPClient('outlook.office365.com', ssl=True, ssl_context=context)
iobj.login(uname,pwd)# provide your username and password
iobj.select_folder('INBOX',readonly=True)# Selecting Inbox.
unread=iobj.search('UNSEEN')# Selecting Unread messages, you can add more search criteria here to suit your purpose.'FROM', 'SINCE' etc.
print('There are: ',len(unread),' UNREAD emails')
for i in unread:
mail=iobj.fetch(i,['BODY[]'])#I'm fetching the body of the email here.
mcontent=pyzmail.PyzMessage.factory(mail[i][b'BODY[]'])#This returns the email content in HTML format
subject=mcontent.get_subject()# You might not need this
receiver_name,receiver_email=mcontent.get_address('from')
mail_body=html2text.html2text(mcontent.html_part.get_payload().decode(mcontent.html_part.charset))# This returns the email content as text that you can easily relate with.
Let's say I want to just go through the unread emails, reply the sender and mark the email as read. I'd call the smtp function from here to compose and send a reply.
import smtplib
smtpobj=smtplib.SMTP('smtp.office365.com',587)
smtpobj.starttls()
smtpobj.login(uname,pwd)# Your username and password goes here.
sub='Subject: '+str(subject)+'\n\n'# Subject of your reply
msg='Thanks for your email! You're qualified for the next round' #Some random reply :(
fullmsg=sub+new_result
smtpobj.sendmail(uname,test,fullmsg)# This sends the email.
iobj.set_flags(i,['\\Seen','\\Answered'])# This marks the email as read and adds the answered flag
iobj.append('Sent Items', fullmsg)# This puts a copy of your reply in your Sent Items.
iobj.logout()
smtpobj.logout()
I hope this helps