Goal:
Getting the Html of an email looking exactly as in the mailbox saved
as.html
Explanation: I am using Python and IMAP to download email and get the HTML content with .get_content_payload(text/HTML) but when I save it and open it, it doesn't look like the mail content as shown in the mailbox.
I tried writing an HTML file with mail_html but, it doesn't work out quite well as the CSS is missing the whole HTML email looks terrible. I want it to download along with other page assets. And here I need your help.
mail = imaplib.IMAP4_SSL(SERVER)
mail.login(EMAIL, PASSWORD)
mail.select(f'{TARGET}')
status, data = mail.search(None, 'ALL')
mail_ids = []
for block in data:
mail_ids += block.split()
for mail_id in mail_ids:
status, data = mail.fetch(mail_id, '(RFC822)')
print("BODY ENDS")
for response_part in data:
if isinstance(response_part, tuple):
message = email.message_from_bytes(response_part[1])
mail_from = message['from']
mail_subject = message['subject']
# if it is multi-part message separate first
if message.is_multipart():
mail_content = ''
mail_html = ' '
for part in message.get_payload():
# Get all parts of the message
if part.get_content_type() == 'text/plain':
mail_content += part.get_payload()
if message.get_content_type() == "text/html":
mail_html + part.get_payload()
Try my lib: https://github.com/ikvk/imap_tools
from imap_tools import MailBox
# get list of email bodies from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX') as mailbox:
bodies = [msg.html or msg.text for msg in mailbox.fetch()]
Related
I'm attempting to use code to read an email and its attachments. I can read the email using Python's extract msg module, but not the attachment content. I am printing the attachments variable but its showing list object and not the content. The code for this is provided below. Please share your thoughts on this.
import extract_msg
import glob
import re
f = glob.glob('Time Off -DAYS.msg')
for filename in f:
msg = extract_msg.Message(filename)
msg_from = msg.sender
msg_date = msg.date
msg_subj = msg.subject
msg_message = msg.body
attachments = msg.attachments
msg_to = msg.to
print("To:-",msg_to)
print("From:-",msg_from)
print ("Date:-",msg.date)
print(attachments)
attachments data: Email has 2 attachments
[<extract_msg.attachment.Attachment object at 0x0000024444C9FEF0>, <extract_msg.attachment.Attachment object at 0x0000024444E39C50>]
you use multiple libraries to reading Email Data
smtplib , imaplib, pywin32
I use pywin32 here and use Outlook
to reading and downloading Emails and Attachments:
first, you have to install and import the module:
pip install pywin32
import win32com.client
second Establish a Connection:
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
third Read the Email:
# Access to the email in the inbox
messages = inbox.Items
# get the first email
message = messages.GetFirst()
# get the last email
#message = messages.GetLast()
# to loop thru the email in the inbox
while True:
try:
print(message.subject) # get the subject of the email
# if you use messages.GetFirst() earlier
message = messages.GetNext()
# if you use messages.GetPrevious() earlier
#message = messages.GetPrevious()
except:
# if you use messages.GetFirst() earlier
message = messages.GetNext()
# if you use messages.GetPrevious() earlier
#message = messages.GetPrevious()
The above example shows how to print the subject of all of the emails in the Inbox.
Below are some of the common properties:
message.subject
message.senton # return the date & time email sent
message.senton.date()
message.senton.time()
message.sender
message.SenderEmailAddress
message.Attachments # return all attachments in the email
Download and Email Attachment:
attachments = message.Attachments
# return the first item in attachments
attachment = attachments.Item(1)
# the name of attachment file
attachment_name = str(attachment).lower()
attachment.SaveASFile(path+ '\\' + attachment_name)
I hope that works fine for you
for more information check this link and this link
I am trying to fetch all the attachments of email messages and make a list of those attachments for that particular mail and save that list in a JSON file.
I have been instructed to use imaplib only.
This is the function that I am using to extract the mails data but the part.getfilename() is only returning one attachment even if I have sent multiple attachments.
The output I want is the list of attachments like [attach1.xlss, attach2.xml, attch.csv].
Again, I can only use imaplib library.
I also don't want to have to download any attachment, so please don't share that code. I tried several websites but couldn't find anything that I could use.
def get_body_and_attachments(msg):
email_body = None
filename = None
html_part = None
# if the email message is multipart
if msg.is_multipart():
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
try:
# get the email body
body = part.get_payload(decode=True).decode()
except:
pass
if content_type == "text/plain" and "attachment" not in content_disposition:
# print text/plain emails and skip attachments
email_body = body
elif "attachment" in content_disposition:
# download attachment
print(part.get_filename(), "helloooo")
filename = part.get_filename()
filename = filename
else:
# extract content type of email
content_type = msg.get_content_type()
# get the email body
body = msg.get_payload(decode=True).decode()
if content_type == "text/plain":
email_body = body
if content_type == "text/html":
html_part = body
return email_body, filename, html_part
It was easy; I just had to do this.
import re
# getting filenames
filenames = mailbox.uid('fetch', num, '(BODYSTRUCTURE)')[1][0]
filenames = re.findall('\("name".*?\)', str(filenames))
filenames = [filenames[i].split('" "')[1][:-2] for i in range(len(filenames))]
Explanation: mailbox.uid will fetch the message (or mail) of a particular uid (num) and will return a byte string with all the data relating to that message.
Now I use re.findall to find all the attachment names and then I clean that return value and save it as a list.
I was working on a project where I used IMAP to delete all messages from a particular sender.
import email
from email.header import decode_header
import webbrowser
import os
# account credentials
username = "my email"
password = "my pass"
imap = imaplib.IMAP4_SSL("imap.gmail.com")
#imap is commonly used with gmail, however there are variants that are able to interface with outlook
imap.login(username, password)
status, messages = imap.select("INBOX")
N = 6
messages = int(messages[0])
for i in range(messages, messages-N, -1):
# fetch the email message by ID
res, msg = imap.fetch(str(i), "(RFC822)")
for response in msg:
if isinstance(response, tuple):
# parse a bytes email into a message object
msg = email.message_from_bytes(response[1])
# decode the email subject
subject = decode_header(msg["Subject"])[0][0]
if isinstance(subject, bytes):
# if it's a bytes, decode to str
subject = subject.decode()
# email sender
from_ = msg.get("From")
print("Subject:", subject)
print("From:", from_)
if "Unwanted sender" in from_:
print("Delete this")
# if the email message is multipart
if msg.is_multipart():
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
try:
# get the email body
body = part.get_payload(decode=True).decode()
except:
pass
if content_type == "text/plain" and "attachment" not in content_disposition:
# print text/plain emails and skip attachments
print(body)
print("=" * 100)
else:
# extract content type of email
content_type = msg.get_content_type()
# get the email body
body = msg.get_payload(decode=True).decode()
if content_type == "text/plain":
# print only text email parts
print(body)
imap.close()
imap.logout()
This code works perfectly fine, and it prints the words "Delete this" under any message from the unwanted sender. Is there a function I could define or call upon (thats already built into the IMAP library) that can solve my problem?
Thanks in advance.
I am using IMAP library in python to read an email inbox which is working file and i am downloading all my attachment successfully but when any .eml file is coming as attachment i got an error, pease help me how to download an eml file coming as attachment.
In-order to download an attachment such as .png from an email, the payload needs to be decoded using: part.get_payload(decode=True).decode(). However, from the documentation:
If the message is a multipart and the decode flag is True, then None is returned.
The error you are seeing is caused because a .eml file is a multipart message. The parts consist of message/rfc822 at the top level which holds all the email's details. Beneath will be single part messages such as text/html which holds the email's text etc...
To download this text into an .html or .txt file you need to .walk() through the parts of the .eml file - like you are doing on the original email to download the .eml attachment.
Here is a snippet of my code:
if msg.is_multipart():
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
if "attachment" in content_disposition:
if content_type == "message/rfc822":
# walk through the .eml attachment parts:
for eml_part in part.walk():
# find the content type of each part:
content_type = eml_part.get_content_type()
if content_type == "text/html": # this type is not multipart
body = eml_part.get_payload(decode=True).decode() # get_payload() can be decoded
# can do what you need with the decoded body.
# in this case extract text and save to .txt or .html
else: .....
Maybe you need to use EML Parser?
You can find the manual for eml-parser here.
You can use it:
def _read(self):
"""Reads all emails and get attachments.
Returns:
Attachments.
"""
self.mail.list()
self.mail.select(self.select)
self.mail.uid('search', None, 'ALL')
self.uids = self.data[0].split()
self.content_length = len(self.uids)
self.attachments = []
for uid in self.uids:
self.result, self.email_data = self.mail.uid(
'fetch', uid, '(RFC822)')
self.raw_email = self.email_data[0][1]
self.raw_email_string = self.raw_email.decode('utf-8')
self.parsed_email = email.message_from_bytes(self.raw_email)
for part in self.parsed_email.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get_content_type() not in ['text/html', 'text/plain']:
self.attachments.append({
'name':
part.get_filename(),
'content_type':
part.get_content_type(),
'bytes':
part.get_payload(decode=True)
})
self.result = {'attachments': self.attachments}
return self.result
Try to use my high level imap lib:
https://github.com/ikvk/imap_tools
from imap_tools import MailBox, MailMessage
# get .eml files attached to email messages from INBOX
with MailBox('imap.mail.com').login('test#mail.com', 'password', 'INBOX') as mailbox:
for message in mailbox.fetch():
for att in message.attachments:
if '.eml' in att.filename:
print(att.filename, len(att.payload))
Also you can parse .eml in place - see lib examples:
https://github.com/ikvk/imap_tools/blob/master/examples/parse_eml_attachments.py
I am trying to send an email containing list of URLs along with message stating these are urls for example:
BADURL = ['abc.123.com','xyz.456.com','rtf.892.com']
Requiring output
Following are BAD URLs
abc.123.com
xyz.456.com
rtf.892.com
I am writing following code but I am getting message as email body and urls as an attachment. I don't want to send URLs as attachment instead I just want them to be listed in the email. Following is my code:
message = multipart.MIMEMultipart('mixed')
message['Subject'] = 'Policy.txt file update'
message['From'] = sender
message['To'] = ','.join(destination)
message['Date'] = formatdate(localtime=True)
message.attach(text.MIMEText('Following are BAD URLs'))
message.attach(text.MIMEText('\n'.join(y),'plain'))
print('sending message')enter code here
print (message.as_string())
try:
z = smtplib.SMTP('localhost')
z.sendmail(sender, destination, message.as_string())
z.quit()
except(smtplib.SMTPException, IOError) as e:
z.quit()
print(str(e))
Try changing the corresponding part of your code by these
message = multipart.MIMEMultipart('alternative')
message.attach(text.MIMEText('\n'.join(y),'html'))
and when you include the URL-s, embrace them into tags; say put
some title here
in your message.
This should work.