How to download the complete html of emails with python using imap - python

Goal:
Getting the Html of an email looking exactly as in the mailbox saved
as.html
Explanation: I am using Python and IMAP to download email and get the HTML content with .get_content_payload(text/HTML) but when I save it and open it, it doesn't look like the mail content as shown in the mailbox.
I tried writing an HTML file with mail_html but, it doesn't work out quite well as the CSS is missing the whole HTML email looks terrible. I want it to download along with other page assets. And here I need your help.
mail = imaplib.IMAP4_SSL(SERVER)
mail.login(EMAIL, PASSWORD)
mail.select(f'{TARGET}')
status, data = mail.search(None, 'ALL')
mail_ids = []
for block in data:
mail_ids += block.split()
for mail_id in mail_ids:
status, data = mail.fetch(mail_id, '(RFC822)')
print("BODY ENDS")
for response_part in data:
if isinstance(response_part, tuple):
message = email.message_from_bytes(response_part[1])
mail_from = message['from']
mail_subject = message['subject']
# if it is multi-part message separate first
if message.is_multipart():
mail_content = ''
mail_html = ' '
for part in message.get_payload():
# Get all parts of the message
if part.get_content_type() == 'text/plain':
mail_content += part.get_payload()
if message.get_content_type() == "text/html":
mail_html + part.get_payload()

Try my lib: https://github.com/ikvk/imap_tools
from imap_tools import MailBox
# get list of email bodies from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX') as mailbox:
bodies = [msg.html or msg.text for msg in mailbox.fetch()]

Related

Read attachments data from email using python

I'm attempting to use code to read an email and its attachments. I can read the email using Python's extract msg module, but not the attachment content. I am printing the attachments variable but its showing list object and not the content. The code for this is provided below. Please share your thoughts on this.
import extract_msg
import glob
import re
f = glob.glob('Time Off -DAYS.msg')
for filename in f:
msg = extract_msg.Message(filename)
msg_from = msg.sender
msg_date = msg.date
msg_subj = msg.subject
msg_message = msg.body
attachments = msg.attachments
msg_to = msg.to
print("To:-",msg_to)
print("From:-",msg_from)
print ("Date:-",msg.date)
print(attachments)
attachments data: Email has 2 attachments
[<extract_msg.attachment.Attachment object at 0x0000024444C9FEF0>, <extract_msg.attachment.Attachment object at 0x0000024444E39C50>]
you use multiple libraries to reading Email Data
smtplib , imaplib, pywin32
I use pywin32 here and use Outlook
to reading and downloading Emails and Attachments:
first, you have to install and import the module:
pip install pywin32
import win32com.client
second Establish a Connection:
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
third Read the Email:
# Access to the email in the inbox
messages = inbox.Items
# get the first email
message = messages.GetFirst()
# get the last email
#message = messages.GetLast()
# to loop thru the email in the inbox
while True:
try:
print(message.subject) # get the subject of the email
# if you use messages.GetFirst() earlier
message = messages.GetNext()
# if you use messages.GetPrevious() earlier
#message = messages.GetPrevious()
except:
# if you use messages.GetFirst() earlier
message = messages.GetNext()
# if you use messages.GetPrevious() earlier
#message = messages.GetPrevious()
The above example shows how to print the subject of all of the emails in the Inbox.
Below are some of the common properties:
message.subject
message.senton # return the date & time email sent
message.senton.date()
message.senton.time()
message.sender
message.SenderEmailAddress
message.Attachments # return all attachments in the email
Download and Email Attachment:
attachments = message.Attachments
# return the first item in attachments
attachment = attachments.Item(1)
# the name of attachment file
attachment_name = str(attachment).lower()
attachment.SaveASFile(path+ '\\' + attachment_name)
I hope that works fine for you
for more information check this link and this link

How to get name of all email attachments of a particular mail using imaplib, python?

I am trying to fetch all the attachments of email messages and make a list of those attachments for that particular mail and save that list in a JSON file.
I have been instructed to use imaplib only.
This is the function that I am using to extract the mails data but the part.getfilename() is only returning one attachment even if I have sent multiple attachments.
The output I want is the list of attachments like [attach1.xlss, attach2.xml, attch.csv].
Again, I can only use imaplib library.
I also don't want to have to download any attachment, so please don't share that code. I tried several websites but couldn't find anything that I could use.
def get_body_and_attachments(msg):
email_body = None
filename = None
html_part = None
# if the email message is multipart
if msg.is_multipart():
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
try:
# get the email body
body = part.get_payload(decode=True).decode()
except:
pass
if content_type == "text/plain" and "attachment" not in content_disposition:
# print text/plain emails and skip attachments
email_body = body
elif "attachment" in content_disposition:
# download attachment
print(part.get_filename(), "helloooo")
filename = part.get_filename()
filename = filename
else:
# extract content type of email
content_type = msg.get_content_type()
# get the email body
body = msg.get_payload(decode=True).decode()
if content_type == "text/plain":
email_body = body
if content_type == "text/html":
html_part = body
return email_body, filename, html_part
It was easy; I just had to do this.
import re
# getting filenames
filenames = mailbox.uid('fetch', num, '(BODYSTRUCTURE)')[1][0]
filenames = re.findall('\("name".*?\)', str(filenames))
filenames = [filenames[i].split('" "')[1][:-2] for i in range(len(filenames))]
Explanation: mailbox.uid will fetch the message (or mail) of a particular uid (num) and will return a byte string with all the data relating to that message.
Now I use re.findall to find all the attachment names and then I clean that return value and save it as a list.

Is there a way to delete an email in gmail with IMAP based off of the sender?

I was working on a project where I used IMAP to delete all messages from a particular sender.
import email
from email.header import decode_header
import webbrowser
import os
# account credentials
username = "my email"
password = "my pass"
imap = imaplib.IMAP4_SSL("imap.gmail.com")
#imap is commonly used with gmail, however there are variants that are able to interface with outlook
imap.login(username, password)
status, messages = imap.select("INBOX")
N = 6
messages = int(messages[0])
for i in range(messages, messages-N, -1):
# fetch the email message by ID
res, msg = imap.fetch(str(i), "(RFC822)")
for response in msg:
if isinstance(response, tuple):
# parse a bytes email into a message object
msg = email.message_from_bytes(response[1])
# decode the email subject
subject = decode_header(msg["Subject"])[0][0]
if isinstance(subject, bytes):
# if it's a bytes, decode to str
subject = subject.decode()
# email sender
from_ = msg.get("From")
print("Subject:", subject)
print("From:", from_)
if "Unwanted sender" in from_:
print("Delete this")
# if the email message is multipart
if msg.is_multipart():
# iterate over email parts
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
try:
# get the email body
body = part.get_payload(decode=True).decode()
except:
pass
if content_type == "text/plain" and "attachment" not in content_disposition:
# print text/plain emails and skip attachments
print(body)
print("=" * 100)
else:
# extract content type of email
content_type = msg.get_content_type()
# get the email body
body = msg.get_payload(decode=True).decode()
if content_type == "text/plain":
# print only text email parts
print(body)
imap.close()
imap.logout()
This code works perfectly fine, and it prints the words "Delete this" under any message from the unwanted sender. Is there a function I could define or call upon (thats already built into the IMAP library) that can solve my problem?
Thanks in advance.

EML file as attachment is not downloading using IMAP in Python?

I am using IMAP library in python to read an email inbox which is working file and i am downloading all my attachment successfully but when any .eml file is coming as attachment i got an error, pease help me how to download an eml file coming as attachment.
In-order to download an attachment such as .png from an email, the payload needs to be decoded using: part.get_payload(decode=True).decode(). However, from the documentation:
If the message is a multipart and the decode flag is True, then None is returned.
The error you are seeing is caused because a .eml file is a multipart message. The parts consist of message/rfc822 at the top level which holds all the email's details. Beneath will be single part messages such as text/html which holds the email's text etc...
To download this text into an .html or .txt file you need to .walk() through the parts of the .eml file - like you are doing on the original email to download the .eml attachment.
Here is a snippet of my code:
if msg.is_multipart():
for part in msg.walk():
# extract content type of email
content_type = part.get_content_type()
content_disposition = str(part.get("Content-Disposition"))
if "attachment" in content_disposition:
if content_type == "message/rfc822":
# walk through the .eml attachment parts:
for eml_part in part.walk():
# find the content type of each part:
content_type = eml_part.get_content_type()
if content_type == "text/html": # this type is not multipart
body = eml_part.get_payload(decode=True).decode() # get_payload() can be decoded
# can do what you need with the decoded body.
# in this case extract text and save to .txt or .html
else: .....
Maybe you need to use EML Parser?
You can find the manual for eml-parser here.
You can use it:
def _read(self):
"""Reads all emails and get attachments.
Returns:
Attachments.
"""
self.mail.list()
self.mail.select(self.select)
self.mail.uid('search', None, 'ALL')
self.uids = self.data[0].split()
self.content_length = len(self.uids)
self.attachments = []
for uid in self.uids:
self.result, self.email_data = self.mail.uid(
'fetch', uid, '(RFC822)')
self.raw_email = self.email_data[0][1]
self.raw_email_string = self.raw_email.decode('utf-8')
self.parsed_email = email.message_from_bytes(self.raw_email)
for part in self.parsed_email.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get_content_type() not in ['text/html', 'text/plain']:
self.attachments.append({
'name':
part.get_filename(),
'content_type':
part.get_content_type(),
'bytes':
part.get_payload(decode=True)
})
self.result = {'attachments': self.attachments}
return self.result
Try to use my high level imap lib:
https://github.com/ikvk/imap_tools
from imap_tools import MailBox, MailMessage
# get .eml files attached to email messages from INBOX
with MailBox('imap.mail.com').login('test#mail.com', 'password', 'INBOX') as mailbox:
for message in mailbox.fetch():
for att in message.attachments:
if '.eml' in att.filename:
print(att.filename, len(att.payload))
Also you can parse .eml in place - see lib examples:
https://github.com/ikvk/imap_tools/blob/master/examples/parse_eml_attachments.py

send email in Python and list items as email content not as attachment

I am trying to send an email containing list of URLs along with message stating these are urls for example:
BADURL = ['abc.123.com','xyz.456.com','rtf.892.com']
Requiring output
Following are BAD URLs
abc.123.com
xyz.456.com
rtf.892.com
I am writing following code but I am getting message as email body and urls as an attachment. I don't want to send URLs as attachment instead I just want them to be listed in the email. Following is my code:
message = multipart.MIMEMultipart('mixed')
message['Subject'] = 'Policy.txt file update'
message['From'] = sender
message['To'] = ','.join(destination)
message['Date'] = formatdate(localtime=True)
message.attach(text.MIMEText('Following are BAD URLs'))
message.attach(text.MIMEText('\n'.join(y),'plain'))
print('sending message')enter code here
print (message.as_string())
try:
z = smtplib.SMTP('localhost')
z.sendmail(sender, destination, message.as_string())
z.quit()
except(smtplib.SMTPException, IOError) as e:
z.quit()
print(str(e))
Try changing the corresponding part of your code by these
message = multipart.MIMEMultipart('alternative')
message.attach(text.MIMEText('\n'.join(y),'html'))
and when you include the URL-s, embrace them into tags; say put
some title here
in your message.
This should work.

Categories

Resources