Python read last 10 emails from Outlook - python

I can read my last email from my Outlook and send all the results according to each line's content.
However, I am unable to find the way to read my last 10 emails to be added to the fileCollect.txt file.
Any ideas how I could do this? Here is my current code:
import win32com.client
import csv
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) # "6" refers to the index of a folder - in this case,
# the inbox. You can change that number to reference
# any other folder
messages = inbox.Items
message = messages.GetLast()
fileCollect = open("fileCollect.txt",'a')
delimiter = "¿"
fileCollect.write( str(message.Sender) + delimiter + str(message.Subject)+ delimiter + str(message.Body) )
fileCollect.close()
csvfile = open("csvfile.csv",'a')
with open("fileCollect.txt","r") as outfile:
for line in outfile:
if line.find("test") != -1:
csvfile.write(line)
csvfile.close()

The Items collection will not be sorted in any particular order until you actually sort it by calling Items.Sort. The VB script below sorts the collection by ReceivedTime in the descending order:
set messages = inbox.Items
messages.Sort("ReceivedTime", True)
set message = messages.GetFirst()
while not (message Is Nothing)
MsgBox message.Subject
set message = messages.GetNext()
wend

You can get the last 10 messages by specifying a negative index:
last_10_messages = messages[-10:]
This will return an array from messages[-10], which is the 10th to the last message, to the last message in the messages array.

use len(inbox.Items) to get the length of the inbox.
use inbox.Items.Item(i) to get i-th email in the inbox.
Ref:
https://learn.microsoft.com/en-us/office/vba/api/outlook.items.item

Related

Unable to loop correctly

I am working on assignment to extract emails from the mailbox.
Below are my codes, I am referencing from this case and combine with some other research online:
import win32com.client
import pandas as pd
import os
outlook = win32com.client.Dispatch("Outlook.Aplication").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6).Folders["Testmails"]
condition = pd.read_excel(r"C:\Users\Asus\Desktop\Python\Condition.xlsx", sheet_name = 'endword')
emails = condition.iloc[:,1].tolist()
done = outlook.GetDefaultFolder(6).Folders["Testmails"].Folders["Done"]
Item = inbox.Items.GetFirst()
add = Item.SenderEmailAddress
for attachment in Item.Attachments:
if any([add.endswith(m) for m in condition]) and Item.Attachments.Count > 0:
print(attachment.FileName)
dir = "C:\\Users\\Asus\\Desktop\\Python\\Output\\"
fname = attachment.FileName
outpath = os.path.join(dir, fname)
attachment.SaveAsFile(outpath)
Item.Move(done)
The code above is running, but it only saves the first email attachment, and the other email that matches the condition is not saving.
The condition file is like below, if is gmail to save in file A. But I am not sure if we can do by vlookup in loops.
mail end Directory
0 gmail.com "C:\\Users\\Asus\\Desktop\\Output\\A\\"
1 outlook.com "C:\\Users\\Asus\\Desktop\\Output\\A\\"
2 microsoft.com "C:\\Users\\Asus\\Desktop\\Output\\B\\"
Thanks for all the gurus who is helping much. I have edited the codes above but now is facing other issues on looping.
Fix Application on Dispatch("Outlook.Aplication") should be double p
On filter add single quotation mark round 'emails'
Example
Filter = "[SenderEmailAddress] = 'emails'"
for loop, you are using i but then you have print(attachment.FileName) / attachment.SaveAsFile
use i for all - print(i.FileName) / i.SaveAsFile or attachment
import win32com.client
Outlook = win32com.client.Dispatch("Outlook.Application")
olNs = Outlook.GetNamespace("MAPI")
Inbox = olNs.GetDefaultFolder(6)
Filter = "[SenderEmailAddress] = '0m3r#email.com'"
Items = Inbox.Items.Restrict(Filter)
Item = Items.GetFirst()
if Item.Attachments.Count > 0:
for attachment in Item.Attachments:
print(Item.Attachments.Count)
print(attachment.FileName)
attachment.SaveAsFile(r"C:\path\to\my\folder\Attachment.xlsx")
The 'NoneType' object has no attribute 'Attachments' error means that you're trying to get attachments from something that is None.
You're getting attachments in only one place:
for i in Item.Attachments:
...
so we can conclude that the Item here is None.
By looking at Microsoft's documentation we can see that the method...
Returns Nothing if no first object exists, for example, if there are no objects in the collection
Therefore, I'd imagine there's an empty collection, or no emails matching your filter
To handle this you could use an if statement
if Item is not None:
for i in Item.Attachments:
...
else:
pass # Do something here if there's nothing matching your filter

How to make a time restriction in outlook using python?

I am making a program that:
opens outlook
find emails per subject
extract some date from emails (code and number)
fills these data in excel file in.
Standard email looks like this:
Subject: Test1
Hi,
You got a new answer from user Alex.
Code: alex123fj
Number1: 0611111111
Number2: 1020
Number3: 3032
I encounter 2 main problems in the process.
Firstly, I do not get how to make time restriction for emails in outlook. For example, if I want to read emails only from yesterday.
Secondly, all codes and numbers from email I save in lists. But every item gets this ["alex123fj/r"] in place from this ["alex123fj"]
I would appreciate any help or advice, that is my first ever program in Python.
Here is my code:
import win32com.client
import re
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.Folders('myemail#....').Folders('Inbox')
messages = inbox.Items
def get_code(messages):
codes_lijst = []
for message in messages:
subject = message.subject
if subject == "Test1":
body = message.body
matches = re.finditer("Code:\s(.*)$", body, re.MULTILINE)
for match in matches:
codes_lijst.append(match.group(1))
return codes_lijst
def get_number(messages):
numbers_lijst = []
for message in messages:
subject = message.subject
if subject == "Test1":
body = message.body
matches = re.finditer("Number:\s(.*)$", body, re.MULTILINE)
for match in matches:
numbers_lijst.append(match.group(1))
return numbers_lijst
code = get_code(messages)
number = get_number(messages)
print(code)
print(number)
Firstly, never loop through all items in a folder. Use Items.Find/FindNext or Items.Restrict with a restriction on ConversationTopic (e.g. [ConversationTopic] = 'Test1').
To create a date/time restriction, add a range restriction ([ReceivedTime] > 'some value') and [ReceivedTime] < 'other value'

Python: Downloading Outlook Attachments and Change Filename on Save

the following code works below, however i am trying to add a parameter that changes the filename on the SaveAsFile method to the iteration of (a) the message that i am on.
As an Example the Current output is
Returned mail see transcript for details
Returned mail see transcript for details
The Desired output is
Returned mail see transcript for details1
Returned mail see transcript for details2
Returned mail see transcript for details3
Currently this code just overwrites the same save file in my folder, however i need to accomplish saving that same file from different messages to a new file name.
Code Below:
import win32com.client
import os
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) # "6" refers to the index of a folder - in this case the inbox. You can change that number to reference
messages = inbox.Items
message = messages.GetFirst()
subject = message.Subject
i = 0
#
get_path = r'S:\Corporate Shared\Contracting Shared\DATA_PROJECTS\James\Email Extraction\Undeliverable Items'
for m in messages:
i = i + 1 #numeration
a = str(i) #Creates i as a string
if m.Subject == ("Returned mail: see transcript for details"):
#print(message)
attachments = message.Attachments
num_attach = len([x for x in attachments])
for x in range(1, num_attach + 1):
attachment = attachments.Item(x)
attachment.SaveASFile(os.path.join(get_path,attachment.FileName))
print(attachment)
#print(a)
message = messages.GetNext()
else:
message = messages.GetNext()
Instead of using attachment.FileName in the call to os.path.join, store attachment.FileName in a variable, then replace the last "." with "_" + x + "."

How to check all emails that came in within a time period?

I have the following method get_email() that basically every 20 seconds, gets the latest email and performs a series of other methods on it.
def get_email():
import win32com.client
import os
import time
import datetime as dt
date_time = time.strftime('%m-%d-%Y')
outlook = win32com.client.Dispatch("Outlook.Application").GetNameSpace("MAPI")
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
message = messages.GetFirst() # any time calling GetFirst(), you can get GetNext()....
email_subject = message.subject
email_sender = message.SenderEmailAddress
attachments = message.Attachments
body_content = message.body
print ('From: ' + email_sender)
print ('Subject: ' + email_subject)
if attachments.Count > 0:
print (str(attachments.Count) + ' attachments found.')
for i in range(attachments.Count):
email_attachment = attachments.Item(i+1)
report_name = date_time + '_' + email_attachment.FileName
print('Pushing attachment - ' + report_name + ' - to check_correct_email() function.')
if check_correct_email(email_attachment, email_subject, report_name) == True:
save_incoming_report(email_attachment, report_name, get_report_directory(email_subject))
else:
print('Not the attachment we are looking for.')
# add error logging here
break
else: #***********add error logging here**************
print('No attachment found.')
My main question is:
Is there a way I can iterate over every email using the GetNext() function per se every hour instead of getting the latest one every 20 seconds (which is definitely not as efficient as searching through all emails)?
Given that there are two functions: GetFirst() and GetNext() how would I properly have it save the latest checked, and then go through all the ones that have yet to be checked?
Do you think it would be easier to potentially set up a different folder in Outlook where I can push all of these reports to, and then iterate through them on a time basis? The only problem here is, if an incoming report is auto-generated and the time interval between the email is less than 20 seconds, or even 1 second.
Any help at all is appreciated!
You can use the Restrict function to restrict your messages variable to emails sent within the past hour, and iterate over each of those. Restrict takes the full list of items from your inbox and gives you a list of the ones that meet specific criteria, such as having been received in a specified time range. (The MSDN documentation linked above lists some other potential properties you could Restrict by.)
If you run this every hour, you can Restrict your inbox to the messages you received in the past hour (which, presumably, are the ones that still need to be searched) and iterate over those.
Here's an example of restricting to emails received in the past hour (or minute):
import win32com.client
import os
import time
import datetime as dt
# this is set to the current time
date_time = dt.datetime.now()
# this is set to one hour ago
lastHourDateTime = dt.datetime.now() - dt.timedelta(hours = 1)
#This is set to one minute ago; you can change timedelta's argument to whatever you want it to be
lastMinuteDateTime = dt.datetime.now() - dt.timedelta(minutes = 1)
outlook = win32com.client.Dispatch("Outlook.Application").GetNameSpace("MAPI")
inbox = outlook.GetDefaultFolder(6)
# retrieve all emails in the inbox, then sort them from most recently received to oldest (False will give you the reverse). Not strictly necessary, but good to know if order matters for your search
messages = inbox.Items
messages.Sort("[ReceivedTime]", True)
# restrict to messages from the past hour based on ReceivedTime using the dates defined above.
# lastHourMessages will contain only emails with a ReceivedTime later than an hour ago
# The way the datetime is formatted DOES matter; You can't add seconds here.
lastHourMessages = messages.Restrict("[ReceivedTime] >= '" +lastHourDateTime.strftime('%m/%d/%Y %H:%M %p')+"'")
lastMinuteMessages = messages.Restrict("[ReceivedTime] >= '" +lastMinuteDateTime.strftime('%m/%d/%Y %H:%M %p')+"'")
print "Current time: "+date_time.strftime('%m/%d/%Y %H:%M %p')
print "Messages from the past hour:"
for message in lastHourMessages:
print message.subject
print message.ReceivedTime
print "Messages from the past minute:"
for message in lastMinuteMessages:
print message.subject
print message.ReceivedTime
# GetFirst/GetNext will also work, since the restricted message list is just a shortened version of your full inbox.
print "Using GetFirst/GetNext"
message = lastHourMessages.GetFirst()
while message:
print message.subject
print message.ReceivedTime
message = lastHourMessages.GetNext()
You seem to have it running every 20 seconds, so presumably you could run it at a different interval. If you can't run it reliably at a regular interval (which would then be specified in the timedelta, e.g. hours=1), you could save the ReceivedTime of the most recent email checked, and use it to Restrict your search. (In that case, the saved ReceivedTime would replace lastHourDateTime, and the Restrict would retrieve every email sent after the last one checked.)
I hope this helps!
I had a similar question and worked through the above solution. Including another general use example in case other folks find it easier:
import win32com.client
import os
import datetime as dt
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
# setup range for outlook to search emails (so we don't go through the entire inbox)
lastWeekDateTime = dt.datetime.now() - dt.timedelta(days = 7)
lastWeekDateTime = lastWeekDateTime.strftime('%m/%d/%Y %H:%M %p') #<-- This format compatible with "Restrict"
# Select main Inbox
inbox = outlook.GetDefaultFolder(6)
messages = inbox.Items
# Only search emails in the last week:
messages = messages.Restrict("[ReceivedTime] >= '" + lastWeekDateTime +"'")
print(message.subject)
# Rest of code...
The below solution gives past 30 minutes unread folder mails from outlook and the count of the mails within the past 30 minutes
import win32com.client
import os
import time
import datetime as dt
# this is set to the current time
date_time = dt.datetime.now()
#This is set to one minute ago; you can change timedelta's argument to whatever you want it to be
last30MinuteDateTime = dt.datetime.now() - dt.timedelta(minutes = 30 )
outlook = win32com.client.Dispatch("Outlook.Application").GetNameSpace("MAPI")
inbox = outlook.GetDefaultFolder(6)
# retrieve all emails in the inbox, then sort them from most recently received to oldest (False will give you the reverse). Not strictly necessary, but good to know if order matters for your search
messages = inbox.Items.Restrict("[Unread]=true")
messages.Sort("[ReceivedTime]", True)
last30MinuteMessages = messages.Restrict("[ReceivedTime] >= '" +last30MinuteDateTime.strftime('%m/%d/%Y %H:%M %p')+"'")
print "Current time: "+date_time.strftime('%m/%d/%Y %H:%M %p')
print "Messages from the past 30 minute:"
c=0
for message in last30MinuteMessages:
print message.subject
print message.ReceivedTime
c=c+1;
print "The count of meesgaes unread from past 30 minutes ==",c
Using import datetime, this is what I came up with:
count = 0
msg = messages[len(messages) - count - 1]
while msg.SentOn.strftime("%d-%m-%y") == datetime.datetime.today().strftime("%d-%m-%y"):
msg = messages[len(messages) - count - 1]
count += 1
# Rest of the code

Getting n most recent emails using IMAP and Python

I'm looking to return the n (most likely 10) most recent emails from an email accounts inbox using IMAP.
So far I've cobbled together:
import imaplib
from email.parser import HeaderParser
M = imaplib.IMAP4_SSL('my.server')
user = 'username'
password = 'password'
M.login(user, password)
M.search(None, 'ALL')
for i in range (1,10):
data = M.fetch(i, '(BODY[HEADER])')
header_data = data[1][0][1]
parser = HeaderParser()
msg = parser.parsestr(header_data)
print msg['subject']
This is returning email headers fine, but it seems to be a semi-random collection of emails that it gets, not the 10 most recent.
If it helps, I'm connecting to an Exchange 2010 server. Other approaches also welcome, IMAP just seemed the most appropriate given that I only wanted to read the emails not send any.
The sort command is available, but it is not guaranteed to be supported by the IMAP server. For example, Gmail does not support the SORT command.
To try the sort command, you would replace:
M.search(None, 'ALL')
with
M.sort(search_critera, 'UTF-8', 'ALL')
Then search_criteria would be a string like:
search_criteria = 'DATE' #Ascending, most recent email last
search_criteria = 'REVERSE DATE' #Descending, most recent email first
search_criteria = '[REVERSE] sort-key' #format for sorting
According to RFC5256 these are valid sort-key's:
"ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / "SUBJECT" / "TO"
Notes:
1. charset is required, try US-ASCII or UTF-8 all others are not required to be supported by the IMAP server
2. search critera is also required. The ALL command is a valid one, but there are many. See more at http://www.networksorcery.com/enp/rfc/rfc3501.txt
The world of IMAP is wild and crazy. Good luck
This is the code to get the emailFrom, emailSubject, emailDate, emailContent etc..
import imaplib, email, os
user = "your#email.com"
password = "pass"
imap_url = "imap.gmail.com"
connection = imaplib.IMAP4_SSL(imap_url)
connection.login(user, password)
result, data = connection.uid('search', None, "ALL")
if result == 'OK':
for num in data[0].split():
result, data = connection.uid('fetch', num, '(RFC822)')
if result == 'OK':
email_message = email.message_from_bytes(data[0][1])
print('From:' + email_message['From'])
print('To:' + email_message['To'])
print('Date:' + email_message['Date'])
print('Subject:' + str(email_message['Subject']))
print('Content:' + str(email_message.get_payload()[0]))
connection.close()
connection.logout()
# get recent one email
from imap_tools import MailBox
with MailBox('imap.mail.com').login('test#mail.com', 'password', 'INBOX') as mailbox:
for msg in mailbox.fetch(limit=1, reverse=True):
print(msg.date_str, msg.subject)
https://github.com/ikvk/imap_tools
this is work for me~
import imaplib
from email.parser import HeaderParser
M = imaplib.IMAP4_SSL('my.server')
user = 'username'
password = 'password'
M.login(user, password)
(retcode, messages) =M.search(None, 'ALL')
news_mail = get_mostnew_email(messages)
for i in news_mail :
data = M.fetch(i, '(BODY[HEADER])')
header_data = data[1][0][1]
parser = HeaderParser()
msg = parser.parsestr(header_data)
print msg['subject']
and this is get the newer email function :
def get_mostnew_email(messages):
"""
Getting in most recent emails using IMAP and Python
:param messages:
:return:
"""
ids = messages[0] # data is a list.
id_list = ids.split() # ids is a space separated string
#latest_ten_email_id = id_list # get all
latest_ten_email_id = id_list[-10:] # get the latest 10
keys = map(int, latest_ten_email_id)
news_keys = sorted(keys, reverse=True)
str_keys = [str(e) for e in news_keys]
return str_keys
Workaround for Gmail. Since the The IMAP.sort('DATE','UTF-8','ALL') does not work for gmail ,we can insert the values and date into a list and sort the list in reverse order of date. Can check for the first n-mails using a counter. This method will take a few minutes longer if there are hundreds of mails.
M.login(user,password)
rv,data= M.search(None,'ALL')
if rv=='OK':
msg_list=[]
for num in date[0].split():
rv,data=M.fetch(num,'(RFC822)')
if rv=='OK':
msg_object={}
msg_object_copy={}
msg=email.message_from_bytes(data[0][1])
msg_date=""
for val in msg['Date'].split(' '):
if(len(val)==1):
val="0"+val
# to pad the single date with 0
msg_date=msg_date+val+" "
msg_date=msg_date[:-1]
# to remove the last space
msg_object['date']= datetime.datetime.strptime(msg_date,"%a, %d %b %Y %H:%M:%S %z")
# to convert string to date time object for sorting the list
msg_object['msg']=msg
msg_object_copy=msg_object.copy()
msg_list.append(msg_object_copy)
msg_list.sort(reverse=True,key=lambda r:r['date'])
# sorts by datetime so latest mails are parsed first
count=0
for msg_obj in msg_list:
count=count+1
if count==n:
break
msg=msg_obj['msg']
# do things with the message
To get the latest mail:
This will return all the mail numbers contained inside the 2nd return value which is a list containing a bytes object:
imap.search(None, "ALL")[1][0]
This will split the bytes object of which the last element can be taken by accessing the negative index:
imap.search(None, "ALL")[1][0].split()[-1]
You may use the mail number to access the corresponding mail.

Categories

Resources