How to make a time restriction in outlook using python? - python

I am making a program that:
opens outlook
find emails per subject
extract some date from emails (code and number)
fills these data in excel file in.
Standard email looks like this:
Subject: Test1
Hi,
You got a new answer from user Alex.
Code: alex123fj
Number1: 0611111111
Number2: 1020
Number3: 3032
I encounter 2 main problems in the process.
Firstly, I do not get how to make time restriction for emails in outlook. For example, if I want to read emails only from yesterday.
Secondly, all codes and numbers from email I save in lists. But every item gets this ["alex123fj/r"] in place from this ["alex123fj"]
I would appreciate any help or advice, that is my first ever program in Python.
Here is my code:
import win32com.client
import re
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.Folders('myemail#....').Folders('Inbox')
messages = inbox.Items
def get_code(messages):
codes_lijst = []
for message in messages:
subject = message.subject
if subject == "Test1":
body = message.body
matches = re.finditer("Code:\s(.*)$", body, re.MULTILINE)
for match in matches:
codes_lijst.append(match.group(1))
return codes_lijst
def get_number(messages):
numbers_lijst = []
for message in messages:
subject = message.subject
if subject == "Test1":
body = message.body
matches = re.finditer("Number:\s(.*)$", body, re.MULTILINE)
for match in matches:
numbers_lijst.append(match.group(1))
return numbers_lijst
code = get_code(messages)
number = get_number(messages)
print(code)
print(number)

Firstly, never loop through all items in a folder. Use Items.Find/FindNext or Items.Restrict with a restriction on ConversationTopic (e.g. [ConversationTopic] = 'Test1').
To create a date/time restriction, add a range restriction ([ReceivedTime] > 'some value') and [ReceivedTime] < 'other value'

Related

Unable to loop correctly

I am working on assignment to extract emails from the mailbox.
Below are my codes, I am referencing from this case and combine with some other research online:
import win32com.client
import pandas as pd
import os
outlook = win32com.client.Dispatch("Outlook.Aplication").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6).Folders["Testmails"]
condition = pd.read_excel(r"C:\Users\Asus\Desktop\Python\Condition.xlsx", sheet_name = 'endword')
emails = condition.iloc[:,1].tolist()
done = outlook.GetDefaultFolder(6).Folders["Testmails"].Folders["Done"]
Item = inbox.Items.GetFirst()
add = Item.SenderEmailAddress
for attachment in Item.Attachments:
if any([add.endswith(m) for m in condition]) and Item.Attachments.Count > 0:
print(attachment.FileName)
dir = "C:\\Users\\Asus\\Desktop\\Python\\Output\\"
fname = attachment.FileName
outpath = os.path.join(dir, fname)
attachment.SaveAsFile(outpath)
Item.Move(done)
The code above is running, but it only saves the first email attachment, and the other email that matches the condition is not saving.
The condition file is like below, if is gmail to save in file A. But I am not sure if we can do by vlookup in loops.
mail end Directory
0 gmail.com "C:\\Users\\Asus\\Desktop\\Output\\A\\"
1 outlook.com "C:\\Users\\Asus\\Desktop\\Output\\A\\"
2 microsoft.com "C:\\Users\\Asus\\Desktop\\Output\\B\\"
Thanks for all the gurus who is helping much. I have edited the codes above but now is facing other issues on looping.
Fix Application on Dispatch("Outlook.Aplication") should be double p
On filter add single quotation mark round 'emails'
Example
Filter = "[SenderEmailAddress] = 'emails'"
for loop, you are using i but then you have print(attachment.FileName) / attachment.SaveAsFile
use i for all - print(i.FileName) / i.SaveAsFile or attachment
import win32com.client
Outlook = win32com.client.Dispatch("Outlook.Application")
olNs = Outlook.GetNamespace("MAPI")
Inbox = olNs.GetDefaultFolder(6)
Filter = "[SenderEmailAddress] = '0m3r#email.com'"
Items = Inbox.Items.Restrict(Filter)
Item = Items.GetFirst()
if Item.Attachments.Count > 0:
for attachment in Item.Attachments:
print(Item.Attachments.Count)
print(attachment.FileName)
attachment.SaveAsFile(r"C:\path\to\my\folder\Attachment.xlsx")
The 'NoneType' object has no attribute 'Attachments' error means that you're trying to get attachments from something that is None.
You're getting attachments in only one place:
for i in Item.Attachments:
...
so we can conclude that the Item here is None.
By looking at Microsoft's documentation we can see that the method...
Returns Nothing if no first object exists, for example, if there are no objects in the collection
Therefore, I'd imagine there's an empty collection, or no emails matching your filter
To handle this you could use an if statement
if Item is not None:
for i in Item.Attachments:
...
else:
pass # Do something here if there's nothing matching your filter

Python adding outlook color categories to specific emails with a loop

I am trying to add color categories to existing emails in a given outlook folder based on criterias such as email object and/or sender email address.
import win32com.client as client
import win32com
import pandas as pd
outlook = client.Dispatch("Outlook.Application").GetNamespace('MAPI')
main_account = outlook.Folders.Item(1)
second_account = outlook.Folders.Items(3)
df = pd.read_excel (r'C:\Python\test.xls')
df_outlook_folder = df['Outlook_folder'].tolist()
df_mail_object = df['Mail_object'].tolist()
out_iter_folder = main_account.Folders['Inbox'].Folders['TEST']
fixed_item_count = out_iter_folder.Items.Count
item_count = out_iter_folder.Items.Count
if yout_iter_folder.Items.Count > 0:
for i in reversed(range(0,item_count)):
message = out_iter_folder.Items[i]
for y,z in zip(df_mail_object,df_outlook_folder):
try:
if y in message.Subject:
message.Move(second_account.Folders['Inbox'].Folders['TESTED'].Folders[z]
except:
pass
item_count = out_iter_folder.Items.Count
print('Nb mails sorted:',fixed_item_count - item_count)
the code above enables me to move emails based on the mail object but I am not able so far to add a feature to also change the outlook color categories
I spent time on the following doc without success so far https://learn.microsoft.com/en-us/office/vba/api/outlook.categories
Categories is a delimited string of category names that have been assigned to an Outlook item.
mail.Categories='Red category'
mail.Save()
You may find the update categories in emails using python thread helpful.

How to scrape a link from a multipart email in python

I have a program which logs on to a specified gmail account and gets all the emails in a selected inbox that were sent from an email that you input at runtime.
I would like to be able to grab all the links from each email and append them to a list so that i can then filter out the ones i don't need before outputting them to another file. I was using a regex to do this which requires me to convert the payload to a string. The problem is that the regex i am using doesn't work for findall(), it only works when i use search() (I am not too familiar with regexes). I was wondering if there was a better way to extract all links from an email that doesn't involve me messing around with regexes?
My code currently looks like this:
print(f'[{Mail.timestamp}] Scanning inbox')
sys.stdout.write(Style.RESET)
self.search_mail_status, self.amount_matching_criteria = self.login_session.search(Mail.CHARSET,search_criteria)
if self.amount_matching_criteria == 0 or self.amount_matching_criteria == '0':
print(f'[{Mail.timestamp}] No mails from that email address could be found...')
Mail.enter_to_continue()
import main
main.main_wrapper()
else:
pattern = '(?P<url>https?://[^\s]+)'
prog = re.compile(pattern)
self.amount_matching_criteria = self.amount_matching_criteria[0]
self.amount_matching_criteria_str = str(self.amount_matching_criteria)
num_mails = re.search(r"\d.+",self.amount_matching_criteria_str)
num_mails = ((num_mails.group())[:-1]).split(' ')
sys.stdout.write(Style.GREEN)
print(f'[{Mail.timestamp}] Status code of {self.search_mail_status}')
sys.stdout.write(Style.RESET)
sys.stdout.write(Style.YELLOW)
print(f'[{Mail.timestamp}] Found {len(num_mails)} emails')
sys.stdout.write(Style.RESET)
num_mails = self.amount_matching_criteria.split()
for message_num in num_mails:
individual_response_code, individual_response_data = self.login_session.fetch(message_num, '(RFC822)')
message = email.message_from_bytes(individual_response_data[0][1])
if message.is_multipart():
print('multipart')
multipart_payload = message.get_payload()
for sub_message in multipart_payload:
string_payload = str(sub_message.get_payload())
print(prog.search(string_payload).group("url"))
Ended up using this for loop with a recursive function and a regex to get the links, i then removed all links without a the substring that you can input earlier on in the program before appending to a set
for message_num in self.amount_matching_criteria.split():
counter += 1
_, self.individual_response_data = self.login_session.fetch(message_num, '(RFC822)')
self.raw = email.message_from_bytes(self.individual_response_data[0][1])
raw = self.raw
self.scraped_email_value = email.message_from_bytes(Mail.scrape_email(raw))
self.scraped_email_value = str(self.scraped_email_value)
self.returned_links = prog.findall(self.scraped_email_value)
for i in self.returned_links:
if self.substring_filter in i:
self.link_set.add(i)
self.timestamp = time.strftime('%H:%M:%S')
print(f'[{self.timestamp}] Links scraped: [{counter}/{len(num_mails)}]')
The function used:
def scrape_email(raw):
if raw.is_multipart():
return Mail.scrape_email(raw.get_payload(0))
else:
return raw.get_payload(None,True)

Python read last 10 emails from Outlook

I can read my last email from my Outlook and send all the results according to each line's content.
However, I am unable to find the way to read my last 10 emails to be added to the fileCollect.txt file.
Any ideas how I could do this? Here is my current code:
import win32com.client
import csv
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6) # "6" refers to the index of a folder - in this case,
# the inbox. You can change that number to reference
# any other folder
messages = inbox.Items
message = messages.GetLast()
fileCollect = open("fileCollect.txt",'a')
delimiter = "¿"
fileCollect.write( str(message.Sender) + delimiter + str(message.Subject)+ delimiter + str(message.Body) )
fileCollect.close()
csvfile = open("csvfile.csv",'a')
with open("fileCollect.txt","r") as outfile:
for line in outfile:
if line.find("test") != -1:
csvfile.write(line)
csvfile.close()
The Items collection will not be sorted in any particular order until you actually sort it by calling Items.Sort. The VB script below sorts the collection by ReceivedTime in the descending order:
set messages = inbox.Items
messages.Sort("ReceivedTime", True)
set message = messages.GetFirst()
while not (message Is Nothing)
MsgBox message.Subject
set message = messages.GetNext()
wend
You can get the last 10 messages by specifying a negative index:
last_10_messages = messages[-10:]
This will return an array from messages[-10], which is the 10th to the last message, to the last message in the messages array.
use len(inbox.Items) to get the length of the inbox.
use inbox.Items.Item(i) to get i-th email in the inbox.
Ref:
https://learn.microsoft.com/en-us/office/vba/api/outlook.items.item

Getting n most recent emails using IMAP and Python

I'm looking to return the n (most likely 10) most recent emails from an email accounts inbox using IMAP.
So far I've cobbled together:
import imaplib
from email.parser import HeaderParser
M = imaplib.IMAP4_SSL('my.server')
user = 'username'
password = 'password'
M.login(user, password)
M.search(None, 'ALL')
for i in range (1,10):
data = M.fetch(i, '(BODY[HEADER])')
header_data = data[1][0][1]
parser = HeaderParser()
msg = parser.parsestr(header_data)
print msg['subject']
This is returning email headers fine, but it seems to be a semi-random collection of emails that it gets, not the 10 most recent.
If it helps, I'm connecting to an Exchange 2010 server. Other approaches also welcome, IMAP just seemed the most appropriate given that I only wanted to read the emails not send any.
The sort command is available, but it is not guaranteed to be supported by the IMAP server. For example, Gmail does not support the SORT command.
To try the sort command, you would replace:
M.search(None, 'ALL')
with
M.sort(search_critera, 'UTF-8', 'ALL')
Then search_criteria would be a string like:
search_criteria = 'DATE' #Ascending, most recent email last
search_criteria = 'REVERSE DATE' #Descending, most recent email first
search_criteria = '[REVERSE] sort-key' #format for sorting
According to RFC5256 these are valid sort-key's:
"ARRIVAL" / "CC" / "DATE" / "FROM" / "SIZE" / "SUBJECT" / "TO"
Notes:
1. charset is required, try US-ASCII or UTF-8 all others are not required to be supported by the IMAP server
2. search critera is also required. The ALL command is a valid one, but there are many. See more at http://www.networksorcery.com/enp/rfc/rfc3501.txt
The world of IMAP is wild and crazy. Good luck
This is the code to get the emailFrom, emailSubject, emailDate, emailContent etc..
import imaplib, email, os
user = "your#email.com"
password = "pass"
imap_url = "imap.gmail.com"
connection = imaplib.IMAP4_SSL(imap_url)
connection.login(user, password)
result, data = connection.uid('search', None, "ALL")
if result == 'OK':
for num in data[0].split():
result, data = connection.uid('fetch', num, '(RFC822)')
if result == 'OK':
email_message = email.message_from_bytes(data[0][1])
print('From:' + email_message['From'])
print('To:' + email_message['To'])
print('Date:' + email_message['Date'])
print('Subject:' + str(email_message['Subject']))
print('Content:' + str(email_message.get_payload()[0]))
connection.close()
connection.logout()
# get recent one email
from imap_tools import MailBox
with MailBox('imap.mail.com').login('test#mail.com', 'password', 'INBOX') as mailbox:
for msg in mailbox.fetch(limit=1, reverse=True):
print(msg.date_str, msg.subject)
https://github.com/ikvk/imap_tools
this is work for me~
import imaplib
from email.parser import HeaderParser
M = imaplib.IMAP4_SSL('my.server')
user = 'username'
password = 'password'
M.login(user, password)
(retcode, messages) =M.search(None, 'ALL')
news_mail = get_mostnew_email(messages)
for i in news_mail :
data = M.fetch(i, '(BODY[HEADER])')
header_data = data[1][0][1]
parser = HeaderParser()
msg = parser.parsestr(header_data)
print msg['subject']
and this is get the newer email function :
def get_mostnew_email(messages):
"""
Getting in most recent emails using IMAP and Python
:param messages:
:return:
"""
ids = messages[0] # data is a list.
id_list = ids.split() # ids is a space separated string
#latest_ten_email_id = id_list # get all
latest_ten_email_id = id_list[-10:] # get the latest 10
keys = map(int, latest_ten_email_id)
news_keys = sorted(keys, reverse=True)
str_keys = [str(e) for e in news_keys]
return str_keys
Workaround for Gmail. Since the The IMAP.sort('DATE','UTF-8','ALL') does not work for gmail ,we can insert the values and date into a list and sort the list in reverse order of date. Can check for the first n-mails using a counter. This method will take a few minutes longer if there are hundreds of mails.
M.login(user,password)
rv,data= M.search(None,'ALL')
if rv=='OK':
msg_list=[]
for num in date[0].split():
rv,data=M.fetch(num,'(RFC822)')
if rv=='OK':
msg_object={}
msg_object_copy={}
msg=email.message_from_bytes(data[0][1])
msg_date=""
for val in msg['Date'].split(' '):
if(len(val)==1):
val="0"+val
# to pad the single date with 0
msg_date=msg_date+val+" "
msg_date=msg_date[:-1]
# to remove the last space
msg_object['date']= datetime.datetime.strptime(msg_date,"%a, %d %b %Y %H:%M:%S %z")
# to convert string to date time object for sorting the list
msg_object['msg']=msg
msg_object_copy=msg_object.copy()
msg_list.append(msg_object_copy)
msg_list.sort(reverse=True,key=lambda r:r['date'])
# sorts by datetime so latest mails are parsed first
count=0
for msg_obj in msg_list:
count=count+1
if count==n:
break
msg=msg_obj['msg']
# do things with the message
To get the latest mail:
This will return all the mail numbers contained inside the 2nd return value which is a list containing a bytes object:
imap.search(None, "ALL")[1][0]
This will split the bytes object of which the last element can be taken by accessing the negative index:
imap.search(None, "ALL")[1][0].split()[-1]
You may use the mail number to access the corresponding mail.

Categories

Resources