Pretty new to Python. My goal is to download only email attachments from certain senders of .xls and .docx filetypes to a specified folder. I have the sender conditions working but can't get the program to filter to the specific filetypes I want. The code below downloads all attachments from the listed senders including image signatures (not desired.) The downloaded attachments contain data that will be further used in a df. I'd like to keep it within win32com since I have other working email scraping programs that use it. I appreciate any suggestions.
Partially working code:
import win32com.client
Outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
Items = inbox.Items
Item = Items.GetFirst()
def saveAttachments(email:object):
for attachedFile in email.Attachments:
try:
filename = attachedFile.FileName
attachedFile.SaveAsFile("C:\\Outputfolder"+filename)
except Exception as e:
print(e)
for mailItem in inbox.Items:
if mailItem.SenderName == "John Smith" or mailItem.SenderName == "Mike Miller":
saveAttachments(mailItem)
Firstly, don't loop through all item in a folder - use Items.Find/FindNext or Items.Restrict with a query on the SenderName property - see https://learn.microsoft.com/en-us/office/vba/api/outlook.items.restrict
As for the attachment, a image attachment is not any different from any other attachment. You can check the file extension or the size. You can also read the PR_ATTACH_CONTENT_ID property (DASL name http://schemas.microsoft.com/mapi/proptag/0x3712001F) using Attachment.PropertyAccessor.GetProperty and check if it is used in an img tag in the MailItem.HTMLBody property.
Currently you save all attached files on the disk:
for attachedFile in email.Attachments:
try:
filename = attachedFile.FileName
attachedFile.SaveAsFile("C:\\Outputfolder"+filename)
except Exception as e:
print(e)
only email attachments from certain senders of .xls and .docx filetypes to a specified folder.
The Attachment.FileName property returns a string representing the file name of the attachment. So, parsing the filename by extracting the file extension will help you to filter files that should be saved on the disk.
Also you may be interested in avoiding hidden attachments used for inline images in the message body. Here is an example code in VBA (the Outlook object model is common for all programming languages, I am not familiar with Python) that counts the visible attachments:
Sub ShowVisibleAttachmentCount()
Const PR_ATTACH_CONTENT_ID As String = "http://schemas.microsoft.com/mapi/proptag/0x3712001F"
Const PR_ATTACHMENT_HIDDEN As String = "http://schemas.microsoft.com/mapi/proptag/0x7FFE000B"
Dim m As MailItem
Dim a As Attachment
Dim pa As PropertyAccessor
Dim c As Integer
Dim cid as String
Dim body As String
c = 0
Set m = Application.ActiveInspector.CurrentItem
body = m.HTMLBody
For Each a In m.Attachments
Set pa = a.PropertyAccessor
cid = pa.GetProperty(PR_ATTACH_CONTENT_ID)
If Len(cid) > 0 Then
If InStr(body, cid) Then
Else
'In case that PR_ATTACHMENT_HIDDEN does not exists,
'an error will occur. We simply ignore this error and
'treat it as false.
On Error Resume Next
If Not pa.GetProperty(PR_ATTACHMENT_HIDDEN) Then
c = c + 1
End If
On Error GoTo 0
End If
Else
c = c + 1
End If
Next a
MsgBox c
End Sub
Also you may check whether the message body (see the HTMLBody property of Outlook items) contains the PR_ATTACH_CONTENT_ID property value. If not, the attached can be visible to users if the PR_ATTACHMENT_HIDDEN property is not set explicitly.
Also you may find the Sending Outlook Email with embedded image using VBS thread helpful.
Related
I'm trying to write a Python script code wherein I’ll send email notifications to my team members on a daily basis.
There are two excel sheets, let's say abc.xlsx and def.xlsx.
I already have a script that updates these files and saves them. (These files abc and def are deleted and recreated with the same name but with updated information.)
Now my goal is to attach the file abc as an attachment in the mail and add the contents of def.xlsx in the email body.
I’m trying to achieve this:
Hello All,
Please find the pending lists here as follows:
///The info from def.xlsx sheet comes here///
Thanks and regards!
/// my outlook signature///
Here is my code:
import win32com.client as win32
import pandas as pd
# reading a file, which needs to be on mail body
df1 = pd.read_excel('def.xlsx')
html_table = df1.to_html(index=False)
outlook = win32.gencache.EnsureDispatch('Outlook.Application')
mail = outlook.CreateItem(0)
mail.To = 'mail#me.com'
mail.CC = 'mail#me.com'
mail.Subject = 'Test mail'
# path to signature should be User\AppData\Roaming\Microsoft\Signatures\signature.htm
pathToIMage = r'path_to_my_signature'
attachment = mail.Attachments.Add(pathToIMage)
attachment.PropertyAccessor.SetProperty("http://schemas.microsoft.com/mapi/proptag/0x3712001F", "MyId1")
# modify the mail body as per need
mail.Attachments.Add(Source="C:\..abc.xlsx")
body = "<p>Hi All, Please find the updates pending updates below:" + html_table + " <br>Thanks and regards <p> <figure><img src=""cid:MyId1""</figure>"
mail.HTMLBody = (body)
mail.Send()
Example:
This type of output I'm expecting
Challenges:
My signature will be a corrupted image with a "x" in it in the test email.
My Excel sheet, which has to be on the body, won't have the same format.
I’ve copied all the codes from Stack overflow only. I did some of my research, but I'm not getting the expected output.
First, you may try setting the BodyFormat property before setting up the HTMLBody property.
Second, to get the signature added to the message body you need to call the Display method before setting up the HTMLBody property.
Third, the <figure> element is not supported in Outlook because Word is used as an email editor and applies its own business rules to message bodies.
Fourth, the HTMLBody property returns or sets a string which represents the message body, it is expected to get or set a full-fledged well-formed HTML document. Try to set up a well-formed HTML document and then set up a property.
If you need to preserve formatting from Excel you may copy the table to the clipboard and then paste it using the Word object model.
Be aware, The Outlook object model supports three main ways of customizing the message body:
The Body property returns or sets a string representing the clear-text body of the Outlook item.
The HTMLBody property of the MailItem class returns or sets a string representing the HTML body of the specified item. Setting the HTMLBody property will always update the Body property immediately. For example:
Sub CreateHTMLMail()
'Creates a new e-mail item and modifies its properties.
Dim objMail As Outlook.MailItem
'Create e-mail item
Set objMail = Application.CreateItem(olMailItem)
With objMail
'Set body format to HTML
.BodyFormat = olFormatHTML
.HTMLBody = "<HTML><BODY>Enter the message text here. </BODY></HTML>"
.Display
End With
End Sub
The Word object model can be used for dealing with message bodies. See Chapter 17: Working with Item Bodies for more information.
Note, the MailItem.BodyFormat property allows you to programmatically change the editor that is used for the body of an item.
I modified it. I'm still working on Challenge 2. I'll just go through the documentation that has been recommended and will share my final script.
import win32com.client as win32
import pandas as pd
import os
import codecs
df1 = pd.read_excel('mail_body.xlsx')
html_table = df1.to_html(index=False)
# below is the coding logic for signature
sig_files_path = 'AppData\Roaming\Microsoft\Signatures\\' + 'signature_file_name' + '_files\\'
sig_html_path = 'AppData\Roaming\Microsoft\Signatures\\' + 'signature_file_name' + '.htm'
signature_path = os.path.join((os.environ['USERPROFILE']), sig_files_path)
html_doc = os.path.join((os.environ['USERPROFILE']), sig_html_path)
html_doc = html_doc.replace('\\\\', '\\')
html_file = codecs.open(html_doc, 'r', 'utf-8', errors='ignore')
signature_code = html_file.read()
signature_code = signature_code.replace(('signature_file_name' + '_files/'), signature_path)
html_file.close()
outlook = win32.gencache.EnsureDispatch('Outlook.Application')
mail = outlook.CreateItem(0)
mail.To = 'mail#me.com'
mail.CC = 'mail#me.com'
mail.Subject = 'TEST EMAIL'
mail.Attachments.Add(Source=r"C:\..abc.xlsx")
# modify the mail body as per need
mail.BodyFormat = 2
body = "<p>Hi All, Please find the updates pending updates below:" + html_table + " <br>Thanks and regards <br><br>"
mail.Display()
mail.HTMLBody = body + signature_code
mail.Send()
SOLVED ! :)
I used the following to move my mail from somewhere in my inbox into online archives with some important help mentioned below :
import win32com
import os
import sys
outlook = win32com.client.Dispatch('outlook.application')
mapi = outlook.GetNamespace("MAPI")
src = mapi.GetDefaultFolder(6).Folders["tobemoved"]
target = mapi.Folders["Online Archive - XXX"].Folders['Archive']
messages = src.Items
i = range(messages.count, 1, -1)
for x in i:
print(x)
messages(x).Move(target)
`
I have additional folder called
'Online-Archive-Same email address as "inbox" email '
that i currently can't locate it tried to use this link to figure out the enumeration of it . but no luck ..
as i must free up some disk space ill appreciate any help given.
P.S
tried the conventional way - with outlook struggling with connection issues and 22k email to be moved to be archived outlook just giving up on me :) feel free to advise anything that can resolve this issue.
You can access the Office 365 Online Archive folders like this:
Replace the example email with the exact email address you see in outlook.
import win32com.client
import win32com
app = win32com.client.gencache.EnsureDispatch("Outlook.Application")
outlook = app.GetNamespace("MAPI")
outlook_folder = outlook.Folders['Online Archive - Example#email.com'].Folders['Inbox']
item_count = outlook_folder.Items.Count
print(item_count)
180923
On the low (Extended MAPI) level (C++ or Delphi only), Online Archive is just another delegate Exchange mailbox. The only way to distinguish an archive mailbox from yet another delegate mailbox owned by some Exchange user is by reading PR_PROFILE_ALTERNATE_STORE_TYPE property in the archive store profile section - retrieve the store entry id (PR_ENTRYID), then find the matching row in the session stores table (IMAPISession::GetMsgStoresTable). For the matching row (use IMAPISession::CompareEntryIDs), retrieve PR_PROVIDER_UID property. Use its value to call IMAPISession.OpenProfileSection. Read PR_PROFILE_ALTERNATE_STORE_TYPE property from the IProfSect object and check if its value is "Archive" (unlike the store name, is not localized).
If Extended MAPI in C++ or Delphi is not an option, you can either
Try to find a matching store in the Namespace.Stores collection with the name starting with "Online Archive - " and the SMTP address of the user. Since that prefix is locale specific, that is not something I would use in production code.
Use Redemption (I am its author) - it exposes RDOExchangeMailboxStore.IsArchive property. If the archive store is not already opened in Outlook, you can also use RDOSession.GetArchiveMailbox. In VB script:
set rSession = CreateObject("Redemption.RDOSession")
rSession.MAPIOBJECT = Application.Session.MAPIOBJECT
userAddress = rSession.CurrentUser.SMTPAddress
set store = GetOpenArchiveMailboxForUser(userAddress)
if not store is Nothing Then
MsgBox "Found archive store for " & userAddress
Else
MsgBox "Could not find archive store for " & userAddress
End If
function GetOpenArchiveMailboxForUser(SmtpAddress)
set GetOpenArchiveMailboxForUser = Nothing
for each store in rSession.Stores
if TypeName(store) = "RDOExchangeMailboxStore" Then
Debug.Print store.Name & " - " & store.Owner.SMTPAddress & " - " & store.IsArchive
if store.IsArchive and LCase(store.Owner.SMTPAddress) = LCase(SmtpAddress) Then
set GetOpenArchiveMailboxForUser = store
exit for
End If
End If
next
end function
I have a subfolder in Outlook. My objective is to go through all unread emails or the ones I received today in that folder and download all existing attachments in those emails on my desktop. So far, I've the following code:
def saveattachments(messages,today,path):
for message in messages:
if message.Unread or message.Senton.date() == today:
attachments = message.Attachments
attachment = attachments.Item(1)
for attachment in message.Attachments:
attachment.SaveAsFile(os.path.join(path, str(attachment)))
if message.Unread:
message.Unread = False
break
def main():
path = '\\Desktop\Test Python Save Attachments Outlook'
today = datetime.today().date()
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
folder = inbox
folderMessages = folder.Items
messages = folderMessages
saveattachments(messages,today,path)
print ("Downloading Files successful.")
if __name__=="__main__":
main()
The problem with the above code is that it downloads only one attachment from the email at the time. Also, it seems that it does favor PDF documents over Excel files, as it always first saves the former ones. Any ideas or suggestions on how the code might be corrected accordingly? Many thanks in advance!
You should never loop through all items in a folder - it is like a SELECT query without a WHERE clause. Inefficient to put it mildly.
Use Items.Restrict or Items.Find/FindNext with a query on Unread and SentOn property being in the range. You can also add a condition on the PR_HASATTACH MAPI property (DASL name "http://schemas.microsoft.com/mapi/proptag/0x0E1B000B")
To make sure that all attached files are saved correctly you need to be sure that a unique name is passed to the SaveAsFile method. For example, the following code doesn't check whether such file already exists in the target folder:
for attachment in message.Attachments:
attachment.SaveAsFile(os.path.join(path, str(attachment)))
I'd suggest using the FileName property of the Attachment class and also add a unique ID to the filename. Note, you need to also make sure that only allowed symbols are used for the filename. See What characters are forbidden in Windows and Linux directory names? for more information.
My objective is to go through all unread emails or the ones I received today in that folder
As Dmitry noted, there is no need to iterate over all items in the folder. Instead, you need to find out only items that correspond to your conditions and only then iterate over them and save attached files.
To find all unread items from the Inbox folder you can use the following code (C#, I am not familiar with a python syntax, but the Outlook object model is common for all kind of applications):
using System.Text;
using System.Diagnostics;
// ...
private void RestrictUnreadItems(Outlook.MAPIFolder folder)
{
string restrictCriteria = "[UnRead] = true";
StringBuilder strBuilder = null;
Outlook.Items folderItems = null;
Outlook.Items resultItems = null;
Outlook._MailItem mail = null;
int counter = default(int);
object item = null;
try
{
strBuilder = new StringBuilder();
folderItems = folder.Items;
resultItems = folderItems.Restrict(restrictCriteria);
item = resultItems.GetFirst();
while (item != null)
{
if (item is Outlook._MailItem)
{
counter++;
mail = item as Outlook._MailItem;
strBuilder.AppendLine("#" + counter.ToString() +
"\tSubject: " + mail.Subject);
}
Marshal.ReleaseComObject(item);
item = resultItems.GetNext();
}
if (strBuilder.Length > 0)
Debug.WriteLine(strBuilder.ToString());
else
Debug.WriteLine("There is no match in the "
+ folder.Name + " folder.");
}
catch (Exception ex)
{
System.Windows.Forms.MessageBox.Show(ex.Message);
}
}
The Find/FindNext or Restrict methods of the Items class can be used for that. Read more about them in the following articles:
How To: Use Find and FindNext methods to retrieve Outlook mail items from a folder (C#, VB.NET)
How To: Use Restrict method to retrieve Outlook mail items from a folder
How To: Get unread Outlook e-mail items from the Inbox folder
I'm using Outlook 2010 - and have my main mailbox: name#company.com
I have also added another mailbox to my profile: mb data proc
Both appear as top level folders within Outlook:
name#company.com
-Inbox
-Sent Items
-Deleted Items
mb data proc
-Inbox
-Sent Items
-Deleted Items
I cannot create a different profile for the additional mailbox. It has been added in the same profile.
How do I get a reference to the Inbox in the "mb data proc" mailbox?
This is the same problem as described here Get reference to additional Inbox but this in VBS.
How to do in python?
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
folder=outlook.Folders("mb data proc")
msg=folder.Items
msgs=msg.GetLast()
print msgs
I tried this but I get this error:
folder=outlook.Folders("mb data proc")
AttributeError: _Folders instance has no __call__ method
I had a similar doubt and as I understand it the solution stated here is for Python 2.7
I will try to make it understandable regarding how to operate it using Python 3.+ versions.
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
folder = outlook.Folders.Item("Mailbox Name")
inbox = folder.Folders.Item("Inbox")
msg = inbox.Items
msgs = msg.GetLast()
print (msgs)
print (msgs.Subject)
Since _Folder is not callable, you need to use Folders.Item() method in Python 3+ to reference your mailbox.
Hope that was helpful. Thanks!
Here's a simple solution. I think the only part you missed was getting to the "Inbox" folder inside of "mb data proc".
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
folder = outlook.Folders("mb data proc")
inbox = folder.Folders("Inbox")
msg = inbox.Items
msgs = msg.GetLast()
print msgs
I was trying to access Additional Mail Boxes and read the Inbox from these Shared folders
import win32com.client
>>> outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI").Folders
>>> folder = outlook(1)
>>> inbox = folder.Folders("Inbox")
>>> message = inbox.Items
>>> messages = message.GetLast()
>>> body_content = messages.body
>>> print (body_content)
If your looking for other mailboxes or seperate PST files you have access to in outlook, try using the Store / Stores MAPI objects.
import win32com.client
for stor in win32com.client.Dispatch("Outlook.Application").Session.Stores:
print( stor.DisplayName)
PS .Session return the same reference as .GetNamespace("MAPI")
for reference https://learn.microsoft.com/en-us/office/vba/api/overview/outlook
Thank you for you Posts!
Here is a function I pulled together based on your Input, to read out the available Folders:
This is my first post, so I hope I copied the code in properly:
def check_shared(namespace,recip = None):
"""Function takes two arguments:
.) Names-Space: e.g.:
which is set in the following way: outlook = Dispatch("Outlook.Application").GetNamespace("MAPI") and
.) Recipient of an eventual shared account as string: e.g.: Shared e-Mail adress is "shared#shared.com"
--> This is optional --> If not added, the standard-e-Mail is read out"""
if recip is None:
for i in range(1,100):
try:
inbox = namespace.GetDefaultFolder(i)
print ("%i %s" % (i,inbox))
except:
#print ("%i does not work"%i)
continue
else:
print('The folders from the following shared account will be printed: '+recip)
tmpRecipient = outlook.CreateRecipient(recip)
for i in range(1,100):
try:
inbox = namespace.GetSharedDefaultFolder(tmpRecipient, i)
print ("%i %s" % (i,inbox))
except:
#print ("%i does not work"%i)
continue
print("Done")
Firstly, you can use Namespace.GetSharedDefaultFolder method.
Secondly, then line
folder=outlook.Folders("mb data proc")
needs to be
folder=outlook.Folders.Item("mb data proc")
In python w/ Outlook 2007, using win32com and/or active_directory, how can I get a reference to a sub-folder so that I may move a MailItem to this sub-folder?
I have an inbox structure like:
Inbox
|
+-- test
|
`-- todo
I can access the inbox folder like:
import win32com.client
import active_directory
session = win32com.client.gencache.EnsureDispatch("MAPI.session")
win32com.client.gencache.EnsureDispatch("Outlook.Application")
outlook = win32com.client.Dispatch("Outlook.Application")
mapi = outlook.GetNamespace('MAPI')
inbox = mapi.GetDefaultFolder(win32com.client.constants.olFolderInbox)
print '\n'.join(dir(inbox))
But when I try to get subdirectory test per Microsoft's example the inbox object doesn't have the Folders interface or any way to get a subdirectory.
How can I get a Folder object which points to test subdir?
I realize this is an old question but I've been using the win32com package recently and found the documentation troublesome to say the least...My hope is that someone, someday can be saved the turmoil I experienced trying to wrap my head around MSDN's explanation
Here's and example of a python script to traverse through Outlook folders, accessing e-mails where I please.
Disclaimer
I shifted around the code and took out some sensitive info so if you're trying to copy and paste it and have it run, good luck.
import win32com
import win32com.client
import string
import os
# the findFolder function takes the folder you're looking for as folderName,
# and tries to find it with the MAPIFolder object searchIn
def findFolder(folderName,searchIn):
try:
lowerAccount = searchIn.Folders
for x in lowerAccount:
if x.Name == folderName:
print 'found it %s'%x.Name
objective = x
return objective
return None
except Exception as error:
print "Looks like we had an issue accessing the searchIn object"
print (error)
return None
def main():
outlook=win32com.client.Dispatch("Outlook.Application")
ons = outlook.GetNamespace("MAPI")
#this is the initial object you're accessing, IE if you want to access
#the account the Inbox belongs too
one = '<your account name here>#<your domain>.com'
#Retrieves a MAPIFolder object for your account
#Object functions and properties defined by MSDN at
#https://msdn.microsoft.com/en-us/library/microsoft.office.interop.outlook.mapifolder_members(v=office.14).aspx
Folder1 = findFolder(one,ons)
#Now pass you're MAPIFolder object to the same function along with the folder you're searching for
Folder2 = findFolder('Inbox',Folder1)
#Rinse and repeat until you have an object for the folder you're interested in
Folder3 = findFolder(<your inbox subfolder>,Folder2)
#This call returns a list of mailItem objects refering to all of the mailitems(messages) in the specified MAPIFolder
messages = Folder3.Items
#Iterate through the messages contained within our subfolder
for xx in messages:
try:
#Treat xx as a singular object, you can print the body, sender, cc's and pretty much every aspect of an e-mail
#In my case I was writing the body to .txt files to parse...
print xx.Subject,xx.Sender,xx.Body
#Using move you can move e-mails around programatically, make sure to pass it a
#MAPIFolder object as the destination, use findFolder() to get the object
xx.Move(Folder3)
except Exception as err:
print "Error accessing mailItem"
print err
if __name__ == "__main__":
main()
PS Hope this doesn't do more harm than good.
Something that did work for me was iterating over the folder names. ( When I posted this question, I couldn't figure out the folder names ).
import win32com.client
import active_directory
session = win32com.client.gencache.EnsureDispatch("MAPI.session")
win32com.client.gencache.EnsureDispatch("Outlook.Application")
outlook = win32com.client.Dispatch("Outlook.Application")
mapi = outlook.GetNamespace('MAPI')
inbox = mapi.GetDefaultFolder(win32com.client.constants.olFolderInbox)
fldr_iterator = inbox.Folders
desired_folder = None
while 1:
f = fldr_iterator.GetNext()
if not f: break
if f.Name == 'test':
print 'found "test" dir'
desired_folder = f
break
print desired_folder.Name
This works for me to move a mail item into a "test" subdirectory (simplified by getting rid of gencache stuff):
import win32com.client
olFolderInbox = 6
olMailItem = 0
outlook = win32com.client.Dispatch("Outlook.Application")
mapi = outlook.GetNamespace('MAPI')
inbox = mapi.GetDefaultFolder(olFolderInbox)
item = outlook.CreateItem(olMailItem)
item.Subject = "test"
test_folder = inbox.Folders("test")
item.Move(test_folder)
So the below code will grab the "Last" item in the SOURCE folder and then move it to DEST folder. Sorry the code is a bit blunt I removed all additional features such as reading and saving the mail.
import win32com.client
inbox = win32com.client.gencache.EnsureDispatch("Outlook.Application").GetNamespace("MAPI")
source = inbox.GetDefaultFolder(6).Folders["FOLDER_NAME_SOURCE"]
dest = inbox.GetDefaultFolder(6).Folders["FOLDER_NAME_DEST"]
def moveMail(message):
print("moving mail to done folder")
message.Move(dest)
return print("MOVED")
def getMail():
message = source.Items.GetLast()
moveMail(message)
getMail()