I am using Microsoft's CDO (Collaboration Data Objects) to programmatically read mail from an Outlook mailbox and save embedded image attachments. I'm trying to do this from Python using the Win32 extensions, but samples in any language that uses CDO would be helpful.
So far, I am here...
The following Python code will read the last email in my mailbox, print the names of the attachments, and print the message body:
from win32com.client import Dispatch
session = Dispatch('MAPI.session')
session.Logon('','',0,1,0,0,'exchange.foo.com\nbar');
inbox = session.Inbox
message = inbox.Messages.Item(inbox.Messages.Count)
for attachment in message.Attachments:
print attachment
print message.Text
session.Logoff()
However, the attachment names are things like: "zesjvqeqcb_chart_0". Inside the email source, I see image source links like this:
<IMG src="cid:zesjvqeqcb_chart_0">
So, is it possible to use this CID URL (or anything else) to extract the actual image and save it locally?
Difference in versions of OS/Outlook/CDO is what might be the source of confusion, so here are the steps to get it working on WinXP/Outlook 2007/CDO 1.21:
install CDO 1.21
install win32com.client
goto C:\Python25\Lib\site-packages\win32com\client\ directory run the following:
python makepy.py
from the list select "Microsoft CDO 1.21 Library (1.21)", click ok
C:\Python25\Lib\site-packages\win32com\client>python makepy.py
Generating to C:\Python25\lib\site-packages\win32com\gen_py\3FA7DEA7-6438-101B-ACC1-00AA00423326x0x1x33.py
Building definitions from type library...
Generating...
Importing module
Examining file 3FA7DEA7-6438-101B-ACC1-00AA00423326x0x1x33.py that's just been generated, will give you an idea of what classes, methods, properties and constants are available.
Now that we are done with the boring steps, here is the fun part:
import win32com.client
from win32com.client import Dispatch
session = Dispatch('MAPI.session')
session.Logon ('Outlook') # this is profile name
inbox = session.Inbox
messages = session.Inbox.Messages
message = inbox.Messages.GetFirst()
if(message):
attachments = message.Attachments
for i in range(attachments.Count):
attachment = attachments.Item(i + 1) # yep, indexes are 1 based
filename = "c:\\tmpfile" + str(i)
attachment.WriteToFile(FileName=filename)
session.Logoff()
Same general approach will also work if you have older version of CDO (CDO for win2k)
Related
SOLVED ! :)
I used the following to move my mail from somewhere in my inbox into online archives with some important help mentioned below :
import win32com
import os
import sys
outlook = win32com.client.Dispatch('outlook.application')
mapi = outlook.GetNamespace("MAPI")
src = mapi.GetDefaultFolder(6).Folders["tobemoved"]
target = mapi.Folders["Online Archive - XXX"].Folders['Archive']
messages = src.Items
i = range(messages.count, 1, -1)
for x in i:
print(x)
messages(x).Move(target)
`
I have additional folder called
'Online-Archive-Same email address as "inbox" email '
that i currently can't locate it tried to use this link to figure out the enumeration of it . but no luck ..
as i must free up some disk space ill appreciate any help given.
P.S
tried the conventional way - with outlook struggling with connection issues and 22k email to be moved to be archived outlook just giving up on me :) feel free to advise anything that can resolve this issue.
You can access the Office 365 Online Archive folders like this:
Replace the example email with the exact email address you see in outlook.
import win32com.client
import win32com
app = win32com.client.gencache.EnsureDispatch("Outlook.Application")
outlook = app.GetNamespace("MAPI")
outlook_folder = outlook.Folders['Online Archive - Example#email.com'].Folders['Inbox']
item_count = outlook_folder.Items.Count
print(item_count)
180923
On the low (Extended MAPI) level (C++ or Delphi only), Online Archive is just another delegate Exchange mailbox. The only way to distinguish an archive mailbox from yet another delegate mailbox owned by some Exchange user is by reading PR_PROFILE_ALTERNATE_STORE_TYPE property in the archive store profile section - retrieve the store entry id (PR_ENTRYID), then find the matching row in the session stores table (IMAPISession::GetMsgStoresTable). For the matching row (use IMAPISession::CompareEntryIDs), retrieve PR_PROVIDER_UID property. Use its value to call IMAPISession.OpenProfileSection. Read PR_PROFILE_ALTERNATE_STORE_TYPE property from the IProfSect object and check if its value is "Archive" (unlike the store name, is not localized).
If Extended MAPI in C++ or Delphi is not an option, you can either
Try to find a matching store in the Namespace.Stores collection with the name starting with "Online Archive - " and the SMTP address of the user. Since that prefix is locale specific, that is not something I would use in production code.
Use Redemption (I am its author) - it exposes RDOExchangeMailboxStore.IsArchive property. If the archive store is not already opened in Outlook, you can also use RDOSession.GetArchiveMailbox. In VB script:
set rSession = CreateObject("Redemption.RDOSession")
rSession.MAPIOBJECT = Application.Session.MAPIOBJECT
userAddress = rSession.CurrentUser.SMTPAddress
set store = GetOpenArchiveMailboxForUser(userAddress)
if not store is Nothing Then
MsgBox "Found archive store for " & userAddress
Else
MsgBox "Could not find archive store for " & userAddress
End If
function GetOpenArchiveMailboxForUser(SmtpAddress)
set GetOpenArchiveMailboxForUser = Nothing
for each store in rSession.Stores
if TypeName(store) = "RDOExchangeMailboxStore" Then
Debug.Print store.Name & " - " & store.Owner.SMTPAddress & " - " & store.IsArchive
if store.IsArchive and LCase(store.Owner.SMTPAddress) = LCase(SmtpAddress) Then
set GetOpenArchiveMailboxForUser = store
exit for
End If
End If
next
end function
Pretty new to Python. My goal is to download only email attachments from certain senders of .xls and .docx filetypes to a specified folder. I have the sender conditions working but can't get the program to filter to the specific filetypes I want. The code below downloads all attachments from the listed senders including image signatures (not desired.) The downloaded attachments contain data that will be further used in a df. I'd like to keep it within win32com since I have other working email scraping programs that use it. I appreciate any suggestions.
Partially working code:
import win32com.client
Outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
Items = inbox.Items
Item = Items.GetFirst()
def saveAttachments(email:object):
for attachedFile in email.Attachments:
try:
filename = attachedFile.FileName
attachedFile.SaveAsFile("C:\\Outputfolder"+filename)
except Exception as e:
print(e)
for mailItem in inbox.Items:
if mailItem.SenderName == "John Smith" or mailItem.SenderName == "Mike Miller":
saveAttachments(mailItem)
Firstly, don't loop through all item in a folder - use Items.Find/FindNext or Items.Restrict with a query on the SenderName property - see https://learn.microsoft.com/en-us/office/vba/api/outlook.items.restrict
As for the attachment, a image attachment is not any different from any other attachment. You can check the file extension or the size. You can also read the PR_ATTACH_CONTENT_ID property (DASL name http://schemas.microsoft.com/mapi/proptag/0x3712001F) using Attachment.PropertyAccessor.GetProperty and check if it is used in an img tag in the MailItem.HTMLBody property.
Currently you save all attached files on the disk:
for attachedFile in email.Attachments:
try:
filename = attachedFile.FileName
attachedFile.SaveAsFile("C:\\Outputfolder"+filename)
except Exception as e:
print(e)
only email attachments from certain senders of .xls and .docx filetypes to a specified folder.
The Attachment.FileName property returns a string representing the file name of the attachment. So, parsing the filename by extracting the file extension will help you to filter files that should be saved on the disk.
Also you may be interested in avoiding hidden attachments used for inline images in the message body. Here is an example code in VBA (the Outlook object model is common for all programming languages, I am not familiar with Python) that counts the visible attachments:
Sub ShowVisibleAttachmentCount()
Const PR_ATTACH_CONTENT_ID As String = "http://schemas.microsoft.com/mapi/proptag/0x3712001F"
Const PR_ATTACHMENT_HIDDEN As String = "http://schemas.microsoft.com/mapi/proptag/0x7FFE000B"
Dim m As MailItem
Dim a As Attachment
Dim pa As PropertyAccessor
Dim c As Integer
Dim cid as String
Dim body As String
c = 0
Set m = Application.ActiveInspector.CurrentItem
body = m.HTMLBody
For Each a In m.Attachments
Set pa = a.PropertyAccessor
cid = pa.GetProperty(PR_ATTACH_CONTENT_ID)
If Len(cid) > 0 Then
If InStr(body, cid) Then
Else
'In case that PR_ATTACHMENT_HIDDEN does not exists,
'an error will occur. We simply ignore this error and
'treat it as false.
On Error Resume Next
If Not pa.GetProperty(PR_ATTACHMENT_HIDDEN) Then
c = c + 1
End If
On Error GoTo 0
End If
Else
c = c + 1
End If
Next a
MsgBox c
End Sub
Also you may check whether the message body (see the HTMLBody property of Outlook items) contains the PR_ATTACH_CONTENT_ID property value. If not, the attached can be visible to users if the PR_ATTACHMENT_HIDDEN property is not set explicitly.
Also you may find the Sending Outlook Email with embedded image using VBS thread helpful.
I am working on a two stages Digital Forensics project, and on stage one, I need to extract all the messages stored on several outlook's PST/OST files, and save them as MSG files in a folder hierarchy like pstFilename\inbox, draft, sent... for each PST file in the sample.
For stage two, now completed, I am using python (3.x) and the Win32Com module to traverse all subfolder inside the target folder, search and hash every MSG file, parse a number of MSG properties and finally, create a CSV report. I found plenty of documentation and code samples to parse a MSG file using python and the Win32Com module, but not so much on how to parse a single PST file other than the PST file associated to Outlook's user profile on the local computer.
I am looking for a way to open a PST file using the win32Com module, traverse all folders in it, and export/save every message as a MSG file to the corresponding pstfilename_folder\subfolder.
There is a very straightforward method to access MSG files:
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
msg = outlook.OpenSharedItem(r"/test_files/test.msg")
print(msg.SenderName)
print(msg.SenderEmailAddress)
print(msg.SentOn)
print(msg.To)
print(msg.CC)
print(msg.BCC)
print(msg.Subject)
print(msg.Body)
count_attachments = msg.Attachments.Count
if count_attachments > 0:
for item in range(count_attachments):
print(msg.Attachments.Item(item + 1).Filename)
del outlook, msg
Is there any equivalent method to access and manipulate a PST file using the win32com module?
I found this link: https://learn.microsoft.com/en-us/dotnet/api/microsoft.office.interop.outlook.store?view=outlook-pia
but I not sure how to use it in python...
This is something that I want to do for my own application. I was able to piece together a solution from these sources:
https://gist.github.com/attibalazs/d4c0f9a1d21a0b24ff375690fbb9f9a7
https://github.com/matthewproctor/OutlookAttachmentExtractor
https://learn.microsoft.com/en-us/office/vba/api/outlook.namespace
My solution doesn't save the .msg files as you request in your question, but unless you have a secondary use for outputting the files this solution should save you a step.
import win32com.client
def find_pst_folder(OutlookObj, pst_filepath) :
for Store in OutlookObj.Stores :
if Store.IsDataFileStore and Store.FilePath == pst_filepath :
return Store.GetRootFolder()
return None
def enumerate_folders(FolderObj) :
for ChildFolder in FolderObj.Folders :
enumerate_folders(ChildFolder)
iterate_messages(FolderObj)
def iterate_messages(FolderObj) :
for item in FolderObj.Items :
print("***************************************")
print(item.SenderName)
print(item.SenderEmailAddress)
print(item.SentOn)
print(item.To)
print(item.CC)
print(item.BCC)
print(item.Subject)
count_attachments = item.Attachments.Count
if count_attachments > 0 :
for att in range(count_attachments) :
print(item.Attachments.Item(att + 1).Filename)
Outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
pst = r"C:\Users\Joe\Your\PST\Path\example.pst"
Outlook.AddStore(pst)
PSTFolderObj = find_pst_folder(Outlook,pst)
try :
enumerate_folders(PSTFolderObj)
except Exception as exc :
print(exc)
finally :
Outlook.RemoveStore(PSTFolderObj)
I use it in my work MSG PY module from independent soft and it turns out great for now.
This is Microsoft Outlook .msg file module for Python.
The module allows you to easy create/read/parse/convert Outlook .msg files.
For example:
from independentsoft.msg import Message
appointment = Message("e:\\appointment.msg")
print("subject: " + str(appointment.subject))
print("start_time: " + str(appointment.appointment_start_time))
print("end_time: " + str(appointment.appointment_end_time))
print("location: " + str(appointment.location))
print("is_reminder_set: " + str(appointment.is_reminder_set))
print("sender_name: " + str(appointment.sender_name))
print("sender_email_address: " + str(appointment.sender_email_address))
print("display_to: " + str(appointment.display_to))
print("display_cc: " + str(appointment.display_cc))
print("body: " + str(appointment.body))
I would like to create an hyperlink in the body of a task created through win32com.
This is my code so far:
outlook = win32com.client.Dispatch("Outlook.Application")
outlook_task_item = 3
recipient = "my_email#site.com"
task = outlook.CreateItem(outlook_task_item)
task.Subject = "hello world"
task.Body = "please update the file here"
task.DueDate = dt.datetime.today()
task.ReminderTime = dt.datetime.today()
task.ReminderSet = True
task.Save()
I have tried to set the property task.HTMLBody but I get the error:
AttributeError: Property 'CreateItem.HTMLBody' can not be set.
I have also tried
task.Body = "Here is the <a href='http://www.python.org'>link</a> I need"
but I am not getting a proper hyperlink.
However if I create a task front end in Outlook, I am able to add hyperlinks.
You can also try:
task.HTMLBody = "Here is the <a href='http://www.python.org'>link</a> I need"
this will overwrite data in 'task.Body' to the HTML format provides in 'task.HTMLBody'
so whichever (Body or HTMLBody) is last will be taken as the Body of the mail.
Tasks do not support HTML. Instead, you have to provide RTF.
You can investigate -- but not set -- the RTF of a given task through task.RTFBody (and task.RTFBody.obj to get a convenient view of it). To use RTF in the body of a task, simply use the task.Body property; setting this to a byte array containing RTF will automatically use that RTF in the body. Concretely, to get the body you want, you could let
task.Body = rb'{\rtf1{Here is the }{\field{\*\fldinst { HYPERLINK "https://www.python.org" }}{\fldrslt {link}}}{ I need}}'
I have windows 10 environment with Python 2.7, win32com package 219 is installed.
I was able to run below code which runs a macro in excel and generate a pie chart that will get attached(also get embedded in email body) to email and sent.
This program was working fine, earlier, however after some windows update, the same is giving AttributeError: olEmbeddeditem, i have imported win32com.client and its constant.
Want the embedded image in the email body, so replacing olEmbeddeditem with olByValue, etc. will not help, i think, though i have tried, which also didn't worked.
I have also done reinstallation of win32com package of python, however problem persist.
Earlier working code does not included "from win32com.client import constants", however since it was not working, have thought of adding this line, but this too didn't helped.
Any help would be appreciated.
import sys
import os
import win32com.client
import codecs
from win32com.client import constants
sys.stdout = codecs.getwriter("iso-8859-1")(sys.stdout, 'xmlcharrefreplace')
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
all_inbox = inbox.Items
folders = inbox.Folders
olMailItem = 0x0
obj = win32com.client.Dispatch("Outlook.Application")
xlApp = win32com.client.Dispatch("Excel.Application")
ExcelWorkBook = xlApp.Workbooks.Open('C:\Users\xxx\Desktop\data.xlsm')
xlSheet1 = ExcelWorkBook.Sheets("Sheet1")
xlApp.Application.Run("data.xlsm!Macro1")
chart1 = xlSheet1.ChartObjects(1)
chart1.Chart.Export("C:\Users\xxx\Desktop\photo.gif", "GIF", False)
xlApp.Workbooks(1).Close(SaveChanges=0)
xlApp.Application.Quit()
newMail = obj.CreateItem(olMailItem)
newMail.Subject = "Presentation of Automation"
attachment = newMail.Attachments.Add("C:\Users\xxx\Desktop\photo.gif", win32com.client.constants.olEmbeddeditem, 0, "photo")
imageCid = "photo.gif"
attachment.PropertyAccessor.SetProperty("http://schemas.microsoft.com/mapi/proptag/0x3712001E", imageCid)
newMail.HTMLBody = "<body>Dear Sir,Madam,<br>Please find the requested details.<br><br><p><img src=\"cid:{0}\"></body>".format(imageCid)
newMail.To = x
attachment1 = "C:\Users\xxx\Desktop\photo.gif"
newMail.Attachments.Add(attachment1)
newMail.Send()
os.remove("C:\Users\xxx\Desktop\photo.gif")
msg.UnRead = False
The root cause of the issue was not a Windows update as suspected, however it was because of a group email in the Inbox which was giving the error. After deleting that group mail or moving to different folder than Inbox the issue got resolved. Still not sure about the reason why it was giving the error and what is the way out going forward to ensure that such emails does not end up into a traceback.
The main reason for this attribute error is because your COM-server has shifted from late-binding (dynamic) to early binding (static).
Delete the gen_py folder in Temp which will revert the Dispatch to dynamic from static and your code should work fine.
instead of using
attachment = newMail.Attachments.Add("C:\Users\xxx\Desktop\photo.gif", win32com.client.constants.olEmbeddeditem, 0, "photo")
you can do
attachment = newMail.Attachments.Add("C:\Users\xxx\Desktop\photo.gif", 0x5, 0, "photo")