Saving Outlook Message Files (.msg) including attachments in Python? - python

I need to save Outlook-Mails with the attachments in the msg-file in Python. Currently working with win32com.client I use: message.SaveAs(path + name) which gives me a nice .msg file, but that does not include attachments (if attachments existent). Attached files are visible using message.Attachments.Count and message.Attachments, but how can I create a .msg-file with the attachments included to store as one file which works when messages are exported straight from Outlook?

how can I create a .msg-file with the attachments included to store as one file which works when messages are exported straight from Outlook?
The Outlook object model doesn't provide anything for that. Potentially, the best that you could do, is save the attached files along with your mail items (msg). Use the Attachment.SaveAsFile method which saves the attachment to the specified path.

Related

Using temporary files and folders in Web2py app

I am relatively new to web development and very new to using Web2py. The application I am currently working on is intended to take in a CSV upload from a user, then generate a PDF file based on the contents of the CSV, then allow the user to download that PDF. As part of this process I need to generate and access several intermediate files that are specific to each individual user (these files would be images, other pdfs, and some text files). I don't need to store these files in a database since they can be deleted after the session ends, but I am not sure the best way or place to store these files and keep them separate based on each session. I thought that maybe the subfolders in the sessions folder would make sense, but I do not know how to dynamically get the path to the correct folder for the current session. Any suggestions pointing me in the right direction are appreciated!
I was having this error "TypeError: expected string or Unicode object, NoneType found" and I had to store just a link in the session to the uploaded document in the db or maybe the upload folder in your case. I would store it to upload to proceed normally, and then clear out the values and the file if not 'approved'?
If the information is not confidential in similar circumstances, I directly write the temporary files under /tmp.

Save Outlook attachment to memory instead of disk

I have a couple hundred daily Excel attachments in email that I want to pull appropriate data from and save into a database. I want to avoid saving each attachment to disk only to re-open from disk to read, since I'll never need the files saved to disk ever again. For this project, sure, I could just do it and delete them, but there ought to be a better way.
Here's what I'm doing so far
outlook = Dispatch("Outlook.Application").GetNamespace("MAPI")
folder = outlook.Folders[blah].Folders[blahblah]
for item in folder.items:
for att in item.Attachments:
att.SaveAsFile(???) # This is where I need something cool, like stream or bytes or something that I don't understand
# do something with the file, either read with pandas or openpyxl
If I can get around even doing the save and have pandas / openpyxl read it without saving, that would be great, but neither of them can read the att directly.
Outlook Object Model won't let you do that: Attachment.SaveAsFile only allows to specify a valid file name.
On the Extended MAPI level (C++ or Delphi only), the one and only way to access attachment data (Extended MAPI does not know anything about files) is to open the PR_ATTACH_DATA_BIN MAPI property as IStream interface: IAttach::OpenProperty(PR_ATTACH_DATA_BIN, IID_IStream, ...). You can then retreive the data directly from the IStream interface.
If using Redemption (any language, I am its author) is an option, it exposes RDOAttachment.AsStream / AsArray / AsText properties that allow to access raw attachment data without saving it as file first.

How to store attachment file in an email in a database?

I used this code to store attachment xlsx files from a specific address email in Outlook, but now I would like to store these files in a database in SQL Server, not in a folder in my laptop? Do you have any idea about how to store these files directly in a database? Many thanks.
outputDir = r"C:\Users\CMhalla\Desktop\Hellmann_attachment"
i=0
for m in messages:
if m.SenderEmailAddress == 'adress#outlook.com':
body_content=m.Body
for attachment in m.Attachments:
i=i+1
attachment.SaveAsFile(os.path.join(outputDir,attachment.FileName + str(i)+'.xlsx'))
The Oultook object model doesn't provide any property or method for saving attachments to DBs directly. You need to save the file on the disk first and then add it to the Db in any convenient way.
However, you may be interested in reading the bytes array of the attached item in Outlook. In that case you may write the byte array directly to the Db without touching the file system which may slow down the overall performance. The PR_ATTACH_DATA_BIN property contains binary attachment data typically accessed through the Object Linking and Embedding (OLE) IStream interface. This property holds the attachment when the value of the PR_ATTACH_METHOD property is ATTACH_BY_VALUE, which is the usual attachment method and the only one required to be supported.
The Outlook object model cannot retrieve large binary or string MAPI properties using PropertyAccessor.GetProperty. On the low level (Extended MAPI) the IMAPIProp::GetProps() method does not work for the large PT_STING8 / PT_UNICODE / PT_BINARY properties. They must be opened as IStream in the following way - IMAPIProp::OpenProperty(PR_ATTACH_DATA_BIN, IIS_IStream, ...). See PropertyAccessor.GetProperty( PR_ATTACH_DATA_BIN) fails for outlook attachment for more information.
You can use Microsoft Power Automate to save the attachment in the drive and then upload the file to the Python environment.

How to get a specific folder with GetDefaultFolder and delete unneeded folders that it has created

I was trying to figure out how to access my folders with a Python program (see this SO answer.) When I ran this:
outlook = win32com.client.Dispatch("Outlook.Application")
namespace = outlook.GetNamespace("MAPI")
for i in range(50):
try:print(i,namespace.GetDefaultFolder(i).Name)
except:pass
The above program revealed or created some folders that I cannot figure out how to delete, such as:
Reminders
the file so that changes to the file will be reflected in your item.
RSS Subscriptions
In addition to being unable to delete these folders, I still haven't actually found the folders I'm looking for programmatically. In Outlook, I have folders that I have created that are at the same level as Inbox, Sent Items, etc... but I don't know how to access the parent folder of these.
My folder structure:
▼ My email address
Inbox
Drafts
Sent Items
...
Folder I want to find
...
the file so that changes to the file will be reflected in your item.
Reminders
RSS Subscriptions
Search Folders
GetDefaultFolder's argument is a enumeration. You can either use a numeric value that's courteously given in the doc,
or, as per Accessing enumaration constants in Excel COM using Python and win32com , access it via the symbolic value:
#need to only do this once per machine; after that, a regular Dispatch will do
o = win32com.client.gencache.EnsureDispatch("Outlook.Application")
from win32com.client import constants
o.GetDefaultFolder(constants.olFolderContacts)
As you could see, accessing a default folder that didn't yet exist creates it. See e.g. How to Hide or Delete Outlook's Default Folders on how to deal with them.
You need to specify a value from the OlDefaultFolders enumeration without iterating over all possible values for the GetDefaultFolder method.
You can't delete IPM folders like Inbox, Outbox and etc. using the Outlook object model.

Reading attributes of .msg file

I am trying to read a .msg file to get the sender, recipients, and title.
I'm making this script for my workplace where I'm only allowed to install default python libraries so I want to use the email module to do this.
On the python website I found some examples of using the email module. https://docs.python.org/3/library/email.examples.html
Near the end of the page it talks about getting the sender, subject and recipient. I've tried using this code like this:
# Import the email modules we'll need
from email import policy
from email.parser import BytesParser
with open('test_email.msg', 'rb') as fp:
msg = BytesParser(policy=policy.default).parse(fp)
# Now the header items can be accessed as a dictionary, and any non-ASCII will
# be converted to unicode:
print('To:', msg['to'])
print('From:', msg['from'])
print('Subject:', msg['subject'])
This results in an output:
To: None
From: None
Subject: None
I checked the file test_email.msg, it is a valid email.
When I add a line of code
print(msg)
I get an output of a garbled email the same as if I opened the .msg file in notepad.
Can anybody suggest why the email module isn't finding the sender/recipient/subject correctly?
You are apparently attempting to read some sort of proprietary binary format. The Python email library does not support this; it only handles traditional (basically text) RFC822 / RFC5322 format.
To read Microsoft's OLE formats, you will need a third-party module, and some patience, voodoo, and luck.
Also, for the record, there is no unambigious definition of .msg. Outlook uses this file extension for its files, but it is used on other files in other formats as well, including also traditional RFC822 files.
(The second link attempts to link to the MS-OXMSG spec on MSDN; but Microsoft have in the past regarded URLs as some sort of depletable resource which runs out when you use it, so the link will probably stop working if enough people click on it.)

Categories

Resources