Using ExtractMsg in a loop? - python

I am trying to write a script that will extract details from Outlook .msg files and append then to a .csv file. ExtractMsg (https://github.com/mattgwwalker/msg-extractor) will process the messages one at a time, at the command line with 'python ExtractMsg.py message' but I can't work out how to use this to loop through all the messages in the directory.
I have tried:
import ExtractMsg
import glob
for message in glob.glob('*.msg'):
print 'Reading', message
ExtractMsg(message)
This gives "'module' object is not callable". I have tried to look at the ExtractMsg module but the structure of it is beyond me at the moment. How can I make the module callable?

ExtractMsg(message)
You are trying to call module object - exactly what error message us telling you.
Perhaps you need to use ExtractMsg.Message class instead
msg = ExtractMsg.Message(message)
In the next link on the very bottom you will find example of usage
https://github.com/mattgwwalker/msg-extractor/blob/master/ExtractMsg.py

Thanks all - the following sorted it:
import ExtractMsg
import glob
for message in glob.glob('*.msg'):
print 'Reading', message
msg = ExtractMsg.Message(message)
body = msg._getStringStream('__substg1.0_1000')
sender = msg._getStringStream('__substg1.0_0C1F')

Related

Getting the name of a folder using Exchangelib

I have looked at:
How to get the parent folder name of Message with Exchangelib python
But have been unable to make this work using the following debugging code:
for item in docdead.all().order_by('-datetime_received')[:3000]: #look into the inbox the first 3K emails order desc by date received
if item.datetime_received < ews_bfr: #if the mail if older than the custom date in the EWS format then apply rule
print (item.subject)
print (item.datetime_received)
print (item.sender.email_address)
print (item.sender.name)
print (item.body)
print(SingleFolderQuerySet(
account=account,
folder=account.root
).get(id=item.parent_folder_id.id))
for attachment in item.attachments:
print (attachment.name)
I get:
ValueError: EWS does not support filtering on field 'id'
I am sure its a simple error, but I would appreciate any help.
If you're just querying one folder, then parent_folder_id will always point to that folder.
If you're querying multiple folders at a time, here's the general way to look up a folder name by ID:
from exchangelib.folders import FolderId, SingleFolderQuerySet
folder_name = SingleFolderQuerySet(
account=account,
folder=FolderId(id=item.parent_folder_id.id),
).resolve().name

How to recall previous output of python script

i am working on a telethon script of python which runs if the channel/group receives new message
i am looking at the message id for running my script
i am a beginner of python so with what knowledge i have
i am using this following code.
prev_msgid=0
latest_msgid = message.id
if latest_msgid>prev_msgid:
print('latest message')
prev_msgid = message.id
else:
print('old message')
but when i run this code every time the previous message resets to 0
i need a way for when i run this code multiple times the prev_msgid is automatically changed to the latest message id.
thank you.
like #Quba said you need a way to store data in persistent way
Pickle is the fastest solution for you. It can save python object as a file:
import pickle
from os import path
prev_msgid = 0
# check if saved
if path.exists("prev_msgid"):
# load
with open("prev_msgid", 'rb') as f:
prev_msgid = pickle.load(f)
prev_msgid += 1
# save
with open("prev_msgid", 'wb') as f:
pickle.dump(prev_msgid, f)
print(prev_msgid)
Every time you run the script it will add one to prev_msgid. See that it makes a file that named "prev_msgid"

After windows update getting this error AttributeError: olEmbeddeditem

I have windows 10 environment with Python 2.7, win32com package 219 is installed.
I was able to run below code which runs a macro in excel and generate a pie chart that will get attached(also get embedded in email body) to email and sent.
This program was working fine, earlier, however after some windows update, the same is giving AttributeError: olEmbeddeditem, i have imported win32com.client and its constant.
Want the embedded image in the email body, so replacing olEmbeddeditem with olByValue, etc. will not help, i think, though i have tried, which also didn't worked.
I have also done reinstallation of win32com package of python, however problem persist.
Earlier working code does not included "from win32com.client import constants", however since it was not working, have thought of adding this line, but this too didn't helped.
Any help would be appreciated.
import sys
import os
import win32com.client
import codecs
from win32com.client import constants
sys.stdout = codecs.getwriter("iso-8859-1")(sys.stdout, 'xmlcharrefreplace')
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
all_inbox = inbox.Items
folders = inbox.Folders
olMailItem = 0x0
obj = win32com.client.Dispatch("Outlook.Application")
xlApp = win32com.client.Dispatch("Excel.Application")
ExcelWorkBook = xlApp.Workbooks.Open('C:\Users\xxx\Desktop\data.xlsm')
xlSheet1 = ExcelWorkBook.Sheets("Sheet1")
xlApp.Application.Run("data.xlsm!Macro1")
chart1 = xlSheet1.ChartObjects(1)
chart1.Chart.Export("C:\Users\xxx\Desktop\photo.gif", "GIF", False)
xlApp.Workbooks(1).Close(SaveChanges=0)
xlApp.Application.Quit()
newMail = obj.CreateItem(olMailItem)
newMail.Subject = "Presentation of Automation"
attachment = newMail.Attachments.Add("C:\Users\xxx\Desktop\photo.gif", win32com.client.constants.olEmbeddeditem, 0, "photo")
imageCid = "photo.gif"
attachment.PropertyAccessor.SetProperty("http://schemas.microsoft.com/mapi/proptag/0x3712001E", imageCid)
newMail.HTMLBody = "<body>Dear Sir,Madam,<br>Please find the requested details.<br><br><p><img src=\"cid:{0}\"></body>".format(imageCid)
newMail.To = x
attachment1 = "C:\Users\xxx\Desktop\photo.gif"
newMail.Attachments.Add(attachment1)
newMail.Send()
os.remove("C:\Users\xxx\Desktop\photo.gif")
msg.UnRead = False
The root cause of the issue was not a Windows update as suspected, however it was because of a group email in the Inbox which was giving the error. After deleting that group mail or moving to different folder than Inbox the issue got resolved. Still not sure about the reason why it was giving the error and what is the way out going forward to ensure that such emails does not end up into a traceback.
The main reason for this attribute error is because your COM-server has shifted from late-binding (dynamic) to early binding (static).
Delete the gen_py folder in Temp which will revert the Dispatch to dynamic from static and your code should work fine.
instead of using
attachment = newMail.Attachments.Add("C:\Users\xxx\Desktop\photo.gif", win32com.client.constants.olEmbeddeditem, 0, "photo")
you can do
attachment = newMail.Attachments.Add("C:\Users\xxx\Desktop\photo.gif", 0x5, 0, "photo")

Reading saved email file with “.msg” extension, in local disk

How to read email file (saved email to local drive, with “.msg” extension)?
I tried this 2 lines and it doesn't work out.
msg = open('Departure HOUSTON EXPRESS Port NORFOLK.msg', 'r')
print msg.read()
I searched the web for an answer, which gave the below code:
import email
def read_MSG(file):
email_File = open(file)
messagedic = email.Message(email_File)
content_type = messagedic["plain/text"]
FROM = messagedic["From"]
TO = messagedic.getaddr("To")
sujet = messagedic["Subject"]
email_File.close()
return content_type, FROM, TO, sujet
myMSG= read_MSG(r"c:\\myemail.msg")
print myMSG
However it gives an error:
Traceback (most recent call last):
File "C:\Python27\G.py", line 19, in <module>
myMSG= read_MSG(r"c:\\myemail.msg")
File "C:\Python27\G.py", line 10, in read_MSG
messagedic = email.Message(email_File)
TypeError: 'LazyImporter' object is not callable
Some responses on Internet tell it’d better to convert the .msg to .eml before parsing but I am not really sure how.
What would be the best way to read a .msg file?
The code you have now looks to be completely unworkable for what you're trying to accomplish. You need to parse Outlook ".msg" files, which can be done in Python but not using the email module. But if you can use ".eml" files as you mentioned, it will be easier because the email module can read those.
To read .eml files, see email.message_from_file().
In case someone else comes across this like me, almost a decade after the original question:
After trying some different solutions offered here and elsewhere on the internet, I found that the easiest for me was to use extract-msg, which you can install with pip. The readme documentation is limited, but the doc-strings in the actual library is quite comprehensive.
In my case, I needed to read a .msg on disc and specifically save its attachments to disc. Here is some sample code to show how easy this is with extact-msg:
import extract_msg
msg = extract_msg.openMsg('c:/some_folder/some_mail.msg')
sender = msg.sender
subject = msg.subject
body = msg.body
time_received = msg.receivedTime # datetime
attachment_filenames = []
for att in msg.attachments:
att.save(customPath='c:/saved_attachments/')
attachment_filenames.append(att.name)

Extracting Embedded Images From Outlook Email

I am using Microsoft's CDO (Collaboration Data Objects) to programmatically read mail from an Outlook mailbox and save embedded image attachments. I'm trying to do this from Python using the Win32 extensions, but samples in any language that uses CDO would be helpful.
So far, I am here...
The following Python code will read the last email in my mailbox, print the names of the attachments, and print the message body:
from win32com.client import Dispatch
session = Dispatch('MAPI.session')
session.Logon('','',0,1,0,0,'exchange.foo.com\nbar');
inbox = session.Inbox
message = inbox.Messages.Item(inbox.Messages.Count)
for attachment in message.Attachments:
print attachment
print message.Text
session.Logoff()
However, the attachment names are things like: "zesjvqeqcb_chart_0". Inside the email source, I see image source links like this:
<IMG src="cid:zesjvqeqcb_chart_0">
So, is it possible to use this CID URL (or anything else) to extract the actual image and save it locally?
Difference in versions of OS/Outlook/CDO is what might be the source of confusion, so here are the steps to get it working on WinXP/Outlook 2007/CDO 1.21:
install CDO 1.21
install win32com.client
goto C:\Python25\Lib\site-packages\win32com\client\ directory run the following:
python makepy.py
from the list select "Microsoft CDO 1.21 Library (1.21)", click ok
C:\Python25\Lib\site-packages\win32com\client>python makepy.py
Generating to C:\Python25\lib\site-packages\win32com\gen_py\3FA7DEA7-6438-101B-ACC1-00AA00423326x0x1x33.py
Building definitions from type library...
Generating...
Importing module
Examining file 3FA7DEA7-6438-101B-ACC1-00AA00423326x0x1x33.py that's just been generated, will give you an idea of what classes, methods, properties and constants are available.
Now that we are done with the boring steps, here is the fun part:
import win32com.client
from win32com.client import Dispatch
session = Dispatch('MAPI.session')
session.Logon ('Outlook') # this is profile name
inbox = session.Inbox
messages = session.Inbox.Messages
message = inbox.Messages.GetFirst()
if(message):
attachments = message.Attachments
for i in range(attachments.Count):
attachment = attachments.Item(i + 1) # yep, indexes are 1 based
filename = "c:\\tmpfile" + str(i)
attachment.WriteToFile(FileName=filename)
session.Logoff()
Same general approach will also work if you have older version of CDO (CDO for win2k)

Categories

Resources