This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Reading e-mails from Outlook with Python through MAPI
I am completely new to Python and have been given the task to write a program that connects to my Microsoft Outlook mailbox, goes through all the emails and if the subject has a certain word, then the details of the email time and subject will be saved in variables, as well as the email message body will be parsed and relevant information will be stored in variables. Then this information will be stored in an external server/database. It also needs to be able to monitor any new emails that comes to my mailbox and repeat the drill of checking the subject line and taking appropriate action.
I have written exactly the same kind of program in C# earlier using the Interop library, but now need to do this in Python. I can figure out the nitty-gritty details by readin gthe module documentations later on, but from a high level perspective what modules should I use. I have been doing my research and some modules that have been mentioned include email, procmail and imaplib, but what do the Python veterans here recommend for the kind of project I am overtaking?
Thanks in advance for any help you might be able to provide!
At one company I worked we have a mailbox for suggestion with websites that had 'adult' material and one mailbox for spam mail that should be blocked.
Once I began working I was in 'charge' of this 'gracious' jobs.
Checking it there was something like 2000 unread mails to block and 4000 spam mails to block too.
Of course that is a function to be automatized and I looked for a good solution for me.
What I did:
[1] Used python IMAP to connect to Exchange server
[2] Used beatifulsoup (python) to parse the href values inside the email
[3] After that send a email 'thanking' the user for its collaboration (very important)
Three days after my boss thanked me for the great effort I was doing answering all the e-mails and that we got compliments. Because NOW we are answering back the customers. (not me the script)
Ok. now lets do a plan
Check the imap python module [1], and after take one tutorial using ssl imap4 [4]
Decide What is best for YOUR problem? Download the emails (pop3) or search and browse it at server (IMAP).
CHECK if you can connect using the protocols IMAP4 or POP3 Before, exchange is buggy in this part please check this bug report too [3]
Ok, you are sure you can connect using IMAP4 or POP3, now fetch one message and parse it with beatiful soup or lxml. (my case I looked for href and 'mailto:')
Do a nice message using the field 'from:' the email making it personal
PROFIT
[1] google it imap python
[2] google it BeautifulSoup python
[3] http://support.microsoft.com/kb/296387
[4] http://yuji.wordpress.com/2011/06/22/python-imaplib-imap-example-with-gmail/
Sorry but I had to give the google urls because of my low score.
I hope this answer give you some good pointers to your solution.
Of course you can make it more hax0r using lxml, sending the data to a DB.
But after you connect and start manipulating you can do anything :)
Related
The scenario
In our scenario we have:
a manager mailbox (manager#xxx.yyy)
many resource mailboxes (resource_a#xxx.yyy, resource_b#xxx.yyy...)
Resource mailboxes are setup so that their meeting requests are forwarded to the manager mailbox. The manager will accept or decline on behalf of the resource.
Our python script connects to the manager account using exchangelib, get the meeting requests and is supposed to accept or to decline depending on rules that are resource specific.
The problem
Our problem is that we can't find a way to know which resource is a MeetingRequest related to.
What we have tried so far
The to_recipients field's value is manager#xxx.yyy so it doesn't help.
The author and sender fields' values are the mailbox which has created the original meeting so it doesn't help either.
We can't rely upon required_attendees or optional_attendees for 2 reasons:
There always are peoples' email addresses in the attendees in addition to the resource and our script can't differentiate a resource email address from other email addresses.
There can be more than one resource for the same meeting. In such a case there will be a meeting request for each resource, each with all the resources in the attendees.
According to MS doc MeetingRequest should have a ReceivedRepresenting field which seems to be exactly what we need. Unfortunately it is not present in the exchangelib MeetingRequest object although it is in the XML response from EWS when getting the meeting request (we can see it by unabling exchangelib debug logging).
<t:ReceivedRepresenting>
<t:Mailbox>
<t:Name>Resource A</t:Name>
<t:EmailAddress>resource_a#xxx.yyy</t:EmailAddress>
<t:RoutingType>SMTP</t:RoutingType>
<t:MailboxType>Mailbox</t:MailboxType>
</t:Mailbox>
</t:ReceivedRepresenting>
Any idea?
Any idea on how to solve this?
ReceivedRepresenting is not mentioned in the EWS documentation on MeetingRequest
Does your MeetingRequest XML element contain such a child element? If it does, then please open an issue at ecederstrand/exchangelib and I'll have the field implemented.
UPDATE:
I've opened a pull request against EWS docs to have this field added. And committed a change to support the field in exchangelib. It's released in v4.7.2.
I am attempting to retrieve a list of all email addresses in an enterprise domain from the company's Exchange server using Python. I am guessing that I would need to use some kind of Admin API to retrieve said information. I am looking at PyExchange and some others but I am unable to find specific areas where I can get started.
It would be great to get some advice on where I need to start with this.
PS - One of the options that I am also exploring is to use the subprocess module to perform PowerShell commands. I am not sure if that is a right/wrong approach.
Another option is you can use LDAP to access Active Directory and just query and retrieve that information from the AD object directly see http://www.reddit.com/r/Python/comments/1ap51a/python_and_active_directory/
Eg the Primary SMTP Address is held in the Mail property and the ProxyAddresses are held in the ProxyAddresses property
Cheers
Glen
There are many emails in my All mailbox more than there are in the Important and Sent mailboxes. I want to remove all the mails which are not in the Important or Sent mailbox.
I can not do any of the following steps
1) Delete all the emails in the All mailbox, (when i delete all the emails in the All mailbox, all the emails in the Important and Sent mailboxes will be deleted at the same time)
2) and copy emails from the Important and Sent mailboxes.
How can I write code to accomplish this?
The problem can become another form:
how can i make a copy of emails in my gmailbox :"[Gmail]/&kc 2JgQ-" into local directory g:\mygmail ?
There are 5 emails in my gmail--inbox ,i save all of them in the g:\mygmails,and name them as 0th.myemail 1th.myemail 2th.myemail 3th.myemail 4th.myemail with the following code,now how can i read them by thunderbird or some email soft ,i don't want to write my own code to read them?
import email,imaplib
att_path="g:\\mygmails\\"
user="xxxx"
password="yyyy"
con=imaplib.IMAP4_SSL('imap.gmail.com')
con.login(user,password)
con.select('INBOX')
resp, items = con.search(None, "ALL")
items = items[0].split()
for id,num in enumerate(items):
resp, data = con.fetch(num, "(RFC822)")
data=data[0][1]
fp = open(att_path+str(id)+"th"+".myemail", 'wb')
fp.write(data)
fp.close()
After doing some digging around on google, I found a github repository that provides a module for doing just this. It is not very well documented but the source code is very easy to read so it isn't a significant loss at all.
In terms of using this module, you can load in each email with the specified labels and mark them for being saved, then go through all the emails and delete the ones that have not been marked.
I don't currently see a natural way to mark the emails on the remote server, so you may have to implement something where you record the emails as strings and store them in a set.
If you have any questions still, just post a comment to this answer and I can elaborate more.
For Example: if you wanted to copy the entries of a particular mailbox into a python data structure, you can do so like this:
# Global Variables
username, password, mailboxname = '', '', '[Gmail]/&kc 2JgQ-'
# Set up
import gmail
g = gmail.Gmail()
g.login(username, password)
# Actual code.
emails = []
for email in g.mailbox(mailboxname).mail():
emails.append(email.fetch())
# Tear down.
g.logout()
So assuming that you adjust the global variables accordingly, you now have a python list (in the python variable emails) of all the emails in mailboxname for the gmail account username. Once you have this, you can easily do something like saving it to a file(s).
If you like Windows_PowerShell I have a solution that can be reuse with little effort and customized for your needs. You can setup Mail_User_Agent to use the Web Access API and automate this task. In my examples good old Powershell (as we know already - task automation and configuration management framework from Microsoft) with it's headless IE capabilities (will make it work as a Daemon and allow it to communicate with us only if preconditions are true) is able to support all this.
And to be more precise if You have to Login and use Firewall Web Access APIs - the implementation is almost the same. So with one stone we get two birds - every morning You'll be behind-the-wall and knowing your mail content. Here You can see sample solution.
I'm attempting to write a Python function to send an email to a list of users, using the default installed mail client. I want to open the email client, and give the user the opportunity to edit the list of users or the email body.
I did some searching, and according to here:
http://www.sightspecific.com/~mosh/WWW_FAQ/multrec.html
It's apparently against the RFC spec to put multiple comma-delimited recipients in a mailto link. However, that's the way everybody else seems to be doing it. What exactly is the modern stance on this?
Anyhow, I found the following two sites:
http://2ality.blogspot.com/2009/02/generate-emails-with-mailto-urls-and.html
http://www.megasolutions.net/python/invoke-users-standard-mail-client-64348.aspx
which seem to suggest solutions using urllib.parse (url.parse.quote for me), and webbrowser.open.
I tried the sample code from the first link (2ality.blogspot.com), and that worked fine, and opened my default mail client. However, when I try to use the code in my own module, it seems to open up my default browser, for some weird reason. No funny text in the address bar, it just opens up the browser.
The email_incorrect_phone_numbers() function is in the Employees class, which contains a dictionary (employee_dict) of Employee objects, which themselves have a number of employee attributes (sn, givenName, mail etc.). Full code is actually here (Python - Converting CSV to Objects - Code Design)
from urllib.parse import quote
import webbrowser
....
def email_incorrect_phone_numbers(self):
email_list = []
for employee in self.employee_dict.values():
if not PhoneNumberFormats.standard_format.search(employee.telephoneNumber):
print(employee.telephoneNumber, employee.sn, employee.givenName, employee.mail)
email_list.append(employee.mail)
recipients = ', '.join(email_list)
webbrowser.open("mailto:%s?subject=%s&body=%s" %
(recipients, quote("testing"), quote('testing'))
)
Any suggestions?
Cheers,
Victor
Well, since you asked for suggestions: forget about the mailto: scheme and webbrowser, and write a small SMTP client using Python's smtplib module. It's standard, fully supported on all systems, and there's an example included in the documentation which you can practically just copy-and-paste pieces out of.
Of course, if you're using smtplib you will have to ask the user for the details of an SMTP server to use (hostname and port, and probably a login/password). That is admittedly inconvenient, so I can see why you'd want to delegate to existing programs on the system to handle the email. Problem is, there's no system-independent way to do that. Even the webbrowser module doesn't work everywhere; some people use systems on which the module isn't able to detect the default (or any) browser, and even when it can, what happens when you provide a mailto: link is entirely up to the browser.
If you don't want to or can't use SMTP, your best bet might be to write a custom module that is able to detect and open the default email client on as many different systems as possible - basically what the webbrowser module does, except for email clients instead of browsers. In that case it's up to you to identify what kinds of mail clients your users have installed and make sure you support them. If you're thorough enough, you could probably publish your module on PyPI (Python package index) and perhaps even get it included in a future version of the Python standard library - I'm sure there are plenty of people who would appreciate something like that.
As is often the case in Python, somebody's already done most of the hard work. Check out this recipe.
In the following line, there shouldn’t be a space after the comma.
recipients = ', '.join(email_list)
Furthermore, Outlook needs semicolons, not commas. Apart from that, mailto never gave me grief.
The general tip is to test mailto URLs manually in the browser first and to debug URLs by printing them out and entering them manually.
I use Gmail and an application that notifies me if I've received a new email, containing its title in a tooltip. (GmailNotifier with Miranda-IM) Most of the emails I receive are ones I don't want to read, and it's annoying having to login to Gmail on a slow connection just to delete said email. I believe plugin is closed source.
I've been (unsuccessfully) trying to write a script that will login and delete the 'top' email (the one most recently received). However this is not as easy I thought it would be.
I first tried using imaplib, but discovered that it doesn't contain any of the methods I hoped it would. It's a bit like the dbapi spec, containing only minimal functionality incase the imap spec is changed. I then tried reading the imap RFC (rfc3501). Halfway through it, I realized I didn't want to write an entire mail client, so decided to try using pop3 instead.
poplib is also minimal but seemingly has what I need. However pop3 doesn't appear to sort the messages in any order I'm familiar with. I have to either call top() or retr() on every single email to read the headers if I want to see the date received.
I could probably iterate through every single message header, searching for the most recent date, but that's ugly. I want to avoid parsing my entire mailbox if possible. I also don't want to 'pop' the mailbox and download any other messages.
It's been 6 hours now and I feel no closer to a solution than when I started. Am I overlooking something simple? Is there another library I could try? (I found a 'chilkat' one, but it's bloated to hell, and I was hoping to do this with the standard library)
import poplib
#connect to server
mailserver = poplib.POP3_SSL('pop.gmail.com')
mailserver.user('recent:YOURUSERNAME') #use 'recent mode'
mailserver.pass_('YOURPASSWORD') #consider not storing in plaintext!
#newest email has the highest message number
numMessages = len(mailserver.list()[1])
#confirm this is the right one, can comment these out later
newestEmail = mailserver.retr(numMessages)
print newestEmail
#most servers will not delete until you quit
mailserver.dele(numMessages)
mailserver.quit()
I worked with the poplib recently, writing a very primitive email client. I tested this with my email server (not gmail) on some test emails and it seemed to work correctly. I would send yourself a few dummy emails to test it out first.
Caveats:
Make sure you are using 'recent
mode':
http://mail.google.com/support/bin/answer.py?answer=47948
Make sure your Gmail account has POP3
enabled: Gmail > Settings >
Forwarding and POP/IMAP > "Enable POP
for all mail"
Hope this helps, it should be enough to get you going!