Python IMAP search using a subject encoded with utf-8 - python

This question is related to question Python IMAP search using a subject encoded with iso-8859-1, but the reply given there is not working for me.
I am doing the following IMAP search in python:
typ, data = self.M.search("utf-8", "(SUBJECT %s)" % u"réception".encode("utf-8"))
And I get the following exception:
...
typ, data = self.M.search("utf-8", "(SUBJECT %s)" % u"réception".encode("utf-8"))
File "/usr/local/python/2.7.2/lib/python2.7/imaplib.py", line 625, in search
typ, dat = self._simple_command(name, 'CHARSET', charset, *criteria)
File "/usr/local/python/2.7.2/lib/python2.7/imaplib.py", line 1070, in _simple_command
return self._command_complete(name, self._command(name, *args))
File "/usr/local/python/2.7.2/lib/python2.7/imaplib.py", line 905, in _command_complete
raise self.error('%s command error: %s %s' % (name, typ, data))
error: SEARCH command error: BAD ['Could not parse command']
Why is that? How can I solve this problem?

import imaplib
import getpass
email = "XXXXXXX#gmail.com"
sock = imaplib.IMAP4_SSL("imap.gmail.com", 993)
sock.login(email, getpass.getpass())
# select the correct mailbox...
sock.select()
# turn on debugging if you like
sock.debug = 4
then:
# use the undocumented IMAP4.literal attribute
sock.literal = "réception"
sock.uid('SEARCH', 'CHARSET', 'UTF-8', 'SUBJECT')

u"réception" will need to be wrapped with quotes: u'"réception"', as IMAPLIB will not quote the string for you in the list.
Update: I could not get gmail's IMAP implementation to accept even a quoted string, and had to use IMAP literal syntax. I'm not sure if this is limitation of my encoding using socat, or a limitation with gmail.
a UID SEARCH CHARSET utf-8 SUBJECT "réception"
a BAD Could not parse command
a UID SEARCH CHARSET utf-8 SUBJECT {10}
+ go ahead
réception
* SEARCH
a OK SEARCH completed (Success)
Unfortunately, imaplib does not provide any way to force using of an IMAP literal.

External lib https://github.com/ikvk/imap_tools supports search by encoded data
from imap_tools import MailBox, A
# get list of emails that subject contains "réception" from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd') as mailbox:
for msg in mailbox.fetch(A(subject='réception'), charset='utf8'):
print(msg.subject)

this one works for me
# use the undocumented IMAP4.literal attribute
sock.literal = u"réception".encode('utf-8')
sock.uid('SEARCH', 'CHARSET', 'UTF-8', 'SUBJECT')
Thanks, Lee!

Related

Why am I receiving a TypeError when trying to use imapclient?

I am new to Python, and I am attempting to write code to get me into my gmail inbox, following along with a Udemy course. I have successfully installed imapclient, and importing impapclient does not return any error so I think I am okay on that.
When I go to write my next line of code conn = imapclient.IMAPClient('imap.gmail.com', ssl=True) and enter it, it gives me a Type Error
Traceback (most recent call last):
File "<pyshell#15>", line 1, in <module>
conn = imapclient.IMAPClient('imap.gmail.com', ssl=True)
File "C:\Users\china\AppData\Local\Programs\Python\Python39\lib\site-packages\imapclient\imapclient.py", line 254, in __init__
self._imap = self._create_IMAP4()
File "C:\Users\china\AppData\Local\Programs\Python\Python39\lib\site-packages\imapclient\imapclient.py", line 288, in _create_IMAP4
return tls.IMAP4_TLS(self.host, self.port, self.ssl_context,
File "C:\Users\china\AppData\Local\Programs\Python\Python39\lib\site-packages\imapclient\tls.py", line 44, in __init__
imaplib.IMAP4.__init__(self, host, port)
File "C:\Users\china\AppData\Local\Programs\Python\Python39\lib\imaplib.py", line 202, in __init__
self.open(host, port, timeout)
TypeError: open() takes 3 positional arguments but 4 were given
I know that both ssl=True and 'imap.gmail.com' are both required arguments, so I have not tried taking either out. I only have two lines of code, and I am following the exact code on the Udemy course, so I am not sure how to resolve this myself.
I had exactly the same problem.
The problem is that imapclient is only officially supported until Python 3.7, so you could try using imapclient with Python 3.7.
Or you could use imaplib module. It is a bit more complicated, but I tried using imaplib and pyzmail module (to make the text more readable)
import imaplib
import pyzmail
server = imaplib.IMAP4_SSL('IMAP_SERVER')
server.login('email#example.com', 'password')
server.select("INBOX")
typ, data = server.search(None, "ALL")
for num in data[0].split():
typ, data = server.fetch(num, '(RFC822)')
message = pyzmail.PyzMessage.factory(data[0][1])
print('THIS IS MESSAGE NO. ' + str(num))
if(message.text_part != None):
print(message.text_part.get_payload().decode(message.text_part.charset))
elif(message.html_part != None):
print(message.html_part.get_payload().decode(message.html_part.charset))
server.close()
server.logout()
im not an expert at Python, so using pyzmail might not be the optimal solution, but it worked for me using Python 3.9
for IMAP_SERVER you have to insert the right IMAP server domain name (like imap.gmail.com for gmail accounts)
and you have to enter your real email address and password (or get the password with input())
I got the same issue. And it has been fixed in IMAPClient 2.2.0.
https://imapclient.readthedocs.io/en/2.2.0/releases.html

Python imaplib deleting multiple emails gmail

my code look like this...
import imaplib
import email
obj = imaplib.IMAP4_SSL('imap.gmail.com','993')
obj.login('user','pass')
obj.select('inbox')
delete = []
for i in range(1, 10):
typ, msg_data = obj.fetch(str(i), '(RFC822)')
print i
x = i
for response_part in msg_data:
if isinstance(response_part, tuple):
msg = email.message_from_string(response_part[1])
for header in [ 'subject', 'to', 'from', 'Received' ]:
print '%-8s: %s' % (header.upper(), msg[header])
if header == 'from' and '<sender's email address>' in msg[header]:
delete.append(x)
string = str(delete[0])
for xx in delete:
if xx != delete[0]:
print xx
string = string + ', '+ str(xx)
print string
obj.select('inbox')
obj.uid('STORE', string , '+FLAGS', '(\Deleted)')
obj.expunge()
obj.close()
obj.logout()
the error I get is
Traceback (most recent call last):
File "del_email.py", line 31, in <module>
obj.uid('STORE', string , '+FLAGS', '(\Deleted)')
File "C:\Tools\Python(x86)\Python27\lib\imaplib.py", line 773, in uid
typ, dat = self._simple_command(name, command, *args)
File "C:\Tools\Python(x86)\Python27\lib\imaplib.py", line 1088, in _simple_command
return self._command_complete(name, self._command(name, *args))
File "C:\Tools\Python(x86)\Python27\lib\imaplib.py", line 918, in _command_complete
raise self.error('%s command error: %s %s' % (name, typ, data))
imaplib.error: UID command error: BAD ['Could not parse command']
I am looking for a way to delete multiple emails at once using imaplib or other module. I am looking for the simplest example to go off of. This example was given at this link here Using python imaplib to "delete" an email from Gmail? the last answer's example. I'ts not working correctly. I can however get the the 1st example to work to delete one email every time the script is ran. I'd rather try the doing it with a multiple than running the script several thousand times. my main goal is to delete multiple emails through imaplib any workarounds or other working modules or examples would be appreciated.
You might find this a bit easier using IMAPClient as it takes care of a lot more of low level protocol aspects for you.
Using IMAPClient your code would look something like:
from imapclient import IMAPClient
import email
obj = IMAPClient('imap.gmail.com', ssl=True)
obj.login('user','pass')
obj.select('inbox')
delete = []
msg_ids = obj.search(('NOT', 'DELETED'))
for msg_id in msg_ids:
msg_data = obj.fetch(msg_id, ('RFC822',))
msg = email.message_from_string(msg_data[msg_id]['RFC822'])
for header in [ 'subject', 'to', 'from', 'Received' ]:
print '%-8s: %s' % (header.upper(), msg[header])
if header == 'from' and '<senders email address>' in msg[header]:
delete.append(x)
obj.delete_messages(delete)
obj.expunge()
obj.close()
obj.logout()
This could be made more efficient by fetching multiple messages in a single fetch() call rather than fetching them one at a time but I've left that out for clarity.
If you're just wanting to filter by the sender's address you can get the IMAP server to do the filtering for you. This avoids the need to download the message bodies and makes the process a whole lot faster.
This would look like:
from imapclient import IMAPClient
obj = IMAPClient('imap.gmail.com', ssl=True)
obj.login('user','pass')
obj.select('inbox')
msg_ids = obj.search(('NOT', 'DELETED', 'FROM', '<senders email address>'))
obj.delete_messages(msg_ids)
obj.expunge()
obj.close()
obj.logout()
Disclaimer: I'm the author and maintainer of IMAPClient.
Initial post :
SyntaxError: '<sender's email address>'
# did you mean :
"<sender's email address>"

In a python try/except how can i pass data back to the calling script

I have a python script that gets executed from a php script.
This script sends and email and it is several orders of magnitude faster than php with swiftmailer, but i have a problem. I don't really understand much about python or charachter encoding, and sometimes an error is generated about the encoding from the MIMEText library of python. I googled and found a solution, but i still want to have the php script as a fallback in case of failure in the python script.
The problem is, in the except loop it seems i am unable to pass the data back to php before the script terminates? any ideas on what i'm doing wrong here? You can see in the except block that i say result = {type: 'failed'} then attempt to json encode it but the data doesn't get back to the php script.
import sys, json, os, smtplib, socket, logging
LOG_FILE='/tmp/mandrill.out'
logging.basicConfig(filename=LOG_FILE,level=logging.DEBUG)
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.header import Header
from email.utils import formataddr
try:
data = json.loads(sys.argv[1])
msg = MIMEMultipart('alternative')
msg['Subject'] = data['subject']
msg['From'] = formataddr((str(Header(u'Business Name', 'utf-8')), data['from']))
msg['To'] = data['to']
html = data['body']
email = MIMEText(html, 'html','utf-8')
username = 'username'
password = 'password'
msg.attach(email)
s = smtplib.SMTP('smtp.example.com', 587, socket.getfqdn(), 3)
s.login(username, password)
s.sendmail(msg['From'], msg['To'], msg.as_string())
s.quit()
result = {'type': 'success'}
print json.dumps(result)
except:
result = {'type': 'error'}
logging.exception('Got exception on main handler')
raise
return json.dumps(result)
EDIT TO INCLUDE TRACEBACK
ERROR:root:Got exception on main handler
Traceback (most recent call last):
File "/usr/share/nginx/mandrill/mandrill.py", line 16, in <module>
email = MIMEText(html, 'html')
File "/usr/lib/python2.7/email/mime/text.py", line 30, in __init__
self.set_payload(_text, _charset)
File "/usr/lib/python2.7/email/message.py", line 226, in set_payload
self.set_charset(charset)
File "/usr/lib/python2.7/email/message.py", line 262, in set_charset
self._payload = self._payload.encode(charset.output_charset)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 9580: ordinal not in range(128)
You can't use return to return data to the PHP calling script, you'll need to print it as you are doing in the successful case.
Also, calling raise will propagate the exception up to the main part of your script, thus preventing execution of the return json.dumps(result) (which should be a print). You don't need to raise the exception again (your script will terminate anyway).
You might also (or instead) want to set a return code from the python script via sys.exit(), then your PHP script can check the return code to detect errors. The convention is to set a return code of 0 for success and anything else for failure.
Also, you are using a bare except clause which will catch all exceptions - what it the actual exception that you are seeing? Can you update your question with the full traceback?
The raise causes the Python script to terminate, so it never reaches the return.
The next problem is that return can only be used inside Python to return a value to the calling function. You probably want to print the JSON string instead (but this somewhat depends on how you are calling Python from PHP, which you have not documented here).
The traceback indicates that
email = MIMEText(html, 'html','utf-8')
is not actually true; you are lacking the 'utf-8' parameter in the code which caused the error traceback.
You are killing the program with sys.exit(1).
If you want to return data back to something, I would exchange the sys.exit(1) line with
return json.dumps(result)

Python IMAP: =?utf-8?Q? in subject string

I am displaying new email with IMAP, and everything looks fine, except for one message subject shows as:
=?utf-8?Q?Subject?=
How can I fix it?
In MIME terminology, those encoded chunks are called encoded-words. You can decode them like this:
import email.header
text, encoding = email.header.decode_header('=?utf-8?Q?Subject?=')[0]
Check out the docs for email.header for more details.
This is a MIME encoded-word. You can parse it with email.header:
import email.header
def decode_mime_words(s):
return u''.join(
word.decode(encoding or 'utf8') if isinstance(word, bytes) else word
for word, encoding in email.header.decode_header(s))
print(decode_mime_words(u'=?utf-8?Q?Subject=c3=a4?=X=?utf-8?Q?=c3=bc?='))
The text is encoded as a MIME encoded-word. This is a mechanism defined in RFC2047 for encoding headers that contain non-ASCII text such that the encoded output contains only ASCII characters.
In Python 3.3+, the parsing classes and functions in email.parser automatically decode "encoded words" in headers if their policy argument is set to policy.default
>>> import email
>>> from email import policy
>>> msg = email.message_from_file(open('message.txt'), policy=policy.default)
>>> msg['from']
'Pepé Le Pew <pepe#example.com>'
The parsing classes and functions are:
email.parser.BytesParser
email.parser.Parser
email.message_from_bytes
email.message_from_binary_file
email.message_from_string
email.message_from_file
Confusingly, up to at least Python 3.10, the default policy for these parsing functions is not policy.default, but policy.compat32, which does not decode "encoded words".
>>> msg = email.message_from_file(open('message.txt'))
>>> msg['from']
'=?utf-8?q?Pep=C3=A9?= Le Pew <pepe#example.com>'
Try Imbox
Because imaplib is a very excessive low level library and returns results which are hard to work with
Installation
pip install imbox
Usage
from imbox import Imbox
with Imbox('imap.gmail.com',
username='username',
password='password',
ssl=True,
ssl_context=None,
starttls=False) as imbox:
all_inbox_messages = imbox.messages()
for uid, message in all_inbox_messages:
message.subject
In Python 3, decoding this to an approximated string is as easy as:
from email.header import decode_header, make_header
decoded = str(make_header(decode_header("=?utf-8?Q?Subject?=")))
See the documentation of decode_header and make_header.
High level IMAP lib may be useful here: imap_tools
from imap_tools import MailBox, AND
# get list of email subjects from INBOX folder
with MailBox('imap.mail.com').login('test#mail.com', 'pwd', 'INBOX') as mailbox:
subjects = [msg.subject for msg in mailbox.fetch()]
Parsed email message attributes
Query builder for searching emails
Actions with emails: copy, delete, flag, move, seen
Actions with folders: list, set, get, create, exists, rename, delete, status
No dependencies

Python 3.0 smtplib

I have a very simple piece of code that I used in previous versions of Python without issues (version 2.5 and prior). Now with 3.0, the following code give the error on the login line "argument 1 must be string or buffer, not str".
import smtplib
smtpserver = 'mail.somedomain.com'
AUTHREQUIRED = 1 # if you need to use SMTP AUTH set to 1
smtpuser = 'admin#somedomain.com' # for SMTP AUTH, set SMTP username here
smtppass = 'somepassword' # for SMTP AUTH, set SMTP password here
msg = "Some message to send"
RECIPIENTS = ['admin#somedomain.com']
SENDER = 'someone#someotherdomain.net'
session = smtplib.SMTP(smtpserver)
if AUTHREQUIRED:
session.login(smtpuser, smtppass)
smtpresult = session.sendmail(SENDER, RECIPIENTS, msg)
Google shows there are some issues with that error not being clear, but I still can't figure out what I need to try to make it work. Suggestions included defining the username as b"username", but that doesn't seem to work either.
UPDATE: just noticed from a look at the bug tracker there's a suggested fix also:
Edit smtplib.py and replace the existing encode_plain() definition with this:
def encode_plain(user, password):
s = "\0%s\0%s" % (user, password)
return encode_base64(s.encode('ascii'), eol='')
Tested here on my installation and it works properly.
Traceback (most recent call last):
File "smtptest.py", line 18, in <module>
session.login(smtpuser, smtppass)
File "c:\Python30\lib\smtplib.py", line 580, in login
AUTH_PLAIN + " " + encode_plain(user, password))
File "c:\Python30\lib\smtplib.py", line 545, in encode_plain
return encode_base64("\0%s\0%s" % (user, password))
File "c:\Python30\lib\email\base64mime.py", line 96, in body_encode
enc = b2a_base64(s[i:i + max_unencoded]).decode("ascii")
TypeError: b2a_base64() argument 1 must be bytes or buffer, not str
Your code is correct. This is a bug in smtplib or in the base64mime.py.
You can track the issue here:
http://bugs.python.org/issue5259
Hopefully the devs will post a patch soon.
As a variation on Jay's answer, rather than edit smtplib.py you could "monkey patch" it at run time.
Put this somewhere in your code:
def encode_plain(user, password):
s = "\0%s\0%s" % (user, password)
return encode_base64(s.encode('ascii'), eol='')
import smtplib
encode_plain.func_globals = vars(smtplib)
smtplib.encode_plain = encode_plain
This is kind of ugly but useful if you want to deploy your code onto other systems without making changes to their python libraries.
This issue has been addressed in Python3.1. Get the update at http://www.python.org/download/releases/3.1/

Categories

Resources