HTML formatting issues with python smtplib and Outlook 2010

HTML formatting issues with python smtplib and Outlook 2010 - python

I am generating html files using elementtree.ElementTree.dump on an Element. The files look ok in all browsers, and the underlying code within the files looks fine (no unclosed brackets or anything).
When I send an email to Outlook 2010 via smtplib, I am seeing weird formatting issues. These issues will be 100% repeatable, so the issue is logical. Here is an example:
<table b="" order="1">
That is from the source code of a HTML email I sent myself. It is correctly written as:
<table border="1">
within the original source code.
If in Outlook I write a HTML email using the original HTML as source, it correctly formats. (New email-attach html file->insert as text)
Is the issue going to be Outlook or Python? The function I used for reading the html file and sending is below.
def email_Report(mailOptions):
reportName = time.strftime("%Y%m%d.%H%M") + ".html"
ElementTree(mailOptions['report']).write("/home/%s/%s" %(mailOptions['username'],reportName))
#Set sender and receiver to the user building the report.
mailaddr = '%s#acme.com' %(mailOptions['username'])
#Access the report file. Added binary in case we ever use code on Windows
filename = "/home/%s/%s" % (mailOptions['username'], reportName)
open_file = open(filename, 'rb')
emsg = MIMEText(open_file.read(), 'html')
open_file.close()
emsg['Subject'] = "Report for %s generated by %s %s" % (mailOptions['zone'], mailOptions['username'], time.strftime("%d%m%Y-%H%M"))
emsg['To'] = mailaddr
emsg['From'] = mailaddr
#Hostname can be a parameter to SMTP method if localhost isn't listening
sc = smtplib.SMTP()
sc.connect()
sc.sendmail(mailaddr, mailaddr, emsg.as_string())
sc.close()
return
The HTML is extremely simple. No CSS, no title or head tags etc. Just html->body->table->tr->th->(newrow)->td->td etc. Could I have overlooked something like encoding/escaping? Do I have to use mime multipart? I am using Python 2.4.3 and can't use any module that didn't come stock.

Are you sure you're not running into the 990 character limit for mail servers as per
workaround for the 990 character limitation for email mailservers

Related

Python - IBM Watson Language Translator v3 - uploading content of a file and downloading the result

I'm trying to use the Python SDK for IBM Watson Language Translator v3, testing the beta functionality of translating actual documents. Below is my code:-
from ibm_watson import LanguageTranslatorV3
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
API = "1234567890abcdefg"
GATEWAY = 'https://gateway-lon.watsonplatform.net/language-translator/api'
document_list = []
"""The below authenticates to the IBM Watson service and initiates an instance"""
authenticator = IAMAuthenticator(API)
language_translator = LanguageTranslatorV3(
version='2018-05-01',
authenticator=authenticator
)
language_translator.set_service_url(GATEWAY)
submission = language_translator.translate_document(file="myfile.txt", filename="myfile.txt", file_content_type='text/plain', model_id=None, source='en', target='es', document_id=None)
document_list.append(submission.result['document_id'])
while len(document_list) > 0:
for document in document_list:
document_status = language_translator.get_document_status(document)
if document_status.result['status'] == "available":
translated_document = language_translator.get_translated_document(document)
document_list.remove(document)
language_translator.delete_document(document)
A few questions on this:-
When I check the content of 'translated_document', it doesn't actually contain any content. It contains the headers and the HTTP status of the response but no actually translated content
I decided to use CURL to download my uploaded document and instead of the actual content of the .txt file being uploaded for translation, when downloading the translated file via CURL, it appears that the content is the actual file name (myfile.txt) that is being submitted for translation as opposed to the content of the file.
Researching this and looking at the actual IBM Watson Github respository, it appears that I may have to read the content of 'myfile.txt' to a variable and then pass this variable as 'file={my_variable}' when submitting the translation but doesn't this defeat the object of being able to submit the actual documents for translation? How is this different to the conventional service offered?
Can anybody advise me as to what I'm doing wrong? I've tried multiple approaches (writing the value of 'translated_content' to a file) for example but I just don't seem to be able to grab the translated content nor can I seem to actually upload the content of the file to the service, instead I simply appear to submit the filename.
Thanks all

The file parameter of translate_document is supposed to be the actual content to be translated. I realize that's not clear from the documentation, but that's how the service works. So try passing the actual content you want translated in the file parameter.

Logging in to website to access data using Python

I have a subscription to the site https://www.naturalgasintel.com/ for daily feeds of data that show up on their site directly as .txt files; their user login page being https://www.naturalgasintel.com/user/login/
For example a file for today's feed is given by the link https://naturalgasintel.com/ext/resources/Data-Feed/Daily-GPI/2019/01/20190104td.txt and shows up on the site like the picture below:
What I'd like to do is to log in using my user_email and user_password and scrape this data in the form of an Excel file.
When I use Twill to try and 'point' me to the data by first logging me into the site I use this code:
from email.mime.text import MIMEText
from subprocess import Popen, PIPE
import twill
from twill.commands import *
year= NOW[0:4]
month=NOW[5:7]
day=NOW[8:10]
date=(year+month+day)
path = "https://naturalgasintel.com/ext/resources/Data-Feed/Daily-GPI/"
end = "td.txt"
go("http://www.naturalgasintel.com/user/login")
fv("2", "user[email]", user_email)
fv("2", "user[password]", user_password)
fv("2", "commit", "Login")
datafilelocation = path + year + "/" + month + "/" + date + end
go(datafilelocation)
However, logging in from the user login page sends me to this referrer link when I go to the data's location.
https://www.naturalgasintel.com/user/login?referer=%2Fext%2Fresources%2FData-Feed%2FDaily-GPI%2F2019%2F01%2F20190104td.txt
Rather than:
https://naturalgasintel.com/ext/resources/Data-Feed/Daily-GPI/2019/01/20190104td.txt
I've tried using modules like requests as well to log in from the site and then access this data but whatever method I use sends me to the HTML source rather than the .txt data location itself.
I've posted my complete walk-through with the Python 2.7 module Twill which I attached a bounty to here:
Using Twill to grab .txt from login page Python
What would the best solution to being able to access these password protected files be?

If you have a compatible version of FireFox for this, then get the plugin javascript 0.0.1 by Chee and add the following to run on the page:
document.getElementById('user_email').value = "E-What";
document.getElementById('user_password').value = " ABC Password ";
Change the email and password as you like. It will load the page, then after that it will put in your username and password.
There are other ways to do this all by yourself with your own stand-alone process. You do not have to download other people's programs and try to learn them (beyond this little thing) if you change it this way.
I would have up voted this question.

unnecessary exclamation marks(!)'s in HTML code

I am emailing the content of a text file "gerrit.txt" # http://pastie.org/8289257 in outlook using the below code,
however after the email is sent when I look at the source code( #http://pastie.org/8289379) of the email in outlook ,i see unnecessary
exclamation markds(!)'s in the code which is messing up the output, can anyone provide inputs on why is it so and how to avoid this ?
from email.mime.text import MIMEText
from smtplib import SMTP
def email (body,subject):
msg = MIMEText("%s" % body, 'html')
msg['Content-Type'] = "text/html; charset=UTF8"
msg['Subject'] = subject
s = SMTP('localhost',25)
s.sendmail('userid#company.com', ['userid2#company.com'],msg=msg.as_string())
def main ():
# open gerrit.txt and read the content into body
with open('gerrit.txt', 'r') as f:
body = f.read()
subject = "test email"
email(body,subject)
print "Done"
if __name__ == '__main__':
main()

Some info available here: http://bugs.python.org/issue6327
Note that mailservers have a 990-character limit on each line
contained within an email message. If an email message is sent that
contains lines longer than 990-characters, those lines will be
subdivided by additional line ending characters, which can cause
corruption in the email message, particularly for HTML content. To
prevent this from occurring, add your own line-ending characters at
appropriate locations within the email message to ensure that no lines
are longer than 990 characters.
I think you must split your html to some lines. You can use textwrap.wrap method.

adding a '\n' in between my html string , some random 20 characters before "!" was appearing solved my problem

I also faced the same issue, Its because outlook doesn't support line more than 990 characters it starts giving below issues.
Nested tables
Color change of column heading
Adding unwanted ! marks .
Here is solution for the same.
if you are adding for single line you can add
"line[:40]" + \r\n + "line[40:]".
If you are forming a table then you can put the same in loop like
"<td>" + line[j][:40]+"\r\n"+line[j][40:] + "</td>"

In my case the html is being constructed outside of the python script and is passed in as an argument. I added line breaks after each html tag within the python script which resolved my issue:
import re
result_html = re.sub(">", ">\n", html_body)

TurboMail not adding Content-ID when embedding images

My bad. Postmark does not support inline images apparently. Solved by changing smtp-mail provider.
I'm trying to send e-mails with TurboMail using pylons.
Everything works fine, except for using embedded images in html-content. It seems that the Content-ID header for each image is being lost somewhere along the way.
This is my code:
def sendMail(to,subject,html_content,plain_content,images):
from turbomail import Message as Mail
mail = Mail(to=to,subject=subject)
mail.plain = plain_content
mail.rich = html_content
for cid,path in images.iteritems():
mail.embed(path,cid)
mail.send()
In my tests the html content is:
<html>
<header/>
<body>
<h1>Send images using TurboMail</h1>
<img src="cid:img0" />
</body>
</html>
And the images dict:
{"img0":"path/to/img0"}

When you pass in both a filename and a cid, TurboMail ignores the cid and uses the basename of the file instead. I suspect your filenames have extensions and your cids do not:
{"img0":"path/to/img0.png"}
If so, the images are embedded with a cid of img0.png instead.
You could pass in an open image file instead; TurboMail will then not ignore the name:
def sendMail(to,subject,html_content,plain_content,images):
from turbomail import Message as Mail
mail = Mail(to=to,subject=subject)
mail.plain = plain_content
mail.rich = html_content
for cid,path in images.iteritems():
mail.embed(open(path, 'rb'), cid)
mail.send()
I'd use marrow.mailer instead; it's the new name for the same package but the .embed method has been made a little saner in it's handling of embedded images and cids.
an earlier revision of this answer had marrow and TurboMail confused, referring to the marrow .embed signature instead.

Apparently, Postmarkapp does not support inline images.

How do I find mime-type in Python [duplicate]

This question already has answers here:
How to find the mime type of a file in python?
(18 answers)
Closed 1 year ago.
I am trying out some CGI-scripting in Python. If a request come to my python script, how do I find the mime-type of the filetype?
UPDATE: Adding more info
Some images for now, adderess/image.jpg. But if I understand mime-type is part of the headers that the web-browsers sends, right? So how do I read these headers?

You have two options. If your lucky the client can determine the mimetype of the file and it can be included in the form post. Usually this is with the value of the an input element whose name is "filetype" or something similar.
Otherwise you can guess the mimetype from the file extension on the server. This is somewhat dependent on how up-to-date the mimetypes module is. Note that you can add types or override types in the module. Then you use the "guess_type" function that interprets the mimetype from the filename's extension.
import mimetypes
mimetypes.add_type('video/webm','.webm')
...
mimetypes.guess_type(filename)
UPDATE: If I remember correctly you can get the client's interpretation of the mimetype from the "Content-Type" header. A lot of the time this turns out to be 'application/octet-stream' which is almost useless.
So assuming your using the cgi module, and you're uploading files with the usual multipart form, the browser is going to guess the mimetype for you. It seems to do a decent job of it, and it gets passed through to the form.type parameter. So you can do something like this:
import cgi
form = cgi.FieldStorage()
files_types = {};
if form.type == 'multipart/form-data':
for part in form.keys():
files_types[form[part].filename] = form[part].type
else:
files_types[form.filename] = form.type

Assuming you are submitting files with an HTML form like this:
<form enctype="multipart/form-data" method="post">
<input name="userfile" type="file" />
<input type="submit" value="Send File" />
</form>
You can get the file type with this in a Python CGI script:
form = cgi.FieldStorage()
filetype = form['userfile'].type
Take into account that the submitted file type is determined by the client web browser. It is not detected by Python CGI library or the HTTP server. You may use Python mimetypes module to guess MIME types from file extensions (e.g.: mimetypes.guess_type(form['userfile'].name)). You could also use the UNIX file command for the highest reliability although this is usually too costly.

You can install the package with pip install filemime and use it as follows. all type of files find file type and mime type filemime
from filemime import filemime
fileObj = filemime()
mime = fileObj.load_file("Kodi_Kodi.mp3",mimeType=True)
print(f"mime-type: {mime}")
output:
mime-type: 'audio/mpeg'

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

HTML formatting issues with python smtplib and Outlook 2010 - python

Are you sure you're not running into the 990 character limit for mail servers as per workaround for the 990 character limitation for email mailservers

Related

Python - IBM Watson Language Translator v3 - uploading content of a file and downloading the result

Logging in to website to access data using Python

unnecessary exclamation marks(!)'s in HTML code

TurboMail not adding Content-ID when embedding images

How do I find mime-type in Python [duplicate]

Categories

Resources