Python-Parse email Body and truncate MIME headers - python

I have an email body which looks somewhat like .
Now I want to remove all the header from it and just have the conversation email text. How can I do it in python?
I tried email.parser module but that isn't giving me the result which I want.
Please find the below code for more information.
import email
a="""--c66f5985-233d-4e89-b598-6398b60cbe00
Content-Type: multipart/alternative;
differences="Content-Type";
boundary="d5eff9f8-76b3-4320-adfb-1e51add8fa8f"
--d5eff9f8-76b3-4320-adfb-1e51add8fa8f
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
THis is a demo email body
Thanks And Regards,
Ana
"""
b = email.message_from_string(a)
if b.is_multipart():
for payload in b.get_payload():
# if payload.is_multipart(): ...
print (payload.get_payload())
else:
print (b.get_payload())

import imaplib,email
hst = "your.host.adresse.com"
usr = "login"
pwd = "password"
imap = imaplib.IMAP4(hst)
try:
imap.login(usr, pwd)
except Exception as e:
raise IOError(e)
try:
imap.select("Inbox") # Tell Imap where to go
result, data = imap.uid('search', None, "ALL")
latest = data[0].split()[-1]
result, data = imap.uid('fetch', latest, '(RFC822)')
a = data[0][1] # This contains the Mail Data
except Exception as e:
raise IOError(e)
b = email.message_from_string(a)
if b.is_multipart():
for payload in b.get_payload():
b = (payload.get_payload())
else:
b = (b.get_payload())
print b
This removes all the stuff from the mail you don't want in the final text. I've tested this with your code. You didn't show how you import the mail (your a) so i guess that's where you get the decoding problem from.
If you have any trouble with HTML Mails:
from bs4 import BeautifulSoup
soup = BeautifulSoup(b, 'html.parser')
soup = soup.get_text()
print soup
That should do the job for now, but I'd advise you to change the default python parser to lxml or html5lib.

Related

Sending a pandas Dataframe using smtplib

I've seen a lot of threads here about this topic, however, none regarding this specific question.
I am sending a email with a pandas dataframe (df) as an html using pandas built in df.to_html() method. The email sends successfully. However, the df is displayed in the email as html, not in the desired table format. Can anyone offer assistance on how to ensure the df is displayed as a table, not in html in the email? The code is below:
import requests
import pandas as pd
import smtplib
MY_LAT =
MY_LNG =
API_KEY = ""
parameters = {
"lat": MY_LAT,
'lon': MY_LNG,
'exclude': "",
"appid": API_KEY
}
df = pd.read_csv("OWM.csv")
response = requests.get("https://api.openweathermap.org/data/2.5/onecall", params=parameters)
response.raise_for_status()
data = response.json()
consolidated_weather_12hour = []
for i in range(0, 12):
consolidated_weather_12hour.append((data['hourly'][i]['weather'][0]['id']))
hour7_forecast = []
for hours in consolidated_weather_12hour:
weather_id = df[df.weather_id == hours]
weather_description = weather_id['description']
for desc in weather_description.iteritems():
hour7_forecast.append(desc[1])
times = ['7AM', '8AM', '9AM', '10AM', '11AM', '12PM', '1PM', '2PM', '3PM', '4PM', '5PM', '6PM']
col_header = ["Description of Expected Weather"]
weather_df = pd.DataFrame(data=hour7_forecast, index=times, columns=col_header)
my_email = ""
password = ""
html_df = weather_df.to_html()
with smtplib.SMTP("smtp.gmail.com", 587) as connection:
connection.starttls() # Makes connection secure
connection.login(user=my_email, password=password)
connection.sendmail(from_addr=my_email, to_addrs="",
msg=f"Subject: 12 Hour Forecast Sterp"
"""\
<html>
<head></head>"
<body>
{0}
<body>
</html>
""".format(html_df))
just use df.to_html() to convert it into an html table that you can include in your html email
then when you send the mail you must set the mimetype to html
smtp = smtplib.SMTP("...")
msg = MIMEMultipart('alternative')
msg['Subject'] = subject_line
msg['From'] = from_addr
msg['To'] = ','.join(to_addrs)
# Create the body of the message (a plain-text and an HTML version).
part1 = MIMEText(plaintext, 'plain')
part2 = MIMEText(html, 'html')
smtp.sendmail(from_addr, to_addrs, msg.as_string())
you can use the library html2text to convert your html to markdown for clients that do not support html content (not many these days) if you do not feel like writing the plaintext on your own
as an aside... using jinja when you are working with html tends to simplify things...

How to extract an exact value from a response in Json?

After a Post action on a certain Link I get the following answer
{"data":{"loginWithEmail":{"__typename":"LoginResponse","me":{"__typename":"User","username":"davishelenekb","displayname":"davishelenekb","avatar":"https://image.sitecdn.com/avatar/default11.png","partnerStatus":"NONE","role":"None","myChatBadges":[],"private":{"__typename":"UserPrivateInfo","accessToken":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtasdasdasdaslzaGVsZW5la2IiLCJkaXNwbGF5bmFtZSI6ImRhdmlzaGVsZW5la2IiLCJhdmF0YXIiOiJodHRwczovL2ltYWdlLmRsaXZlY2RuLmNvbS9hdmF0YXIvZGVmYXVsdDExLnBuZyIsInBhcnRuZXJfc3RhdHVzX3N0cmluZyI6Ik5PTkUiLCJpZCI6IiIsImxpZCI6MCwidHlwZSI6ImVtYWlsIiwicm9sZSI6Ik5vbmUiLCJvYXV0aF9hcHBpZCI6IiIsImV4cCI6MTYwOTE4NDQwNyadasdasdaNTkyNDA3LCJpc3MiOiJETGl2ZSJ9.cQXJFUEo7r4bQa2FPHvKAvjisEF1VKldhFdxOcZ3YTk","email":"email","emailVerified":true,"bttAddress":{"__typename":"MyBTTAddress","senderAddress":null}},"subCashbacked":true},"accessToken":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6ImRhdmlzaGVsZW5la2IiLCJkaXNwbGF5bmFtZSI6ImRhdmlzaGVsZW5la2IiLCJhdmF0YXIiOiJodHRwczovL2ltYWdlLmRsaXZlY2RuLmNvbS9hdmF0YXIvZGVmYasdasdyIsInBhcnRuZXJfc3RhdHVzX3N0cmluZyI6Ik5PTkUiLCJpasdasdlwZSI6ImVtYWlsIiwicm9sZSI6Ik5vbmUiLCJvYXV0aF9hcHBpZCI6IiIsImV4cCI6MTYwOTE4NDQasd221DA3LCJpc3MiOiJETGl2ZSJ9.cQXJFUEo7r4bQa2FPHvKAvjisEF1VKldhFdxOcZ3YTk","twofactorToken":null,"err":null}}}
I just want to extract the key that is in
"accessToken":"KEY",
How can I do this?
My Code
import requests
import json
from fake_useragent import UserAgent
#Set Modules
ua = UserAgent()
url = 'site'
#Read TXT
accounts = 'accounts\\accounts.txt'
with open(accounts) as line:
login = line.readline()
line = login.split(",")
cnt = 1
email = line[0]
password = line[1]
#login
head = {
'.......': '.........',
}
data = {
..........
}
test = requests.post(url, json.dumps(data), headers=head)
if test.status_code == 200:
print('Loged!')
print(test.text)
else:
print('Error') ```
You can take the text of the response, parse it as JSON, and then access the "accessToken" property:
test = requests.post(url, json.dumps(data), headers=head)
if test.status_code == 200:
parsed = json.loads(test.text)
key = parsed['data']['loginWithEmail']['accessToken']
print(key)
Side note:
This snippet assumes that the format of the returned JSON is well known and no error occurs. In a real-world scenario, you may want to add a few validations to it.
You can achieve what you need like this:
response = json.loads(test.text)
print(response["data"]["loginWithEmail"]["me"]["private"]["accessToken"])

Python IMAPLIB HTML body parsing

so basically i have this script that runs continuously and when a new email arrives in the inbox with specific text in the subject, it grabs information from the email. I have only managed to get it to pull the subject from the email but I cant get it do get the body of the email no matter what I try, I believe the email body is in HTML so i attempted to use BeautifulSoup to parse the body but that doesnt work at all. Please help!!! :( Here is what i have so far:
import email
import imaplib
from bs4 import BeautifulSoup
import time
import sys
username = 'xxx.xxx#xxx.xx'
password = 'xxxxxx'
mail = imaplib.IMAP4_SSL('imap-mail.outlook.com')
(retcode, capabilities) = mail.login(username, password)
mail.list()
n=0
while True:
mail.select('inbox')
(retcode, messages) = mail.search(None, 'UNSEEN', '(SUBJECT "xxxxxxx-
")', '(FROM "xx.xx#xxxx.xx")')
if retcode == 'OK':
for num in messages[0].split():
n=n+1
print('Processing Email ' + str(n))
typ, data = mail.fetch(num, '(RFC822)')
for response_part in data:
if isinstance(response_part, tuple):
original = email.message_from_bytes(response_part[1])
print("Subject: " + original['Subject'])
typ, data = mail.store(num,'+FLAGS','\\Seen')
time.sleep(120)
Comment: The "body" returned by imap.fetch are usually bytes, not a string, which throws an exception
Change to:
msg = email.message_from_bytes(body)
Question: I cant get it do get the body of the email
For example:
import email, imaplib
username = 'xxx.xxx#xxx.xx'
password = 'xxxxxx'
imap = imaplib.IMAP4_SSL('imap-mail.outlook.com')
imap.login(username, password)
imap.select("inbox")
resp, items = imap.search(None, "(UNSEEN)")
for n, num in enumerate(items[0].split(), 1):
resp, data = imap.fetch(num, '(RFC822)')
body = data[0][1]
msg = email.message_from_string(body)
content = msg.get_payload(decode=True)
print("Message content[{}]:{}".format(n, content))

Getting the xml soap response instead of raw data

I am trying to get the soap response and read few tags and then put the key and values inside a dictionary.
Better would be if I could use the response generated directly and preform regd operations to it.
But since I was not able to do that, I tried storing the response in an xml file and then using that for operations.
My problem is that the response generated is in a raw form. How to resolve this.
Example: <medical:totEeCnt val="2" />
<medical:totMbrCnt val="2" />
<medical:totDepCnt val="0" />
def soapTest():
request = """<soapenv:Envelope.......
auth = HTTPBasicAuth('', '')
headers = {'content-type': 'application/soap+xml', 'SOAPAction': "", 'Host': 'bfx-b2b....com'}
url = "https://bfx-b2b....com/B2BWEB/services/IProductPort"
response = requests.post(url, data=request, headers=headers, auth=auth, verify=True)
# Open local file
fd = os.open('planRates.xml', os.O_RDWR|os.O_CREAT)
# Convert response object into string
response_str = str(response.content)
# Write response to the file
os.write(fd, response_str)
# Close the file
os.close(fd)
tree = ET.parse('planRates.xml')
root = tree.getroot()
dict = {}
print root
for plan in root.findall('.//{http://services.b2b.../types/rates/dental}dentPln'): # type: object
plan_id = plan.get('cd')
print plan
print plan_id
for rtGroup in plan.findall('.//{http://services.b2b....com/types/rates/dental}censRtGrp'):
#print rtGroup
for amt in rtGroup.findall('.//{http://services.b2b....com/types/rates/dental}totAnnPrem'):
# print amt
print amt.get('val')
amount = amt.get('val')
dict[plan_id] = amount
print dict
Update-:
I did few things, what I am not able to understand is that ,
using this, the operations further are working,
tree = ET.parse('data/planRates.xml')
root = tree.getroot()
dict = {}
print tree
print root
for plan in root.findall(..
output -
<xml.etree.ElementTree.ElementTree object at 0x100d7b910>
<Element '{http://schemas.xmlsoap.org/soap/envelope/}Envelope' at 0x101500450>
But after using this ,it is not working
tree = ET.fromstring(response.text)
print tree
for plan in tree.findall(..
output-:
<Element '{http://schemas.xmlsoap.org/soap/envelope/}Envelope' at 0x10d624910>
Basically I am using the same object only .
Supposing you get a response that you want as proper xml object:
rt = resp.text.encode('utf-8')
# printed rt is something like:
'<soap:Envelope xmlns:soap="http://...">
<soap:Envelope
<AirShoppingRS xmlns="http://www.iata.org/IATA/EDIST" Version="16.1">
<Document>...</soap:Envelope</soap:Envelope>'
# striping soapEnv
startTag = '<AirShoppingRS '
endTag = '</AirShoppingRS>'
trimmed = rt[rt.find(startTag): rt.find(endTag) + len(endTag)]
# parsing
from lxml import etree as et
root = et.fromstring(trimmed)
With this root element you can use find method, xpath or whatever you prefer.
Obviously you need to change the start and endtags to extract the correct element from the response but you get the idea, right?

how to post sessionid using python 3

I'm a noob and I need to use the sessionid to post other commands like search.do, Im using Python 3.5 but Im not sure the best way to get and post it.
here is how I posted the request.
import urllib.parse
url = 'https://myapi.application.com/dmapi/login.do'
values = {'account' : 'MYACCOUNT', 'username': 'admin', 'password': 'pas1234', 'appid':'12346'}
data = urllib.parse.urlencode(values)
data = data.encode('utf-8') # data should be bytes
req = urllib.request.Request(url, data)
resp = urllib.request.urlopen(req)
respData = resp.read()
print(respData)
printing gets this result.
b'errorcode=0\r\nsessionid=ef9a9cbd-e063-4be2-9301-9de59891304c\r\n'
I need to use the sessionid in subsequent request. Whats the best way to go about this.
In fact the response in composed of lines (in bytes) one of which contains the session id. You could simply read and parse what you get:
resp = urllib.request.urlopen(req)
errorcode = None
sessionid = None
for line in resp.read():
line = line.strip() # remove end of line
if line.startswith(b'errorcode'):
errorcode = line.split(b'=')[1]
if line.startswith(b'sessionid'):
sessionid = line.split(b'=')[1]
One idea is to split by sessionid= and extract the last item:
>>> respData.split("sessionid=")[-1].strip()
'ef9a9cbd-e063-4be2-9301-9de59891304c'
Another, is to use a regular expression:
>>> import re
>>>
>>> re.search(r"sessionid=([A-Za-z0-9-]+)", respData).group(1)
'ef9a9cbd-e063-4be2-9301-9de59891304c'

Categories

Resources