I am using Python to read the Enron email dataset. I have the emails in text files. I would like to read the text files and extract only the "Body" section of each email. I am not concerned about any other FROM, TO, BCC, attachments, DATE, etc. I only want the BODY section and would like to store it in a list. I tried to use the get_payload() function, but it still prints everything. How do I skip the other content and use only the Body section?
import email.parser
from email.parser import Parser
# Code to extract a particular section from raw emails.
parser = Parser()
text1 = open("path of the file", "r").read()
msg = email.message_from_string(text1)
email = parser.parsestr(text1)
if msg.is_multipart():
for payload in msg.get_payload():
print payload.get_payload()
else:
print msg.get_payload()
One file may contain multiple emails. Sample emails.
docID: 1
segmentNumber: 0
Body: I just checked with Carolyn on your invoicing for the conference. She
verified the 85K was processed.
##########################################################
docID: 2
segmentNumber: 0
Body: null
##########################################################
docID: 3
segmentNumber: 0
Body: In regard to the costs for the GAM conference, Karen told me the $ 6,695.97
figure was inclusive of all the items for the conference. However, after
speaking with Shweta, I found out this is not the case. The CDs are not
included in this figure.
The CD cost will be $2,011.50 + the cost of postage/handling (which is
currently being tabulated).
##########################################################
docID: 3
segmentNumber: 1
Body:
This is the original quote for this project and it did not include the
postage. As soon as I have the details from the vendor, I'll forward those to
you.
Please call me if you have any questions.
Assuming all your files has the format specified in your example, this might work:
email_body_list = [ email.split('Body:')[-1] for email in file_content.split('##########################################################')]
Related
I'm trying to write a Python script code wherein I’ll send email notifications to my team members on a daily basis.
There are two excel sheets, let's say abc.xlsx and def.xlsx.
I already have a script that updates these files and saves them. (These files abc and def are deleted and recreated with the same name but with updated information.)
Now my goal is to attach the file abc as an attachment in the mail and add the contents of def.xlsx in the email body.
I’m trying to achieve this:
Hello All,
Please find the pending lists here as follows:
///The info from def.xlsx sheet comes here///
Thanks and regards!
/// my outlook signature///
Here is my code:
import win32com.client as win32
import pandas as pd
# reading a file, which needs to be on mail body
df1 = pd.read_excel('def.xlsx')
html_table = df1.to_html(index=False)
outlook = win32.gencache.EnsureDispatch('Outlook.Application')
mail = outlook.CreateItem(0)
mail.To = 'mail#me.com'
mail.CC = 'mail#me.com'
mail.Subject = 'Test mail'
# path to signature should be User\AppData\Roaming\Microsoft\Signatures\signature.htm
pathToIMage = r'path_to_my_signature'
attachment = mail.Attachments.Add(pathToIMage)
attachment.PropertyAccessor.SetProperty("http://schemas.microsoft.com/mapi/proptag/0x3712001F", "MyId1")
# modify the mail body as per need
mail.Attachments.Add(Source="C:\..abc.xlsx")
body = "<p>Hi All, Please find the updates pending updates below:" + html_table + " <br>Thanks and regards <p> <figure><img src=""cid:MyId1""</figure>"
mail.HTMLBody = (body)
mail.Send()
Example:
This type of output I'm expecting
Challenges:
My signature will be a corrupted image with a "x" in it in the test email.
My Excel sheet, which has to be on the body, won't have the same format.
I’ve copied all the codes from Stack overflow only. I did some of my research, but I'm not getting the expected output.
First, you may try setting the BodyFormat property before setting up the HTMLBody property.
Second, to get the signature added to the message body you need to call the Display method before setting up the HTMLBody property.
Third, the <figure> element is not supported in Outlook because Word is used as an email editor and applies its own business rules to message bodies.
Fourth, the HTMLBody property returns or sets a string which represents the message body, it is expected to get or set a full-fledged well-formed HTML document. Try to set up a well-formed HTML document and then set up a property.
If you need to preserve formatting from Excel you may copy the table to the clipboard and then paste it using the Word object model.
Be aware, The Outlook object model supports three main ways of customizing the message body:
The Body property returns or sets a string representing the clear-text body of the Outlook item.
The HTMLBody property of the MailItem class returns or sets a string representing the HTML body of the specified item. Setting the HTMLBody property will always update the Body property immediately. For example:
Sub CreateHTMLMail()
'Creates a new e-mail item and modifies its properties.
Dim objMail As Outlook.MailItem
'Create e-mail item
Set objMail = Application.CreateItem(olMailItem)
With objMail
'Set body format to HTML
.BodyFormat = olFormatHTML
.HTMLBody = "<HTML><BODY>Enter the message text here. </BODY></HTML>"
.Display
End With
End Sub
The Word object model can be used for dealing with message bodies. See Chapter 17: Working with Item Bodies for more information.
Note, the MailItem.BodyFormat property allows you to programmatically change the editor that is used for the body of an item.
I modified it. I'm still working on Challenge 2. I'll just go through the documentation that has been recommended and will share my final script.
import win32com.client as win32
import pandas as pd
import os
import codecs
df1 = pd.read_excel('mail_body.xlsx')
html_table = df1.to_html(index=False)
# below is the coding logic for signature
sig_files_path = 'AppData\Roaming\Microsoft\Signatures\\' + 'signature_file_name' + '_files\\'
sig_html_path = 'AppData\Roaming\Microsoft\Signatures\\' + 'signature_file_name' + '.htm'
signature_path = os.path.join((os.environ['USERPROFILE']), sig_files_path)
html_doc = os.path.join((os.environ['USERPROFILE']), sig_html_path)
html_doc = html_doc.replace('\\\\', '\\')
html_file = codecs.open(html_doc, 'r', 'utf-8', errors='ignore')
signature_code = html_file.read()
signature_code = signature_code.replace(('signature_file_name' + '_files/'), signature_path)
html_file.close()
outlook = win32.gencache.EnsureDispatch('Outlook.Application')
mail = outlook.CreateItem(0)
mail.To = 'mail#me.com'
mail.CC = 'mail#me.com'
mail.Subject = 'TEST EMAIL'
mail.Attachments.Add(Source=r"C:\..abc.xlsx")
# modify the mail body as per need
mail.BodyFormat = 2
body = "<p>Hi All, Please find the updates pending updates below:" + html_table + " <br>Thanks and regards <br><br>"
mail.Display()
mail.HTMLBody = body + signature_code
mail.Send()
I am reading .txt file in Python code and I should get the same mail body what I have in my text file.
It is working fine but hyperlinks not displayed in my outlook email, it displays only as text in outlook email.
Below is the code:
Mail_Content = open("MailBody.txt","r")
Read_Content = Mail_Content.read()
In the text file , passing content like this for hyperlink:
linkname,'html'
Please help me out, I am trying to fix this from last two days.
Firstly, you really need to show the code that sets the message body. Secondly, make sure you set the MailItem.HTMLBody rather than the plaintext MailItem.Body.
Make sure the BodyFormat property is set up correctly before setting the HTMLBody property in the code, for example, here is a VBA sample which shows how to set up it properly:
Sub CreateHTMLMail()
'Creates a new email item and modifies its properties.
Dim objMail As MailItem
'Create mail item
Set objMail = Application.CreateItem(olMailItem)
With objMail
'Set body format to HTML
.BodyFormat = olFormatHTML
.HTMLBody = "<HTML><H2>The body of this message will appear in HTML.</H2><BODY>Type the message text here. </BODY></HTML>"
.Display
End With
End Sub
I am trying to use the gitlab projects api to edit multiple project MR templates. The problem is, that it only sends the first line of the markdown template.
While messing around with the script, I was toying with converting it to html when I found that it sent the whole template when converted to html.
I am probably missing something super simple but for the life of me, I cant figure out why it would be able to send the entire template in html but only send the first line of it natively in markdown.
I have been searching for a solution for a bit now so I apologize if my googlefu missed an obvious answer here.
Here is the script...
#! /usr/bin/env python3
import argparse
import requests
gitlab_addr = "https://gitlab.com/api/v4"
# Insert your project IDs into the array below.
project_IDs = [xxxx, yyyy, zzzz]
# Insert your MR template info below.
with open('/.gitlab/merge_request_templates/DefaultMRTemplate.md', 'r') as file:
MR_template = file.read()
#print(MR_template)
def getArgs():
parser = argparse.ArgumentParser(
description='This tool updates the default template for a single '
'or multiple program\'s MRs. \n\nYou will need to edit '
'the script to input your MR template and projects IDs.'
'\nYou will also need to pass in your API Token via '
' command line.\n\nYou want to see "200 OK" on the '
' command line as confirmation.',
formatter_class=argparse.RawTextHelpFormatter)
parser.add_argument("token", type=str,
help="API Token. Create one at User Settings / Access Tokens")
return parser.parse_args()
def ChangeTemplate():
token = getArgs().token
headers = {"PRIVATE-TOKEN": token, }
for x in project_IDs:
addr = f"{gitlab_addr}/projects/{x}/?merge_requests_template={MR_template}"
response = requests.put(addr, headers=headers)
# You want to see "200 OK" on the command line.
print(response.status_code, response.reason)
def main():
ChangeTemplate()
if __name__ == '__main__':
main()
Here is a sample template...
See guidance here: https://example.com/Gitlab+MR+Guide
## Description
%% Put a description here %%
%% Add an issue link here %%
## Tests
%% Include test listing here %%
## Checklists
**Author Checklist**
- [ ] A: Did you fill out the description, add an issue link (in title or desc) and fill out the test section?
- [ ] A: Add a peer to the MR
**Assignee 1 Checklist:**
- [ ] P: Verify the description field is filled out, issue link is included (in title or desc) and the test section is filled out
- [ ] P: Add a code owner to the MR
**Assignee 2 (Code Owner) Checklist:**
- [ ] O: Verify the description field is filled out, issue link is included (in title or desc) and the test section is filled out
- [ ] O: Verify unit test coverage is at least 40% line coverage with a goal of 90%
problem output...
See guidance here: https://example.com/Gitlab MR Guide
Your data needs to be properly encoded in the request. Trying to format the literal contents of the file into the query string won't work here.
Use the data keyword argument to requests.put, which will pass the data in the request body (or use params to set query params). requests will handle the proper encoding of the data.
addr = f"{gitlab_addr}/projects/{x}/"
payload = {'merge_requests_template': MR_template}
response = requests.put(addr, headers=headers, data=payload)
# or params=payload to use query string
I am trying to send e-mails to a list of contacts, along with a blind copy (BCC) to myself, using Yagmail and Python. I couldn't find any examples in the Yagmail documentation that described how to do this. I know it's possible, but I keep getting an error with my current code.
Can anyone help me resolve this?
Note: This code works until I add "bcc" as a method-parameter.
The Code:
yag = yagmail.SMTP(
user={real_sender:alias_sender}, password="xxxxxx", host='smtp.xxxxxx.com', port='587',
smtp_starttls=True, smtp_ssl=None, smtp_set_debuglevel=0, smtp_skip_login=False,
encoding='utf-8', oauth2_file=None, soft_email_validation=True)
to = all_receivers ### list of contacts 1
bcc = all_receivers_bcc ### list of contacts 2
subject = 'SUBJECT HERE'
contents = 'HTML CONTENT HERE'
yag.send(to, bcc, subject, contents) ### FAILS HERE WHEN THE "bcc" is added
You need to tell python which parameter you are inputting. If you don't, you need to make sure parameters are sent in the right order. Try this:
yag.send(to=all_receivers, bcc=all_receivers_bcc , subject='SUBJECT HERE', contents='HTML CONTENT HERE')
I think this code will work, please test:
Yagmail Usage Doc
This example uses string interpolation to place the variables.
yag = yagmail.SMTP(
user={real_sender:alias_sender}, password="xxxxxx", host='smtp.xxxxxx.com', port='587',
smtp_starttls=True, smtp_ssl=None, smtp_set_debuglevel=0, smtp_skip_login=False,
encoding='utf-8', oauth2_file=None, soft_email_validation=True)
all_receivers = str(['aContact1#gmail.com','aContact2#gmail.com','aContact3#gmail.com']) #contacts list
all_receivers_bcc = str(['bbcContact1#gmail.com','bbcContact2#gmail.com','bbcContact3#gmail.com'])#contact list
subject = 'SUBJECT HERE'
contents = 'HTML CONTENT HERE'
yag.send(to='{all_receivers}', subject='{subjects}', contents='{contents}', bcc='{all_receivers_bbc}')
I would like to create an hyperlink in the body of a task created through win32com.
This is my code so far:
outlook = win32com.client.Dispatch("Outlook.Application")
outlook_task_item = 3
recipient = "my_email#site.com"
task = outlook.CreateItem(outlook_task_item)
task.Subject = "hello world"
task.Body = "please update the file here"
task.DueDate = dt.datetime.today()
task.ReminderTime = dt.datetime.today()
task.ReminderSet = True
task.Save()
I have tried to set the property task.HTMLBody but I get the error:
AttributeError: Property 'CreateItem.HTMLBody' can not be set.
I have also tried
task.Body = "Here is the <a href='http://www.python.org'>link</a> I need"
but I am not getting a proper hyperlink.
However if I create a task front end in Outlook, I am able to add hyperlinks.
You can also try:
task.HTMLBody = "Here is the <a href='http://www.python.org'>link</a> I need"
this will overwrite data in 'task.Body' to the HTML format provides in 'task.HTMLBody'
so whichever (Body or HTMLBody) is last will be taken as the Body of the mail.
Tasks do not support HTML. Instead, you have to provide RTF.
You can investigate -- but not set -- the RTF of a given task through task.RTFBody (and task.RTFBody.obj to get a convenient view of it). To use RTF in the body of a task, simply use the task.Body property; setting this to a byte array containing RTF will automatically use that RTF in the body. Concretely, to get the body you want, you could let
task.Body = rb'{\rtf1{Here is the }{\field{\*\fldinst { HYPERLINK "https://www.python.org" }}{\fldrslt {link}}}{ I need}}'