XML file as input - python

I have the following line of code: xml = BytesIO("<A><B>some text</B></A>") for the file named test.xml.
But I would like to have something like xml = "/home/user1/test.xml"
How can I use the file location instread of having to put the file content?

Exactly like you have. lxml.etree.parse() accepts a string filename and will read the file for you.

The following code will read in the contents of the file into a string, and pass it into the class instantiator for BytesIO
xml = BytesIO(open("/home/user1/test.xml").read())

xml = open('/home/user1/test.xml', 'rb').read()

Related

How to increase version number of a xml file after each change in the file using ETree

I'm trying to manipulate a xml file. I use a loop and for each iteration I want the version number of the xml file to be increased. For manipulating the xml file I using ETree. Here is what I have tried so far:
def main():
import xml.etree.ElementTree as ET
import os
version = "0"
while os.path.exists(f"/Users/tt/sumoTracefcdfile_{version}.xml"):
#use parse() function to load and parse an xml file
fileDirect="/Users/tt/sumoTracefcdfile_{version}.xml"
version=int(version)
version+=1
doc = ET.parse(fileDirect)
.....
#at the end after adding some data to xml file, I do the following to write the changes into the xml file:
save_path_file = "/Users/tt/sumoTracefcdfile_{version}.xml"
b_xml = ET.tostring(valeurs)
with open(save_path_file, "wb") as f:
f.write(b_xml)
However I get the following error for the line 'doc = ET.parse(fileDirect)':
FileNotFoundError: [Errno 2] No such file or directory:
'/Users/tt/sumoTracefcdfile_{version}.xml'
It looks like you wanted to use f-strings and forgot the "f" in 2 lines.
Changing fileDirect="/Users/tt/sumoTracefcdfile_{version}.xml" to fileDirect = f"/Users/tt/sumoTracefcdfile_{version}.xml" and save_path_file = "/Users/tt/sumoTracefcdfile_{version}.xml" to save_path_file = f"/Users/tt/sumoTracefcdfile_{version}.xml" might solve your issues.

Python O365 send email with HTML file

I'm using O365 for Python.
Sending an email and building the body my using the setBodyHTML() function. However at the present I need to write the actual HTML code inside the function. I don't want to do that. I want to just have python look at an HTML file I saved somewhere and send an email using that file as the body. Is that possible? Or am I confined to copy/pasting my HTML into that function? I'm using office365 for business. Thanks.
In other words instead of this: msg.setBodyHTML("<h3>Hello</h3>") I want to be able to do this: msg.setBodyHTML("C:\somemsg.html")
I guess you can assign the file content to a variable first, i.e.:
file = open('C:/somemsg.html', 'r')
content = file.read()
file.close()
msg.setBodyHTML(content)
You can do this via a simple reading of that file into a string, which you then can pass to the setBodyHTML function.
Here's a quick function example that will do the trick:
def load_html_from_file(path):
contents = ""
with open(path, 'r') as f:
contents = f.read()
return contents
Later, you can do something along the lines of
msg.setBodyHTML(load_html_from_file("C:\somemsg.html"))
or
html_contents = load_html_from_file("C:\somemsg.html")
msg.setBodyHTML(html_contents)

POST XML file with requests

I'm getting:
<error>You have an error in your XML syntax...
when I run this python script I just wrote (I'm a newbie)
import requests
xml = """xxx.xml"""
headers = {'Content-Type':'text/xml'}
r = requests.post('https://example.com/serverxml.asp', data=xml)
print (r.content);
Here is the content of the xxx.xml
<xml>
<API>4.0</API>
<action>login</action>
<password>xxxx</password>
<license_number>xxxxx</license_number>
<username>xxx#xyz.com</username>
<training>1</training>
</xml>
I know that the xml is valid because I use the same xml for a perl script and the contents are being printed back.
Any help will greatly appreciated as I am very new to python.
You want to give the XML data from a file to requests.post. But, this function will not open a file for you. It expects you to pass a file object to it, not a file name. You need to open the file before you call requests.post.
Try this:
import requests
# Set the name of the XML file.
xml_file = "xxx.xml"
headers = {'Content-Type':'text/xml'}
# Open the XML file.
with open(xml_file) as xml:
# Give the object representing the XML file to requests.post.
r = requests.post('https://example.com/serverxml.asp', data=xml, headers=headers)
print (r.content);

Python - html2text write to file

I have text files which contain html tags which I want to remove using html2text with Python:
import html2text
html = open("textFileWithHtml.txt").read()
print html2text.html2text(html)
My question is how can I write the output to a .txt file ? (I want to create the new text file without the html elements -- the file does not previously exist)
You need to open another file for writing.
import html2text
html = open("textFileWithHtml.txt")
f = html.read()
w = open("out.txt", "w")
w.write(html2text.html2text(f).encode('utf-8'))
html.close()
w.close()
You should open a file and write to it.
import html2text
# Open your file
with open("textFileWithHtml.txt", 'r') as f_html:
html = f_html.read()
# Open a file and write to it
with open('your_file.txt', 'w') as f:
f.write(html2text.html2text(html).encode('utf-8'))
It is good practice to use the with keyword when dealing with file objects.
And it is more pythonic too.
See more information for files reading / writing files : https://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files
Edit
If you have issues with encoding, try using .encode('utf-8'). I've added it in my code snipped. Look for python unicode if you have issues regarding this (https://docs.python.org/2/howto/unicode.html)

Fetch URLs from a Text File Using Python

I have a list of URLs in a text file that I would like to fetch using urllib2. I know you can use urllib2.urlopen(the_url) to read the content from the url, but how can I make it so Python reads those URLs line by line as they are in the text file and prints the results?
Thanks.
If the file just consists of one URL per line, you can just do:
with open(filename, "r") as f:
for line in f:
url = urllib2.urlopen(line)
...
Is there something about the way the file is formatted that makes this more complicated?

Categories

Resources