Download xml-file and save it to a text file - python

I'm very new at programming and have a problem. I need to create a Python function that uses the request external module to download an XML-file, and then saves the text of the response to a text file.
So far i've tried this:
import requests
def downloading_xml():
r = requests.get('https://www.w3schools.com/xml/simplexsl.xml')
print(r.text)
But I don't get it quite right. I think my main problem is the last part, I don't know how to save the text of the response to a text file. Any ideas? Thanks in advance!

Here you go. If you want to know more about Python file operations follow this link
Python I/O
import requests
def downloading_xml():
r = requests.get('https://www.w3schools.com/xml/simplexsl.xml')
print(r.text)
with open("filename.txt", "w+") as f:
f.write(r.text)
f.close()
Now call the function
downloading_xml()

To save to a text file you can do something like this:
textfile=open("anyname.xml",'w')
textfile.write(r.text)
textfile.close()
you may need to include the path to the file as well

Related

Program running but file not opening using open() function

I am just learning about file manipulation and created a simple 'text.txt' file. This file, as well as my source code('files.py') share the same parent directory('files/'). I am trying to open the file:
import os
helloFile = open('./text.txt')
However, when I run this at the command line no error is thrown but no file is opened. It seems like it is running with no problem but not doing what I want it to do. Anyone know why?
Well, you should probably read() from file as you only created file descriptor object which targets the file, but didn't really do any operation on it.
also, I recommend you to use
with open('./text.txt') as f:
helloFile = f.read()
it will automatically close file for you,
in other case you need to close file manually like below:
f = open('./text.txt')
helloFile = f.read()
f.close()

Checking for an entry in text file before adding

I have some fairly simple code i thought would work but it is not doing as it should, all i'm doing is reading a text file for a url, if it does not exist in the tex file we add it:
code:
def verify_links_working(self, url):
if url not in open("links/register.txt").read():
with open("links/register.txt", "a+") as file:
file.write("%s\n" % str(url).strip())
file.close()
It looks fairly straight forward but it still adds duplicate lines, is there something i have missed? any help is appreciated.
You can do both read and write on the same file, plus cleaning both url and urls from txt with strip. Since self is not used, I remove it by adding staticmethod(assuming you are using the function as a class function:
#staticmethod
def verify_links_working(url):
url_clean = url.strip()
with open('links/register.txt', 'r+') as file:
if url_clean not in {url.strip() for url in file}:
file.write(f'{url_clean}\n')
Even better, you can pass path to register.txt as an argument instead of hard coding it in your function. In that since your function will be more generic.

Trying to download data from URL with CSV File

I'm slightly new to Python and have a question as to why the following code doesn't produce any output in the csv file. The code is as follows:
import csv
import urllib2
url = 'http://www.rba.gov.au/statistics/tables/csv/f17-yields.csv'
response = urllib2.urlopen(url)
cr = csv.reader(response)
for row in cr:
with open("AusCentralbank.csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(row)
Cheers.
Edit:
Brien and Albert solved the initial issue I had. However, I now have one further question. When I download the CSV File which I have listed above which is in "http://www.rba.gov.au/statistics/tables/#interest-rates" under Zero-coupon "Interest Rates - Analytical Series - 2009 to Current - F17" and is the F-17 Yields CSV I see that it has 5 workbooks and I actually just want to gather the data in the 5th Workbook. Is there a way I could do this? Cheers.
I could only test my code using Python 3. However, the only diffence should be urllib2, hence I am using urllib.respose for opening the desired url.
The variable html is type bytes and can generally be written to a file in binary mode. Additionally, your source is a csv-file already, so there should be no need to convert it somehow:
#!/usr/bin/env python3
# coding: utf-8
import urllib
url = 'http://www.rba.gov.au/statistics/tables/csv/f17-yields.csv'
response = urllib.request.urlopen(url)
html = response.read()
with open('output.csv', 'wb') as f:
f.write(html)
It is probably because of your opening mode.
According to documentation:
'w' for only writing (an existing file with the same name will be
erased)
You should use append(a) mode to append it to the end of the file.
'a' opens the file for appending; any data written to the file is
automatically added to the end.
Also, since the file you are trying to download is csv file, you don't need to convert it.
#albert had a great answer. I've gone ahead and converted it to the equivalent Python 2.x code. You were doing a bit too much work in your original program; since the file was already a csv you didn't need to do any special work to turn it into a csv.
import urllib2
url = 'http://www.rba.gov.au/statistics/tables/csv/f17-yields.csv'
response = urllib2.urlopen(url)
html = response.read()
with open('AusCentralbank.csv', 'wb') as f:
f.write(html)

unexpected line breaks in python after writing to csv

i have a code that updates CSVs from a server. it gets data using:
a = urllib.urlopen(url)
data = a.read().strip()
then i append the data to the csv by
f = open(filename+".csv", "ab")
f.write(ndata)
f.close()
the problem is that randomly, a line in the csv gets written like this (or gets a line break somewhere along the csv):
2,,,,,
015-04-21 13:00:00,18,998,50,31,2293
instead of its usual form:
2015-04-21 13:00:00,6,1007,29,25,2394
2015-04-21 13:00:00,7,1004,47,26,2522
i tried printing my data in shell after the program ran, and it would show that the broken csv entry actually appears to be normal.
hope you guys can help me out. thanks.
running python 2.7.9 on win8.1
What actions are performed on your "ndata" variable ?
You should use the csv module to manage CSV files : https://docs.python.org/2/library/csv.html
Edit after comment :
If you do not want to use the "csv" module I linked to you, instead of
a = urllib.urlopen(url)
data = a.read().strip()
ndata = data.split('\n')
f.write('\n'.join(ndata[1:]))
you should do this :
a = urllib.urlopen(url)
f.writelines(a.readlines()[1:])
I don't see any reason explaining your randomly unwanted "\n" if you are sure that your incoming data is correct. Do you manage very long lines ?
I recommand you to use the csv module to read your input : you'll be sure to have a valid CSV content if your input is correct.

How do I write to a website/HTML Document using python 3?

I'm using python 3 and I was able to read the code from my html document but I was unable to write to it. How would I go about this. I'll show you what I mean:
import urllib.request
locator=urllib.request.urlopen("file:///E:/Programming/Calculator.html", "r")
transfer=locator.read()
print("\n\n",transfer, "\n")
locator.close()
locator=urllib.request.urlopen("file:///E:/Programming/Calculator.html","w+")
locator.write("<p> Hello this site has been slightly changed</p>")
locator.close()
locator=urllib.request.urlopen("file:///E:/Programming/Calculator.html","r")
new=locator.read()
print(new)
locator.close()
So I'm to read to it but i can't write to it or change any of it's code. Why is this?
Also, I tried to read from an actual url website using the exact same code as above but replacing the url and removing the write function. The interpreter came up with an error, and I wasn't able to read from the site. How can I read from a website too?
Note: I'm just learning, I'm not actually gonna do anything illegal I just want to become more knowledgeable with this kind of stuff
Also if I change write to append() it still produces an error
import urllib.request
locator=urllib.request.urlopen("file:///E:/Programming/Calculator.html", "r")
transfer=locator.read()
print("\n\n",transfer, "\n")
locator.close()
locator=urllib.request.urlopen("file:///E:/Programming/Calculator.html", "w+")
with open("file:///E:/Programming/Calculator.html") as f:
f.write('something')
locator.close()
The above is a piece of code suggested by another member ut instead of writing to the url it up with an error saying:
Traceback (most recent call last):
File "C:\Users\KENNY\Desktop\Python\practice.py", line 10, in <module>
with open("file:///E:/Programming/Calculator.html") as f:
OSError: [Errno 22] Invalid argument: 'file:///E:/Programming/Calculator.html'
Ignore the spacing its just the way i pasted it. all the code should be in line up to the with open part where the f.write function is idented
urllib.request.urlopen returns a file-like object.
These objects expose a read method, but should not allow you to write.
I like to think of it as, when using urllib it is like requesting the file as you would if you typed the resource into your browser. You can't write to it. You request it and it is served to you.
If you want to open the file for writing you can use the open method
with open('E:/Programming/Calculator.html', 'w+') as f:
f.write('something')
The above example uses with statement which is basically a shortcut for manually closing the file when the code exits the with block.
It is similar to
f = open('E:/Programming/Calculator.html', 'w+')
f.write('something')
f.close()
#Lattyware posted a great tutorial on it, many more can be found online. The pep outlines what it is for.
It seems like you might be confusing urlopen and the open command.

Categories

Resources