Bad Zip File error when rewriting code in Python 3.4

Bad Zip File error when rewriting code in Python 3.4 - python

I am trying to rewrite code previously written for Python 2.7 into Python 3.4. I get the error zipfile.BadZipFile: File is not a zip file in the line zipfile = ZipFile(StringIO(zipdata)) in the code below.
import csv
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
import pandas as pd
import os
from zipfile import ZipFile
from pprint import pprint, pformat
import urllib.request
import urllib.parse
try:
import urllib.request as urllib2
except ImportError:
import urllib2
my_url = 'http://www.bankofcanada.ca/stats/results/csv'
data = urllib.parse.urlencode({"lookupPage": "lookup_yield_curve.php",
"startRange": "1986-01-01",
"searchRange": "all"})
# request = urllib2.Request(my_url, data)
# result = urllib2.urlopen(request)
binary_data = data.encode('utf-8')
req = urllib.request.Request(my_url, binary_data)
result = urllib.request.urlopen(req)
zipdata = result.read().decode("utf-8",errors="ignore")
zipfile = ZipFile(StringIO(zipdata))
df = pd.read_csv(zipfile.open(zipfile.namelist()[0]))
df = pd.melt(df, id_vars=['Date'])
df.rename(columns={'variable': 'Maturity'}, inplace=True)
Thank You

You shouldn't be decoding the data you get back in the result. The data is the bytes for the ZipFile, not bytes which are the encoding of a unicode string. I think your confusion arises because in Python 2 there is no distinction, but here in Python 3 you need a BytesIO not a StringIO.
So that part of your code should read:
zipdata = result.read()
zipfile = ZipFile(BytesIO(zipdata))
df = pd.read_csv(zipfile.open(zipfile.namelist()[0]))
The data you are getting back is not utf-8 encoded so you can't decode it that way. You would have found that more easily if you hadn't specified errors = "ignore", which is seldom a good idea ...

Related

Getting code from a .txt on a website and pasting it in a tempfile PYTHON

I was trying to make a script that gets a .txt from a websites, pastes the code into a python executable temp file but its not working. Here is the code:
from urllib.request import urlopen as urlopen
import os
import subprocess
import os
import tempfile
filename = urlopen("https://randomsiteeeee.000webhostapp.com/script.txt")
temp = open(filename)
temp.close()
# Clean up the temporary file yourself
os.remove(filename)
temp = tempfile.TemporaryFile()
temp.close()
If you know a fix to this please let me know. The error is :
File "test.py", line 9, in <module>
temp = open(filename)
TypeError: expected str, bytes or os.PathLike object, not HTTPResponse
I tried everything such as a request to the url and pasting it but didnt work as well. I tried the code that i pasted here and didnt work as well.
And as i said, i was expecting it getting the code from the .txt from the website, and making it a temp executable python script

you are missing a read:
from urllib.request import urlopen as urlopen
import os
import subprocess
import os
import tempfile
filename = urlopen("https://randomsiteeeee.000webhostapp.com/script.txt").read() # <-- here
temp = open(filename)
temp.close()
# Clean up the temporary file yourself
os.remove(filename)
temp = tempfile.TemporaryFile()
temp.close()
But if the script.txt contains the script and not the filename, you need to create a temporary file and write the content:
from urllib.request import urlopen as urlopen
import os
import subprocess
import os
import tempfile
content = urlopen("https://randomsiteeeee.000webhostapp.com/script.txt").read() #
with tempfile.TemporaryFile() as fp:
name = fp.name
fp.write(content)
If you want to execute the code you fetch from the url, you may also use exec or eval instead of writing a new script file.
eval and exec are EVIL, they should only be used if you 100% trust the input and there is no other way!
EDIT: How do i use exec?
Using exec, you could do something like this (also, I use requests instead of urllib here. If you prefer urllib, you can do this too):
import requests
exec(requests.get("https://randomsiteeeee.000webhostapp.com/script.txt").text)

Your trying to open a file that is named "the content of a website".
filename = "path/to/my/output/file.txt"
httpresponse = urlopen("https://randomsiteeeee.000webhostapp.com/script.txt").read()
temp = open(filename)
temp.write(httpresponse)
temp.close()
Is probably more like what you are intending

Python: Reading fortran file from url

I would like to do the following in Python 3: Read in a FortranFile, but from an URL rather than a local file. The reason is that for my concrete example there are a lot of files and I want to avoid having to download them all first.
I have managed to
a) read in a simple .txt file from an URL
import urllib
from urllib.request import urlopen
url='http://www.deus-consortium.org/deus-library/filelist/deus_file_list_501.txt'
data=urllib.request.urlopen(url)
i=0
for line in data: # files are iterable
print(i,line)
i+=1
#alternative: data.read()
b) read in a local FortranFile (binary little endian unformated Fortran file) like this:
The file is from: http://www.deus-consortium.org/deus-library/efiler1/Babel_le/boxlen648_n2048_lcdmw7/post/fof/output_00090/fof_boxlen648_n2048_lcdmw7_masst_00000
from scipy.io import FortranFile
filename='../../Downloads/fof_boxlen648_n2048_rpcdmw7_masst_00000'
ff = FortranFile(filename, 'r')
nhalos=ff.read_ints(dtype=np.int32)[0]
print('number of halos in file',nhalos)
Is there any way to avoid downloading and reading FortranFiles directly from the URL? I tried
import urllib
from urllib.request import urlopen
url='http://www.deus-consortium.org/deus-library/efiler1/Babel_le/boxlen648_n2048_lcdmw7/cube_00090/fof_boxlen648_n2048_lcdmw7_cube_00000'
pathname = urllib.request.urlopen(url)
ff = FortranFile(pathname, 'r')
ff.read_ints()
gives "OSError: obtaining file position failed". pathname.read() doesn't work either because it's a fortran file.
Any ideas? Thanks in advance!

Maybe you can use tempfile module to download and read the data?
For example:
import urllib
import tempfile
from scipy.io import FortranFile
from urllib.request import urlopen
url='http://www.deus-consortium.org/deus-library/efiler1/Babel_le/boxlen648_n2048_lcdmw7/cube_00090/fof_boxlen648_n2048_lcdmw7_cube_00000'
with tempfile.TemporaryFile() as fp:
fp.write(urllib.request.urlopen(url).read())
fp.seek(0)
ff = FortranFile(fp, 'r')
info = ff.read_ints()
print(info)
Prints:
[12808737]

TypeError when trying to convert Python 2.7 code to Python 3.4 code

I am having issues converting the code below which was written for Python 2.7 to code compatible in Python 3.4. I get the error TypeError: can't concat bytes to str in the line outfile.write(decompressedFile.read()). So I replaced the line with outfile.write(decompressedFile.read().decode("utf-8", errors="ignore")), but this resulted in the error same error.
import os
import gzip
try:
from StirngIO import StringIO
except ImportError:
from io import StringIO
import pandas as pd
import urllib.request
baseURL = "http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file="
filename = "data/irt_euryld_d.tsv.gz"
outFilePath = filename.split('/')[1][:-3]
response = urllib.request.urlopen(baseURL + filename)
compressedFile = StringIO()
compressedFile.write(response.read().decode("utf-8", errors="ignore"))
compressedFile.seek(0)
decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb')
with open(outFilePath, 'w') as outfile:
outfile.write(decompressedFile.read()) #Error

The problem is that GzipFile needs to wrap a bytes-oriented file object, but you're passing a StringIO, which is text-oriented. Use io.BytesIO instead:
from io import BytesIO # Works even in 2.x
# snip
response = urllib.request.urlopen(baseURL + filename)
compressedFile = BytesIO() # change this
compressedFile.write(response.read()) # and this
compressedFile.seek(0)
decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb')
with open(outFilePath, 'w') as outfile:
outfile.write(decompressedFile.read().decode("utf-8", errors="ignore"))
# change this too

TypeError when using urllib.request in place of urllib2

I am trying to convert code that was previously written for Python 2.7 to code that will work in Python 3.4. The code is below and I had to change urllib2.urlopen() to urllib.request.urlopen(). However, this change resulted in the error TypeError: string argument expected, got 'bytes' in the line compressedFile.write(response.read()).
import os
import urllib2
import gzip
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
import pandas as pd
baseURL = "http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file="
filename = "data/irt_euryld_d.tsv.gz"
outFilePath = filename.split('/')[1][:-3]
response = urllib2.urlopen(baseURL + filename) #Changed this to urllib.request.urlopen()
compressedFile = StringIO()
compressedFile.write(response.read())

You should decode the bytes before passing it to the write function
compressedFile = StringIO()
compressedFile.write(response.read().decode("utf-8"))
Also see the docs. "utf-8" may be omitted because it's the default, but explicit is better than implicit ;-)

Append a call to decode() to decode the bytes into a str.
compressedFile.write(response.read().decode())

Reading this type of Json with Python 3 Urllib

My json url has this:
{years=["2014","2015","2016"]}
How can I get this strings from URL with Python 3? I know this method but Python 3 has no urllib2 module.
import urllib2
import json
response = urllib2.urlopen('http://127.0.0.1/years.php')
data = json.load(response)
print (data)
ImportError: No module named 'urllib2'

Try changing the import to urllib, and use urllib.request instead. For the reason being, please refer to this SO Answer
import urllib
import json
response = urllib.request.urlopen('http://127.0.0.1/years.php')
data = json.load(response)
print (data)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Bad Zip File error when rewriting code in Python 3.4 - python

Related

Getting code from a .txt on a website and pasting it in a tempfile PYTHON

Python: Reading fortran file from url

TypeError when trying to convert Python 2.7 code to Python 3.4 code

TypeError when using urllib.request in place of urllib2

Reading this type of Json with Python 3 Urllib

Categories

Resources