When I hit post API, it is returning a zip file content as an output (which is in unicode form) and I want to save those content in zipfile locally.
How can I save the same?
Trials :
Try 1:
`//variable data containing API response. (i.e data = response.text)
f = open('test.zip', 'wb')
f.write(data.encode('utf8'))
f.close()`
Above code creating zip file. But the file is corrupted one.
Try 2
with zipfile.ZipFile('spam.zip', 'w') as myzip:
myzip.write(data.decode("utf8"))
Above code giving me an error: UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 97: ordinal not in range(128)
Can anyone help me to resolve the same?
I found the answer for above problem. May be someone in future wants the same. So writing answer for my own question.
response.content instead of response.text resolved my problem.
import requests
response = requests.request("POST", <<url>>, <<payload>>, <<headers>>, verify=False)
data = response.content
f = open('test.zip', 'w')
f.write(data)
f.close()
Related
I am doing a POST request to an API that returns an Excel file.
When I try the process without Python - in Postman - it works just fine : I see the garbled output, but if I click on Save response and Save to a file, it saves the file as an xlsx file that I can open just fine:
When I try to do the same in Python, I can also print the (garbled) response, but I do not manage to save the file as something that I can open.
First part of code (runs without issue):
import requests
for i in range (1,3):
url = "myurl"
payload={}
headers = {}
response = requests.request("POST", url, headers=headers, data=payload)
And now for the crucial part of the code.
If I do A:
with open('C:\\Users\\mypath\\exportdata.xlsx', "w") as o:
o.write(response.text)
print(response.text)
...then I get this error when I run the code:
File "C:\Users\Username\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 10-11: character maps to <undefined>
If I do B:
with open('C:\\Users\\mypath\\exportdata.xlsx', "w", encoding="utf-8") as o:
o.write(response.text)
print(response.text)
...then the code runs without error, but I get an extension/format error in excel when I open the file.
How do I save the excel file with python so that I can open and view it correctly after?
This is not a standard text/csv to excel conversion issue, you can see from the garbled output that all the XML hallmarks of an excel file are there.
Excel isn't Text. Excel is binary. Try response.content:
with open(filename, "wb") as o:
o.write(response.content)
I am reading a zip file from a URL. Inside the zip file, there is an HTML file. After I read the file everything works fine. But when I print the text I am facing a Unicode problem. Python version: 3.8
from zipfile import ZipFile
from io import BytesIO
from bs4 import BeautifulSoup
from lxml import html
content = requests.get("www.url.com")
zf = ZipFile(BytesIO(content.content))
file_name = zf.namelist()[0]
file = zf.open(file_name)
soup = BeautifulSoup(file.read(),'html.parser',from_encoding='utf-8',exclude_encodings='utf-8')
for product in soup.find_all('tr'):
product = product.find_all('td')
if len(product) < 2: continue
print(product[1].text)
I already try to open file and print text with .decode('utf-8') I got following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 0: invalid continuation byte
I add from_encoding and exclude_encodings in BeautifulSoup but nothing change and I didn't get an error.
Expected prints:
ÇEŞİTLİ MADDELER TOPLAMI
Tarçın
Fidanı
What I am getting:
ÇEÞÝTLÝ MADDELER TOPLAMI
Tarçýn
Fidaný
I look at the file and the encoding is not utf-8, but iso-8859-9.
Change the encoding and everything will be fine:
soup = BeautifulSoup(file.read(),'html.parser',from_encoding='iso-8859-9')
This will output: ÇEŞİTLİ MADDELER TOPLAMI
so I'm trying to get a csv file with requests and save it to my project:
import requests
import pandas as pd
import csv
def get_and_save_countries():
url = 'https://www.trackcorona.live/api/countries'
r = requests.get(url)
data = r.json()
data = data["data"]
with open("corona/dash_apps/finished_apps/apicountries.csv","w",newline="") as f:
title = "location,country_code,latitude,longitude,confirmed,dead,recovered,updated".split(",")
cw = csv.DictWriter(f,title,delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
cw.writeheader()
cw.writerows(data)
I've managed that but when I try this:
get_data.get_and_save_countries()
df = pd.read_csv("corona\\dash_apps\\finished_apps\\apicountries.csv")
I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1: invalid continuation byte
And I have no idea why. Any help is welcome. Thanks.
Try:
with open("corona/dash_apps/finished_apps/apicountries.csv","w",newline="", encoding ='utf-8') as f:
to explicitly specify the encoding with encoding='utf-8'
When you write to a file, the default encoding is locale.getpreferredencoding(False). On Windows that is usually not UTF-8 and even on Linux the terminal could be configured other than UTF-8. Pandas is defaulting to utf-8, so specify encoding='utf8' as another parameter to open.
I'm having problem using non-ascii characters in a file I'm trying to send as an attachment using requests.
The exception pops at httplib module in the _send_output function.
see this image:
here is my code:
response = requests.post(url="https://api.mailgun.net/v2/%s/messages" % utils.config.mailDomain,
auth=("api", utils.config.mailApiKey),
data={
"from" : me,
"to" : recepients,
"subject" : subject,
"html" if html else "text" : message
},
files= [('attachment', open(f)) for f in attachments] if attachments and len(attachments) else []
)
The problem is with the files parameter, containing non ascii data (hebrew).
The exception as can be seen in the image is:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 673: ordinal not in range(128)
the open() function has a parameter encoding, used like f = open('t.txt', encoding='utf-8') which accepts a variety of parameters as outlined in the docs. Find out what encoding scheme your data uses (probably UTF-8) and see if opening with that encoding works.
Don't use the encoding parameter to open the files because you want to open them as binary data. The calls to open should look like open(f, 'rb'). The documentation for requests only shows examples like this purposefully and even documents this behaviour.
I'm having problem using decode in python, I'm trying to fetch an IMDB website (example address: http://www.imdb.com/title/tt2216240/):
req = urllib.request.Request(address)
response = urllib.request.urlopen(req)
page = response.read().decode('utf-8', 'ignore')
with open('film.html', 'w') as f:
print(page, file=f)
I get an error:
UnicodeEncodeError: 'charmap' codec can't encode character '\xe6' in position 4132: character maps to <undefined>
Try to explicitly specify utf-8 file encoding:
with open('film.html', 'w', encoding='utf-8') as f:
print(page, file=f)
Did already use requests library ?
Anyway, it made simpler:
#samplerequest.py
import requests
address = "http://www.imdb.com/title/tt2216240/"
req = requests.get(address)
print req.text
print req.encoding