I'm trying to serve a PDF file with django 1.7, and this is basically the code that "should" work... it certainly works if I change the content_type to 'text' and download a .tex file with it, but when I try it with a binary file, I get "UnicodeDecodeError at /path/to/file/filename.pdf
'utf-8' codec can't decode byte 0xd0 in position 10: invalid continuation byte"
def download(request, file_name):
file = open('path/to/file/{}'.format(file_name), 'r')
response = HttpResponse(file, content_type='application/pdf')
response['Content-Disposition'] = "attachment; filename={}".format(file_name)
return response
So basically, if I understand correctly, it's trying to serve the file as a UTF-8 encoded text file, instead of a binary file. I've tried to change the content_type to 'application/octet-stream' with similar results. What am I missing?
Try opening the file using binary mode:
file = open('path/to/file/{}'.format(file_name), 'rb')
Related
I'm trying to return the contents of an image file via a Python Connexion application generated from an OpenAPI v2 spec file using swagger-codegen and the python-flask language setting. In my controller module, I simply do the following:
def file_contents_get(file_id):
file = app.datastore.get_instance().get_file(file_id)
with open(file.path, "rb") as f:
return f.read()
However, this results in the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
What is the proper way to return a file's contents? Note that I don't want the file as an attachment but rather inline.
I've read some posts about this problem, but most of them didn't help my case, I'm trying to save an encoded pdf in a zip file (I'm using Docraptor API for the pdf generation, which return the encoded pdf).
def toZip(request, ...):
...
response = docraptor_api_call() #api call to generate pdf (encoded pdf)
with open('creation.pdf', 'wb') as f:
f.write(response)
#decode pdf
with open(f.name, 'rb') as pdf:
# this will download the pdf to the user
# doc = HttpResponse(pdf.read(), content_type='application/pdf')
# doc['Content-Disposition'] = "attachment; filename=filename.pdf"
# return doc
zip_io = io.BytesIO()
# create zipFile
zf = zipfile.ZipFile(zip_io, mode='w')
# write PDF in ZIP ?
save_zf = zf.write(pdf.read())
# save zip to FileField
zip = ZipStore.objects.create(zip=save_zf)
While trying the code on top I get this error :
UnicodeEncodeError: 'charmap' codec can't encode character '\u2019' in position 43: character maps to
I'm don't really get what am I doing wrong and how I should fix it, any suggestion ?
You've got an error in the way you're calling zf.write. You should be using:
# ZipFile.write would take the the file to write, not bytes to be written.
# f.name is the name of the file in the zip archive. So if I passed
# in "foo.txt", "1", I'd get a file named `foo.txt` after decompressing, and its
# contents would be 1
zf.writestr(f.name, pdf.read())
This method does not appear to return something, so you'll need to change this: zip = ZipStore.objects.create(zip=save_zf) probably to:
zip = ZipStore.objects.create(zip=zip_io)
Okay, I've been stuck on this one for hours which should have only taken a few minutes of work.
I have the following code which pulls a gzipped CSV file from a datastore:
from ftplib import FTP_TLS
import gzip
import csv
ftps = FTP_TLS('waws-prod.net')
ftps.login(user='foo', passwd='bar')
resp = ftps.retrbinary('RETR data/WFSIV0606201701.700.csv.gz', gzip.open('WFSIV0606201701.700.csv.gz', 'wb').write)
The file appears in the pwd, and I can even open my Mac Decompression tool, and the original CSV is decompressed perfectly.
However, if I try to decompress this file in using the gzip Library, i can't get a UTF8 encoded string to parse:
f=gzip.GzipFile('WFSIV0606201701.700.csv.gz', 'rb')
s = f.read()
I get what appears to be UTF8 bytestrings, however utf8 decoder can't parse the string.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
BUT! If i download directly from the SFTP server using FileZilla, and i do run the gzip.GzipFile code above, it reads it perfectly. Something must be wrong with my downloader/reader but i haven't a clue as to what could be wrong.
resp = ftps.retrbinary('RETR data/WFSIV0606201701.700.csv.gz', gzip.open('WFSIV0606201701.700.csv.gz', 'wb').write)
This line downloads a compressed file, and then compresses it again when writing it to disk.
Replace gzip.open(...).write with open(...).write to write the compressed file directly.
I am trying to output my model as a CSV file.It is working fine with small data in model and it is very slow with large data.And secondly there are some error in outputting a model as CSV.My logic which I am using is:
def some_view(request):
# Create the HttpResponse object with the appropriate CSV header.
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="news.csv"'
writer = csv.writer(response)
news_obj = News.objects.using('cms').all()
for item in news_obj:
#writer.writerow([item.newsText])
writer.writerow([item.userId.name])
return response
and the error which I am facing is:
UnicodeEncodeError :--
'ascii' codec can't encode characters in position 0-6: ordinal not in
range(128)
and further it says:-
The string that could not be encoded/decoded was: عبدالله الحذ
Replace line
writer.writerow([item.userId.name])
with:
writer.writerow([item.userId.name.encode('utf-8')])
Before saving unicode string to a file you must encode it in some encoding. Most system use utf-8 by default, so it's a safe choice.
From the error, The write content of csv file is like ASCII character. So decode the character.
>>>u'aあä'.encode('ascii', 'ignore')
'a'
Can fix this error from ignoring the ASCII character:
writer.writerow([item.userId.name.encode('ascii', 'ignore')])
On amazon SES, I have a rule to save incoming emails to S3 buckets. Amazon saves these in MIME format.
These emails have a .txt in attachment that will be shown in the MIME file as content-type=text/plain, Content-Disposition=attachment ... .txt, and Content-Transfer-Encoding=quoted-printable or bases64.
I am able to parse it fine using python.
I have a problem decoding the content of the .txt file attachment when this is compressed (i.e., content-type: applcation/zip), as if the encoding wasn't base64.
My code:
import base64
s = unicode(base64.b64decode(attachment_content), "utf-8")
throws the error:
Traceback (most recent call last):
File "<input>", line 796, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xcf in position 10: invalid continuation byte
Below are the first few lines of the "base64" string in attachment_content, which btw has length 53683 + "==" at the end, and I thought that the length of a base64 should be a multiple of 4 (??).
So maybe the decoding is failing because the compression is changing attachment_content and I need some other operation before/after decoding it? I have really no idea..
UEsDBBQAAAAIAM9Ah0otgkpwx5oAADMTAgAJAAAAX2NoYXQudHh0tL3bjiRJkiX23sD+g0U3iOxu
REWGu8c1l2Ag8lKd0V2ZWajM3kLuC6Hubu5uFeZm3nYJL6+n4T4Ry8EOdwCSMyQXBRBLgMQ+7CP5
QPBj5gdYn0CRI6JqFxWv7hlyszursiJV1G6qonI5cmQyeT6dPp9cnCaT6Yvp5Yvz6xfJe7cp8P/k
1SbL8xfJu0OSvUvr2q3TOnFVWjxrknWZFeuk2VRlu978s19MRvNMrHneOv51SOZlGUtMLYnfp0nd
...
I have also tried used "latin-1", but get gibberish.
The problem was that, after conversion, I was dealing with a zipped file in format, like "PK \x03 \x04 \X3C \Xa \x0c ...", and I needed to unzip it before transforming it to UTF-8 unicode.
This code worked for me:
import email
# Parse results from email
received_email = email.message_from_string(email_text)
for part in received_email.walk():
c_type = part.get_content_type()
c_enco = part.get('Content-Transfer-Encoding')
attachment_content = part.get_payload()
if c_enco == 'base64':
import base64
decoded_file = base64.b64decode(attachment_content)
print("File decoded from base64")
if c_type == "application/zip":
from cStringIO import StringIO
import zipfile
zfp = zipfile.ZipFile(StringIO(decoded_file), "r")
unzipped_list = zfp.open(zfp.namelist()[0]).readlines()
decoded_file = "".join(unzipped_list)
print('And un-zipped')
result = unicode(decoded_file, "utf-8")