Django 1.7: serve a pdf -file (UnicodeDecodeError)

Django 1.7: serve a pdf -file (UnicodeDecodeError) - python

I'm trying to serve a PDF file with django 1.7, and this is basically the code that "should" work... it certainly works if I change the content_type to 'text' and download a .tex file with it, but when I try it with a binary file, I get "UnicodeDecodeError at /path/to/file/filename.pdf
'utf-8' codec can't decode byte 0xd0 in position 10: invalid continuation byte"
def download(request, file_name):
file = open('path/to/file/{}'.format(file_name), 'r')
response = HttpResponse(file, content_type='application/pdf')
response['Content-Disposition'] = "attachment; filename={}".format(file_name)
return response
So basically, if I understand correctly, it's trying to serve the file as a UTF-8 encoded text file, instead of a binary file. I've tried to change the content_type to 'application/octet-stream' with similar results. What am I missing?

Try opening the file using binary mode:
file = open('path/to/file/{}'.format(file_name), 'rb')

Related

How to return file contents from controller?

I'm trying to return the contents of an image file via a Python Connexion application generated from an OpenAPI v2 spec file using swagger-codegen and the python-flask language setting. In my controller module, I simply do the following:
def file_contents_get(file_id):
file = app.datastore.get_instance().get_file(file_id)
with open(file.path, "rb") as f:
return f.read()
However, this results in the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
What is the proper way to return a file's contents? Note that I don't want the file as an attachment but rather inline.

How to save an encoded pdf in a zip file using django?

I've read some posts about this problem, but most of them didn't help my case, I'm trying to save an encoded pdf in a zip file (I'm using Docraptor API for the pdf generation, which return the encoded pdf).
def toZip(request, ...):
...
response = docraptor_api_call() #api call to generate pdf (encoded pdf)
with open('creation.pdf', 'wb') as f:
f.write(response)
#decode pdf
with open(f.name, 'rb') as pdf:
# this will download the pdf to the user
# doc = HttpResponse(pdf.read(), content_type='application/pdf')
# doc['Content-Disposition'] = "attachment; filename=filename.pdf"
# return doc
zip_io = io.BytesIO()
# create zipFile
zf = zipfile.ZipFile(zip_io, mode='w')
# write PDF in ZIP ?
save_zf = zf.write(pdf.read())
# save zip to FileField
zip = ZipStore.objects.create(zip=save_zf)
While trying the code on top I get this error :
UnicodeEncodeError: 'charmap' codec can't encode character '\u2019' in position 43: character maps to
I'm don't really get what am I doing wrong and how I should fix it, any suggestion ?

You've got an error in the way you're calling zf.write. You should be using:
# ZipFile.write would take the the file to write, not bytes to be written.
# f.name is the name of the file in the zip archive. So if I passed
# in "foo.txt", "1", I'd get a file named `foo.txt` after decompressing, and its
# contents would be 1
zf.writestr(f.name, pdf.read())
This method does not appear to return something, so you'll need to change this: zip = ZipStore.objects.create(zip=save_zf) probably to:
zip = ZipStore.objects.create(zip=zip_io)

How to decompress a GZIP file pulled from SFTP in Python3 the same way Mac OS's gunzip does it?

Okay, I've been stuck on this one for hours which should have only taken a few minutes of work.
I have the following code which pulls a gzipped CSV file from a datastore:
from ftplib import FTP_TLS
import gzip
import csv
ftps = FTP_TLS('waws-prod.net')
ftps.login(user='foo', passwd='bar')
resp = ftps.retrbinary('RETR data/WFSIV0606201701.700.csv.gz', gzip.open('WFSIV0606201701.700.csv.gz', 'wb').write)
The file appears in the pwd, and I can even open my Mac Decompression tool, and the original CSV is decompressed perfectly.
However, if I try to decompress this file in using the gzip Library, i can't get a UTF8 encoded string to parse:
f=gzip.GzipFile('WFSIV0606201701.700.csv.gz', 'rb')
s = f.read()
I get what appears to be UTF8 bytestrings, however utf8 decoder can't parse the string.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
BUT! If i download directly from the SFTP server using FileZilla, and i do run the gzip.GzipFile code above, it reads it perfectly. Something must be wrong with my downloader/reader but i haven't a clue as to what could be wrong.

resp = ftps.retrbinary('RETR data/WFSIV0606201701.700.csv.gz', gzip.open('WFSIV0606201701.700.csv.gz', 'wb').write)
This line downloads a compressed file, and then compresses it again when writing it to disk.
Replace gzip.open(...).write with open(...).write to write the compressed file directly.

Errror in outputting CSV with Django?

I am trying to output my model as a CSV file.It is working fine with small data in model and it is very slow with large data.And secondly there are some error in outputting a model as CSV.My logic which I am using is:
def some_view(request):
# Create the HttpResponse object with the appropriate CSV header.
response = HttpResponse(content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="news.csv"'
writer = csv.writer(response)
news_obj = News.objects.using('cms').all()
for item in news_obj:
#writer.writerow([item.newsText])
writer.writerow([item.userId.name])
return response
and the error which I am facing is:
UnicodeEncodeError :--
'ascii' codec can't encode characters in position 0-6: ordinal not in
range(128)
and further it says:-
The string that could not be encoded/decoded was: عبدالله الحذ

Replace line
writer.writerow([item.userId.name])
with:
writer.writerow([item.userId.name.encode('utf-8')])
Before saving unicode string to a file you must encode it in some encoding. Most system use utf-8 by default, so it's a safe choice.

From the error, The write content of csv file is like ASCII character. So decode the character.
>>>u'aあä'.encode('ascii', 'ignore')
'a'
Can fix this error from ignoring the ASCII character:
writer.writerow([item.userId.name.encode('ascii', 'ignore')])

Parse text file from content-type=application/zip and base64 encoding in AWS SES

On amazon SES, I have a rule to save incoming emails to S3 buckets. Amazon saves these in MIME format.
These emails have a .txt in attachment that will be shown in the MIME file as content-type=text/plain, Content-Disposition=attachment ... .txt, and Content-Transfer-Encoding=quoted-printable or bases64.
I am able to parse it fine using python.
I have a problem decoding the content of the .txt file attachment when this is compressed (i.e., content-type: applcation/zip), as if the encoding wasn't base64.
My code:
import base64
s = unicode(base64.b64decode(attachment_content), "utf-8")
throws the error:
Traceback (most recent call last):
File "<input>", line 796, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xcf in position 10: invalid continuation byte
Below are the first few lines of the "base64" string in attachment_content, which btw has length 53683 + "==" at the end, and I thought that the length of a base64 should be a multiple of 4 (??).
So maybe the decoding is failing because the compression is changing attachment_content and I need some other operation before/after decoding it? I have really no idea..
UEsDBBQAAAAIAM9Ah0otgkpwx5oAADMTAgAJAAAAX2NoYXQudHh0tL3bjiRJkiX23sD+g0U3iOxu
REWGu8c1l2Ag8lKd0V2ZWajM3kLuC6Hubu5uFeZm3nYJL6+n4T4Ry8EOdwCSMyQXBRBLgMQ+7CP5
QPBj5gdYn0CRI6JqFxWv7hlyszursiJV1G6qonI5cmQyeT6dPp9cnCaT6Yvp5Yvz6xfJe7cp8P/k
1SbL8xfJu0OSvUvr2q3TOnFVWjxrknWZFeuk2VRlu978s19MRvNMrHneOv51SOZlGUtMLYnfp0nd
...
I have also tried used "latin-1", but get gibberish.

The problem was that, after conversion, I was dealing with a zipped file in format, like "PK \x03 \x04 \X3C \Xa \x0c ...", and I needed to unzip it before transforming it to UTF-8 unicode.
This code worked for me:
import email
# Parse results from email
received_email = email.message_from_string(email_text)
for part in received_email.walk():
c_type = part.get_content_type()
c_enco = part.get('Content-Transfer-Encoding')
attachment_content = part.get_payload()
if c_enco == 'base64':
import base64
decoded_file = base64.b64decode(attachment_content)
print("File decoded from base64")
if c_type == "application/zip":
from cStringIO import StringIO
import zipfile
zfp = zipfile.ZipFile(StringIO(decoded_file), "r")
unzipped_list = zfp.open(zfp.namelist()[0]).readlines()
decoded_file = "".join(unzipped_list)
print('And un-zipped')
result = unicode(decoded_file, "utf-8")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django 1.7: serve a pdf -file (UnicodeDecodeError) - python

Try opening the file using binary mode: file = open('path/to/file/{}'.format(file_name), 'rb')

Related

How to return file contents from controller?

How to save an encoded pdf in a zip file using django?

How to decompress a GZIP file pulled from SFTP in Python3 the same way Mac OS's gunzip does it?

Errror in outputting CSV with Django?

Parse text file from content-type=application/zip and base64 encoding in AWS SES

Categories

Resources