Confused about making a CSV file into a ZIP file in django - python

I have a view that takes data from my site and then makes it into a zip compressed csv file. Here is my working code sans zip:
def backup_to_csv(request):
response = HttpResponse(mimetype='text/csv')
response['Content-Disposition'] = 'attachment; filename=backup.csv'
writer = csv.writer(response, dialect='excel')
#code for writing csv file go here...
return response
and it works great. Now I want that file to be compressed before it gets sent out. This is where I get stuck.
def backup_to_csv(request):
output = StringIO.StringIO() ## temp output file
writer = csv.writer(output, dialect='excel')
#code for writing csv file go here...
response = HttpResponse(mimetype='application/zip')
response['Content-Disposition'] = 'attachment; filename=backup.csv.zip'
z = zipfile.ZipFile(response,'w') ## write zip to response
z.writestr("filename.csv", output) ## write csv file to zip
return response
But thats not it and I have no idea how to do this.

OK I got it. Here is my new function:
def backup_to_csv(request):
output = StringIO.StringIO() ## temp output file
writer = csv.writer(output, dialect='excel')
#code for writing csv file go here...
response = HttpResponse(mimetype='application/zip')
response['Content-Disposition'] = 'attachment; filename=backup.csv.zip'
z = zipfile.ZipFile(response,'w') ## write zip to response
z.writestr("filename.csv", output.getvalue()) ## write csv file to zip
return response

Note how, in the working case, you return response... and in the NON-working case you return z, which is NOT an HttpResponse of course (while it should be!).
So: use your csv_writer NOT on response but on a temporary file; zip the temporary file; and write THAT zipped bytestream into the response!

zipfile.ZipFile(response,'w')
doesn't seem to work in python 2.7.9. The response is a django.HttpResponse object (which is said to be file-like) but it gives an error "HttpResponse object does not have an attribute 'seek'. When the same code is run in python 2.7.0 or 2.7.6 (I haven't tested it in other versions) it is OK... So you'd better test it with python 2.7.9 and see if you get the same behaviour.

Related

In Memory CSV download file name

I am using csv writer to create a csv in memory and returning it as a response using django.
My code looks like this.
response = HttpResponse (content_type='text/csv')
writer = csv.writer(response)
writer.writerow(['rizwan','mumtaz'])
......
return response
The code is working fine, but every time I get 'download.csv' how can I change the name 'download.csv' to somethigelse.csv
response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
https://docs.djangoproject.com/en/stable/howto/outputting-csv/

How to generate a file without saving it to disk in python?

I'm using Python 2.7 and Django 1.7.
I have a method in my admin interface that generates some kind of a csv file.
def generate_csv(args):
...
#some code that generates a dictionary to be written as csv
....
# this creates a directory and returns its filepath
dirname = create_csv_dir('stock')
csvpath = os.path.join(dirname, 'mycsv_file.csv')
fieldnames = [#some field names]
# this function creates the csv file in the directory shown by the csvpath
newcsv(data, csvheader, csvpath, fieldnames)
# this automatically starts a download from that directory
return HttpResponseRedirect('/media/csv/stock/%s' % csvfile)
All in all I create a csv file, save it somewhere on the disk, and then pass its URL to the user for download.
I was thinking if all this can be done without writing to disc. I googled around a bit and maybe content disposition attachment might help me, but I got lost in documentation a bit.
Anyway if there's an easier way of doing this I'd love to know.
Thanks to #Ragora, you pointed me towards the right direction.
I rewrote the newcsv method:
from io import StringIO
import csv
def newcsv(data, csvheader, fieldnames):
"""
Create a new csv file that represents generated data.
"""
new_csvfile = StringIO.StringIO()
wr = csv.writer(new_csvfile, quoting=csv.QUOTE_ALL)
wr.writerow(csvheader)
wr = csv.DictWriter(new_csvfile, fieldnames = fieldnames)
for key in data.keys():
wr.writerow(data[key])
return new_csvfile
and in the admin:
csvfile = newcsv(data, csvheader, fieldnames)
response = HttpResponse(csvfile.getvalue(), content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename=stock.csv'
return response
If it annoys you that you are saving a file to disk, just add the application/octet-stream content-type to the Content-Disposition header then delete the file from disk.
If this header (Content-Disposition) is used in a response with the application/octet- stream content-type, the implied suggestion is that the user agent should not display the response, but directly enter a `save response as...' dialog.

Python: generate xlsx in memory and stream file download?

for example the following code creates the xlsx file first and then streams it as a download but I'm wondering if it is possible to send the xlsx data as it is being created. For example, imagine if a very large xlsx file needs to be generated, the user has to wait until it is finished and then receive the download, what I'd like is to start the xlsx file download in the user browser, and then send over the data as it is being generated. It seems trivial with a .csv file but not so with an xlsx file.
try:
import cStringIO as StringIO
except ImportError:
import StringIO
from django.http import HttpResponse
from xlsxwriter.workbook import Workbook
def your_view(request):
# your view logic here
# create a workbook in memory
output = StringIO.StringIO()
book = Workbook(output)
sheet = book.add_worksheet('test')
sheet.write(0, 0, 'Hello, world!')
book.close()
# construct response
output.seek(0)
response = HttpResponse(output.read(), mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
response['Content-Disposition'] = "attachment; filename=test.xlsx"
return response
Are you able to write tempfiles to disk while generating the XLSX?
If you are able to use tempfile you won't be memory bound, which is nice, but the download will still only start when the XLSX writer is done assembling the document.
If you can't write tempfiles, you'll have to follow this example http://xlsxwriter.readthedocs.org/en/latest/example_http_server.html and your code is unfortunately completely memory bound.
Streaming CSV is very easy, on the other hand. Here is code we use to stream any iterator of rows in a CSV response:
import csv
import io
def csv_generator(data_generator):
csvfile = io.BytesIO()
csvwriter = csv.writer(csvfile)
def read_and_flush():
csvfile.seek(0)
data = csvfile.read()
csvfile.seek(0)
csvfile.truncate()
return data
for row in data_generator:
csvwriter.writerow(row)
yield read_and_flush()
def csv_stream_response(response, iterator, file_name="xxxx.csv"):
response.content_type = 'text/csv'
response.content_disposition = 'attachment;filename="' + file_name + '"'
response.charset = 'utf8'
response.content_encoding = 'utf8'
response.app_iter = csv_generator(iterator)
return response
xlsx format is a zip file that contains several individual files, so you can't create it on the fly and send it out as it is being created.

Serving Excel(xlsx) file to the user for download in Django(Python)

I'm trying create and serve excel files using Django. I have a jar file which gets parameters and produces an excel file according to parameters and it works with no problem. But when i'm trying to get the produced file and serve it to the user for download the file comes out broken. It has 0kb size. This is the code piece I'm using for excel generation and serving.
def generateExcel(request,id):
if os.path.exists('./%s_Report.xlsx' % id):
excel = open("%s_Report.xlsx" % id, "r")
output = StringIO.StringIO(excel.read())
out_content = output.getvalue()
output.close()
response = HttpResponse(out_content,content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s_Report.xlsx' % id
return response
else:
args = ['ServerExcel.jar', id]
result = jarWrapper(*args) # this creates the excel file with no problem
if result:
excel = open("%s_Report.xlsx" % id, "r")
output = StringIO.StringIO(excel.read())
out_content = output.getvalue()
output.close()
response = HttpResponse(out_content,content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s_Report.xlsx' % id
return response
else:
return HttpResponse(json.dumps({"no":"excel","no one": "cries"}))
I have searched for possible solutions and tried to use File Wrapper also but the result did not changed. I assume i have problem with reading the xlsx file into StringIO object. But dont have any idea about how to fix it
Why on earth are you passing your file's content to a StringIO just to assign StringIO.get_value() to a local variable ? What's wrong with assigning file.read() to your variable directly ?
def generateExcel(request,id):
path = './%s_Report.xlsx' % id # this should live elsewhere, definitely
if os.path.exists(path):
with open(path, "r") as excel:
data = excel.read()
response = HttpResponse(data,content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s_Report.xlsx' % id
return response
else:
# quite some duplication to fix down there
Now you may want to check weither you actually had any content in your file - the fact that the file exists doesn't mean it has anything in it. Remember that you're in a concurrent context, you can have one thread or process trying to read the file while another (=>another request) is trying to write it.
In addition to what Bruno says, you probably need to open the file in binary mode:
excel = open("%s_Report.xlsx" % id, "rb")
You can use this library to create excel sheets on the fly.
http://xlsxwriter.readthedocs.io/
For more information see this page. Thanks to #alexcxe
XlsxWriter object save as http response to create download in Django
my answer is:
def generateExcel(request,id):
if os.path.exists('./%s_Report.xlsx' % id):
with open('./%s_Report.xlsx' % id, "rb") as file:
response = HttpResponse(file.read(),content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s_Report.xlsx' % id
return response
else:
# quite some duplication to fix down there
why using "rb"? because HttpResponse class init parameters is (self, content=b'', *args, **kwargs), so we should using "rb" and using .read() to get the bytes.

Django httpresponse stripping CR

When I open my text file attachment that is generated using the below code, the HTTP response always seems to strip out the CR from each line, the users of this file will be using Notepad so I need CR/LF on each line.
the_file = tempfile.TemporaryFile(mode='w+b')
<procedure call to generate lines of text in "the_file">
the_file.seek(0)
filestring = the_file.read()
response = HttpResponse(filestring,
mimetype="text/plain")
response['Content-Length'] = the_file.tell()
response['Content-Disposition'] = 'attachment; filename="4cos_example.txt"'
return response
If I use this method, I get CR/LF in my files but I'd like to avoid having to write the file to disk at all, so it doesn't seem to be a good solution:
the_file = open('myfile.txt','w+')
<procedure call to generate lines of text in "the_file">
the_file.close
the_file = open('myfile.txt','rb')
filestring = the_file.read()
response = HttpResponse(filestring,
mimetype="text/plain")
response['Content-Length'] = the_file.tell()
response['Content-Disposition'] = 'attachment; filename="4cos_example.txt"'
return response
I feel like the solution should be obvious. but I can't close a tempfile and re-open it in binary mode (preserving the CR/LR). Heck I'm not even sure i'm in the right ballpark regarding how to correctly do this :) None the less, I'd like to pass this data as an attachment to the user after the configuration is assembled and have it display correctly in notepad. Is tempfile the wrong solution here or is there a mechanic of tempfile that will solve this issue for me without having to use file IO on disk.
Instead of using TemporaryFile, just use HttpResponse:
response = HttpResponse('', content_type='text/plain')
response['Content-Disposition'] = 'attachment; filename="4cos_example.txt"'
response.write('first line\r\n')
response.write('second line\r\n')
return response
FYI, if this is a very large response, you can also use StreamingHttpResponse. But only do that if required, since headers like Content-Length will not be able to be added automatically.

Categories

Resources