How to generate a file without saving it to disk in python?

How to generate a file without saving it to disk in python? - python

I'm using Python 2.7 and Django 1.7.
I have a method in my admin interface that generates some kind of a csv file.
def generate_csv(args):
...
#some code that generates a dictionary to be written as csv
....
# this creates a directory and returns its filepath
dirname = create_csv_dir('stock')
csvpath = os.path.join(dirname, 'mycsv_file.csv')
fieldnames = [#some field names]
# this function creates the csv file in the directory shown by the csvpath
newcsv(data, csvheader, csvpath, fieldnames)
# this automatically starts a download from that directory
return HttpResponseRedirect('/media/csv/stock/%s' % csvfile)
All in all I create a csv file, save it somewhere on the disk, and then pass its URL to the user for download.
I was thinking if all this can be done without writing to disc. I googled around a bit and maybe content disposition attachment might help me, but I got lost in documentation a bit.
Anyway if there's an easier way of doing this I'd love to know.

Thanks to #Ragora, you pointed me towards the right direction.
I rewrote the newcsv method:
from io import StringIO
import csv
def newcsv(data, csvheader, fieldnames):
"""
Create a new csv file that represents generated data.
"""
new_csvfile = StringIO.StringIO()
wr = csv.writer(new_csvfile, quoting=csv.QUOTE_ALL)
wr.writerow(csvheader)
wr = csv.DictWriter(new_csvfile, fieldnames = fieldnames)
for key in data.keys():
wr.writerow(data[key])
return new_csvfile
and in the admin:
csvfile = newcsv(data, csvheader, fieldnames)
response = HttpResponse(csvfile.getvalue(), content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename=stock.csv'
return response

If it annoys you that you are saving a file to disk, just add the application/octet-stream content-type to the Content-Disposition header then delete the file from disk.
If this header (Content-Disposition) is used in a response with the application/octet- stream content-type, the implied suggestion is that the user agent should not display the response, but directly enter a `save response as...' dialog.

Related

Django download a BinaryField as a CSV file

I am working with a legacy project and we need to implement a Django Admin that helps download a csv report that was stored as a BinaryField.
The model is something like this:
class MyModel(models.Model):
csv_report = models.BinaryField(blank=True,null=True)
Everything seems to being stored as expected but I have no clue how to decode the field back to a csv file for later use.
I am using something like these (as an admin action on MyModelAdmin class)
class MyModelAdmin(admin.ModelAdmin):
...
...
actions = ["download_file",]
def download_file(self, request,queryset):
# just getting one for testing
contents = queryset[0].csv_report
encoded_data = base64.b64encode(contents).decode()
with open("report.csv", "wb") as binary_file:
# Write bytes to file
decoded_image_data = base64.decodebytes(encoded_data)
binary_file.write(decoded_image_data)
response = HttpResponse(encoded_data)
response['Content-Disposition'] = 'attachment; filename=report.csv'
return response
download_file.short_description = "file"
But all I download is a scrambled csv file. I don't seem to understand if it is a problem of the format I am using to decode (.decode('utf-8') does nothing either )
PD:
I know it is a bad practice to use BinaryField for this. But requirements are requirements. Nothing to do about it.
EDIT:
As #TimRoberts pointed out, encoding and then decoding is REALLY silly :$. I've changed the method like so:
def download_file(self, request,queryset):
# print(self,request)
contents = queryset[0].csv_report
# print(type(contents))
encoded_data = base64.b64decode(contents)
with open("my_file.csv", "wb") as binary_file:
binary_file.write(encoded_data)
response = HttpResponse(encoded_data)
response['Content-Disposition'] = 'attachment; filename=blob.csv'
return response
download_file.short_description = "file"
Still I am getting a csv file with something like this:

A big fat case of the old RTFM: I was getting carried away by the all base64.. Obviously I didn't have any idea of what I was doing.
After tampering with the shell and reading the docs, I just changed my method to:
def download_file(self, request,queryset):
**contents = bytes(queryset[0].csv_report)**
response = HttpResponse(contents)
response['ContentDisposition']='attachment;filename=report.csv'
return response
Note that I was scrambling the data on purpose by doing the encoded_data = base64.b64decode(contents) stuff. I just needed to apply bytes on my BinaryField and voilá

Python: generate xlsx in memory and stream file download?

for example the following code creates the xlsx file first and then streams it as a download but I'm wondering if it is possible to send the xlsx data as it is being created. For example, imagine if a very large xlsx file needs to be generated, the user has to wait until it is finished and then receive the download, what I'd like is to start the xlsx file download in the user browser, and then send over the data as it is being generated. It seems trivial with a .csv file but not so with an xlsx file.
try:
import cStringIO as StringIO
except ImportError:
import StringIO
from django.http import HttpResponse
from xlsxwriter.workbook import Workbook
def your_view(request):
# your view logic here
# create a workbook in memory
output = StringIO.StringIO()
book = Workbook(output)
sheet = book.add_worksheet('test')
sheet.write(0, 0, 'Hello, world!')
book.close()
# construct response
output.seek(0)
response = HttpResponse(output.read(), mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
response['Content-Disposition'] = "attachment; filename=test.xlsx"
return response

Are you able to write tempfiles to disk while generating the XLSX?
If you are able to use tempfile you won't be memory bound, which is nice, but the download will still only start when the XLSX writer is done assembling the document.
If you can't write tempfiles, you'll have to follow this example http://xlsxwriter.readthedocs.org/en/latest/example_http_server.html and your code is unfortunately completely memory bound.
Streaming CSV is very easy, on the other hand. Here is code we use to stream any iterator of rows in a CSV response:
import csv
import io
def csv_generator(data_generator):
csvfile = io.BytesIO()
csvwriter = csv.writer(csvfile)
def read_and_flush():
csvfile.seek(0)
data = csvfile.read()
csvfile.seek(0)
csvfile.truncate()
return data
for row in data_generator:
csvwriter.writerow(row)
yield read_and_flush()
def csv_stream_response(response, iterator, file_name="xxxx.csv"):
response.content_type = 'text/csv'
response.content_disposition = 'attachment;filename="' + file_name + '"'
response.charset = 'utf8'
response.content_encoding = 'utf8'
response.app_iter = csv_generator(iterator)
return response

xlsx format is a zip file that contains several individual files, so you can't create it on the fly and send it out as it is being created.

Serving Excel(xlsx) file to the user for download in Django(Python)

I'm trying create and serve excel files using Django. I have a jar file which gets parameters and produces an excel file according to parameters and it works with no problem. But when i'm trying to get the produced file and serve it to the user for download the file comes out broken. It has 0kb size. This is the code piece I'm using for excel generation and serving.
def generateExcel(request,id):
if os.path.exists('./%s_Report.xlsx' % id):
excel = open("%s_Report.xlsx" % id, "r")
output = StringIO.StringIO(excel.read())
out_content = output.getvalue()
output.close()
response = HttpResponse(out_content,content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s_Report.xlsx' % id
return response
else:
args = ['ServerExcel.jar', id]
result = jarWrapper(*args) # this creates the excel file with no problem
if result:
excel = open("%s_Report.xlsx" % id, "r")
output = StringIO.StringIO(excel.read())
out_content = output.getvalue()
output.close()
response = HttpResponse(out_content,content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s_Report.xlsx' % id
return response
else:
return HttpResponse(json.dumps({"no":"excel","no one": "cries"}))
I have searched for possible solutions and tried to use File Wrapper also but the result did not changed. I assume i have problem with reading the xlsx file into StringIO object. But dont have any idea about how to fix it

Why on earth are you passing your file's content to a StringIO just to assign StringIO.get_value() to a local variable ? What's wrong with assigning file.read() to your variable directly ?
def generateExcel(request,id):
path = './%s_Report.xlsx' % id # this should live elsewhere, definitely
if os.path.exists(path):
with open(path, "r") as excel:
data = excel.read()
response = HttpResponse(data,content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s_Report.xlsx' % id
return response
else:
# quite some duplication to fix down there
Now you may want to check weither you actually had any content in your file - the fact that the file exists doesn't mean it has anything in it. Remember that you're in a concurrent context, you can have one thread or process trying to read the file while another (=>another request) is trying to write it.

In addition to what Bruno says, you probably need to open the file in binary mode:
excel = open("%s_Report.xlsx" % id, "rb")

You can use this library to create excel sheets on the fly.
http://xlsxwriter.readthedocs.io/
For more information see this page. Thanks to #alexcxe
XlsxWriter object save as http response to create download in Django

my answer is:
def generateExcel(request,id):
if os.path.exists('./%s_Report.xlsx' % id):
with open('./%s_Report.xlsx' % id, "rb") as file:
response = HttpResponse(file.read(),content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s_Report.xlsx' % id
return response
else:
# quite some duplication to fix down there
why using "rb"? because HttpResponse class init parameters is (self, content=b'', *args, **kwargs), so we should using "rb" and using .read() to get the bytes.

Sending multiple .CSV files to .ZIP without storing to disk in Python

I'm working on a reporting application for my Django powered website. I want to run several reports and have each report generate a .csv file in memory that can be downloaded in batch as a .zip. I would like to do this without storing any files to disk. So far, to generate a single .csv file, I am following the common operation:
mem_file = StringIO.StringIO()
writer = csv.writer(mem_file)
writer.writerow(["My content", my_value])
mem_file.seek(0)
response = HttpResponse(mem_file, content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename=my_file.csv'
This works fine, but only for a single, unzipped .csv. If I had, for example, a list of .csv files created with a StringIO stream:
firstFile = StringIO.StringIO()
# write some data to the file
secondFile = StringIO.StringIO()
# write some data to the file
thirdFile = StringIO.StringIO()
# write some data to the file
myFiles = [firstFile, secondFile, thirdFile]
How could I return a compressed file that contains all objects in myFiles and can be properly unzipped to reveal three .csv files?

zipfile is a standard library module that does exactly what you're looking for. For your use-case, the meat and potatoes is a method called "writestr" that takes a name of a file and the data contained within it that you'd like to zip.
In the code below, I've used a sequential naming scheme for the files when they're unzipped, but this can be switched to whatever you'd like.
import zipfile
import StringIO
zipped_file = StringIO.StringIO()
with zipfile.ZipFile(zipped_file, 'w') as zip:
for i, file in enumerate(files):
file.seek(0)
zip.writestr("{}.csv".format(i), file.read())
zipped_file.seek(0)
If you want to future-proof your code (hint hint Python 3 hint hint), you might want to switch over to using io.BytesIO instead of StringIO, since Python 3 is all about the bytes. Another bonus is that explicit seeks are not necessary with io.BytesIO before reads (I haven't tested this behavior with Django's HttpResponse, so I've left that final seek in there just in case).
import io
import zipfile
zipped_file = io.BytesIO()
with zipfile.ZipFile(zipped_file, 'w') as f:
for i, file in enumerate(files):
f.writestr("{}.csv".format(i), file.getvalue())
zipped_file.seek(0)

The stdlib comes with the module zipfile, and the main class, ZipFile, accepts a file or file-like object:
from zipfile import ZipFile
temp_file = StringIO.StringIO()
zipped = ZipFile(temp_file, 'w')
# create temp csv_files = [(name1, data1), (name2, data2), ... ]
for name, data in csv_files:
data.seek(0)
zipped.writestr(name, data.read())
zipped.close()
temp_file.seek(0)
# etc. etc.
I'm not a user of StringIO so I may have the seek and read out of place, but hopefully you get the idea.

def zipFiles(files):
outfile = StringIO() # io.BytesIO() for python 3
with zipfile.ZipFile(outfile, 'w') as zf:
for n, f in enumarate(files):
zf.writestr("{}.csv".format(n), f.getvalue())
return outfile.getvalue()
zipped_file = zip_files(myfiles)
response = HttpResponse(zipped_file, content_type='application/octet-stream')
response['Content-Disposition'] = 'attachment; filename=my_file.zip'
StringIO has getvalue method which return the entire contents. You can compress the zipfile
by zipfile.ZipFile(outfile, 'w', zipfile.ZIP_DEFLATED). Default value of compression is ZIP_STORED which will create zip file without compressing.

Confused about making a CSV file into a ZIP file in django

I have a view that takes data from my site and then makes it into a zip compressed csv file. Here is my working code sans zip:
def backup_to_csv(request):
response = HttpResponse(mimetype='text/csv')
response['Content-Disposition'] = 'attachment; filename=backup.csv'
writer = csv.writer(response, dialect='excel')
#code for writing csv file go here...
return response
and it works great. Now I want that file to be compressed before it gets sent out. This is where I get stuck.
def backup_to_csv(request):
output = StringIO.StringIO() ## temp output file
writer = csv.writer(output, dialect='excel')
#code for writing csv file go here...
response = HttpResponse(mimetype='application/zip')
response['Content-Disposition'] = 'attachment; filename=backup.csv.zip'
z = zipfile.ZipFile(response,'w') ## write zip to response
z.writestr("filename.csv", output) ## write csv file to zip
return response
But thats not it and I have no idea how to do this.

OK I got it. Here is my new function:
def backup_to_csv(request):
output = StringIO.StringIO() ## temp output file
writer = csv.writer(output, dialect='excel')
#code for writing csv file go here...
response = HttpResponse(mimetype='application/zip')
response['Content-Disposition'] = 'attachment; filename=backup.csv.zip'
z = zipfile.ZipFile(response,'w') ## write zip to response
z.writestr("filename.csv", output.getvalue()) ## write csv file to zip
return response

Note how, in the working case, you return response... and in the NON-working case you return z, which is NOT an HttpResponse of course (while it should be!).
So: use your csv_writer NOT on response but on a temporary file; zip the temporary file; and write THAT zipped bytestream into the response!

zipfile.ZipFile(response,'w')
doesn't seem to work in python 2.7.9. The response is a django.HttpResponse object (which is said to be file-like) but it gives an error "HttpResponse object does not have an attribute 'seek'. When the same code is run in python 2.7.0 or 2.7.6 (I haven't tested it in other versions) it is OK... So you'd better test it with python 2.7.9 and see if you get the same behaviour.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to generate a file without saving it to disk in python? - python

Related

Django download a BinaryField as a CSV file

Python: generate xlsx in memory and stream file download?

Serving Excel(xlsx) file to the user for download in Django(Python)

Sending multiple .CSV files to .ZIP without storing to disk in Python

Confused about making a CSV file into a ZIP file in django

Categories

Resources