I have an app that takes in some information, performs some calculations using pandas, and turns the final pandas data frame into a CSV that is then downloaded using the Flask app. How do I download multiple CSVs within one view? It seems that I can only return a single response at a time.
An example snippet:
def serve_csv(dataframe,filename):
buffer = StringIO.StringIO()
dataframe.to_csv(buffer, encoding='utf-8', index=False)
buffer.seek(0)
return send_file(buffer,
attachment_filename=filename,
mimetype='text/csv')
def make_calculation(arg1, arg2):
'''Does some calculations.
input: arg1 - string, arg2- string
returns: a pandas data frame'''
#app.route('test_app', methods=['GET', 'POST'])
def test_app():
form = Form1()
if form.validate_on_submit():
calculated_dataframe = make_calculation(str(form.input_1.data), str(form.input_2.data))
return serve_csv(calculated_dataframe, 'Your_final_output.csv')
return render_template('test_app.html', form=form)
So let's say in that example above that make_calculation returned two pandas data frames. How would I print both of them to a CSV?
This is all the code you need using the Zip files. It will return a zip file with all of your files.
In my program everything I want to zip is in an output folder so i just use os.walk and put it in the zip file with write. Before returning the file you need to close it, if you don't close it will return an empty file.
import zipfile
import os
from flask import send_file
#app.route('/download_all')
def download_all():
zipf = zipfile.ZipFile('Name.zip','w', zipfile.ZIP_DEFLATED)
for root,dirs, files in os.walk('output/'):
for file in files:
zipf.write('output/'+file)
zipf.close()
return send_file('Name.zip',
mimetype = 'zip',
attachment_filename= 'Name.zip',
as_attachment = True)
In the html I simply call the route:
DOWNLOAD ALL
I hope this helped somebody. :)
You could return a MIME Multipart response, a zip file, or a TAR ball (please note the linked RFC is somewhat out of date, but is easier to quickly get up to speed with because it's in HTML; the official one is here).
If you choose to do a MIME multipart response, a good starting point might be to look at the MultipartEncoder and MultipartDecoder in requests toolbelt; you may be able to use them directly, or at least subclass/compose using those to get your desired behavior. Zip files and TAR balls can be implemented using standard library modules.
An alternative would be to design your API so that you were returning JSON, use a header (or XML element or JSON field) to indicate that additional CSVs could be obtained by another request, or similar.
Building on #desfido's answer above, here would be some code implementation that does not involve using zip, and instead downloads two different files:
from requests_toolbelt import MultipartEncoder
def make_calculation(arg1, arg2):
'''Does some calculations.
input: arg1 - string, arg2- string
puts results in two different dataframes
and stores them in two different files,
returns the names of those two files'''
return filename1, filename2
#app.route('test_app', methods=['GET', 'POST'])
def test_app():
form = Form1()
if form.validate_on_submit():
f1, f2 = make_calculation(str(form.input_1.data), str(form.input_2.data))
m = MultipartEncoder({
'field1': (f1, open(f1, 'rb'), 'text/plain'),
'field2': (f2, open(f2, 'rb'), 'text/plain')
})
return Response(m.to_string(), mimetype=m.content_type)
return render_template('test_app.html', form=form)
Also you may try this, using zip module --
import zipfile
from os.path import basename
UPLOAD_PATH = <upload_location>
base_files = ["file1.csv", "file2.csv"]
with zipfile.ZipFile(UPLOAD_PATH + 'Test.zip', 'w') as zipF:
for file in base_files:
zipF.write(UPLOAD_PATH + file, basename(UPLOAD_PATH + file), compress_type=zipfile.ZIP_DEFLATED)
zipF.close()
return send_file(METAFILE_UPLOADS+'Test.zip', mimetype='zip', attachment_filename='Test.zip', as_attachment=True)
Related
How can I use the Flask test_client to upload multiple files to one API endpoint?
I'm trying to use the Flask test_client to upload multiple files to a web service that accepts multiple files to combine them into one large file.
My controller looks like this:
#app.route("/combine/file", methods=["POST"])
#flask_login.login_required
def combine_files():
user = flask_login.current_user
combined_file_name = request.form.get("file_name")
# Store file locally
file_infos = []
for file_data in request.files.getlist('file[]'):
# Get the content of the file
file_temp_path="/tmp/{}-request.csv".format(file_id)
file_data.save(file_temp_path)
# Create a namedtuple with information about the file
FileInfo = namedtuple("FileInfo", ["id", "name", "path"])
file_infos.append(
FileInfo(
id=file_id,
name=file_data.filename,
path=file_temp_path
)
)
...
My test code looks like this:
def test_combine_file(get_project_files):
project = get_project_files["project"]
r = web_client.post(
"/combine/file",
content_type='multipart/form-data',
buffered=True,
follow_redirects=True,
data={
"project_id": project.project_id,
"file_name": "API Test Combined File",
"file": [
(open("data/CC-Th0-MolsPerCell.csv", "rb"), "CC-Th0-MolsPerCell.csv"),
(open("data/CC-Th1-MolsPerCell.csv", "rb"), "CC-Th1-MolsPerCell.csv")
]})
response_data = json.loads(r.data)
assert "status" in response_data
assert response_data["status"] == "OK"
However, I can't get the test_client to actually upload both files. With more than one file specified, the file_data is empty when the API code loops. I have tried my own ImmutableDict with two "file" entries, a list of file tuples, a tuple of file tuples, anything I could think of.
What is the API to specify multiple files for upload in the Flask test_client? I can't find this anywhere on the web! :(
The test client takes a list of file objects (as returned by open()), so this is the testing utility I use:
def multi_file_upload(test_client, src_file_paths, dest_folder):
files = []
try:
files = [open(fpath, 'rb') for fpath in src_file_paths]
return test_client.post('/api/upload/', data={
'files': files,
'dest': dest_folder
})
finally:
for fp in files:
fp.close()
I think if you lose your tuples (but keeping the open()s) then your code might work.
You should just send data object with your files named as you want:
test_client.post('/api/upload',
data={'title': 'upload sample',
'file1': (io.BytesIO(b'get something'), 'file1'),
'file2': (io.BytesIO(b'forthright'), 'file2')},
content_type='multipart/form-data')
Another way of doing this- if you want to explicitly name your file uploads here (my use case was for two CSVs, but could be anything) with test_client is like this:
resp = test_client.post(
'/data_upload_api', # flask route
file_upload_one=[open(FILE_PATH, 'rb')],
file_upload_two=[open(FILE_PATH_2, 'rb')]
)
Using this syntax, these files would be accessible as:
request.files['file_upload_one'] # etc.
I am currently working on a small web interface which allows different users to upload files, convert the files they have uploaded, and download the converted files. The details of the conversion are not important for my question.
I am currently using flask-uploads to manage the uploaded files, and I am storing them in the file system. Once a user uploads and converts a file, there are all sorts of pretty buttons to delete the file, so that the uploads folder doesn't fill up.
I don't think this is ideal. What I really want is for the files to be deleted right after they are downloaded. I would settle for the files being deleted when the session ends.
I've spent some time trying to figure out how to do this, but I have yet to succeed. It doesn't seem like an uncommon problem, so I figure there must be some solution out there that I am missing. Does anyone have a solution?
There are several ways to do this.
send_file and then immediately delete (Linux only)
Flask has an after_this_request decorator which could work for this use case:
#app.route('/files/<filename>/download')
def download_file(filename):
file_path = derive_filepath_from_filename(filename)
file_handle = open(file_path, 'r')
#after_this_request
def remove_file(response):
try:
os.remove(file_path)
file_handle.close()
except Exception as error:
app.logger.error("Error removing or closing downloaded file handle", error)
return response
return send_file(file_handle)
The issue is that this will only work on Linux (which lets the file be read even after deletion if there is still an open file pointer to it). It also won't always work (I've heard reports that sometimes send_file won't wind up making the kernel call before the file is already unlinked by Flask). It doesn't tie up the Python process to send the file though.
Stream file, then delete
Ideally though you'd have the file cleaned up after you know the OS has streamed it to the client. You can do this by streaming the file back through Python by creating a generator that streams the file and then closes it, like is suggested in this answer:
def download_file(filename):
file_path = derive_filepath_from_filename(filename)
file_handle = open(file_path, 'r')
# This *replaces* the `remove_file` + #after_this_request code above
def stream_and_remove_file():
yield from file_handle
file_handle.close()
os.remove(file_path)
return current_app.response_class(
stream_and_remove_file(),
headers={'Content-Disposition': 'attachment', 'filename': filename}
)
This approach is nice because it is cross-platform. It isn't a silver bullet however, because it ties up the Python web process until the entire file has been streamed to the client.
Clean up on a timer
Run another process on a timer (using cron, perhaps) or use an in-process scheduler like APScheduler and clean up files that have been on-disk in the temporary location beyond your timeout (e. g. half an hour, one week, thirty days, after they've been marked "downloaded" in RDMBS)
This is the most robust way, but requires additional complexity (cron, in-process scheduler, work queue, etc.)
You can also store the file's data in memory, delete it, then serve what you have in memory.
For example, if you were serving a PDF:
import io
import os
#app.route('/download')
def download_file():
file_path = get_path_to_your_file()
return_data = io.BytesIO()
with open(file_path, 'rb') as fo:
return_data.write(fo.read())
# (after writing, cursor will be at last byte, so move it to start)
return_data.seek(0)
os.remove(file_path)
return send_file(return_data, mimetype='application/pdf',
attachment_filename='download_filename.pdf')
(above I'm just assuming it's PDF, but you can get the mimetype programmatically if you need)
Flask has an after_request decorator which could work in this case:
#app.route('/', methods=['POST'])
def upload_file():
uploaded_file = request.files['file']
file = secure_filename(uploaded_file.filename)
#app.after_request
def delete(response):
os.remove(file_path)
return response
return send_file(file_path, as_attachment=True, environ=request.environ)
Based on #Garrett comment, the better approach is to not blocking the send_file while removing the file. IMHO, the better approach is to remove it in the background, something like the following is better:
import io
import os
from flask import send_file
from multiprocessing import Process
#app.route('/download')
def download_file():
file_path = get_path_to_your_file()
return_data = io.BytesIO()
with open(file_path, 'rb') as fo:
return_data.write(fo.read())
return_data.seek(0)
background_remove(file_path)
return send_file(return_data, mimetype='application/pdf',
attachment_filename='download_filename.pdf')
def background_remove(path):
task = Process(target=rm(path))
task.start()
def rm(path):
os.remove(path)
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Serving dynamically generated ZIP archives in Django
(Feel free to point me to any potential duplicates if I have missed them)
I have looked at this snippet:
http://djangosnippets.org/snippets/365/
and this answer:
but I wonder how I can tweak them to suit my need: I want multiple files to be zipped and the archive available as a download via a link (or dynamically generated via a view). I am new to Python and Django so I don't know how to go about it.
Thank in advance!
I've posted this on the duplicate question which Willy linked to, but since questions with a bounty cannot be closed as a duplicate, might as well copy it here too:
import os
import zipfile
import StringIO
from django.http import HttpResponse
def getfiles(request):
# Files (local path) to put in the .zip
# FIXME: Change this (get paths from DB etc)
filenames = ["/tmp/file1.txt", "/tmp/file2.txt"]
# Folder name in ZIP archive which contains the above files
# E.g [thearchive.zip]/somefiles/file2.txt
# FIXME: Set this to something better
zip_subdir = "somefiles"
zip_filename = "%s.zip" % zip_subdir
# Open StringIO to grab in-memory ZIP contents
s = StringIO.StringIO()
# The zip compressor
zf = zipfile.ZipFile(s, "w")
for fpath in filenames:
# Calculate path for file in zip
fdir, fname = os.path.split(fpath)
zip_path = os.path.join(zip_subdir, fname)
# Add file, at correct path
zf.write(fpath, zip_path)
# Must close zip for all contents to be written
zf.close()
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(s.getvalue(), mimetype = "application/x-zip-compressed")
# ..and correct content-disposition
resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename
return resp
So as I understand your problem is not how to generate dynamically this file, but creating a link for people to download it...
What I suggest is the following:
0) Create a model for your file, if you want to generate it dynamically don't use the FileField, but just the info you need for generating this file:
class ZipStored(models.Model):
zip = FileField(upload_to="/choose/a/path/")
1) Create and store your Zip. This step is important, you create your zip in memory, and then cast it to assign it to the FileField:
function create_my_zip(request, [...]):
[...]
# This is a in-memory file
file_like = StringIO.StringIO()
# Create your zip, do all your stuff
zf = zipfile.ZipFile(file_like, mode='w')
[...]
# Your zip is saved in this "file"
zf.close()
file_like.seek(0)
# To store it we can use a InMemoryUploadedFile
inMemory = InMemoryUploadedFile(file_like, None, "my_zip_%s" % filename, 'application/zip', file_like.len, None)
zip = ZipStored(zip=inMemory)
# Your zip will be stored!
zip.save()
# Notify the user the zip was created or whatever
[...]
2) Create a url, for example get a number matching the id, you can also use a slugfield (this)
url(r'^get_my_zip/(\d+)$', "zippyApp.views.get_zip")
3) Now the view, this view will return the file matching the id passed in the url, you can also use a slug sending the text instead of the id, and make the get filtering by your slugfield.
function get_zip(request, id):
myzip = ZipStored.object.get(pk = id)
filename = myzip.zip.name.split('/')[-1]
# You got the zip! Now, return it!
response = HttpResponse(myzip.file, content_type='application/zip')
response['Content-Disposition'] = 'attachment; filename=%s' % filename
XLRD is installed and tested:
>>> import xlrd
>>> workbook = xlrd.open_workbook('Sample.xls')
When I read the file through html form like below, I'm able to access all the values.
xls_file = request.params['xls_file']
print xls_file.filename, xls_file.type
I'm using Pylons module, request comes from: from pylons import request, tmpl_context as c
My questions:
Is xls_file read through requst.params an object?
How can I read xls_file and make it work with xlrd?
Update:
The xls_file is uploaded on web server, but the xlrd library expects a filename instead of an open file object, How can I make the uploaded file to work with xlrd? (Thanks to Martijn Pieters, I was being unable to formulate the question clearly.)
xlrd does support providing data directly without a filepath, just use the file_contents argument:
xlrd.open_workbook(file_contents=fileobj.read())
From the documentation:
file_contents – A string or an mmap.mmap object or some other behave-alike object. If file_contents is supplied, filename will not be used, except (possibly) in messages.
What I met is not totally the same with the question, but I think maybe it is similar and I can give some hints.
I am using django rest framework's request instead of pylons request.
If I write simple codes like following:
#api_view(['POST'])
#renderer_classes([JSONRenderer])
def upload_files(request):
file_obj = request.FILES['file']
from xlrd import open_workbook
wb = open_workbook(file_contents=file_obj.read())
result = {"code": "0", "message": "success", "data": {}}
return Response(status=200, data=result)
Here We can read using open_workbook(file_contents=file_obj.read()) as mentioned in previous comments.
But if you write code in following way:
from rest_framework.views import APIView
from rest_framework.parsers import MultiPartParser
class FileUploadView(APIView):
parser_classes = (MultiPartParser,)
def put(self, request, filename, format=None):
file_obj = request.FILES.get('file')
from xlrd import open_workbook
wb = open_workbook(file_contents=file_obj.read())
# do some stuff with uploaded file
return Response(status=204)
You must pay attention that using MultiPartParser instead of FileUploadParser, using FileUploadParser will raise some BOF error.
So I am wondering somehow it is also affected by how you write the API.
For me this code works. Python 3
xlrd.open_workbook(file_contents=fileobj.content)
You could try something like...
import xlrd
def newopen(fileobject, modes):
return fileobject
oldopen = __builtins__.open
__builtins__.open = newopen
InputWorkBook = xlrd.open_workbook(fileobject)
__builtins__.open = oldopen
You may have to wrap the fileobject in StringIO if it isn't already a file handle.
How to serve users a dynamically generated ZIP archive in Django?
I'm making a site, where users can choose any combination of available books and download them as ZIP archive. I'm worried that generating such archives for each request would slow my server down to a crawl. I have also heard that Django doesn't currently have a good solution for serving dynamically generated files.
The solution is as follows.
Use Python module zipfile to create zip archive, but as the file specify StringIO object (ZipFile constructor requires file-like object). Add files you want to compress. Then in your Django application return the content of StringIO object in HttpResponse with mimetype set to application/x-zip-compressed (or at least application/octet-stream). If you want, you can set content-disposition header, but this should not be really required.
But beware, creating zip archives on each request is bad idea and this may kill your server (not counting timeouts if the archives are large). Performance-wise approach is to cache generated output somewhere in filesystem and regenerate it only if source files have changed. Even better idea is to prepare archives in advance (eg. by cron job) and have your web server serving them as usual statics.
Here's a Django view to do this:
import os
import zipfile
import StringIO
from django.http import HttpResponse
def getfiles(request):
# Files (local path) to put in the .zip
# FIXME: Change this (get paths from DB etc)
filenames = ["/tmp/file1.txt", "/tmp/file2.txt"]
# Folder name in ZIP archive which contains the above files
# E.g [thearchive.zip]/somefiles/file2.txt
# FIXME: Set this to something better
zip_subdir = "somefiles"
zip_filename = "%s.zip" % zip_subdir
# Open StringIO to grab in-memory ZIP contents
s = StringIO.StringIO()
# The zip compressor
zf = zipfile.ZipFile(s, "w")
for fpath in filenames:
# Calculate path for file in zip
fdir, fname = os.path.split(fpath)
zip_path = os.path.join(zip_subdir, fname)
# Add file, at correct path
zf.write(fpath, zip_path)
# Must close zip for all contents to be written
zf.close()
# Grab ZIP file from in-memory, make response with correct MIME-type
resp = HttpResponse(s.getvalue(), mimetype = "application/x-zip-compressed")
# ..and correct content-disposition
resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename
return resp
Many answers here suggest to use a StringIO or BytesIO buffer. However this is not needed as HttpResponse is already a file-like object:
response = HttpResponse(content_type='application/zip')
zip_file = zipfile.ZipFile(response, 'w')
for filename in filenames:
zip_file.write(filename)
response['Content-Disposition'] = 'attachment; filename={}'.format(zipfile_name)
return response
Note that you should not call zip_file.close() as the open "file" is response and we definitely don't want to close it.
I used Django 2.0 and Python 3.6.
import zipfile
import os
from io import BytesIO
def download_zip_file(request):
filelist = ["path/to/file-11.txt", "path/to/file-22.txt"]
byte_data = BytesIO()
zip_file = zipfile.ZipFile(byte_data, "w")
for file in filelist:
filename = os.path.basename(os.path.normpath(file))
zip_file.write(file, filename)
zip_file.close()
response = HttpResponse(byte_data.getvalue(), content_type='application/zip')
response['Content-Disposition'] = 'attachment; filename=files.zip'
# Print list files in zip_file
zip_file.printdir()
return response
For python3 i use the io.ByteIO since StringIO is deprecated to achieve this. Hope it helps.
import io
def my_downloadable_zip(request):
zip_io = io.BytesIO()
with zipfile.ZipFile(zip_io, mode='w', compression=zipfile.ZIP_DEFLATED) as backup_zip:
backup_zip.write('file_name_loc_to_zip') # u can also make use of list of filename location
# and do some iteration over it
response = HttpResponse(zip_io.getvalue(), content_type='application/x-zip-compressed')
response['Content-Disposition'] = 'attachment; filename=%s' % 'your_zipfilename' + ".zip"
response['Content-Length'] = zip_io.tell()
return response
Django doesn't directly handle the generation of dynamic content (specifically Zip files). That work would be done by Python's standard library. You can take a look at how to dynamically create a Zip file in Python here.
If you're worried about it slowing down your server you can cache the requests if you expect to have many of the same requests. You can use Django's cache framework to help you with that.
Overall, zipping files can be CPU intensive but Django shouldn't be any slower than another Python web framework.
Shameless plug: you can use django-zipview for the same purpose.
After a pip install django-zipview:
from zipview.views import BaseZipView
from reviews import Review
class CommentsArchiveView(BaseZipView):
"""Download at once all comments for a review."""
def get_files(self):
document_key = self.kwargs.get('document_key')
reviews = Review.objects \
.filter(document__document_key=document_key) \
.exclude(comments__isnull=True)
return [review.comments.file for review in reviews if review.comments.name]
I suggest to use separate model for storing those temp zip files. You can create zip on-fly, save to model with filefield and finally send url to user.
Advantages:
Serving static zip files with django media mechanism (like usual uploads).
Ability to cleanup stale zip files by regular cron script execution (which can use date field from zip file model).
A lot of contributions were made to the topic already, but since I came across this thread when I first researched this problem, I thought I'd add my own two cents.
Integrating your own zip creation is probably not as robust and optimized as web-server-level solutions. At the same time, we're using Nginx and it doesn't come with a module out of the box.
You can, however, compile Nginx with the mod_zip module (see here for a docker image with the latest stable Nginx version, and an alpine base making it smaller than the default Nginx image). This adds the zip stream capabilities.
Then Django just needs to serve a list of files to zip, all done!
It is a little more reusable to use a library for this file list response, and django-zip-stream offers just that.
Sadly it never really worked for me, so I started a fork with fixes and improvements.
You can use it in a few lines:
def download_view(request, name=""):
from django_zip_stream.responses import FolderZipResponse
path = settings.STATIC_ROOT
path = os.path.join(path, name)
return FolderZipResponse(path)
You need a way to have Nginx serve all files that you want to archive, but that's it.
Can't you just write a link to a "zip server" or whatnot? Why does the zip archive itself need to be served from Django? A 90's era CGI script to generate a zip and spit it to stdout is really all that's required here, at least as far as I can see.