I wanted to ask if it's possible to create PDF/XLS documents as temporary files. I'm doing that to send them using flask afterwards. For pdf/xls files creation I use reportlab and xlsxwriter packages respectively. When I save document using their methods, I get the "Python temporary file permission denied" error. When I try to close using the tempfile methods, files become corrupted. Is there any way to overcome this? Or any other suitable solution?
EDIT:
Some code snippets:
import xlswriter
import tempfile
from flask import after_this_request
#app.route('/some_url', method=['POST'])
def create_doc_function():
#after_this_request
def cleanup(response):
temp.close()
return response
temp = tempfile.TemporaryFile()
book = xlsxwriter.Workbook(temp.name)
# some actions here ...
book.close() # raises "Python temporaty file permission denied" error.
# If missed, Excel book is gonna be corrupted,
# i.e. blank, which make sense
return send_file(temp, as_attachment=True,
attachment_filename='my_document_name.xls')
Similar story with pdf files.
Use tempfile.mkstemp() which will create a standard temp file on disk which will persist until removed:
import tempfile
import os
handle, filepath = tempfile.mkstemp()
f = os.fdopen(handle) # convert raw handle to file object
...
EDIT
tempfile.TemporaryFile() will be destroyed as soon as it's closed, which is why your code above is failing.
You can use and delete NamedTemporaryFile with context manager (or atexit module). It may do the dirty job for you.Example 1:
import os
from tempfile import NamedTemporaryFile
# define class, because everyone loves objects
class FileHandler():
def __init__(self):
'''
Let's create temporary file in constructor
Notice that there is no param (delete=True is not necessary)
'''
self.file = NamedTemporaryFile()
# write something funny into file...or do whatever you need
def write_into(self, btext):
self.file.write(btext)
def __enter__(self):
'''
Define simple but mandatory __enter__ function - context manager will require it.
Just return the instance, nothing more is requested.
'''
return self
def __exit__(self, exc_type, exc_val, exc_tb):
'''
Also define mandatory __exit__ method which is called at the end.
NamedTemporaryFile is deleted as soon as is closed (function checks it before and after close())
'''
print('Calling __exit__:')
print(f'File exists = {os.path.exists(self.file.name)}')
self.file.close()
print(f'File exists = {os.path.exists(self.file.name)}')
# use context mamager 'with' to create new instance and do something
with FileHandler() as fh:
fh.write_into(b'Hi happy developer!')
print(f'\nIn this point {fh.file.name} does not exist (exists = {os.path.exists(fh.file.name)})')
Output:
Calling __exit__:
File exists = True
File exists = False
In this point D:\users\fll2cj\AppData\Local\Temp\tmpyv37sp58 does not exist (exists = False)
Or you can use atexit module which calls defined function when program (cmd) exits.Example 2:
import os, atexit
from tempfile import NamedTemporaryFile
class FileHandler():
def __init__(self):
self.file = NamedTemporaryFile()
# register function called when quit
atexit.register(self._cleanup)
def write_into(self, btext):
self.file.write(btext)
def _cleanup(self):
# because self.file has been created without delete=False, closing the file causes its deletion
self.file.close()
# create new instance and do whatever you need
fh = FileHandler()
fh.write_into(b'Hi happy developer!')
# now the file still exists, but when program quits, _cleanup() is called and file closed and automaticaly deleted.
Related
I have a remote storage project that when the user requests his file, the django server retrieves and stores the file locally (for some processing) as a temporary file and then serves it to the user with mod x-sendfile. I certainly want the tempfile to be deleted after it is served to the user.
The documentations state that NamedTemporaryFile delete argument if set to False leads to deletion of the file after that all the references are gone. But when the user is served the tempfile, it doesn't get deleted. If I set the delete=True in case of downloading I get the "The requested URL /ServeSegment/Test.jpg/ was not found on this server."
Here is a view to list the user files:
def file_profile(request):
obj = MainFile.objects.filter(owner=request.user)
context = {'title': 'welcome',
'obj': obj
}
return render(request, 'ServeSegments.html', context=context)
This is the view which retrieves, stores temporarily and serve the requested file:
def ServeSegment(request, segmentID):
if request.method == 'GET':
url = 'http://192.168.43.7:8000/foo/'+str(segmentID)
r = requests.get(url, stream=True)
if r.status_code == 200:
with tempfile.NamedTemporaryFile(dir=
'/tmp/Files', mode='w+b') as f:
for chunk in r.iter_content(1024):
f.write(chunk)
response = HttpResponse()
response['Content-Disposition'] = 'attachment; segmentID={0}'.format(f.name)
response['X-Sendfile'] = "{0}".format(f.name)
return response
else:
return HttpResponse(str(segmentID))
I guess if I could manage to return the response inside with a statement and after that, the last chunk was written, it would work as I want, but I found no solution regarding how to determine if we are in the last loop (without being hackish).
What should I do the serve the tempfile and have it deleted right after?
Adding a generalized answer (based on Cyrbil's) that avoids using signals by doing the cleanup in a finally block.
While the directory entry is deleted by os.remove on the way out, the underlying file remains open until FileResponse closes it. You can check this by inspecting response._closable_objects[0].fileno() in the finally block with pdb, and checking open files with lsof in another terminal while it's paused.
It looks like it's important that you're on a Unix system if you're going to use this solution (see os.remove docs)
https://docs.python.org/3/library/os.html#os.remove
import os
import tempfile
from django.http import FileResponse
def my_view(request):
try:
tmp = tempfile.NamedTemporaryFile(delete=False)
with open(tmp.name, 'w') as fi:
# write to your tempfile, mode may vary
response = FileResponse(open(tmp.name, 'rb'))
return response
finally:
os.remove(tmp.name)
Any file created by tempfile will be deleted once the file handler is closed. In your case, when you exit the with statement. The delete=False argument prevent this behavior and let the deletion up to the application. You can delete the file after its been sent by registering a signal handler that will unlink the file once response is sent.
Your example does nothing on the file, so you might want to stream the content directly with StreamingHttpResponse or FileResponse. But as you said you "stores the file locally (for some processing)", I would suggest thinking on doing the processing without any temporary file created and only work with streams.
Disposable files
The solution to the question is to not use with in the NamedTemporaryFile and handle exceptions. Currently your file is being deleted before your read. At the end return
f.seek(0)
return FileResponse(f, as_attachment=True, filename=f.name)
The temporary file will be closed when the read is complete and therefore deleted.
Non-disposable files
For those who stumble across do not have an automatically disposable file handle.
From the other answers, signals seemed to be a reasonable solution however passing data required altering protected members. I was unsure how supported it would be in the future. I also found that whp's solution did not work in the current version of Django. The most future-proof version I could come up with was monkey patching the file output so the file is deleted on close. Django closes the file handles at the end of sending the file and I can't see that changing.
def my_view(request):
tmp = tempfile.NamedTemporaryFile(delete=False)
try:
# write file tmp (remember to close if re-opening)
# after write close the file (if not closed)
stream_file = open(tmp.name, 'rb')
# monkey patch the file
original_close = stream_file.close
def new_close():
original_close()
os.remove(tmp.name)
stream_file.close = new_close
# return the result
return FileResponse(stream_file, as_attachment=True, filename='out.txt')
except Exception:
os.remove(output.name)
raise
Is there an option I can pass open() that will cause an IOerror when trying to write a nonexistent file? I am using python to read and write block devices via symlinks, and if the link is missing I want to raise an error rather than create a regular file. I know I could add a check to see if the file exists and manually raise the error, but would prefer to use something built-in if it exists.
Current code looks like this:
device = open(device_path, 'wb', 0)
device.write(data)
device.close()
Yes.
open(path, 'r+b')
Specifying the "r" option means the file must exist and you can read.
Specifying "+" means you can write and that you will be positioned at the end.
https://docs.python.org/3/library/functions.html?#open
Use os.path.islink() or os.path.isfile() to check if the file exists.
Doing the check each time is a nuisance, but you can always wrap open():
import os
def open_if_exists(*args, **kwargs):
if not os.path.exists(args[0]):
raise IOError('{:s} does not exist.'.format(args[0]))
f = open(*args, **kwargs)
return f
f = open_if_exists(r'file_does_not_exist.txt', 'w+')
This is just quick and dirty, so it doesn't allow for usage as: with open_if_exists(...).
Update
The lack of a context manager was bothering me, so here goes:
import os
from contextlib import contextmanager
#contextmanager
def open_if_exists(*args, **kwargs):
if not os.path.exists(args[0]):
raise IOError('{:s} does not exist.'.format(args[0]))
f = open(*args, **kwargs)
try:
yield f
finally:
f.close()
with open_if_exists(r'file_does_not_exist.txt', 'w+') as f:
print('foo', file=f)
I am afraid you can't perform the check of file existence and raise error using the open() function.
Below is the signature of open() in python where name is the file_name, mode is the access mode and buffering to indicate if buffering is to be performed while accessing a file.
open(name[, mode[, buffering]])
Instead, you can check if the file exists or not.
>>> import os
>>> os.path.isfile(file_name)
This will return True or False depending on if the file exists. To test a file specifically, you can use this.
To test the existence of both files and directories, you can use:
>>> os.path.exists(file_path)
I have the following view code that attempts to "stream" a zipfile to the client for download:
import os
import zipfile
import tempfile
from pyramid.response import FileIter
def zipper(request):
_temp_path = request.registry.settings['_temp']
tmpfile = tempfile.NamedTemporaryFile('w', dir=_temp_path, delete=True)
tmpfile_path = tmpfile.name
## creating zipfile and adding files
z = zipfile.ZipFile(tmpfile_path, "w")
z.write('somefile1.txt')
z.write('somefile2.txt')
z.close()
## renaming the zipfile
new_zip_path = _temp_path + '/somefilegroup.zip'
os.rename(tmpfile_path, new_zip_path)
## re-opening the zipfile with new name
z = zipfile.ZipFile(new_zip_path, 'r')
response = FileIter(z.fp)
return response
However, this is the Response I get in the browser:
Could not convert return value of the view callable function newsite.static.zipper into a response object. The value returned was .
I suppose I am not using FileIter correctly.
UPDATE:
Since updating with Michael Merickel's suggestions, the FileIter function is working correctly. However, still lingering is a MIME type error that appears on the client (browser):
Resource interpreted as Document but transferred with MIME type application/zip: "http://newsite.local:6543/zipper?data=%7B%22ids%22%3A%5B6%2C7%5D%7D"
To better illustrate the issue, I have included a tiny .py and .pt file on Github: https://github.com/thapar/zipper-fix
FileIter is not a response object, just like your error message says. It is an iterable that can be used for the response body, that's it. Also the ZipFile can accept a file object, which is more useful here than a file path. Let's try writing into the tmpfile, then rewinding that file pointer back to the start, and using it to write out without doing any fancy renaming.
import os
import zipfile
import tempfile
from pyramid.response import FileIter
def zipper(request):
_temp_path = request.registry.settings['_temp']
fp = tempfile.NamedTemporaryFile('w+b', dir=_temp_path, delete=True)
## creating zipfile and adding files
z = zipfile.ZipFile(fp, "w")
z.write('somefile1.txt')
z.write('somefile2.txt')
z.close()
# rewind fp back to start of the file
fp.seek(0)
response = request.response
response.content_type = 'application/zip'
response.app_iter = FileIter(fp)
return response
I changed the mode on NamedTemporaryFile to 'w+b' as per the docs to allow the file to be written to and read from.
current Pyramid version has 2 convenience classes for this use case- FileResponse, FileIter. The snippet below will serve a static file. I ran this code - the downloaded file is named "download" like the view name. To change the file name and more set the Content-Disposition header or have a look at the arguments of pyramid.response.Response.
from pyramid.response import FileResponse
#view_config(name="download")
def zipper(request):
path = 'path_to_file'
return FileResponse(path, request) #passing request is required
docs:
http://docs.pylonsproject.org/projects/pyramid/en/latest/api/response.html#
hint: extract the Zip logic from the view if possible
I am wondering whether there is a way to upload a zip file to django web server and put the zip's files into django database WITHOUT accessing the actual file system in the process (e.g. extracting the files in the zip into a tmp dir and then load them)
Django provides a function to convert python File to Django File, so if there is a way to convert ZipExtFile to python File, it should be fine.
thanks for help!
Django model:
from django.db import models
class Foo:
file = models.FileField(upload_to='somewhere')
Usage:
from zipfile import ZipFile
from django.core.exceptions import ValidationError
from django.core.files import File
from io import BytesIO
z = ZipFile('zipFile')
istream = z.open('subfile')
ostream = BytesIO(istream.read())
tmp = Foo(file=File(ostream))
try:
tmp.full_clean()
except Validation, e:
print e
Output:
{'file': [u'This field cannot be blank.']}
[SOLUTION] Solution using an ugly hack:
As correctly pointed out by Don Quest, file-like classes such as StringIO or BytesIO should represent the data as a virtual file. However, Django File's constructor only accepts the build-in file type and nothing else, although the file-like classes would have done the job as well. The hack is to set the variables in Django::File manually:
buf = bytesarray(OPENED_ZIP_OBJECT.read(FILE_NAME))
tmp_file = BytesIO(buf)
dummy_file = File(tmp_file) # this line actually fails
dummy_file.name = SOME_RANDOM_NAME
dummy_file.size = len(buf)
dummy_file.file = tmp_file
# dummy file is now valid
Please keep commenting if you have a better solution (except for custom storage)
There's an easier way to do this:
from django.core.files.base import ContentFile
uploaded_zip = zipfile.ZipFile(uploaded_file, 'r') # ZipFile
for filename in uploaded_zip.namelist():
with uploaded_zip.open(filename) as f: # ZipExtFile
my_django_file = ContentFile(f.read())
Using this, you can convert a file that was uploaded to memory directly to a django file. For a more complete example, let's say you wanted to upload a series of image files inside of a zip to the file system:
# some_app/models.py
class Photo(models.Model):
image = models.ImageField(upload_to='some/upload/path')
...
# Upload code
from some_app.models import Photo
for filename in uploaded_zip.namelist():
with uploaded_zip.open(filename) as f: # ZipExtFile
new_photo = Photo()
new_photo.image.save(filename, ContentFile(f.read(), save=True)
Without knowing to much about Django, i can tell you to take a look at the "io" package.
You could do something like:
from zipfile import ZipFile
from io import StringIO
zname,zipextfile = 'zipcontainer.zip', 'file_in_archive'
istream = ZipFile(zname).open(zipextfile)
ostream = StringIO(istream.read())
And then do whatever you would like to do with your "virtual" ostream Stream/File.
I've used the following django file class to avoid the need to read ZipExtFile into a another datastructure (StingIO or BytesIO) while properly impelementing what Django needs in order to save the file directly.
from django.core.files.base import File
class DjangoZipExtFile(File):
def __init__(self, zipextfile, zipinfo):
self.file = zipextfile
self.zipinfo = zipinfo
self.mode = 'r'
self.name = zipinfo.filename
self._size = zipinfo.file_size
def seek(self, position):
if position != 0:
#this will raise an unsupported operation
return self.file.seek(position)
#TODO if we have already done a read, reopen file
zipextfile = archive.open(path, 'r')
zipinfo = archive.getinfo(path)
djangofile = DjangoZipExtFile(zipextfile, zipinfo)
storage = DefaultStorage()
result = storage.save(djangofile.name, djangofile)
Based on the with statement
The context manager’s __exit__() is loaded for later use.
The context manager’s __enter__() method is invoked.
I have seen one of the with usage with zipfile
Question>
I have checked the source code of zipfile located here:
/usr/lib/python2.6/zipfile.py
I don't know where the __enter__ and __exit__ functions are defined?
Thank you
zipfile.ZipFile is not a context manager in 2.6, this has been added in 2.7.
I've added this as another answer because it is generally not an answer to initial question. However, it can help to fix your problem.
class MyZipFile(zipfile.ZipFile): # Create class based on zipfile.ZipFile
def __init__(file, mode='r'): # Initial part of our module
zipfile.ZipFile.__init__(file, mode) # Create ZipFile object
def __enter__(self): # On entering...
return(self) # Return object created in __init__ part
def __exit__(self, exc_type, exc_val, exc_tb): # On exiting...
self.close() # Use close method of zipfile.ZipFile
Usage:
with MyZipFile('new.zip', 'w') as tempzip: # Use content manager of MyZipFile
tempzip.write('sbdtools.py') # Write file to our archive
If you type
help(MyZipFile)
you can see all methods of original zipfile.ZipFile and your own methods: init, enter and exit. You can add another own functions if you want.
Good luck!
Example of creating a class using object class:
class ZipExtractor(object): # Create class that can only extract zip files
def __init__(self, path): # Initial part
import zipfile # Import old zipfile
self.Path = path # To make path available to all class
try: open(self.Path, 'rb') # To check whether file exists
except IOError: print('File doesn\'t exist') # Catch error and print it
else: # If file can be opened
with open(self.Path, 'rb') as temp:
self.Header = temp.read(4) # Read first 4 bytes
if self.Header != '\x50\x4B\x03\x04':
print('Your file is not a zip archive!')
else: self.ZipObject = zipfile.ZipFile(self.Path, 'r')
def __enter__(self): # On entering...
return(self) # Return object created in __init__ part
def __exit__(self, exc_type, exc_val, exc_tb): # On exiting...
self.close() # Use close method of our class
def SuperExtract(member=None, path=None):
'''Used to extract files from zip archive. If arg 'member'
was not set, extract all files. If path was set, extract file(s)
to selected folder.'''
print('Extracting ZIP archive %s' % self.Path) # Print path of zip
print('Archive has header %s' % self.Header) # Print header of zip
if filename=None:
self.ZipObject.extractall(path) # Extract all if member was not set
else:
self.ZipObject.extract(mamber, path) # Else extract selected file
def close(self): # To close our file
self.ZipObject.close()
Usage:
with ZipExtractor('/path/to/zip') as zfile:
zfile.SuperExtract('file') # Extract file to current dir
zfile.SuperExtract(None, path='/your/folder') # Extract all to selected dir
# another way
zfile = ZipExtractor('/path/to/zip')
zfile.SuperExtract('file')
zfile.close() # Don't forget that line to clear memory
If you run 'help(ZipExtractor)', you will see five methods:
__init__, __enter__, __exit__, close, SuperExtract
I hope I've helped you. I didn't test it, so you might have to improve it.
cat-plus-plus is right. But if you want, you can write your own class to add "missed" features. All you need to do is to add two functions in your class (which is based on zipfile):
def __enter__(self):
return(self)
def __exit__(self, exc_type, exc_val, exc_tb):
self.close()
That should be enough, AFAIR.