I am wondering whether there is a way to upload a zip file to django web server and put the zip's files into django database WITHOUT accessing the actual file system in the process (e.g. extracting the files in the zip into a tmp dir and then load them)
Django provides a function to convert python File to Django File, so if there is a way to convert ZipExtFile to python File, it should be fine.
thanks for help!
Django model:
from django.db import models
class Foo:
file = models.FileField(upload_to='somewhere')
Usage:
from zipfile import ZipFile
from django.core.exceptions import ValidationError
from django.core.files import File
from io import BytesIO
z = ZipFile('zipFile')
istream = z.open('subfile')
ostream = BytesIO(istream.read())
tmp = Foo(file=File(ostream))
try:
tmp.full_clean()
except Validation, e:
print e
Output:
{'file': [u'This field cannot be blank.']}
[SOLUTION] Solution using an ugly hack:
As correctly pointed out by Don Quest, file-like classes such as StringIO or BytesIO should represent the data as a virtual file. However, Django File's constructor only accepts the build-in file type and nothing else, although the file-like classes would have done the job as well. The hack is to set the variables in Django::File manually:
buf = bytesarray(OPENED_ZIP_OBJECT.read(FILE_NAME))
tmp_file = BytesIO(buf)
dummy_file = File(tmp_file) # this line actually fails
dummy_file.name = SOME_RANDOM_NAME
dummy_file.size = len(buf)
dummy_file.file = tmp_file
# dummy file is now valid
Please keep commenting if you have a better solution (except for custom storage)
There's an easier way to do this:
from django.core.files.base import ContentFile
uploaded_zip = zipfile.ZipFile(uploaded_file, 'r') # ZipFile
for filename in uploaded_zip.namelist():
with uploaded_zip.open(filename) as f: # ZipExtFile
my_django_file = ContentFile(f.read())
Using this, you can convert a file that was uploaded to memory directly to a django file. For a more complete example, let's say you wanted to upload a series of image files inside of a zip to the file system:
# some_app/models.py
class Photo(models.Model):
image = models.ImageField(upload_to='some/upload/path')
...
# Upload code
from some_app.models import Photo
for filename in uploaded_zip.namelist():
with uploaded_zip.open(filename) as f: # ZipExtFile
new_photo = Photo()
new_photo.image.save(filename, ContentFile(f.read(), save=True)
Without knowing to much about Django, i can tell you to take a look at the "io" package.
You could do something like:
from zipfile import ZipFile
from io import StringIO
zname,zipextfile = 'zipcontainer.zip', 'file_in_archive'
istream = ZipFile(zname).open(zipextfile)
ostream = StringIO(istream.read())
And then do whatever you would like to do with your "virtual" ostream Stream/File.
I've used the following django file class to avoid the need to read ZipExtFile into a another datastructure (StingIO or BytesIO) while properly impelementing what Django needs in order to save the file directly.
from django.core.files.base import File
class DjangoZipExtFile(File):
def __init__(self, zipextfile, zipinfo):
self.file = zipextfile
self.zipinfo = zipinfo
self.mode = 'r'
self.name = zipinfo.filename
self._size = zipinfo.file_size
def seek(self, position):
if position != 0:
#this will raise an unsupported operation
return self.file.seek(position)
#TODO if we have already done a read, reopen file
zipextfile = archive.open(path, 'r')
zipinfo = archive.getinfo(path)
djangofile = DjangoZipExtFile(zipextfile, zipinfo)
storage = DefaultStorage()
result = storage.save(djangofile.name, djangofile)
Related
I have service in my Django project's app, that upload images, and I need to convert all images to webp to optimize further work with these files on the frontend side.
Draft of _convert_to_webp method:
# imports
from pathlib import Path
from django.core.files import temp as tempfile
from django.core.files.uploadedfile import InMemoryUploadedFile
from PIL import Image
# some service class
...
def _convert_to_webp(self, f_object: InMemoryUploadedFile):
new_file_name = str(Path(f_object._name).with_suffix('.webp'))
temp_file = tempfile.NamedTemporaryFile(suffix='.temp.webp')
# FIXME: on other OS may cause FileNotFoundError
with open(temp_file 'wb') as f:
for line in f_object.file.readlines():
... # will it works good?
new_file = ...
new_f_object = InMemoryUploadedFile(
new_file,
f_object.field_name,
new_file_name,
f_object.content_type,
f_object.size,
f_object.charset,
f_object.content_type_extra
)
return new_file_name, new_f_object
...
f_object is InMemoryUploadedFile instance from POST request body (Django automatically create it).
My idea is to create a temporary file, write data from f_object.file.readlines() to it, open this file with PIL.Image.open and save with format="webp". Is this idea a good one or there is another way to make file converting?
I found a pretty clean way to do this using the django-resized package.
After pip installing, I just needed to swap out the imageField for a ResizedImageField
img = ResizedImageField(force_format="WEBP", quality=75, upload_to="post_imgs/")
All image uploads are automatically converted to .webp!
The solution was pretty simple. PIL.Image can be opened using file instance, so I just opened it using f_object.file and then saved it in BytesIO instance with optimization and compression.
Correctly working code:
# imports
from pathlib import Path
from django.core.files.uploadedfile import InMemoryUploadedFile
from PIL import Image
# some service class
...
def _convert_to_webp(self, f_object: InMemoryUploadedFile):
suffix = Path(f_object._name).suffix
if suffix == ".webp":
return f_object._name, f_object
new_file_name = str(Path(f_object._name).with_suffix('.webp'))
image = Image.open(f_object.file)
thumb_io = io.BytesIO()
image.save(thumb_io, 'webp', optimize=True, quality=95)
new_f_object = InMemoryUploadedFile(
thumb_io,
f_object.field_name,
new_file_name,
f_object.content_type,
f_object.size,
f_object.charset,
f_object.content_type_extra
)
return new_file_name, new_f_object
95% was chosen as balanced parameter. There was very bad quality with quality=80 or quality=90.
In my django app, I have a management command which I want to use to process some data and create a model instance as well as file in filesystem and save its path to above instance's file field.
My management command:
import datetime
import timy
from django.core.management.base import BaseCommand
import tempfile
from core.models import Entry
from np_web.models import Dump
import calendar
import csv
class Command(BaseCommand):
help = 'Create dump and file for current day'
def handle(self, *args, **options):
with timy.Timer() as timer:
today = datetime.date.today()
dump_title = '{}.{}.{}.csv'.format(today.day, today.month, today.year)
entries = Entry.objects.all()
dump = Dump.objects.create(all_entries=entries.count())
dump.save()
print('Dump created with uuid:', dump.uuid)
print('Now create data file for dump')
with tempfile.NamedTemporaryFile() as temp_csv:
writer = csv.writer(temp_csv)
writer.writerow([
'Column1',
'Column2',
])
for entry in entries:
writer.writerow([
entry.first_name,
entry.last_name
])
dump.file.save(dump_title, temp_csv)
My Dump model:
class Dump(BaseModel):
created = models.DateField(auto_now_add=True)
all_entries = models.PositiveIntegerField()
file = models.FileField(verbose_name=_('Attachment'), upload_to=dump_file_upload_path, max_length=2048,
storage=attachment_upload_storage, null=True, blank=True)
Anyway, it doesn't work. It is throwing an error:
TypeError: a bytes-like object is required, not 'str'
I am also not sure if using temporary file is a best solution out there.
A few comments to hopefully direct you to a solution.
For starters, "If delete is true (the default), the file is deleted as soon as it is closed.", and assuming that the spacing you have is correct, you are closing the file prior to attempting to save the file. This would result in the file being empty.
My suggestion would be to simply create the file like normal, and delete it afterwards. A simple approach (though there may be a better) would be to create the file, save it, and then remove the original (temp) copy.
In addition, when saving the file, you want to save it using Django's File wrapper.
So you could do something like the following:
from django.core.files import File
import os
with open(temp, 'rb') as f:
doc_file = File(f)
dump.file.save("filename", doc_file, True)
dump_file.save()
try:
os.remove(temp)
except Expection:
print('Unable to remove temp file')
I have the following view code that attempts to "stream" a zipfile to the client for download:
import os
import zipfile
import tempfile
from pyramid.response import FileIter
def zipper(request):
_temp_path = request.registry.settings['_temp']
tmpfile = tempfile.NamedTemporaryFile('w', dir=_temp_path, delete=True)
tmpfile_path = tmpfile.name
## creating zipfile and adding files
z = zipfile.ZipFile(tmpfile_path, "w")
z.write('somefile1.txt')
z.write('somefile2.txt')
z.close()
## renaming the zipfile
new_zip_path = _temp_path + '/somefilegroup.zip'
os.rename(tmpfile_path, new_zip_path)
## re-opening the zipfile with new name
z = zipfile.ZipFile(new_zip_path, 'r')
response = FileIter(z.fp)
return response
However, this is the Response I get in the browser:
Could not convert return value of the view callable function newsite.static.zipper into a response object. The value returned was .
I suppose I am not using FileIter correctly.
UPDATE:
Since updating with Michael Merickel's suggestions, the FileIter function is working correctly. However, still lingering is a MIME type error that appears on the client (browser):
Resource interpreted as Document but transferred with MIME type application/zip: "http://newsite.local:6543/zipper?data=%7B%22ids%22%3A%5B6%2C7%5D%7D"
To better illustrate the issue, I have included a tiny .py and .pt file on Github: https://github.com/thapar/zipper-fix
FileIter is not a response object, just like your error message says. It is an iterable that can be used for the response body, that's it. Also the ZipFile can accept a file object, which is more useful here than a file path. Let's try writing into the tmpfile, then rewinding that file pointer back to the start, and using it to write out without doing any fancy renaming.
import os
import zipfile
import tempfile
from pyramid.response import FileIter
def zipper(request):
_temp_path = request.registry.settings['_temp']
fp = tempfile.NamedTemporaryFile('w+b', dir=_temp_path, delete=True)
## creating zipfile and adding files
z = zipfile.ZipFile(fp, "w")
z.write('somefile1.txt')
z.write('somefile2.txt')
z.close()
# rewind fp back to start of the file
fp.seek(0)
response = request.response
response.content_type = 'application/zip'
response.app_iter = FileIter(fp)
return response
I changed the mode on NamedTemporaryFile to 'w+b' as per the docs to allow the file to be written to and read from.
current Pyramid version has 2 convenience classes for this use case- FileResponse, FileIter. The snippet below will serve a static file. I ran this code - the downloaded file is named "download" like the view name. To change the file name and more set the Content-Disposition header or have a look at the arguments of pyramid.response.Response.
from pyramid.response import FileResponse
#view_config(name="download")
def zipper(request):
path = 'path_to_file'
return FileResponse(path, request) #passing request is required
docs:
http://docs.pylonsproject.org/projects/pyramid/en/latest/api/response.html#
hint: extract the Zip logic from the view if possible
I have a this model...
class MyModel(models.Model):
...
file = models.FileField(upload_to='files/',null=True, blank=True)
...
when i upload a file, example file name is docfile.doc. when i change the file or i rewrite it and upload again docfile.doc the file will become docfile_1.doc and the old docfile.doc is still exist.
i am doing the uploading and saving data in django-admin
my question is, how can i remove the old docfile.doc if i upload the new docfile.doc and the file name is still docfile.doc?
can anyone help me in my case? thanks in advance
i try this one :
def content_file_name(instance, filename):
print instance
print filename
file = os.path.exists(filename)
print file
if file:
os.remove(filename)
return "file/"+str(filename)
class MyModel(models.Model):
...
file = models.FileField(upload_to=content_file_name,null=True, blank=True)
...
but nothing happend, when i upload docfile.doc again, it will become docfile_1.doc and the old docfile.doc still exist.
i got it... i use this
def content_file_name(instance, filename):
print instance
print filename
file = os.path.exists("media/file/"+str(filename))
print file
if file:
os.remove("media/file/"+str(filename))
return "file/"+str(filename)
I don't know exactly how to do it, but i think these links can help you:
Here you can find the two options that a FileField accept. The one that i think will interest you the most is FileField.storage. You can pass a storage object in that parameter.
It says:
FileField.storage: Optional. A storage object, which handles the storage and retrieval of your files.
Then, if you read this you would see that you can write your own storage object. Here is some explanation on how to do it. I think that you could just override the _save method in order to accomplish what you want to do (i.e: if the file already exists, remove it before saving the new copy.)
But be careful! I don't know which is the source of the files you are going to store. Maybe, your app is going to recieve lots of files with the same name, although they are all different. In this case, you would want to use a callable as the FileField.upload_to parameter, so that determine a unique filename for each file your site recieve.
I hope this helps you!
You could also have a look here: ImageField overwrite image file with same name
Define your own storage and overwrite its get available_name method.
The next code solves your problem. You override pre_save method where image is actually saved to storage. Please, rename functions for your project. Use newly created image field ImageFieldWithPermantName with your upload_to function (content_file_name).
If the code is too complicated you could simplify it. I use the code to do more complex operations for uploading images: I create thumbnails on-the-fly in custom _save_image function. So, you can simplify it.
from PIL import Image
import StringIO
from django.db.models import ImageField
from django.db.models.fields.files import FileField
from dargent.settings import MEDIA_ROOT
import os
class ImageFieldWithPermanentName( ImageField ):
def pre_save( self, model_instance, add ):
file = super( FileField, self ).pre_save(model_instance, add)
if file and not file._committed:
if callable( self.upload_to ):
path = self.upload_to( model_instance, "" )
else:
path = self.upload_to
file.name = path # here we set the same name to a file
path = os.path.join( MEDIA_ROOT, path )
chunks = _get_chunks( file.chunks() )
_save_image( chunks, path )
return file
def _get_chunks( chunks ):
chunks_ = ""
for chunk in chunks:
chunks_ += chunk
return chunks_
def _get_image( chunks ):
chunks_ = ""
for chunk in chunks:
chunks_ += chunk
virt_file = StringIO.StringIO( chunks_ )
image = Image.open( virt_file )
return image
def _save_image( chunks, out_file_path ):
image = _get_image( chunks )
image.save( out_file_path, "JPEG", quality = 100 )
I was trying to assign a file from my disk to the FileField, but I have this error:
AttributeError: 'str' object has no attribute 'open'
My python code:
pdfImage = FileSaver()
pdfImage.myfile.save('new', open('mytest.pdf').read())
and my models.py
class FileSaver(models.Model):
myfile = models.FileField(upload_to="files/")
class Meta:
managed=False
Thank you in advance for your help
Django uses it's own file type (with a sightly enhanced functionality). Anyway Django's file type works like a decorator, so you can simply wrap it around existing file objects to meet the needs of the Django API.
from django.core.files import File
local_file = open('mytest.pdf')
djangofile = File(local_file)
pdfImage.myfile.save('new', djangofile)
local_file.close()
You can of course decorate the file on the fly by writing the following (one line less):
pdfImage.myfile.save('new', File(local_file))
If you don't want to open the file, you can also move the file to the media folder and directly set myfile.name with the relative path to MEDIA_ROOT :
import os
os.rename('mytest.pdf', '/media/files/mytest.pdf')
pdfImage = FileSaver()
pdfImage.myfile.name = '/files/mytest.pdf'
pdfImage.save()
if you are getting error like:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position...
then you have to open the file in binary mode: open("mytest.pdf", "rb")
full example:
from django.core.files import File
pdfImage = FileSaver()
pdfImage.myfile.save('new.pdf', File(open('mytest.pdf','rb')))
If you want to upload a local file in django, you can do this
from django.core.files import File
file = open(filepath, 'rb')
file_to_upload = File(file, name=f'{your_desired_name}')
file.close()
Now you can pass the file to django rest serializer to upload or whatever your use-case is. If passing this file to serializer, be sure to close it after it has been passed to serializer.
Note: The filepath here can be django's temporary file or any stored file.