I would like to store large dataset generated in Python in a Django model. My idea was to pickle the data to a string and upload it to FileField of my model. My django model is:
#models.py
from django.db import models
class Data(models.Model):
label = models.CharField(max_length=30)
file = models.FileField(upload_to="data")
In my Python program I would like to do the following:
import random, pickle
data_entry = Data(label="somedata")
somedata = [random.random() for i in range(10000)]
# Next line does NOT work
#data_entry.file.save(filename, pickle.dumps(somedata))
How should I modify the last line to store somedata in file preserving the paths defined with upload_to parameter?
Based on the answers to the questions I came up with the following solution:
from django.core.files.base import ContentFile
import pickle
content = pickle.dumps(somedata)
fid = ContentFile(content)
data_entry.file.save(filename, fid)
fid.close()
All of it is done on the server side and and users are NOT allowed to upload pickles. I tested it and it works all fine, but I am open to any suggestions.
In you database the file attribute is just a path to the file. So, since you are not doing an actual upload you need to store the file on the disk and then save the path in database.
f = open(filename, 'w')
pickle.dump(somedata, f)
f.close()
data_entry.file=filename
data_entry.save()
Might you not be better off storing your data in a text field? It's not a file upload, after all.
I've never done this, but based on reading a bit of the relevant code, I'd start by looking into creating an instance of django.core.files.base.ContentFile and assigning that as the value of the field.
NOTE: See other answers and comments below - old info and broken links removed (can't delete a once-accepted answer).
Marty Alchin has a section on this in chapter 3 of Pro Django, review here.
Related
I recently learned that the PDF files and images I uploaded to my Heroku website were removed whenever I updated the website. Due to this, I have been trying to store my PDFs in my MongoDB database using Mongoengine (with Flask and Python), and then retrieving them and storing them in the static folder (I was able to successfully do this with my images), with no luck.
Below is the relevant code for my Mongoengine class:
class Article(Document):
uploaded_content = FileField() # Field for storing PDF
uploaded_content_name = StringField() # File name for PDF
The relevant code for my Flask route that is trying to store the PDF:
data = Article()
if request.files['uploaded-article']:
data.uploaded_content = request.files['uploaded-article']
# uploaded_content_name given random name below, and stored in
# database
And then here is my code that tries to retrieve the PDF from mongoengine, and save it to my blog folder:
articles = Article.objects()
for art in articles:
path = os.path.join(app.config['BLOG_FOLDER'], art.uploaded_content_name)
if not os.path.isfile(path):
f = open(art.uploaded_content.read(), 'wb') # This lines gives the error
f.save(os.path.join(app.config['BLOG_FOLDER'] + art.uploaded_content_name), "PDF")
The line that gives me the error is when I try to open the PDF file I stored in my database. I have tried many different ways and have gotten various errors, but one I get is:
No such file or directory: b''. I can confirm that if I read() the database object, its just an empty byte string.
I have also tried changing my flask route to the code below, by storing the open PDF from Flask's request object. However, this gave me the error ValueError: embedded null byte when I tried to open it. However, the read() method gave me at least a really long byte string.
data = Article()
if request.files['uploaded-article']:
# store the PDF in the blog folder
article_pdf = request.files['uploaded-article']
article_pdf.save(os.path.join(app.config['BLOG_FOLDER'], article_pdf_filename))
# Open the PDF just stored in the blog folder
with open(os.path.join(app.config['BLOG_FOLDER'], article_pdf_filename), 'rb') as f:
# Store the opened PDF in the database
data.uploaded_content.put(f)
f.close()
# uploaded_content_name given random name below, and stored in
# database
Another random thing I tried was trying to open the PDF file using the BytesIO data structure, but it resulted in the same error above of an embedded null byte.
Are there any suggestions for how I can properly store and retrieve my PDF from my mongoengine database? My apologies for the complexity of my question - however, if needed I can add more details. If there are any alternative ways of storing my PDFs so they do not get lost on Heroku, I would take that as a valid solution as well.
As a reference for the future, it looks like this was not working because I did not set the content type correctly when putting the pdf in. My original code when saving the PDF to the data.uploaded_content field was:
data.uploaded_content.put(f)
However, I needed to define the mimetype correctly:
data.uploaded_content.put(f, content_type='application/pdf')
With this change it then worked, and I was able to successfully store the PDF in mongoengine. As far as storing the PDF to a folder after it was successfully uploaded, I used the following code:
if art.uploaded_content_name:
extension = art.uploaded_content_name.rsplit('.', 1)[1].lower()
path = os.path.join(app.config['BLOG_FOLDER'], art.uploaded_content_name)
if not os.path.isfile(path):
pdf = art.uploaded_content.read()
with open(os.path.join(app.config['BLOG_FOLDER'], art.uploaded_content_name), 'wb') as f:
f.write(pdf)
I have created a function that creates an excel file using xlwt. I was able to download it as file but I want it to be saved to the database first and what I did does not work.
Here's what I did so far.
import xlwt
response = HttpResponse(content_type='application/ms-excel')
response['Content-Disposition'] = 'attachment; filename="excel-file.xls"'
wb = xlwt.Worbook()
# some excel file generating code here
wb.save(response)
return response
After generating the excel file, I tried it to save it to the database but doing this does not work.
(Code below)
# file = models.FileField
Reports.objects.create(
file=wb
)
I have also tried saving it to a stream first but saving it like this also does not work.
(Code below)
f = io.StringIO()
wb.save(f)
# file = models.FileField
Reports.objects.create(
file=f
)
OK, I think I understand what you are asking now.
Unfortunately, the FileField class is designed to save an (uploaded) file in the file system, not the database. This is probably a good thing. Databases are not designed to be used as file systems. But either way, if you use FileField the database will only contain the file name, not the file content.
So my first suggestion is to use FileField and FileSystemStorage as intended. There is some example code in https://docs.djangoproject.com/en/3.0/topics/files/#file-storage that shows how to save some stuff into the default file storage.
Alternatively, if you really want to store your Excel object in the database, then you need to change the FileField to a BinaryField. Then you can do something like this:
buffer = BytesIO()
wb.save(buffer)
// excel = models.BinaryField()
Reports.object.create(excel=buffer.getValue(), ...)
(There are probably more efficient ways to do this.)
In order to become more familiar with django, I decided to build a website which is gonna let a user upload a csv file, which is then gonna be converted to excel and the user will be able to download it.
In order to achieve that I created a modelform with one model FileField called csv_file as shown below:
#models.py
class CSVUpload(models.Model):
csv_file = models.FileField(upload_to="csvupload/")
def __str__(self):
return self.csv_file
#forms.py
class CsvForm(forms.ModelForm):
class Meta:
model = CSVUpload
fields = ('csv_file', )
and the corresponding view is:
from django.shortcuts import render, redirect
import pandas as pd
import os
#converts file from csv to excel
def convertcsv2excel(filename):
fpath = os.path.join(settings.MEDIA_ROOT + "\csvupload", filename)
df = pd.read_csv(fpath)
newfilename = str(filename) +".xlsx"
newpathfile = os.path.join(settings.MEDIA_ROOT, newfilename)
df.to_excel(newpathfile, encoding='utf-8', index=False)
return newfilename
def csvtoexcel(request):
if request.method == 'POST':
form = CsvForm(request.POST, request.FILES)
if form.is_valid():
form.save()
print(form.cleaned_data['csv_file'].name)
convertcsv2excel(form.cleaned_data['csv_file'].name)
return redirect('exceltocsv')
else:
form = CsvForm()
return render(request, 'xmlconverter/csvtoexcel.html',{'form': form})
right now as you can see I am using Pandas in order to convert the csv file to excel inside the views.py file. My question is, is there a better way to do it (for instance in the form or model module) in order to make the excel file more effectively downloadable?
I appreciate any help you can provide!
First, I want to point out that your example demonstrates an arbitrary file upload vulnerability. Pandas does not validate the format of the file for you, so as an attacker, I can simply upload something like malware.php.csv to your conversion script, and any malicious code I include will remain intact. Since you aren't validating that this file's contents are, in fact, in CSV format, then you are giving users a means to directly upload a file with an arbitrary extension and possibly execute code on your website. Since you are rendering the xlsx format on the webpage the way you are, there's a good chance someone could abuse this. If this is just your own personal experiment to help yourself get familiar, that's one thing, but I strongly recommend against deploying this in production. What you are doing here is very dangerous.
As for your more immediate problem, I'm not personally familiar with Django, but this looks very similar to this question: Having Django serve downloadable files
In your case, you do not want to actually save the file's contents to your server but rather you want to process the file contents and return it in the body of the response. The django smartfile module looks to be exactly what you want: https://github.com/smartfile/django-transfer
This provides components for Apache, Nginx, and lighttpd and should allow you to provide a means to provide files in the response immediately following a request to upload/convert the file. I should emphasize that you need to be very careful about where you save these files, validating their contents, ensure end-users cannot browse to or execute these files under the web server context, and that they are deleted immediately after the response and file is successfully sent.
Someone more familiar with Django can feel free to correct me or provide a usable code example, but this kind of functionality, in my experience, is how you introduce code execution into your site. It's usually a bad idea.
This is my model.
class Product(models.Model)
id = models.AutoField(max_length=10, primary_key=True)
name = models.CharField(max_length=60)
summary = models.TextField(max_length=200)
image = models.FileField(upload_to='product_images', blank=True)
I know how to import data from csv in django model. But I also have image field here.
Can I upload files to models from the method of adding data to models from csv? How?
I am using this method for importing them: http://mitchfournier.com/2011/10/11/how-to-import-a-csv-or-tsv-file-into-a-django-model/
I have images stored in a folder in my system.
FileField is just a reference to a file. It does not put a file into the database as is sometimes mistakenly believed.
Assuming that you have already got the code written for reading through the CSV and fetching the location of the file. Then all you need to do is to follow the example given in FileField.save()
This method takes a filename and file contents and passes them to the
storage class for the field, then associates the stored file with the
model field. If you want to manually associate file data with
FileField instances on your model, the save() method is used to
persist that file data.
Takes two required arguments: name which is the name of the file, and
content which is an object containing the file’s contents.
Something like this:
p = Product()
p.image.save('name from csv',open('path from csv'))
I solved it by saving the name of the image in the csv and just uploading the name in the field like this:
...
product.image = "product_images" + namefromcsv
then I uploaded all the images in the media folder inside product_images .
And it worked very well.
I tried uploading the image with product.image.save() option with File class but it didn't work for me.
How can i create a xml file in django where the file is to contain objects from a queryset?
def my_serialize(request):
from django.core import serializers
data = serializers.serialize('xml', Student.objects.filter(Q(name__startswith='A')),
fields=('name','dob'))
from django.core.files import File
f = open('content.xml', 'w')
myfile = File(f)
myfile.write(data)
myfile.close()
After i call the above function, my content file remains empty, there is no data that gets written in it.
from django.core import serializers
data = serializers.serialize("xml", SomeModel.objects.all())
Take a look at the Serialization documentation.
I don't use version 1.3, so I'm not sure how it works, but could it be that the file is not actually being opened? Could adding myfile.open('w') work? Of course, this may be handled in the File class's init function. Anyway, try giving it a shot and commenting your results.
Update:
BTW, I came up with the idea from looking at the docs. Maybe they'll be able to help too.
http://docs.djangoproject.com/en/1.3/ref/files/file/