Where to store user uploaded files in Django - python

I'm building a Django app where users will upload a CSV file. Each row in the CSV file will then be added to the DB (after validating the data). The file can then be discarded.
From the Django documentation, I'm using this to save the file.
def handle_uploaded_file(f):
with open('some/file/name.txt', 'wb+') as destination:
for chunk in f.chunks():
destination.write(chunk)
There's a lot of warnings on the Django site about handling user uploaded files so my question is, where should these files be saved? My guess is it doesn't matter, but I want to be sure.
At the moment I plan to create a variable called UPLOADS_URL in my settings file. All uploaded files will then be stored in there.

Indeed it does not matter so much where you put these files, especially if you discard them after. The only important thing is to put them in a place that is not accessible by your webserver. So for example if your static files and/or your app code is stored in the /var/www directory, do NOT put the uploaded files there.
There are existing Django settings called MEDIA_ROOT and MEDIA_URL that are used as the default place to store the uploaded files. MEDIA_ROOT is the path where the files are stored, MEDIA_URL is the URL path that a http client may use to retrieve the uploaded files (so in your case it is not needed).
What you need to be really careful about is the content of the CSV files. As you are going to store them in DB and re-use them later, you should validate carefully the content before storing it. For example, if you are storing a string that you will later display on the site, you want to make sure it doesn't contain any javascript...

Related

Using temporary files and folders in Web2py app

I am relatively new to web development and very new to using Web2py. The application I am currently working on is intended to take in a CSV upload from a user, then generate a PDF file based on the contents of the CSV, then allow the user to download that PDF. As part of this process I need to generate and access several intermediate files that are specific to each individual user (these files would be images, other pdfs, and some text files). I don't need to store these files in a database since they can be deleted after the session ends, but I am not sure the best way or place to store these files and keep them separate based on each session. I thought that maybe the subfolders in the sessions folder would make sense, but I do not know how to dynamically get the path to the correct folder for the current session. Any suggestions pointing me in the right direction are appreciated!
I was having this error "TypeError: expected string or Unicode object, NoneType found" and I had to store just a link in the session to the uploaded document in the db or maybe the upload folder in your case. I would store it to upload to proceed normally, and then clear out the values and the file if not 'approved'?
If the information is not confidential in similar circumstances, I directly write the temporary files under /tmp.

Accessing app-specific server-side generated files in django template

I have a django app (my_app) that based on the user query:
creates a file from the db
runs a program on the file from step-1 and gets an output file
generates a json file from the step-2 output file
renders a D3 visualization from a django template using the data from the json file from step-3
I need the program to run on the server side and the json file to be generated server-side as well.
Because the json files are query-specific, I thought it's not a good idea to keep these files in the /static/ folder and thought of keeping the files (even if temporarily) in e.g. /myapp/output_files/ folder.
The problem is that there is no url pattern corresponding to /myapp/output_files/my_file.json and I get a "Page not found (404)" error if I try to open the generated file and it obviously doesn't load in the javascript code in the template.
Is there a better way to design the system?
If the design is ok, how can I access a json file in the app's folder from the django template? Do I need something in the urls.py?
P.S. Everything works fine if I change the json location to /static/ or its subfolder.
Just add the location to your STATICFILES_DIRS setting as shown here
However, you probably need to build a view function that can somehow return the json based on some parameter in the url. Static files are meant to stay static...

Django server side caching of uploaded file

I am running a server in django and I want to serve a file. I am storing the file under '/upload/directory/filename' and I return it using
from django.shortcuts import redirect
file_path = '/upload/directory/filename'
return redirect(file_path)
However, the file appears to have been cached to the first version that had been placed locally and is never updated. Even if I remove the file, the file is still served. I checked that if I change the path to 'upload/directory_2/filename then I correctly get my new file. What is going on and how can I fight this ?
This is happening locally and I am making a direct server request (hence there is no possibility of browser caching or anything else).
Additional information:
I understand that maybe I should be using static files, although for instance this answer suggests that it is quite debatable for files that I am uploading myself.
When I say "I want to serve files with django" I just mean that I have associated a file path to a particular entity in my database (using models.FileField) and based on what the user requests I want to return this file. I kind of doubt this is a clear cut for using static files in that case.
There are many workarounds to my issue, like generating unique filenames every time I want to "clear my cache" or explicitly opening the file:
with open(absolute_file_path) as file:
response = HttpResponse(file.read(), content_type='application/octet-stream')
My question was about understanding why the particular piece of code above does what it does, i.e. leads to data caching, and how to prevent this.
If you must do this using Django itself, I would suggest skipping the redirect and setting up your app according to these directions:
https://docs.djangoproject.com/en/1.11/howto/static-files/#serving-files-uploaded-by-a-user-during-development
Make your MEDIA_URL something like /media/ or if you want it to match your current case /upload/ or something. MEDIA_ROOT could point to os.path.join(BASE_DIR, 'upload') and your FileField(upload_to='directory').
You must use django static facility to serve static files. On your local machine this is setup properly in settings to point a folder static in your app then you can point to the file using the static function.
All is explained here:
https://docs.djangoproject.com/en/1.11/howto/static-files/
In an url you access the code:
from django.templatetags.static import static
url_to_file = static('some_app/path/to_file')
On a production machine the static files are served by a web server or a specific service like aws S3 or similar but not from django! For this reason the static facility is a must.
To avoid the cache, in case you have this problem, have a look at the never_cache decorator: https://docs.djangoproject.com/en/1.11/topics/http/decorators/#caching
There is a special HttpResponse in Django that can help you to serve a file easily.
def file_dowload(request):
file_name = request.GET.get('file_name', '')
try:
temp = open(file_name, 'rb')
except IOError:
temp = None
if not temp:
return HttpResponse("Not found")
response = FileResponse(temp, content_type='text/plain')
return response

DJANGO: How to get list of filenames in views.py on server? [duplicate]

I am able to access the static file in question via direct url (localhost:8000/static/maps/foo.txt), so I guess I have it all working well. But I can't do the following: I want to open that text file in views.py. It's because I'm working on a simple web browser adventure game and I wanted to store maps in static/maps and load those maps using f=open('/static/maps/' + mapname + '.txt', 'r'). I get the IOError: no such file or directory. I really don't understand it, because there is such directory when I search for it in address.
Can it be done somehow?
You need to use the place they are stored on disk, which is probably in settings.STATIC_ROOT or settings.STATICFILES_DIRS, not the place they are being served by the web app.
Note however that if you are modifying these files programmatically, they aren't (by definition) static files. You'd be better off using the MEDIA_ROOT location. Also note that Django has helpers to do this sort of thing - see the documentation on Managing files.

Django: create csv file and load it in view using Javascript

I'm developing a web application which creates visualizations of some data.
The data is taken from third parties, using their APIs, and imported in my database. The importation will be done sporadically, therefore my database will be pretty static.
The visualizations will be dynamically created in JavaScript, using d3.
When thinking about how to pass (and format) the data from the server to the client I thought I could export it to a .csv file and then load it from javascript (d3 has a builtin csv parser).
This way the csv file doubles as a caching system: it will regenerated (and therefore the database queried), only if it is older than, say, a week.
My question is: where and how should I save the generated the csv file? STATIC_ROOT, MEDIA_ROOT, another hardlinked directory?
Also, do you think the csv system is a good idea?
Sorry if the questions may seem useless, I literally picked up both django and d3 less than a week ago.
You can place the file in STATIC_ROOT, that would be a suitable location.
Two thoughts on the side:
Did you think about locking / mutexing the csv file while it is writing? Or is it not a problem if a client may get half a CSV file if the request comes in at an unlucky moment?
CSV is not the standard way to transfer a data series to a JS client. I would probably write a JSON array to the file.
In Django, we usually store the static files - files used by our website to render content (like CSS, JS) under the STATIC_ROOT. Files under the MEDIA_ROOT are usually media files like images and videos that Django lets the webserver to serve. I would store the visualization data file under a data directory within my app (which goes under the main django project directory). This article is a good resource to structure your django project.
As for using a CSV file for the data file that drives the visualization, I would prefer exporting your data as a JSON, since it is a more compact notation. Also, I would assume decoding JSON in JavaScript would be faster than CSV. Although it would depend on other parameters like the size and structure of data in the file.

Categories

Resources