Why isn't Django serving staticfiles in production?

Why isn't Django serving staticfiles in production? - python

I am wondering the reason why Django does not serve the statifiles in production, when DEGUB = False.
STATICFILES_DIRS
We specify STATICFILES_DIRS to tell Django where to look for staticfiles that are tied up to a specified app.
STATIC_ROOT
We specify STATIC_ROOT to tell Django where to store the files once we run python manage.py collectstatic, so everystatic file is stored in the path specified in STATIC_ROOT.
Assume that we set STATIC_ROOT = "staticfiles/".
This means that once we run the collectstatic command, all the files that are inside STATICFILES_DIRS paths are going to be stored in "staticfiles/"
STATIC_URL
Finally we specify STATIC_URL as "prefix" to tell Djando where to look for staticfiles, for example in the HTML <link> tag, the url that we see is based on STATIC_URL value
When we upload our project to the server, we upload the entire project, so every single file. Why can't Django serve staticfiles itself when running on server?
As I just said, we upload the entire folder, so the files we uploaded are there (and the staticfiles too!).
QUESTIONS
I am just wondering, why do we have to specify the staticfiles based on server in production, when Django could do everything for us as it have always done in localhost?
Isn't load the files from another storage so much slower than load them from main folder of the project?

I am just wondering, why do we have to specify the staticfiles based on server in production, when Django could do everything for us as it have always done in localhost?
Because it is likely inefficient and insecure. Each time a request is made, the request passes through all middleware then the view will produce a response that will again pass through the middleware to the client. If you request the same file a second time, it will likely not have any caching, and thus repeat that process again. If you work with a webserver like Nginx/Apache, it will probably cache the result. If you work with a CDN, then it will also contact the nearest server and thus get access to these resources in a more efficient way.
Another problem is security. If you specify a path to a file that is not supposed to be served, then the webserver should prevent the browser from accessing that file. Some hackers for example try to access the source files of the browser to then look for vulnerabilities. This should not be possible. Likely a web server like Apache or Nginx will have more advanced security mechanisms for this in place.
If you really want to, you can use WhiteNoise to let Django serve static files and media files in production. This Django application has been optimized for security and efficiency. Although it is hard to tell if it will have the same level as aan Apache or Nginx server.
Isn't load the files from another storage so much slower than load them from main folder of the project?
The webserver will not contact the other storage: the browser will do that. It thus is possible that instead of the webserver, it will contact a CDN. It is possible that this is slightly less efficient, since a webbrowser usually reuses the open connection to the server to make more requests, but often you already contacted that CDN, for example for JavaScript files. Furthermore CDNs are optimized to deliver content as efficient as possible: the browser will usually contact a browerser close to the client, and usually there is also load balancing and redundancy in place to make it less likely that the server can no longer serve the resource.

Related

Share media between multiple django(VMs) servers

We have deployed a django server (nginx/gunicorn/django) but to scale the server there are multiple instances of same django application running.
Here is the diagram (architecture):
Each blue rectangle is a Virtual Machine.
HAProxy sends all request to example.com/admin to Server 3.other requests are divided between Server 1 and Server 2.(load balance).
Old Problem:
Each machine has a media folder and when admin Uploads something the uploaded media is only on Server 3. (normal users can't upload anything)
We solved this by sending all requests to example.com/media/* to Server 3 and nginx from Server3 serves all static files and media.
Problem right now
We are also using sorl-thumbnail.
When a requests comes for example.com/,sorl-thumbnail tries to access the media file but it doesn't exist on this machine because it's on Server3.
So now all requests to that machine(server 1 or 2) get 404 for that media file.
One solution that comes to mind is to make a shared partition between all 3 machines and use it as media.
Another solution is to sync all media folders after each upload but this solution has problem and that is we have almost 2000 requests per second and sometimes sync might not be fast enough and sorl-thumbnail creates the database record of empty file and 404 happens.
Thanks in advance and sorry for long question.

You should use an object store to save and serve your user uploaded files. django-storages makes the implementation really simple.
If you don’t want to use cloud based AWS S3 or equivalent, you can host your own on-prem S3 compatible object store with minio.
On your current setup I don’t see any easy way to fix where the number of vm s are dynamic depending on load.
If you have deployment automation then maybe try out rsync so that the vm takes care of syncing files with other vms.

Question: What was the problem?
we got 404 on other machines because normal requests (requests asking for a template) would get a 404 not found on thumbnail media.
real problem was with sorl-thumbnail template tags.
Here is what we ended up doing:
In models that needed a thumbnail, we added functions to create that specific thumbnail.
and using a post-save signal in the admin machine called all those functions to make sure all the thumbnails were created after save and the table for sorl-thumbnail is filled.
now in templates instead of calling sorl-thumbnail template tags now we call a function in model.

Deployment of django app - Scaling, Static Files, Servers

I am working my way to deploy my django app on a Linux Server. Encountered with various problems and want someone to clarify that for me. I searched for days but the answer I found are either too broad or faded away from topic.
1) According to django docs, it is inefficient to serve static files locally. Does that means that all the static files including html,css,js files should be hosted on another server?
2) I have an AWS S3 bucket hosting all my media files, can I use that same bucket (or create a new bucket) to host my static files if above answer is yes. If so, is that efficient?
3) According to search, in order for django to scale horizontally, it should be a stateless app, does it means that I also have to host my database on a different location than my own Linux server ?

1) It is completely fine to host your staticfiles on the same server as your django application however to serve said files you should use a web server such as NGINX or Apache. Django was not designed to serve static data in a production environment. Nginx and Apache on the other hand do great job at it.
2) You can definitely host your static and media files inside an S3 bucket. This will scale a lot better than hosting them on a single server as they're provided by a separate entity, meaning that no matter how many application servers you're running behind a load balancer, all of them will be requesting staticfiles from the same source. To make it more efficient you can even configure AWS' CloudFront which is Amazons CDN (content delivery network).
3) Ideally your database should be hosted on a separate server. Databases are heavy on resources therefore hosting your database on the same server as your application may lead to slowdowns and sometimes outright crashes. Scaling horizontally, you'd be connecting a lot of application servers to a single database instance; effectively increasing the load on that server.
All of the points above are relative to your use case and resources. If the application you are running doesn't deal with heavy traffic - say a few hundred hits a day - and your server has an adequate amount of resources (RAM, CPU, storage) it's acceptable to run everything off a single box.
However if you're planning to accept tens of thousands of connections every day it's better to separate the responsibilities for optimum scalability. Not only it makes your application more efficient and responsive but it also makes your life easier in the long run when you need to scale further (database clustering, nearline content delivery, etc).
TL;DR: you can run everything off a single server if it's beefy enough but in the long run it'll make your life harder.

How to load local static files if CDN fails? (django)

I am using django to develop my site and I am trying to optimize my site for speed so I want to use CDN for my bootstrap and if it fails than i want to use the copy from my server, I have seen
How to load local files if CDN is not working
but it does it in javascript but it doesn't solve my problem, I want to know
how to check if CDN working with Django and if not serve the static files from server?

Do not try to do this in server-side. CDN services are built to be reliable in that they are geographically distributed and fault-tolerant and use the best practices available.
You can't find out if the CDN servers work for your user by pinging them from your Django application. Your user is located differently and might have very different network conditions, e.g. be using a mobile network connection from a different country, and have a network provider that experiences outages.
You could, indeed, ping the CDN servers, which would probably resolve into your Django application getting one CDN load balancer address and trying to see if that works for you or not, and falling back to others, if the CDN source is down. Then you would probably have to see, for every resource you have, that is every JavaScript and CSS file, if they are available, and load a local backup, if not. On the server side. This is very slow and error-prone. Networks can fail for a googolplex different reasons.
The proper way to go about this is to
Only use local servers for serving those static files, distribute the load with application servers that each have their own versioned copies of your static files. If your application server works, it should have the copies available as well;
Do the checks on the client-side, because server side queries will slow your server down to a halt if it is not close to your CDN network, and you generally do not wish to depend on any external resources on the server side;
Or, as I would recommend, set up your own CDN which serves your local files from a proxied URL or subdomain. Read more below.
Ideally, if you wish to use a reliable CDN source, you would set up a CDN server with redundancy on the same infrastructure you use to host your files in.
For your site is located in www.example.com, which is your Django application server address. You would set up cdn.example.com domain which would be a CDN service, for example CloudFront or similar, that proxies your requests to www.example.com/static/ and mirrors your static files as a CDN, taking the load off your application server. You can just define your Django application to use the http://cdn.example.com/static address for serving static files. There are multiple different services for providing a CDN for your application, CloudFront is just one option. This will get your static, CDNable files near to your user.
Ideally, your application servers and CDN servers are hosted on the same, redundant infrastructure, and you can claim that if one part of your infrastructure works the others will as well, or other your service provider is violating your SLA. You do NOT wish to use broken infrastructure and drive away your customers, or use hacks that will eventually break in production.

I don't know that there would be a good way of doing this, but this is the method I would use if the people who paid me really wanted me to make this work
You could setup custom URL tags in a separate pluggable app, and have it ping your CDN target, and then if it fails, serve a local URI. Admittedly, pinging the CDN target doesn't mean it will actually serve the file, so a more robust way would be to attempt to GET the file from the CDN provider, and and if successful, send the remote URI, and if it fails, send the local URI. This would double the traffic of your static files for every request.
This also requires you to setup static file serving just like you would if you planned to serve everything from that server. I wouldn't recommend any of this. I would recommend doing what #ceejayoz says and just using a reliable CDN. That's their whole purpose in life is to prevent doing any of this.

This is achievable, but the setup might be a little tedious.
Basically you are trying to do failover between CDN and your origin server. If CDN fails, the request fails over to your origin server. One option is to use DNS level failover with primary to CDN CNAME, backup to your origin server hostname.
And you also include healthcheck in your DNS setup for both CDN and origin server. Once healthcheck fails for CDN, DNS should fail over to your origin server and serve static file from there automatically.

Serving static files through MongoDB's GridFS

I'm fairly new to Django and I'm trying to deploy a small hobby project on OpenShift. I know that there are certain conventions that recommend against letting Django serve static and media files because it's inefficient. I've also noticed that Django refuses to serve media files when DEBUG is turned off. That's why I'm looking at better ways to serve this content.
Django's documentation endorses CDNs like Amazon S3 as one of the best way to serve static, but as a hobbyist I'd rather stick to freemium solutions for now. I found out that MongoDB - another technology I'm fairly new to - provides GridFS as a storage backend. I can get free MongoDB storage through MongoLab, so this is looking interesting to me.
Would such a construction work in practice or is this crazy talk? If this is feasible, which changes would I need to make to my OpenShift environment and Django settings to let GridFS serve static content? I've seen alternative setups where people use CloudFlare's free CDN to serve their static content, but then I wouldn't be able to upload/access media files from my local development environment.

To make a long story short: it is a bad idea.
Here is why: in order to serve static files, you would first need to process the request, get the data from GridFS, which actually scatters the files in 255k chunks which would have to be collected (depending on the size, of course) and only then they could be returned.
What I tend to do is to use varnish to cache the static files served by the application, be it either django or a Servlet container. This works like this:
All requests are sent to varnish, which either serves the requested ressource from cache or hands then to a backend.
Varnish uses your django app as a backend. I usually have django run behind an additional lighttpd, though.
When a static file is returned by django, varnish puts this file into an in memory caching area. I usually have some 100M allocated for this - sometimes even less. The size of your static files should be well known to you. I tend to do a simple caching configuration like "cache all files ending with .css, .js, .png".
All subsequent requests for this static ressource will be served by varnish now - from memory, via the sendfile system call, not even hitting the backend.
All in all: this way, load is taken from your application, latency is reduced, the ressources are delivered lightning fast, it is easy to set up without any programming effort.
Edit: as for an OpenShift environment: simply leave it as is. Serving static files from MongoDB simply does not make sense.

How to serve django templates to javascript

I am using django on the server side and obviously javascript on the client side. Now i want to use the plate template engine on the client.
What's the best way so serve django templates to the client? We taught of some ways doing that.
Create a view that serves the raw templates.
probably not the best method
Copy the needed templates to the static folder.
this could be done with a custom static files finder
the broser is able to cache the templates
Provide the templates using a template tag which puts the raw template into a javascript variables.
templates received this way can not be cached seperatly
is a django app out there that makes this easyer?
The reason i need the templates on the client is, that i want to use the same clients on the server and the client side. When the page is first loaded, the full template is rendered on the server, when navigating trough the application only the needed data gets loaded and the page change is done using push state.

If you need to be able to have A) dynamically generated plate templates, or B) dynamically created plate templates (e.g., entered into the DB via the admin, etc.), You'll want to go with 1 (not a bad thing - django is made for serving text content, so as long as you need to have it in a dynamic manner, there's no problem doing it). 3 is a bad choice, because it means that a browser can't cache the static resource (if it's output into each page)... unless you need different plate templates for each page of course.
If you don't need A or B from above, I'd just stick the templates in your static dir, as you mentioned (e.g., collectstatic or simply add them to your repo, if they're a part of your app).
Regarding an app that makes this easy - you could look at Django Chunks (output a static chunk into a place in the page, like `{% chunk "header-snippet" %}), but I don't think you need that.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.