Python Flask on Passenger cannot handle small numbers of concurrent requests

Python Flask on Passenger cannot handle small numbers of concurrent requests - python

I have a Flask (Python) application running on Passenger which works absolutely fine when I test it and I'm the only user.
But as soon as I attempt to have several concurrent connections, each client waits forever for a response. I have tried it with 50 concurrent users, which seems like it should easily be supported.
The app is very simple, reading and writing to a SQLite database once or twice. (Concurrent access of SQLite by this small number of users is not a problem.)
What am I missing?

In the Passenger docs it makes the following suggestion:
Passenger also supports the magic file 'tmp/always_restart.txt'. If
this file exists, Passenger will restart your application after every
request. This way you do not have to invoke the restart command often.
Activate this mechanism by creating the file:
$ mkdir -p tmp
$ touch tmp/always_restart.txt
This is great for development because it means you only need to save your Python files for the latest version of the app to be available to clients.
But it's terrible for production because every client makes its own request and that restarts the Python app. This is a very major overhead for the server, so users are likely to timeout before receiving a response.
Delete the file tmp/always_restart.txt and your concurrency limits will shoot up.

Related

Advice on running flask app ONLY locally forever

I want to create web form that stays on forever on a single computer. Users can come to the computer fill out the form and submit it. After submitting, it will record the responses in an excel file and send emails. The next user can then come and fill out a new form automatically. I was planning on using Flask for this task since it is simple to create, but since I am not doing this on some production server, I will just have it running locally in development on the single computer.
I have never seen anyone do something like this with Flask so I was wondering if my idea is possible or if I should avoid it. I am also new to web development so I was wondering what problems there could be with keeping a flask application stay on 24/7 on a local development computer.
Thanks

There is nothing wrong with doing this in principle however, it is likely not the best solution for the time-to-reward payoff.
First, to answer your question, this could easily be done, even for a beginner, completing this in a few hours with minimal Python and HTML experience could definitely be done. Your app could crash in the background for many reasons (running out of space, bad memory addresses, etc) but most likely you will be fine.
As for specifically building it, it is all possible, there are libraries you can use to add the results to an excel file, or you can easily just append to a CSV (which is what I would recommend). Creating and sending an email, similarly is relatively simple, but again, doing it without python would be much easier.
If you are not set on flask/python, you could check out Google Forms but if you are set on python, or want to use it as a learning experience, it can definitely be done.

Your idea is possible and while there are many ways to do this kind of thing, what you are suggesting is not necessarily to be avoided.
All apps that run on a computer over a long period of time start a process and keep it going until closed. That is essentially what you are doing.
Having done this myself (and still currently doing it) at my business, I can say that it works great.
The only caveat is that to ensure that it will always be available, you need to have the process monitored by some tool to make sure that it gets restarted if it ever closes due to a variety of reasons.
In linux, supervisor is a great tool for doing that. In windows you could register it as a service. But you could also just create an easy way to restart and make it easy for the user to do so if it is down when they need it.

Yes, this could be done. It's very similar to the applications that run on the servers in data centers.
To keep the application running forever or restarting it after your system starts you'll need to use a system manager similar to systemd in Unix. You could use NSSM - the Non-Sucking Service Manager
or Service Control to monitor your application and restart it if it crashes. This will also have to be enabled on startup.
Other than this, you could use Waitres to serve your Flask application. Waitress is a WSGI web server with which you can easily configure the number of threads and workers to enable serving multiple users at the same time.
In a production environment, it's always suggested to use a web server interface like Gunicorn or Waitress.

Scheduled job from Flask application

I am hoping to gain a basic understanding of scheduled task processes and why things like Celery are recommended for Flask.
My situation is a web-based tool which generates spreadsheets based on user input. I save those spreadsheets to a temp directory, and when the user clicks the "download" button, I use Flask's "send_from_directory" function to serve the file as an attachment. I need a background service to run every 15 minutes or so to clear the temp directory of all files older than 15 minutes.
My initial plan was a basic python script running in a while(True) loop, but I did some research to find what people normally do, and everything recommends Celery or other task managers. I looked into Celery and found that I also need to learn about redis, and I need to apparently host redis in a unix environment. This is a lot of trouble for a script that just deletes files every 15 minutes.
I'm developing my Flask app locally in Windows with the built-in development server and deploying to a virtual machine on company intranet with IIS. I'm learning as I go, so please explain why this much machinery is needed to regularly call a script that simply deletes things. It seems like a vast overcomplication, but as I said, I'm trying to learn as I go so I want to do/learn it correctly.
Thanks!

You wouldn't use Celery or redis for this. A cron job would be perfectly appropriate.
Celery is for jobs that need to be run asynchronously but in response to events in the main server processes. For example, if a sign up form requires sending an email notification, that would be scheduled and run via Celery so as not to block the main web response.

Running django migrations on multiple databases simultaneously

We are developing a b2b application with django. For each client, we launch a new virtual server machine and a database. So each client has a separate installation of our application. (We do so because by the nature of our application, one client may require high use of resources at certain times, and we do not want one client's state to affect the others)
Each of these installations are binded to a central repository. If we update the application code, when we push to the master branch, all installations detect this, pull the latest version of the code and restart the application.
If we update the database schema on the other hand, currently, we need to run migrations manually by connecting to each db instance one by one (settings.py file reads the database settings from an external file which is not in the repo, we add this file manually upon installation).
Can we automate this process? i.e. given a list of databases, is it possible to run migrations on these databases with a single command?

If we update the application code, when we push to the master branch,
all installations detect this, pull the latest version of the code and
restart the application.
I assume that you have some sort of automation to pull the codes and restart the web server. You can just add the migration to this automation process. Each of the server's settings.py would read the database details from the external file and run the migration for you.
So the flow should be something like:
Pull the codes
Migrate
Collect Static
Restart the web server

First, I'd really look (very hard) for a way to launch a script that does as masnun suggests on the client side, really hard.
Second, if that does not work, then I'd try the following:
Configure on your local machine all client databases in the settings variable DATABASES
Make sure you can connect to all the client databases, this may need some fiddling
Then you run the "manage.py migrate" process with the extra flag --database=mydatabase (where "mydatabase" is the handle provided in the configuration) for EACH client database
I have not tried this, but I don't see why it wouldn't work ...

What are the rules for automatic Django reloading when one of the code files is changed?

I noticed that ./manage.py runserver automatically reloads when my views.py file is changed. How doesthe underlying code that drives it work?

Automatic Django Server Restart:
Django tries to poll file modification timestamps each second. If it sees there are any changes. it restarts the server.
So basically, Django server checks every second the modification timestamps of every file. If it sees a change in any of them, it will trigger a server restart.
However, adding a new file does not trigger a restart so you will have to restart the server yourself in that scenario.
Exception: If you are using Linux and install pyinotify, kernel signals will be used to autoreload the server.
As per django docs,
If you are using Linux and install pyinotify, kernel signals will be
used to autoreload the server (rather than polling file modification
timestamps each second). This offers better scaling to large projects,
reduction in response time to code modification, more robust change
detection, and battery usage reduction.
System Checks performed while restarting the server:
System check framework is used to perform checks on Django projects.
The system check framework is a set of static checks for validating
Django projects. It detects common problems and provides hints for how
to fix them.
When you start the server, and each time you change Python code while the server is running, the system check framework checks our entire Django project for some common errors. If any errors are found, they are printed to standard output.

Django, RabbitMQ, & Celery - why does Celery run old versions of my tasks after I update my Django code in development?

So I have a Django app that occasionally sends a task to Celery for asynchronous execution. I've found that as I work on my code in development, the Django development server knows how to automatically detect when code has changed and then restart the server so I can see my changes. However, the RabbitMQ/Celery section of my app doesn't pick up on these sorts of changes in development. If I change code that will later be run in a Celery task, Celery will still keep running the old version of the code. The only way I can get it to pick up on the change is to:
stop the Celery worker
stop RabbitMQ
reset RabbitMQ
start RabbitMQ
add the user to RabbitMQ that my Django app is configured to use
set appropriate permissions for this user
restart the Celery worker
This seems like a far more drastic approach than I should have to take, however. Is there a more lightweight approach I can use?

I've found that as I work on my code in development, the Django
development server knows how to automatically detect when code has
changed and then restart the server so I can see my changes. However,
the RabbitMQ/Celery section of my app doesn't pick up on these sorts
of changes in development.
What you've described here is exactly correct and expected. Keep in mind that Python will use a module cache, so you WILL need to restart the Python interpreter before you can use the new code.
The question is "Why doesn't Celery pick up the new version", but this is how most libraries will work. The Django development server, however, is an exception. It has special code that helps it automatically reload Python code as necessary. It basically restarts the web server without you needing to restart the web server.
Note that when you run Django in production, you probably WILL have to restart/reload your server (since you won't be using the development server in production, and most production servers don't try to take on the hassle of implementing a problematic feature of detecting file changes and auto-reloading the server).
Finally, you shouldn't need to restart RabbitMQ. You should only have to restart the Celery worker to use the new version of the Python code. You might have to clear the queue if the new version of the code is changing the data in the message, however. For example, the Celery worker might be receiving version 1 of the message when it is expecting to receive version 2.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.