How can I disable threading in Flask? - python

I have the following bit of code:
import sqlite3
from flask import Flask
app = Flask(__name__)
db = sqlite3.connect('/etc/db.sqlite')
#app.route('/')
def handle():
# run a query and return a response
if __name__ == '__main__':
app.run('0.0.0.0', 8080, debug=True)
However, when I try to perform some operations on the database object in the request handler, I get the following exception from sqlite3 because it is not a thread-safe library and the query is run from a different thread that Flask spawns, and not from the main thread:
sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread.The object was created in thread id 139886422697792 and this is thread id 139886332843776
I am aware that the "proper" way to do this is to have a function to create an instance of the sqlite3.Connection object and store it in the Flask g global, as outlined here: http://flask.pocoo.org/docs/1.0/patterns/sqlite3/. However, when running this application on production, I use gunicorn -w 4 -b 0.0.0.0:8080 app:app, and there it works fine, because the threads are spawned at the beginning in this case.
While the Flask g global method works in all cases, I would really like to avoid the overhead of creating and destroying sqlite3.Connection objects with every request. So, I would like to disable threading in Flask so that the above code can run without causing issues.
However, even when I change the last line of the above code to app.run(..., threaded=False), I am unable to avoid this error. It seems that Flask still spawns a thread for handling requests.
So, how can I disable threading with Flask?

Don't use sqlite3 module directly in flask. Use Flask_sqlalchemy
I had lots of trouble trying to set up databases on sqlite without it. As soon as I made the switch it was sooooo much easier. You can connect to multiple types of SQL databases too!
Flask sqlalchemy:
http://flask-sqlalchemy.pocoo.org/2.3/
Really the best guide for flask out there:
https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-iv-database

Use scoped_session to avoid this:
session = scoped_session(sessionmaker(bind=engine))()

Related

Why child threads cannot access the current_user variable in flask_login?

I am writing a Flask application and I am trying to insert a multi-threaded implementation for certain server related features. I noticed this weird behavior so I wanted to understand why is it happening and how to solve it. I have the following code:
from flask_login import current_user, login_required
import threading
posts = Blueprint('posts', __name__)
#posts.route("/foo")
#login_required
def foo():
print(current_user)
thread = threading.Thread(target=goo)
thread.start()
thread.join()
return
def goo():
print(current_user)
# ...
The main process correctly prints the current_user, while the child thread prints None.
User('Username1', 'email1#email.com', 'Username1-ProfilePic.jpg')
None
Why is it happening? How can I manage to obtain the current_user also in the child process? I tried passing it as argument of goo but I still get the same behavior.
I found this post but I can't understand how to ensure the context is not changing in this situation, so I tried providing a simpler example.
A partially working workaround
I tried passing as parameter also a newly created object User populated with the data from current_user
def foo():
# ...
user = User.query.filter_by(username=current_user.username).first_or_404()
thread = threading.Thread(target=goo, args=[user])
# ...
def goo(user):
print(user)
# ...
And it correctly prints the information of the current user. But since inside goo I am also performing database operations I get the following error:
RuntimeError: No application found. Either work inside a view function
or push an application context. See
http://flask-sqlalchemy.pocoo.org/contexts/.
So as I suspected I assume it's a problem of context.
I tried also inserting this inside goo as suggested by the error:
def goo():
from myapp import create_app
app = create_app()
app.app_context().push()
# ... database access
But I still get the same errors and if I try to print current_user I get None.
How can I pass the old context to the new thread? Or should I create a new one?
This is because Flask uses thread local variables to store this for each request's thread. That simplifies in many cases, but makes it hard to use multiple threads. See https://flask.palletsprojects.com/en/1.1.x/design/#thread-local.
If you want to use multiple threads to handle a single request, Flask might not be the best choice. You can always interact with Flask exclusively on the initial thread if you want and then forward anything you need on other threads back and forth yourself through a shared object of some kind. For database access on secondary threads, you can use a thread-safe database library with multiple threads as long as Flask isn't involved in its usage.
In summary, treat Flask as single threaded. Any extra threads shouldn't interact directly with Flask to avoid problems. You can also consider either not using threads at all and run everything sequentially or trying e.g. Tornado and asyncio for easier concurrency with coroutines depending on the needs.
your server serves multiple users, wich are threads by themself.
flask_login was not designed for extra threading in it, thats why child thread prints None.
i suggest u to use db for transmit variables from users and run addition docker container if you need separate process.
That is because current_user is implement as a local safe resource:
https://github.com/maxcountryman/flask-login/blob/main/flask_login/utils.py#L26
Read:
https://werkzeug.palletsprojects.com/en/1.0.x/local/#module-werkzeug.local

Where should DB connection pool be initialized in Flask?

I would like to use psycopg2 (directly, without SQLAlchemy). Also, I would prefer using a connection pool to avoid initializing database connections on every request, as opposed to what (I think?) official docs recommend.
However, Flask app context has approximately the same lifetime as request context, which is the lifetime of the request, so defining the pool there would not make sense. The only cross-request place I found is in a global variable on module level, which seems to work, but I'm worried if this is safe?
In other words, where is the correct place to initialize a DB connection pool in a Flask application so that it is used across requests?

Triggering connection pools with sqlalchemy in flask

I am using Flask + SQLAlchemy (DB is Postgres) for my server, and am wondering how connection pooling happens. I know that it is enabled by default with a pool size of 5, but I don't know if my code triggers it.
Assuming I use the default flask SQLalchemy bridge :
db = SQLAlchemy(app)
And then use that object to place database calls like
db.session.query(......)
How does flask-sqlalchemy manage the connection pool behind the scene? Does it grab a new session every time I access db.session? When is this object returned to the pool (assuming I don't store it in a local variable)?
What is the correct pattern to write code to maximize concurrency + performance? If I access the DB multiple times in one serial method, is it a good idea to use db.session every time?
I was unable to find documentation on this manner, so I don't know what is happening behind the scene (the code works, but will it scale?)
Thanks!
You can use event registration - http://docs.sqlalchemy.org/en/latest/core/event.html#event-registration
There are many different event types that can be monitored, checkout, checkin, connect etc... - http://docs.sqlalchemy.org/en/latest/core/events.html
Here is a basic example from the docs on printing a when a new connection is established.
from sqlalchemy.event import listen
from sqlalchemy.pool import Pool
def my_on_connect(dbapi_con, connection_record):
print "New DBAPI connection:", dbapi_con
listen(Pool, 'connect', my_on_connect)

Enable multithreading of my web app using python Bottle framework

I have a web app written with Bottle framework. It have a global somedict list accessed by multiple HTTP query.
After some researching, I find that the Bottle framework only support 1 thread in 1 process mode to run my app(I don't believe it is true, perhaps migrating it to other frameworks like Flask is a good idea.).
1 To enable multi-threading, I find WSGI solution but it does not support multiple processs(1 threads for each process) accessing global variable like somedict in my app, because process will re-init the list every time a query gets handled. How can I handle this issue?
2 Is there any other solutions except WSGI that solve the problem to enable this app to serve multiple HTTP query at once?
from bottle import request, route
import threading
somedict = {}
somedict_lock = threading.Lock()
#route("/read")
def read():
with somedict_lock:
return somedict
#route("/write", method="POST")
def write():
with somedict_lock:
somedict[request.forms.get("key1")] = request.forms.get("value1")
somedict[request.forms.get("key2")] = request.forms.get("value2")
It's best to serve a WSGI app via a server like gunicorn or waitress, which will handle your concurrency needs, but almost no matter what you do for concurrency your global queue in memory will not work the way you want it to. You need to use an external memory store like memcached, redis, etc. Static data is one thing, but mutable state should never be shared between web app processes. That's contrary to Python web server idioms and the typical execution model of Python web apps.
I'm not saying it's literally impossible to do in Python, but it's not the way Python solves this problem.
You can process incoming requests asynchronously, currently Celery seems very suitable for running asynchronous tasks. Read how Celery can do this.

What's the recommended scoped_session usage pattern in a multithreaded sqlalchemy webapp?

I'm writing an application with python and sqlalchemy-0.7. It starts by initializing the sqlalchemy orm (using declarative) and then it starts a multithreaded web server - I'm currently using web.py for rapid prototyping but that could change in the future. I will also add other "threads" for scheduled jobs and so on, probably using other python threads.
From SA documentation I understand I have to use scoped_session() to get a thread-local session, so my web.py app should end up looking something like:
import web
from myapp.model import Session # scoped_session(sessionmaker(bind=engine))
from myapp.model import This, That, AndSoOn
urls = blah...
app = web.application(urls, globals())
class index:
def GET(self):
s = Session()
# get stuff done
Session().remove()
return(stuff)
class foo:
def GET(self):
s = Session()
# get stuff done
Session().remove()
return(stuff)
Is that the Right Way to handle the session?
As far as I understand, I should get a scoped_session at every method since it'll give me a thread local session that I could not obtain beforehand (like at the module level).
Also, I should call .remove() or .commit() or something like them at every method end, otherwise the session will still contain Persistent objects and I would not be able to query/access the same objects in other threads?
If that pattern is the correct one, it could probably be made better by writing it only once, maybe using a decorator? Such a decorator could get the session, invoke the method and then make sure to dispose the session properly. How would that pass the session to the decorated function?
Yes, this is the right way.
Example:
The Flask microframework with Flask-sqlalchemy extension does what you described. It also does .remove() automatically at the end of each HTTP request ("view" functions), so the session is released by the current thread. Calling just .commit() is not sufficient, you should use .remove().
When not using Flask views, I usually use a "with" statement:
#contextmanager
def get_db_session():
try:
yield session
finally:
session.remove()
with get_db_session() as session:
# do something with session
You can create a similar decorator.
Scoped session creates a DBMS connection pool, so this approach will be faster than opening/closing session at each HTTP request. It also works nice with greenlets (gevent or eventlet).
You don't need to create a scoped session if you create new session for each request and each request is handled by single thread.
You have to call s.commit() to make pending objects persistent, i.e. to save changes into database.
You may also want to close session by calling s.close().

Categories

Resources