This question already has an answer here:
Store large data or a service connection per Flask session
(1 answer)
Closed 2 years ago.
I'm using Flask to build my web application, and I'd like to register a global resource that represents a connection to a remote service which lasts longer than a request (in this case, the connection is a SOAP connection which can be valid for up to 30 days).
Another example could be a database like MongoDB which handles connection-pooling in the driver, and would perform badly if you created a new connection on each request.
Neither the Application Context, nor the Request Context seem appropriate for this task.
The question "Pass another object to the main flask application" suggests that we store such resources on the app.config dictionary.
If it MUST coincide with the instantiation of your app, then you should subclass Flask. This really doesn't do a whole lot if all your doing is attaching a resource to the object, considering the creation of the app is a process. Truth is, you probably don't need to do this if the application doesn't need to use your resource during instantiation.
class MyApp(Flask):
def __init__(self, *args, **kwargs):
setattr(self, 'some_resource', SomeResource())
super(Flask, self).__init__(*args, **kwargs)
app = MyApp(__name__)
app.some_resource.do_something()
Unless you have some specific use case, you are probably better off writing a wrapper class, turning it into a flask-extension, creating it on the module level and storing it on app.extensions.
class MyExtensions(object):
def __init__(self, app=None):
self.app = app
if app is not None:
self.init_app(app)
def init_app(self, app):
app.extensions['my_extension'] = SomeResource()
app = Flask(__name__)
my_extension = MyExtension(app)
Then you can choose if you want to give each app its own resource (like above), or if you'd rather use a shared resource that is always pointing to the current app
from flask import _request_ctx_stack
try:
from flask import _app_ctx_stack
except ImportError:
_app_ctx_stack = None
stack = _app_ctx_stack or _request_ctx_stack
class SomeResource(object):
def do_something(self):
ctx = stack.top
app = ctx.app
my_resource = SomeResource()
my_resource.do_something()
I don't think you want to store this on the application context, because " it starts when the Flask object is instantiated, and it implicitly ends when the first request comes in"
Instead, you want your resource created at the module level. Then you can attach it to the application as an extension, or yes, even in the config- though it would be more consistent and logical to make a quick extension out of it.
This is a Python problem, the reason Flask and other web frameworks do not allow this/make it very difficult is because there is no defined memory model for concurrency. E.g. suppose your webserver serves requests in their own processes (e.g. Gunicorn). This will result in processes forking off the main process, and thus copying your "global" variable. If you run a server on windows then due to the lack of a fork it will instead make a new process which reimports all modules, probably producing a clean version of your global state which does not remember any changes made to it in the main process pre-thread. Or not finding it at all if you declare it behind a name guard.
There are only two possible ways to consistently do this "right". One is to build a webapp specifically to hold your global state and which is restricted to a single thread, and have your other webapp call state from this webapp.
The second, is to use Jython + threads. Since Jython runs in a JVM it has a well defined memory model with shared state, so you can use all the java EE tricks like container managed concurrency....
Related
I am writing a Flask application and I am trying to insert a multi-threaded implementation for certain server related features. I noticed this weird behavior so I wanted to understand why is it happening and how to solve it. I have the following code:
from flask_login import current_user, login_required
import threading
posts = Blueprint('posts', __name__)
#posts.route("/foo")
#login_required
def foo():
print(current_user)
thread = threading.Thread(target=goo)
thread.start()
thread.join()
return
def goo():
print(current_user)
# ...
The main process correctly prints the current_user, while the child thread prints None.
User('Username1', 'email1#email.com', 'Username1-ProfilePic.jpg')
None
Why is it happening? How can I manage to obtain the current_user also in the child process? I tried passing it as argument of goo but I still get the same behavior.
I found this post but I can't understand how to ensure the context is not changing in this situation, so I tried providing a simpler example.
A partially working workaround
I tried passing as parameter also a newly created object User populated with the data from current_user
def foo():
# ...
user = User.query.filter_by(username=current_user.username).first_or_404()
thread = threading.Thread(target=goo, args=[user])
# ...
def goo(user):
print(user)
# ...
And it correctly prints the information of the current user. But since inside goo I am also performing database operations I get the following error:
RuntimeError: No application found. Either work inside a view function
or push an application context. See
http://flask-sqlalchemy.pocoo.org/contexts/.
So as I suspected I assume it's a problem of context.
I tried also inserting this inside goo as suggested by the error:
def goo():
from myapp import create_app
app = create_app()
app.app_context().push()
# ... database access
But I still get the same errors and if I try to print current_user I get None.
How can I pass the old context to the new thread? Or should I create a new one?
This is because Flask uses thread local variables to store this for each request's thread. That simplifies in many cases, but makes it hard to use multiple threads. See https://flask.palletsprojects.com/en/1.1.x/design/#thread-local.
If you want to use multiple threads to handle a single request, Flask might not be the best choice. You can always interact with Flask exclusively on the initial thread if you want and then forward anything you need on other threads back and forth yourself through a shared object of some kind. For database access on secondary threads, you can use a thread-safe database library with multiple threads as long as Flask isn't involved in its usage.
In summary, treat Flask as single threaded. Any extra threads shouldn't interact directly with Flask to avoid problems. You can also consider either not using threads at all and run everything sequentially or trying e.g. Tornado and asyncio for easier concurrency with coroutines depending on the needs.
your server serves multiple users, wich are threads by themself.
flask_login was not designed for extra threading in it, thats why child thread prints None.
i suggest u to use db for transmit variables from users and run addition docker container if you need separate process.
That is because current_user is implement as a local safe resource:
https://github.com/maxcountryman/flask-login/blob/main/flask_login/utils.py#L26
Read:
https://werkzeug.palletsprojects.com/en/1.0.x/local/#module-werkzeug.local
I would like to use psycopg2 (directly, without SQLAlchemy). Also, I would prefer using a connection pool to avoid initializing database connections on every request, as opposed to what (I think?) official docs recommend.
However, Flask app context has approximately the same lifetime as request context, which is the lifetime of the request, so defining the pool there would not make sense. The only cross-request place I found is in a global variable on module level, which seems to work, but I'm worried if this is safe?
In other words, where is the correct place to initialize a DB connection pool in a Flask application so that it is used across requests?
This question already has answers here:
Store large data or a service connection per Flask session
(1 answer)
Are global variables thread-safe in Flask? How do I share data between requests?
(4 answers)
Closed 4 years ago.
I'm have written a python script on a Raspberry Pi that controls a large and somewhat complex piece of hardware. Currently the use interface is a python console. I run the python program, and the can enter commands from the console with input("> ").
Now I want to add a web interface to this program, and I'm trying to figure out the right way to do that. I've written a basic web UI in flask, but I haven't found a good way to connect the flask interface with the main script. Because the server controls a single piece of hardware, the main object (handling the hardware control) is only instantiated once. In addition, there is significant hardware configuration that must happen each time the script is run (each time the main object is created.)
Flask seems to create a new thread for each client session, and only persist variable across that session. How can I link flask events (i.e, web user presses a button) to a method in the main object?
Am I using the wrong tool for the job? What is the right tool? What is the correct way to do this? Alternatively, what is a jank way to do this?
EDIT: The linked questions are similar, and led me down the path to the answer below, but they really didn't answer the question, so much as point to some code that solved that specific example. I think the below answer is far more helpful. (Of course I'm a bit biased; I wrote it)
I found a solution, so I'll share it here. (I guess I can't answer my own questions?)
The Flask Web App system (and all WSGI web apps?) rely on the principle that the web app can be executed in a totally fresh environment, without needing to be passed objects on creation. I guess this makes sense for Web Apps, but it adds a ton of annoying complexity to Web UIs, that is, web interfaces purpose built to interface a single instance of a larger program. As a hardware/solutions engineer, I tend to need a Web UI to control a single piece of hardware. If there is something better than flask for this purpose please leave a comment. But flask is nice, and I now know its possible to use for this purpose.
Flask Web Apps are not designed to do the "heavy lifting" themselves. They maintain very limited state (i.e. every request triggers a fresh context), and as mentioned can't be passed references to other objects. In a proper Web App a full stack developer would connect Flask to a database server and other such WebDev-y systems. For controlling hardware, our goal is to trigger the execution of some arbitrary methods in another python process (quite possibly a separately executed process).
In python a convenient way to achieve this execution is the multiprocessing.managers module. Managers are a tool that allows you to easily construct Proxies, which link objects across processes. If you have an object bar = Bar() you can produce a <AutoProxy[get_bar]> proxy, that allows you to manipulate the original bar object from far away. Far away can be in a child process, on another computer across the internet. Here's an example.
server.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
class Bar: #First we setup a dummy class for this example.
def __init__(self):
self.text = ""
def read(self):
return str(self.text)
def write(self, string):
self.text = str(string)
logger.error("Wrote!")
logger.error(string)
bar = Bar() #On the server side we create an instance: this is the object we want to share to other processes.
m.BaseManager.register('get_bar', callable=lambda:bar) #then we register a 'get' function in the manager,
# to retrieve our object from afar. the lambda:bar is just shorthand for a function that returns the bar object.
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server on port 50000, with a password.
server = manager.get_server()
server.serve_forever() #Then we start the server!
client.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
m.BaseManager.register('get_bar') #We register this so that the Manager knows it's a valid method
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server connection
manager.connect() #and connect!
bar = manager.get_bar() # now we can use our 'get' method to retrieve a Proxy of the object.
So there are a few interesting things to note here. First, we can get_bar() from as many clients as we want, and they'll all point back to the same bar. Second, we can call methods in bar, read() and write(), from a client, without having the Bar class on hand. Pretty neat.
So how to use this? If you have a console-enabled program, first split it into two parts, the console, and the functionality it controls. Have that functionality boxed up in a handful of objects that together comprise the instance of the application. Modify server.py above to instantiate those objects, and host them out with get methods in a Manager. Then tweak your console interface to connect like client.py and use the proxies instead of the actual objects. Finally, set up a flask app that also connects to the server and provides a web interface.
Now you're a Bona-Fide Web Developer! RIP.
I have a wsgi application which can potentially run in different python web server environments(cherrypy, uwsgi or gunicorn), but not greenlet-based. The application handles different paths for some REST apis and user interfaces. During any http call of the app there is a need to know, what is the context of the call, since implementation logic methods share code of API calls and UI calls and some bunch of logic which is separated in many modules should react differently depending on the context. The simple and straightforward way is to pass a parameter to implementation code, e.g. ApiCall(caller=client_service_id) or UserCall(caller=user_id), but it's a pain to propagate this parameter to all the possible modules. Is it a good solution to just set the context in the thread object like this?
def set_context(ctx):
threading.current_thread().ctx = ctx
def get_context():
return threading.current_thread().ctx
So call set_context somewhere in the beginning of the http call handler where we can construct the context ovject depending on the environment data, and then just use get_context() in any part of the code where we must react depending on the context. What is the best practices to achive this? Thank you!
It looks like this is what e.g. MongoEngine does. The goal is to have model files be able to access the db without having to explicitly pass around the context.
Pyramid has nothing to do with it. The global needs to handle whatever mechanism the WSGI server is using to serve your application.
For instance, most servers use a separate thread per request, so your global variable needs to be threadsafe. gunicorn and gevent are served using greenlets, which is a different mechanic.
A lot of engines/orms support a threadlocal connection. This will allow you to access your connection as if it were a global variable, but it is a different variable in each thread. You just have to make sure to close the connection when the request is complete to avoid that connection spilling over into the next request in the same thread. This can be done easily using a Pyramid tween or several other patterns illustrated in the cookbook.