setting attribute in current_thread as state holder - python

I have a wsgi application which can potentially run in different python web server environments(cherrypy, uwsgi or gunicorn), but not greenlet-based. The application handles different paths for some REST apis and user interfaces. During any http call of the app there is a need to know, what is the context of the call, since implementation logic methods share code of API calls and UI calls and some bunch of logic which is separated in many modules should react differently depending on the context. The simple and straightforward way is to pass a parameter to implementation code, e.g. ApiCall(caller=client_service_id) or UserCall(caller=user_id), but it's a pain to propagate this parameter to all the possible modules. Is it a good solution to just set the context in the thread object like this?
def set_context(ctx):
threading.current_thread().ctx = ctx
def get_context():
return threading.current_thread().ctx
So call set_context somewhere in the beginning of the http call handler where we can construct the context ovject depending on the environment data, and then just use get_context() in any part of the code where we must react depending on the context. What is the best practices to achive this? Thank you!

Related

Connect Flask server with many threads to single server-side thread [duplicate]

This question already has answers here:
Store large data or a service connection per Flask session
(1 answer)
Are global variables thread-safe in Flask? How do I share data between requests?
(4 answers)
Closed 4 years ago.
I'm have written a python script on a Raspberry Pi that controls a large and somewhat complex piece of hardware. Currently the use interface is a python console. I run the python program, and the can enter commands from the console with input("> ").
Now I want to add a web interface to this program, and I'm trying to figure out the right way to do that. I've written a basic web UI in flask, but I haven't found a good way to connect the flask interface with the main script. Because the server controls a single piece of hardware, the main object (handling the hardware control) is only instantiated once. In addition, there is significant hardware configuration that must happen each time the script is run (each time the main object is created.)
Flask seems to create a new thread for each client session, and only persist variable across that session. How can I link flask events (i.e, web user presses a button) to a method in the main object?
Am I using the wrong tool for the job? What is the right tool? What is the correct way to do this? Alternatively, what is a jank way to do this?
EDIT: The linked questions are similar, and led me down the path to the answer below, but they really didn't answer the question, so much as point to some code that solved that specific example. I think the below answer is far more helpful. (Of course I'm a bit biased; I wrote it)
I found a solution, so I'll share it here. (I guess I can't answer my own questions?)
The Flask Web App system (and all WSGI web apps?) rely on the principle that the web app can be executed in a totally fresh environment, without needing to be passed objects on creation. I guess this makes sense for Web Apps, but it adds a ton of annoying complexity to Web UIs, that is, web interfaces purpose built to interface a single instance of a larger program. As a hardware/solutions engineer, I tend to need a Web UI to control a single piece of hardware. If there is something better than flask for this purpose please leave a comment. But flask is nice, and I now know its possible to use for this purpose.
Flask Web Apps are not designed to do the "heavy lifting" themselves. They maintain very limited state (i.e. every request triggers a fresh context), and as mentioned can't be passed references to other objects. In a proper Web App a full stack developer would connect Flask to a database server and other such WebDev-y systems. For controlling hardware, our goal is to trigger the execution of some arbitrary methods in another python process (quite possibly a separately executed process).
In python a convenient way to achieve this execution is the multiprocessing.managers module. Managers are a tool that allows you to easily construct Proxies, which link objects across processes. If you have an object bar = Bar() you can produce a <AutoProxy[get_bar]> proxy, that allows you to manipulate the original bar object from far away. Far away can be in a child process, on another computer across the internet. Here's an example.
server.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
class Bar: #First we setup a dummy class for this example.
def __init__(self):
self.text = ""
def read(self):
return str(self.text)
def write(self, string):
self.text = str(string)
logger.error("Wrote!")
logger.error(string)
bar = Bar() #On the server side we create an instance: this is the object we want to share to other processes.
m.BaseManager.register('get_bar', callable=lambda:bar) #then we register a 'get' function in the manager,
# to retrieve our object from afar. the lambda:bar is just shorthand for a function that returns the bar object.
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server on port 50000, with a password.
server = manager.get_server()
server.serve_forever() #Then we start the server!
client.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
m.BaseManager.register('get_bar') #We register this so that the Manager knows it's a valid method
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server connection
manager.connect() #and connect!
bar = manager.get_bar() # now we can use our 'get' method to retrieve a Proxy of the object.
So there are a few interesting things to note here. First, we can get_bar() from as many clients as we want, and they'll all point back to the same bar. Second, we can call methods in bar, read() and write(), from a client, without having the Bar class on hand. Pretty neat.
So how to use this? If you have a console-enabled program, first split it into two parts, the console, and the functionality it controls. Have that functionality boxed up in a handful of objects that together comprise the instance of the application. Modify server.py above to instantiate those objects, and host them out with get methods in a Manager. Then tweak your console interface to connect like client.py and use the proxies instead of the actual objects. Finally, set up a flask app that also connects to the server and provides a web interface.
Now you're a Bona-Fide Web Developer! RIP.

Connecting to MongoDb (or any other db server) with Pyramid

What is the difference between connecting to the MongoDb server with the following two lines in the models.py module and then import models.py inside views.py:
from pymongo import MongoClient
db = MongoClient()['name']
versus adding db to request as described here or here?
I just started playing round with Pyramid and MongoDb, I used the first approach and it works well. Then I found out that people use the second approach.
Am I doing something wrong?
There's nothing wrong with what you're doing, but it's less future proof in case your app is going to become complex. The pattern your using is what sometimes is called "using a module as a singleton". The first time your module is imported, the code runs, creating a module level object that can be used from any other code that imports from this module. There's nothing wrong with this, it's a normal python pattern and is the reason you don't see much in the way of singleton boilerplate in python land.
However, in a complex app, it can become useful to control exactly when something happens, regardless of who imports what when. When we create the client at config time as per the docs example, you know that it gets created when the config (server startup) block is running as opposed to whenever any code imports your module, and you know from then on that it's available through your registry, which is accessible everywhere in a Pyramid app through the request object. This is the normal Pyramid best practise: set up all your one-time shared across requests machinery in the server start up code where you create your configurator, and (probably) attach them to the configurator or its registry.
This is the same reason we hook things into request lifecycle callbacks, it allows us to know where and when some piece of per-request code executes, and to make sure that a clean up helper always fires at the end of the request lifecycle. So for DB access, we create the shared machinery in config startup, and at the beginning of a request we create the per-connection code, cleaning up afterwards at the end of the request. For an SQL db, this would mean starting the transaction, and then committing or rolling back at the end.
So it might not matter at all for your app right now, but it's good practise for growing code bases. Most of the Pyramid design decisions were made for complex code situations.

Self-invocation in Python bottle

Using Python's bottle module, I would like to process a request internally, without invoking a call from outside.
Suppose I have the following minimal bottle application running at localhost:8080 and I would like to invoke foo from inside bar. One way to do this is:
from bottle import *
import requests
#get('/foo')
def foo()
return 'foo'
#get('/bar')
def bar()
return requests.get('http://localhost:8080/foo').text
app = default_app()
run(app, port=8080)
Now what I would like to do is get rid of the HTTP call using requests. I would simply love to use something like:
#get('/bar')
def bar()
return bottle.process_internally('/foo', 'GET')
For me, this would have two big advantages:
Only single worker is required (the worker is blocked while processing request, hence using requests.get leads to deadlock if only one worker is running).
No overhead caused by the HTTP protocol.
My true motivation is that I wish to process batches containing request URLs in form of JSON arrays. Very ineffective, yet very fast-to-implement.
Is that somehow possible?
You may want to consider "thin controllers" (and skinny everything) paradigm.
With this concept, all your code logic is elsewhere other than the controller (perhaps in your service or model classes).
If you have the bare minimum amount of logic in your controllers, then your foo and bar routes can call the same functions in your models/services, and you won't need to resort to your routes calling each other.
There are some frameworks that have support for internal redirect (Ruby's Sinatra is one), but I've always considered these a hacky workaround for not writing the code in a flexible way.

Controlling a Twisted Server from Django

I'm trying to build a Twisted/Django mashup that will let me control various client connections managed by a Twisted server via Django's admin interface. Meaning, I want to be able to login to Django's admin and see what protocols are currently in use, any details specific to each connection (e.g. if the server is connected to freenode via IRC, it should list all the channels currently connected to), and allow me to disconnect or connect new clients by modifying or creating database records.
What would be the best way to do this? There are lots of posts out there about combining Django with Twisted, but I haven't found any prior art for doing quite what I've outlined. All the Twisted examples I've seen use hardcoded connection parameters, which makes it difficult for me to imagine how I would dynamically running reactor.connectTCP(...) or loseConnection(...) when signalled by a record in the database.
My strategy is to create a custom ClientFactory that solely polls the Django/managed database every N seconds for any commands, and to modify/create/delete connections as appropriate, reflecting the new status in the database when complete.
Does this seem feasible? Is there a better approach? Does anyone know of any existing projects that implement similar functionality?
Polling the database is lame, but unfortunately, databases rarely have good tools (and certainly there are no database-portable tools) for monitoring changes. So your approach might be okay.
However, if your app is in Django and you're not supporting random changes to the database from other (non-Django) clients, and your WSGI container is Twisted, then you can do this very simply by doing callFromThread(connectTCP, ...).
I've been working on yet another way of combing django and twisted. Fell free to give it a try: https://github.com/kowalski/featdjango.
The way it works, is slightly different that the others. It starts a twisted application and http site. The requests done to django are processed inside a special thread pool. What makes it special, is that that these threads can wait on Deferred, which makes it easy to combine synchronous django application code with asynchronous twisted code.
The reason I came up with structure like this, is that my application needs to perform a lot of http requests from inside the django views. Instead of performing them one by one I can delegate all of them at once to "the main application thread" which runs twisted and wait for them. The similarity to your problem is, that I also have an asynchronous component, which is a singleton and I access it from django views.
So this is, for example, this is how you would initiate the twisted component and later to get the reference from the view.
import threading
from django.conf import settings
_initiate_lock = threading.Lock()
def get_component():
global _initiate_lock
if not hasattr(settings, 'YOUR_CLIENT')
_initiate_lock.acquire()
try:
# other thread might have did our job while we
# were waiting for the lock
if not hasattr(settings, 'YOUR_CLIENT'):
client = YourComponent(**whatever)
threading.current_thread().wait_for_deferred(
client.initiate)
settings.YOUR_CLIENT = client
finally:
_initiate_lock.release()
return settings.YOUR_CLIENT
The code above, initiates my client and calls the initiate method on it. This method is asynchronous and returns a Deferred. I do all the necessary setup in there. The django thread will wait for it to finish before processing to next line.
This is how I do it, because I only access it from the request handler. You probably would want to initiate your component at startup, to call ListenTCP|SSL. Than your django request handlers could get the data about the connections just accessing some public methods on the your client. These methods could even return Deferred, in which case you should use .wait_for_defer() to call them.

In Pyramid, is it safe to have a python global variable that stores the db connection?

It looks like this is what e.g. MongoEngine does. The goal is to have model files be able to access the db without having to explicitly pass around the context.
Pyramid has nothing to do with it. The global needs to handle whatever mechanism the WSGI server is using to serve your application.
For instance, most servers use a separate thread per request, so your global variable needs to be threadsafe. gunicorn and gevent are served using greenlets, which is a different mechanic.
A lot of engines/orms support a threadlocal connection. This will allow you to access your connection as if it were a global variable, but it is a different variable in each thread. You just have to make sure to close the connection when the request is complete to avoid that connection spilling over into the next request in the same thread. This can be done easily using a Pyramid tween or several other patterns illustrated in the cookbook.

Categories

Resources