I am a PHP programmer learning Python, when ever I get a chance.
I read that Python web Application stay active between requests.
Meaning that data stays in memory and is available between requests, right?
I am wondering how that works.
In php we place a cookie with a unique token, and save data in sessions.
Sessions are arrays, saved on disk or database.
Between requests the session functions, restore the correct session array based on the cookie with the unique token. That means each browser gets it's own unique session, and the session has a preset expiration time. If the user is inactive and the expiration get's triggered then the session gets purged. A new session has to be created when the user comes back.
My understanding is Python doesn't need this, because the application stays active between requests.
Doesn't each request get a unique thread in Python?
How does it distinguish between requests, who the requester is?
Is there a handling method to separate vars between users and application?
Lets say I have a dict saved, is this dict globally available between all requests from any browser, or only to that one browser.
When and how does the memory get cleared. If everything stays in the memory. What if the app is running for a couple years without a restart. There must be some kind of expiration setting or memory handling?
One commenter says it depends on the web app. So I am using Bottle.py to learn.
I would assume the answer would depend on which web application framework you are using within python. Some of them have session management pieces in them that track a user across requests. But if you just had a basic port listener that responded with http, you would have to build any cookie support or session management yourself.
The other big difference is that in php, you have a module installed on the server that the actual http server delegates to in order to generate a response. PHP doesn't handle the routing or actual serving of the responses. Where as python can actually be the server and the resource for generating the response. It depends on how python is installed/accessed on the machine where the server is running. So in that sense you can do whatever you want within a python web application.
If you are interested, you should look at some available python web frameworks.
Edit: I see you mentioned bottle.py, and out of the box, it does not provide authentication and session management because it's a micro framework for fast prototyping and not necessarily suitable for a large scale application (although not impossible, just a lot of work).
Yes and no. If you check out this question you get an idea how it could work for a Django application.
However, the way you state it, it will not work. Defining a dict in one request without passing it somewhere in order to make it accessible in following request will obviously not make it available in further requests. So, yes, you have the options do this but its not the casue out of the box!
I was able to persist an object in Python between requests before using Twisted's web server. I have not tried seeing for myself if it persists across browsers though but I have a feeling it does. Here's a code snippet from the documentation:
Twisted includes an event-driven web server. Here's a sample web application; notice how the resource object persists in memory, rather than being recreated on each request:
from twisted.web import server, resource
from twisted.internet import reactor
class HelloResource(resource.Resource):
isLeaf = True
numberRequests = 0
def render_GET(self, request):
self.numberRequests += 1
request.setHeader("content-type", "text/plain")
return "I am request #" + str(self.numberRequests) + "\n"
reactor.listenTCP(8080, server.Site(HelloResource()))
reactor.run()
First of all you should understand the difference between local and global variables in python, and also how thread local storage works.
This is a (very) short explanation:
global variables are those declared at module scope and are shared by all threads. They live as long as the process is running, unless explicitly removed
local variables are those declared inside a function and instantiated for each call of that function. They are deleted when the function is over unless it is still referenced somewhere else.
thread local stoarage enables defining global variables that are specific to the current thread. The live as tong as the current thread is running, unless explicitly removed.
And now I'll try to answer your original questions (the answers are specific to bottle.py, but it is the most common implementation in python web servers)
Doesn't each request get a unique thread in Python?
Each concurrent will have a separate thread, future requests might reuse the previous threads.
How does it distinguish between requests, who the requester is?
bottle.py uses thread local storage to access the current request
Is there a handling method to separate vars between users and application?
Sounds like you are looking for a session. If so, there is no standard way of doing it, because different implementation have advantages and disadvantages. For example this is a bottle.py middleware for sessions.
Lets say I have a dict saved, is this dict globally available between
all requests from any browser, or only to that one browser. When and
how does the memory get cleared.
If everything stays in the memory. What if the app is running for a
couple years without a restart. There must be some kind of expiration
setting or memory handling?
Exactly, there must be an expiration setting. Since you are using a custom dict you need a timer that checks each entry in the dict for expiration.
Related
I'm having some issues with the pyghmi python library, which is used for sending IPMI commands with python scripts. My goal is to implement an HTTP API to send IPMI commands through HTTP requests.
I am already able to create a Session and send a few commands with the library, but if the Session remains IDLE for 30 seconds, it logged itself out.
When the Session is logged out, I can't create a new one : I get an error "Session is logged out", or a deadlock.
How can I do if I want to have a server that is always up and create Session when it receives requests, if I can't create new Session when the previous one is logged out ?
What I've tried :
from pyghmi.ipmi import command
ipmi = command.Command(ip, user, passwd)
res = ipmi.get_power()
print(res)
# wait 30 seconds
res2 = ipmi.get_power() # get "Session logged out" error
ipmi2 = command.Command(ip, user, paswd) # Deadlock if wait < 30 seconds, else no error
res3 = ipmi2.get_power() # get "Session logged out" error
# Impossible to create new command.Command() Session, every command will give "logged out" error
The other problem is that I can't use the asynchronous way by giving an "onlogon callback" function in the command.Command() call, because I will need the callback return value in the caller and that's not possible with this sort of thread behavior.
Edit: I already tried some examples provided here but it's always one-time run scripts, whereas I'm looking for something that can stay "up" forever.
So I finally achieved a sort of solution. I emailed the Pyghmi's main contributor and he said that this lib was not suited for a multi- and reuseable- Session implementation (there is currently an open issue "Session reuse" on Pyghmi repository).
First "solution": use processes
My goal was to create an HTTP API. To avoid the Session timeout issue, I create a new Process (not Thread) for every new request. That works fine, but I did not keep this solution because it is to heavy and sockets consuming. It seems that by creating processes, the memory used by Pyghmi is not shared between processes (that's the goal of processes) so every Session utilisation is not a reuse but a creation.
Second "solution" : use Confluent
Confluent is a tool developed by Lenovo that allow to control hardware via HTTP. It uses a sort of patched version of Pyghmi as backend for IPMI calls. Confluent documentation here.
Once installed and configured on a server, Confluent worked well to control IPMI devices via HTTP. I packaged it in a Docker image along with an ipmi_simulator for testing purposes : confluent dockerized.
The solution today is to run Command.eventloop() after creating the connection. It is documented in ipmi/command.py, which has a very trivial Housekeeper class which in the current version 1.5.53 is actually just a renamed Thread class, with no additional features. It merely runs the eventloop.
The implementation looks like this. One of those mentioned house keeping tasks is sending keepalive messages, if enabled which it is by default and can be influence by supplying keepalive=True at Command instantiation:
class Housekeeper(threading.Thread):
"""A Maintenance thread for housekeeping
Long lived use of pyghmi may warrant some recurring asynchronous behavior.
This stock thread provides a simple minimal context for these housekeeping
tasks to run in. To use, do 'pyghmi.ipmi.command.Maintenance().start()'
and from that point forward, pyghmi should execute any needed ongoing
tasks automatically as needed. This is an alternative to calling
wait_for_rsp or eventloop in a thread of the callers design.
"""
def run(self):
Command.eventloop()
This question already has answers here:
Store large data or a service connection per Flask session
(1 answer)
Are global variables thread-safe in Flask? How do I share data between requests?
(4 answers)
Closed 4 years ago.
I'm have written a python script on a Raspberry Pi that controls a large and somewhat complex piece of hardware. Currently the use interface is a python console. I run the python program, and the can enter commands from the console with input("> ").
Now I want to add a web interface to this program, and I'm trying to figure out the right way to do that. I've written a basic web UI in flask, but I haven't found a good way to connect the flask interface with the main script. Because the server controls a single piece of hardware, the main object (handling the hardware control) is only instantiated once. In addition, there is significant hardware configuration that must happen each time the script is run (each time the main object is created.)
Flask seems to create a new thread for each client session, and only persist variable across that session. How can I link flask events (i.e, web user presses a button) to a method in the main object?
Am I using the wrong tool for the job? What is the right tool? What is the correct way to do this? Alternatively, what is a jank way to do this?
EDIT: The linked questions are similar, and led me down the path to the answer below, but they really didn't answer the question, so much as point to some code that solved that specific example. I think the below answer is far more helpful. (Of course I'm a bit biased; I wrote it)
I found a solution, so I'll share it here. (I guess I can't answer my own questions?)
The Flask Web App system (and all WSGI web apps?) rely on the principle that the web app can be executed in a totally fresh environment, without needing to be passed objects on creation. I guess this makes sense for Web Apps, but it adds a ton of annoying complexity to Web UIs, that is, web interfaces purpose built to interface a single instance of a larger program. As a hardware/solutions engineer, I tend to need a Web UI to control a single piece of hardware. If there is something better than flask for this purpose please leave a comment. But flask is nice, and I now know its possible to use for this purpose.
Flask Web Apps are not designed to do the "heavy lifting" themselves. They maintain very limited state (i.e. every request triggers a fresh context), and as mentioned can't be passed references to other objects. In a proper Web App a full stack developer would connect Flask to a database server and other such WebDev-y systems. For controlling hardware, our goal is to trigger the execution of some arbitrary methods in another python process (quite possibly a separately executed process).
In python a convenient way to achieve this execution is the multiprocessing.managers module. Managers are a tool that allows you to easily construct Proxies, which link objects across processes. If you have an object bar = Bar() you can produce a <AutoProxy[get_bar]> proxy, that allows you to manipulate the original bar object from far away. Far away can be in a child process, on another computer across the internet. Here's an example.
server.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
class Bar: #First we setup a dummy class for this example.
def __init__(self):
self.text = ""
def read(self):
return str(self.text)
def write(self, string):
self.text = str(string)
logger.error("Wrote!")
logger.error(string)
bar = Bar() #On the server side we create an instance: this is the object we want to share to other processes.
m.BaseManager.register('get_bar', callable=lambda:bar) #then we register a 'get' function in the manager,
# to retrieve our object from afar. the lambda:bar is just shorthand for a function that returns the bar object.
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server on port 50000, with a password.
server = manager.get_server()
server.serve_forever() #Then we start the server!
client.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
m.BaseManager.register('get_bar') #We register this so that the Manager knows it's a valid method
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server connection
manager.connect() #and connect!
bar = manager.get_bar() # now we can use our 'get' method to retrieve a Proxy of the object.
So there are a few interesting things to note here. First, we can get_bar() from as many clients as we want, and they'll all point back to the same bar. Second, we can call methods in bar, read() and write(), from a client, without having the Bar class on hand. Pretty neat.
So how to use this? If you have a console-enabled program, first split it into two parts, the console, and the functionality it controls. Have that functionality boxed up in a handful of objects that together comprise the instance of the application. Modify server.py above to instantiate those objects, and host them out with get methods in a Manager. Then tweak your console interface to connect like client.py and use the proxies instead of the actual objects. Finally, set up a flask app that also connects to the server and provides a web interface.
Now you're a Bona-Fide Web Developer! RIP.
I'm trying to build a Twisted/Django mashup that will let me control various client connections managed by a Twisted server via Django's admin interface. Meaning, I want to be able to login to Django's admin and see what protocols are currently in use, any details specific to each connection (e.g. if the server is connected to freenode via IRC, it should list all the channels currently connected to), and allow me to disconnect or connect new clients by modifying or creating database records.
What would be the best way to do this? There are lots of posts out there about combining Django with Twisted, but I haven't found any prior art for doing quite what I've outlined. All the Twisted examples I've seen use hardcoded connection parameters, which makes it difficult for me to imagine how I would dynamically running reactor.connectTCP(...) or loseConnection(...) when signalled by a record in the database.
My strategy is to create a custom ClientFactory that solely polls the Django/managed database every N seconds for any commands, and to modify/create/delete connections as appropriate, reflecting the new status in the database when complete.
Does this seem feasible? Is there a better approach? Does anyone know of any existing projects that implement similar functionality?
Polling the database is lame, but unfortunately, databases rarely have good tools (and certainly there are no database-portable tools) for monitoring changes. So your approach might be okay.
However, if your app is in Django and you're not supporting random changes to the database from other (non-Django) clients, and your WSGI container is Twisted, then you can do this very simply by doing callFromThread(connectTCP, ...).
I've been working on yet another way of combing django and twisted. Fell free to give it a try: https://github.com/kowalski/featdjango.
The way it works, is slightly different that the others. It starts a twisted application and http site. The requests done to django are processed inside a special thread pool. What makes it special, is that that these threads can wait on Deferred, which makes it easy to combine synchronous django application code with asynchronous twisted code.
The reason I came up with structure like this, is that my application needs to perform a lot of http requests from inside the django views. Instead of performing them one by one I can delegate all of them at once to "the main application thread" which runs twisted and wait for them. The similarity to your problem is, that I also have an asynchronous component, which is a singleton and I access it from django views.
So this is, for example, this is how you would initiate the twisted component and later to get the reference from the view.
import threading
from django.conf import settings
_initiate_lock = threading.Lock()
def get_component():
global _initiate_lock
if not hasattr(settings, 'YOUR_CLIENT')
_initiate_lock.acquire()
try:
# other thread might have did our job while we
# were waiting for the lock
if not hasattr(settings, 'YOUR_CLIENT'):
client = YourComponent(**whatever)
threading.current_thread().wait_for_deferred(
client.initiate)
settings.YOUR_CLIENT = client
finally:
_initiate_lock.release()
return settings.YOUR_CLIENT
The code above, initiates my client and calls the initiate method on it. This method is asynchronous and returns a Deferred. I do all the necessary setup in there. The django thread will wait for it to finish before processing to next line.
This is how I do it, because I only access it from the request handler. You probably would want to initiate your component at startup, to call ListenTCP|SSL. Than your django request handlers could get the data about the connections just accessing some public methods on the your client. These methods could even return Deferred, in which case you should use .wait_for_defer() to call them.
It looks like this is what e.g. MongoEngine does. The goal is to have model files be able to access the db without having to explicitly pass around the context.
Pyramid has nothing to do with it. The global needs to handle whatever mechanism the WSGI server is using to serve your application.
For instance, most servers use a separate thread per request, so your global variable needs to be threadsafe. gunicorn and gevent are served using greenlets, which is a different mechanic.
A lot of engines/orms support a threadlocal connection. This will allow you to access your connection as if it were a global variable, but it is a different variable in each thread. You just have to make sure to close the connection when the request is complete to avoid that connection spilling over into the next request in the same thread. This can be done easily using a Pyramid tween or several other patterns illustrated in the cookbook.
I am developing web app on flask, python, sqlalchemy and postgresql.
My question is here regarding concurrency handling in this app.
How I wrote the app :
I take the example of adding user in database. I post the form and a view is called. I process all the form data and then call add_user(*arg) which uses sqlalchemy code to insert user in database and returns on successful execution and I return the response from the view.
What I assumed:
Ok now I assumed that my web server (which I have not decided yet) will either spawn a thread or a process if two users are trying to signup at the same time and will handle all the concurreny requirements.
Do i need to write threaded code here? By threaded code I mean that before writing I acquire a lock and after write release it.
I am pretty new to web development and multithreading/multiprocessing programing and would like some guidance on how write web app which can handle concurrency well.
Writing concurrency handling from start is right or this thought should come when a large number of concurrent users are using the webapp. Even If it should be done later I would like some pointers about it.
Basically I have no idea about concurrency part of webapp development. If you can point to resources from where I can learn more about it would be really helpful.
Flask will execute each request in a separate thread or even in separate processes. The number of threads and processes to spawn is determined by the WSGI server (for example, Apache with mod_wsgi).
If you use SQLAlchemy ScopedSessions, the session is perfectly thread-safe. You must not share ORM-controlled objects across threads (but in the large majority of cases, you won't let your objects live longer than a request anyway so this is usually not a concern).
In other words, as long as you don't intend to share state between requests other than through the database or cookies, you don't need to worry about concurrency issues. You don't need to create a lock for writing to the database.
If you create your own long-lived objects within your application, which you most likely don't need to do, and if those objects communicate or share state with request handling code, then you must take appropriate precautions to avoid synchronization issues (race conditions, deadlocks, use of libraries that are not thread-safe, etc.)