Twisted: Creating a ThreadPool and then daemonizing leads to uninformative hangs

Twisted: Creating a ThreadPool and then daemonizing leads to uninformative hangs - python

I am developing a networked application in Twisted, part of which consists of a web interface written in Django.
I wish to use Twisted's WSGI server to host the web interface, and I've written a working "tap" plugin to allow me to use twistd.
When running the server with the -n flag (don't daemonize) everything works fine, but when this flag is removed the server doesn't respond to requests at all, and there are no messages logged (though the server is still running).
There is a bug on Twisted's Trac which seems to describe the problem exactly, and my plugin happens to be based on the code referenced in the ticket.
Unfortunately, the issue hasn't been fixed - and it was raised almost a year ago.
I have attempted to create a ThreadPoolService class, which extends Service and starts a given ThreadPool when startService is called:
class ThreadPoolService(service.Service):
def __init__(self, pool):
self.pool = pool
def startService(self):
super(ThreadPoolService, self).startService()
self.pool.start()
def stopService(self):
super(ThreadPoolService, self).stopService()
self.pool.stop()
However, Twisted doesn't seem to be calling the startService method at all. I think the problem is that with a "tap" plugin, the ServiceMaker can only return one service to be started - and any others belonging to the same application aren't started. Obviously, I am returning a TCPServer service which contains the WSGI root.
At this point, I've hit somewhat of a bit of a brick wall. Does anyone have any ideas as to how I can work round this issue?

Return a MultiService from your ServiceMaker; one that includes your ThreadPoolService as well as your main application service. The API for assembling such a thing is pretty straightforward:
multi = MultiService()
mine = TCPServer(...) # your existing application service
threads = ThreadPoolService()
mine.setServiceParent(multi)
threads.setServiceParent(multi)
return multi
Given that you've already found the ticket for dealing with this confusing issue within Twisted, I look forward to seeing your patch :).

Related

Connect Flask server with many threads to single server-side thread [duplicate]

This question already has answers here:
Store large data or a service connection per Flask session
(1 answer)
Are global variables thread-safe in Flask? How do I share data between requests?
(4 answers)
Closed 4 years ago.
I'm have written a python script on a Raspberry Pi that controls a large and somewhat complex piece of hardware. Currently the use interface is a python console. I run the python program, and the can enter commands from the console with input("> ").
Now I want to add a web interface to this program, and I'm trying to figure out the right way to do that. I've written a basic web UI in flask, but I haven't found a good way to connect the flask interface with the main script. Because the server controls a single piece of hardware, the main object (handling the hardware control) is only instantiated once. In addition, there is significant hardware configuration that must happen each time the script is run (each time the main object is created.)
Flask seems to create a new thread for each client session, and only persist variable across that session. How can I link flask events (i.e, web user presses a button) to a method in the main object?
Am I using the wrong tool for the job? What is the right tool? What is the correct way to do this? Alternatively, what is a jank way to do this?
EDIT: The linked questions are similar, and led me down the path to the answer below, but they really didn't answer the question, so much as point to some code that solved that specific example. I think the below answer is far more helpful. (Of course I'm a bit biased; I wrote it)
I found a solution, so I'll share it here. (I guess I can't answer my own questions?)
The Flask Web App system (and all WSGI web apps?) rely on the principle that the web app can be executed in a totally fresh environment, without needing to be passed objects on creation. I guess this makes sense for Web Apps, but it adds a ton of annoying complexity to Web UIs, that is, web interfaces purpose built to interface a single instance of a larger program. As a hardware/solutions engineer, I tend to need a Web UI to control a single piece of hardware. If there is something better than flask for this purpose please leave a comment. But flask is nice, and I now know its possible to use for this purpose.
Flask Web Apps are not designed to do the "heavy lifting" themselves. They maintain very limited state (i.e. every request triggers a fresh context), and as mentioned can't be passed references to other objects. In a proper Web App a full stack developer would connect Flask to a database server and other such WebDev-y systems. For controlling hardware, our goal is to trigger the execution of some arbitrary methods in another python process (quite possibly a separately executed process).
In python a convenient way to achieve this execution is the multiprocessing.managers module. Managers are a tool that allows you to easily construct Proxies, which link objects across processes. If you have an object bar = Bar() you can produce a <AutoProxy[get_bar]> proxy, that allows you to manipulate the original bar object from far away. Far away can be in a child process, on another computer across the internet. Here's an example.
server.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
class Bar: #First we setup a dummy class for this example.
def __init__(self):
self.text = ""
def read(self):
return str(self.text)
def write(self, string):
self.text = str(string)
logger.error("Wrote!")
logger.error(string)
bar = Bar() #On the server side we create an instance: this is the object we want to share to other processes.
m.BaseManager.register('get_bar', callable=lambda:bar) #then we register a 'get' function in the manager,
# to retrieve our object from afar. the lambda:bar is just shorthand for a function that returns the bar object.
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server on port 50000, with a password.
server = manager.get_server()
server.serve_forever() #Then we start the server!
client.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
m.BaseManager.register('get_bar') #We register this so that the Manager knows it's a valid method
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server connection
manager.connect() #and connect!
bar = manager.get_bar() # now we can use our 'get' method to retrieve a Proxy of the object.
So there are a few interesting things to note here. First, we can get_bar() from as many clients as we want, and they'll all point back to the same bar. Second, we can call methods in bar, read() and write(), from a client, without having the Bar class on hand. Pretty neat.
So how to use this? If you have a console-enabled program, first split it into two parts, the console, and the functionality it controls. Have that functionality boxed up in a handful of objects that together comprise the instance of the application. Modify server.py above to instantiate those objects, and host them out with get methods in a Manager. Then tweak your console interface to connect like client.py and use the proxies instead of the actual objects. Finally, set up a flask app that also connects to the server and provides a web interface.
Now you're a Bona-Fide Web Developer! RIP.

Can I have Python code to continue executing after I call Flask app.run?

I have just started with Python, although I have been programming in other languages over the past 30 years. I wanted to keep my first application simple, so I started out with a little home automation project hosted on a Raspberry Pi.
I got my code to work fine (controlling a valve, reading a flow sensor and showing some data on a display), but when I wanted to add some web interactivity it came to a sudden halt.
Most articles I have found on the subject suggest to use the Flask framework to compose dynamic web pages. I have tried, and understood, the basics of Flask, but I just can't get around the issue that Flask is blocking once I call the "app.run" function. The rest of my python code waits for Flask to return, which never happens. I.e. no more water flow measurement, valve motor steering or display updating.
So, my basic question would be: What tool should I use in order to serve a simple dynamic web page (with very low load, like 1 request / week), in parallel to my applications main tasks (GPIO/Pulse counting)? All this in the resource constrained environment of a Raspberry Pi (3).
If you still suggest Flask (because it seems very close to target), how should I arrange my code to keep handling the real-world events, such as mentioned above?
(This last part might be tough answering without seeing the actual code, but maybe it's possible answering it in a "generic" way? Or pointing to existing examples that I might have missed while searching.)

You're on the right track with multithreading. If your monitoring code runs in a loop, you could define a function like
def monitoring_loop():
while True:
# do the monitoring
Then, before you call app.run(), start a thread that runs that function:
import threading
from wherever import monitoring_loop
monitoring_thread = threading.Thread(target = monitoring_loop)
monitoring_thread.start()
# app.run() and whatever else you want to do
Don't join the thread - you want it to keep running in parallel to your Flask app. If you joined it, it would block the main execution thread until it finished, which would be never, since it's running a while True loop.
To communicate between the monitoring thread and the rest of the program, you could use a queue to pass messages in a thread-safe way between them.

The way I would probably handle this is to split your program into two distinct separately running programs.
One program handles the GPIO monitoring and communication, and the other program is your small Flask server. Since they run as separate processes, they won't block each other.
You can have the two processes communicate through a small database. The GIPO interface can periodically record flow measurements or other relevant data to a table in the database. It can also monitor another table in the database that might serve as a queue for requests.
Your Flask instance can query that same database to get the current statistics to return to the user, and can submit entries to the requests queue based on user input. (If the GIPO process updates that requests queue with the current status, the Flask process can report that back out.)
And as far as what kind of database to use on a little Raspberry Pi, consider sqlite3 which is a very small, lightweight file-based database well supported as a standard library in Python. (It doesn't require running a full "database server" process.)
Good luck with your project, it sounds like fun!

Hi i was trying the connection with dronekit_sitl and i got the same issue , after 30 seconds the connection was closed.To get rid of that , there are 2 solutions:
You use the decorator before_request:in this one you define a method that will handle the connection before each request
You use the decorator before_first_request : in this case the connection will be made once the first request will be called and the you can handle the object in the other route using a global variable
For more information https://pythonise.com/series/learning-flask/python-before-after-request

Run code on first Django start

I have a Django application written to handle displaying a webpage with data from a model based on the primary key passed in the URL, this all works fine and the Django component is working perfectly for the most part.
My question though is, and I have tried multiple methods such as using an AppConfig, is how I can make it so when the Django server boots up, code is called that would then create a separate thread which would then monitor an external source, logging valid data from that source as a model into the database.
I have the threading code written along with the section that creates the model and saves it in the database, my issue though is that if I try to use an AppConfig to create the thread which would then handle the code, I get an django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet. error and the server does not boot up.
Where would be appropriate to place the code? Is my approach incorrect to the matter?

Trying to use threading to get around blocking processes like web servers is an exercise in pain. I've done it before and it's fragile and often yields unpredictable results.
A much easier idea is to create a separate worker that runs in a totally different process that you start separately. It would have the same database access and could even use your Django models. This is how hosts like Heroku approach this problem. It comes with the added benefit of being able to be tested separately and doesn't need to run at all while you're working on your main Django application.
These days, with a multitude of virtualization options like Vagrant and containerization options like Docker, running parallel processes and workers is trivial. In the wild they may literally be running on separate servers with your database on yet another server. As was mentioned in the comments, starting a worker process could easily be delegated to a separate Django management command. This, in turn, can be fairly easily turned into separate worker processes by gunicorn on your web server.

Controlling a Twisted Server from Django

I'm trying to build a Twisted/Django mashup that will let me control various client connections managed by a Twisted server via Django's admin interface. Meaning, I want to be able to login to Django's admin and see what protocols are currently in use, any details specific to each connection (e.g. if the server is connected to freenode via IRC, it should list all the channels currently connected to), and allow me to disconnect or connect new clients by modifying or creating database records.
What would be the best way to do this? There are lots of posts out there about combining Django with Twisted, but I haven't found any prior art for doing quite what I've outlined. All the Twisted examples I've seen use hardcoded connection parameters, which makes it difficult for me to imagine how I would dynamically running reactor.connectTCP(...) or loseConnection(...) when signalled by a record in the database.
My strategy is to create a custom ClientFactory that solely polls the Django/managed database every N seconds for any commands, and to modify/create/delete connections as appropriate, reflecting the new status in the database when complete.
Does this seem feasible? Is there a better approach? Does anyone know of any existing projects that implement similar functionality?

Polling the database is lame, but unfortunately, databases rarely have good tools (and certainly there are no database-portable tools) for monitoring changes. So your approach might be okay.
However, if your app is in Django and you're not supporting random changes to the database from other (non-Django) clients, and your WSGI container is Twisted, then you can do this very simply by doing callFromThread(connectTCP, ...).

I've been working on yet another way of combing django and twisted. Fell free to give it a try: https://github.com/kowalski/featdjango.
The way it works, is slightly different that the others. It starts a twisted application and http site. The requests done to django are processed inside a special thread pool. What makes it special, is that that these threads can wait on Deferred, which makes it easy to combine synchronous django application code with asynchronous twisted code.
The reason I came up with structure like this, is that my application needs to perform a lot of http requests from inside the django views. Instead of performing them one by one I can delegate all of them at once to "the main application thread" which runs twisted and wait for them. The similarity to your problem is, that I also have an asynchronous component, which is a singleton and I access it from django views.
So this is, for example, this is how you would initiate the twisted component and later to get the reference from the view.
import threading
from django.conf import settings
_initiate_lock = threading.Lock()
def get_component():
global _initiate_lock
if not hasattr(settings, 'YOUR_CLIENT')
_initiate_lock.acquire()
try:
# other thread might have did our job while we
# were waiting for the lock
if not hasattr(settings, 'YOUR_CLIENT'):
client = YourComponent(**whatever)
threading.current_thread().wait_for_deferred(
client.initiate)
settings.YOUR_CLIENT = client
finally:
_initiate_lock.release()
return settings.YOUR_CLIENT
The code above, initiates my client and calls the initiate method on it. This method is asynchronous and returns a Deferred. I do all the necessary setup in there. The django thread will wait for it to finish before processing to next line.
This is how I do it, because I only access it from the request handler. You probably would want to initiate your component at startup, to call ListenTCP|SSL. Than your django request handlers could get the data about the connections just accessing some public methods on the your client. These methods could even return Deferred, in which case you should use .wait_for_defer() to call them.

Communicating with a running python daemon

I wrote a small Python application that runs as a daemon. It utilizes threading and queues.
I'm looking for general approaches to altering this application so that I can communicate with it while it's running. Mostly I'd like to be able to monitor its health.
In a nutshell, I'd like to be able to do something like this:
python application.py start # launches the daemon
Later, I'd like to be able to come along and do something like:
python application.py check_queue_size # return info from the daemonized process
To be clear, I don't have any problem implementing the Django-inspired syntax. What I don't have any idea how to do is to send signals to the daemonized process (start), or how to write the daemon to handle and respond to such signals.
Like I said above, I'm looking for general approaches. The only one I can see right now is telling the daemon constantly log everything that might be needed to a file, but I hope there's a less messy way to go about it.
UPDATE: Wow, a lot of great answers. Thanks so much. I think I'll look at both Pyro and the web.py/Werkzeug approaches, since Twisted is a little more than I want to bite off at this point. The next conceptual challenge, I suppose, is how to go about talking to my worker threads without hanging them up.
Thanks again.

Yet another approach: use Pyro (Python remoting objects).
Pyro basically allows you to publish Python object instances as services that can be called remotely. I have used Pyro for the exact purpose you describe, and I found it to work very well.
By default, a Pyro server daemon accepts connections from everywhere. To limit this, either use a connection validator (see documentation), or supply host='127.0.0.1' to the Daemon constructor to only listen for local connections.
Example code taken from the Pyro documentation:
Server
import Pyro.core
class JokeGen(Pyro.core.ObjBase):
def __init__(self):
Pyro.core.ObjBase.__init__(self)
def joke(self, name):
return "Sorry "+name+", I don't know any jokes."
Pyro.core.initServer()
daemon=Pyro.core.Daemon()
uri=daemon.connect(JokeGen(),"jokegen")
print "The daemon runs on port:",daemon.port
print "The object's uri is:",uri
daemon.requestLoop()
Client
import Pyro.core
# you have to change the URI below to match your own host/port.
jokes = Pyro.core.getProxyForURI("PYROLOC://localhost:7766/jokegen")
print jokes.joke("Irmen")
Another similar project is RPyC. I have not tried RPyC.

What about having it run an http server?
It seems crazy but running a simple web server for administrating your
server requires just a few lines using web.py
You can also consider creating a unix pipe.

Use werkzeug and make your daemon include an HTTP-based WSGI server.
Your daemon has a collection of small WSGI apps to respond with status information.
Your client simply uses urllib2 to make POST or GET requests to localhost:somePort. Your client and server must agree on the port number (and the URL's).
This is very simple to implement and very scalable. Adding new commands is a trivial exercise.
Note that your daemon does not have to respond in HTML (that's often simple, though). Our daemons respond to the WSGI-requests with JSON-encoded status objects.

I would use twisted with a named pipe or just open up a socket. Take a look at the echo server and client examples. You would need to modify the echo server to check for some string passed by the client and then respond with whatever requested info.
Because of Python's threading issues you are going to have trouble responding to information requests while simultaneously continuing to do whatever the daemon is meant to do anyways. Asynchronous techniques or forking another processes are your only real option.

# your server
from twisted.web import xmlrpc, server
from twisted.internet import reactor
class MyServer(xmlrpc.XMLRPC):
def xmlrpc_monitor(self, params):
return server_related_info
if __name__ == '__main__':
r = MyServer()
reactor.listenTCP(8080, Server.Site(r))
reactor.run()
client can be written using xmlrpclib, check example code here.

Assuming you're under *nix, you can send signals to a running program with kill from a shell (and analogs in many other environments). To handle them from within python check out the signal module.

You could associate it with Pyro (http://pythonhosted.org/Pyro4/) the Python Remote Object. It lets you remotely access python objects. It's easily to implement, has low overhead, and isn't as invasive as Twisted.

You can do this using multiprocessing managers (https://docs.python.org/3/library/multiprocessing.html#managers):
Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
Example server:
from multiprocessing.managers import BaseManager
class RemoteOperations:
def add(self, a, b):
print('adding in server process!')
return a + b
def multiply(self, a, b):
print('multiplying in server process!')
return a * b
class RemoteManager(BaseManager):
pass
RemoteManager.register('RemoteOperations', RemoteOperations)
manager = RemoteManager(address=('', 12345), authkey=b'secret')
manager.get_server().serve_forever()
Example client:
from multiprocessing.managers import BaseManager
class RemoteManager(BaseManager):
pass
RemoteManager.register('RemoteOperations')
manager = RemoteManager(address=('localhost', 12345), authkey=b'secret')
manager.connect()
remoteops = manager.RemoteOperations()
print(remoteops.add(2, 3))
print(remoteops.multiply(2, 3))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Twisted: Creating a ThreadPool and then daemonizing leads to uninformative hangs - python

Related

Connect Flask server with many threads to single server-side thread [duplicate]

Can I have Python code to continue executing after I call Flask app.run?

Run code on first Django start

Controlling a Twisted Server from Django

Communicating with a running python daemon

Categories

Resources