I wrote a small Python application that runs as a daemon. It utilizes threading and queues.
I'm looking for general approaches to altering this application so that I can communicate with it while it's running. Mostly I'd like to be able to monitor its health.
In a nutshell, I'd like to be able to do something like this:
python application.py start # launches the daemon
Later, I'd like to be able to come along and do something like:
python application.py check_queue_size # return info from the daemonized process
To be clear, I don't have any problem implementing the Django-inspired syntax. What I don't have any idea how to do is to send signals to the daemonized process (start), or how to write the daemon to handle and respond to such signals.
Like I said above, I'm looking for general approaches. The only one I can see right now is telling the daemon constantly log everything that might be needed to a file, but I hope there's a less messy way to go about it.
UPDATE: Wow, a lot of great answers. Thanks so much. I think I'll look at both Pyro and the web.py/Werkzeug approaches, since Twisted is a little more than I want to bite off at this point. The next conceptual challenge, I suppose, is how to go about talking to my worker threads without hanging them up.
Thanks again.
Yet another approach: use Pyro (Python remoting objects).
Pyro basically allows you to publish Python object instances as services that can be called remotely. I have used Pyro for the exact purpose you describe, and I found it to work very well.
By default, a Pyro server daemon accepts connections from everywhere. To limit this, either use a connection validator (see documentation), or supply host='127.0.0.1' to the Daemon constructor to only listen for local connections.
Example code taken from the Pyro documentation:
Server
import Pyro.core
class JokeGen(Pyro.core.ObjBase):
def __init__(self):
Pyro.core.ObjBase.__init__(self)
def joke(self, name):
return "Sorry "+name+", I don't know any jokes."
Pyro.core.initServer()
daemon=Pyro.core.Daemon()
uri=daemon.connect(JokeGen(),"jokegen")
print "The daemon runs on port:",daemon.port
print "The object's uri is:",uri
daemon.requestLoop()
Client
import Pyro.core
# you have to change the URI below to match your own host/port.
jokes = Pyro.core.getProxyForURI("PYROLOC://localhost:7766/jokegen")
print jokes.joke("Irmen")
Another similar project is RPyC. I have not tried RPyC.
What about having it run an http server?
It seems crazy but running a simple web server for administrating your
server requires just a few lines using web.py
You can also consider creating a unix pipe.
Use werkzeug and make your daemon include an HTTP-based WSGI server.
Your daemon has a collection of small WSGI apps to respond with status information.
Your client simply uses urllib2 to make POST or GET requests to localhost:somePort. Your client and server must agree on the port number (and the URL's).
This is very simple to implement and very scalable. Adding new commands is a trivial exercise.
Note that your daemon does not have to respond in HTML (that's often simple, though). Our daemons respond to the WSGI-requests with JSON-encoded status objects.
I would use twisted with a named pipe or just open up a socket. Take a look at the echo server and client examples. You would need to modify the echo server to check for some string passed by the client and then respond with whatever requested info.
Because of Python's threading issues you are going to have trouble responding to information requests while simultaneously continuing to do whatever the daemon is meant to do anyways. Asynchronous techniques or forking another processes are your only real option.
# your server
from twisted.web import xmlrpc, server
from twisted.internet import reactor
class MyServer(xmlrpc.XMLRPC):
def xmlrpc_monitor(self, params):
return server_related_info
if __name__ == '__main__':
r = MyServer()
reactor.listenTCP(8080, Server.Site(r))
reactor.run()
client can be written using xmlrpclib, check example code here.
Assuming you're under *nix, you can send signals to a running program with kill from a shell (and analogs in many other environments). To handle them from within python check out the signal module.
You could associate it with Pyro (http://pythonhosted.org/Pyro4/) the Python Remote Object. It lets you remotely access python objects. It's easily to implement, has low overhead, and isn't as invasive as Twisted.
You can do this using multiprocessing managers (https://docs.python.org/3/library/multiprocessing.html#managers):
Managers provide a way to create data which can be shared between different processes, including sharing over a network between processes running on different machines. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
Example server:
from multiprocessing.managers import BaseManager
class RemoteOperations:
def add(self, a, b):
print('adding in server process!')
return a + b
def multiply(self, a, b):
print('multiplying in server process!')
return a * b
class RemoteManager(BaseManager):
pass
RemoteManager.register('RemoteOperations', RemoteOperations)
manager = RemoteManager(address=('', 12345), authkey=b'secret')
manager.get_server().serve_forever()
Example client:
from multiprocessing.managers import BaseManager
class RemoteManager(BaseManager):
pass
RemoteManager.register('RemoteOperations')
manager = RemoteManager(address=('localhost', 12345), authkey=b'secret')
manager.connect()
remoteops = manager.RemoteOperations()
print(remoteops.add(2, 3))
print(remoteops.multiply(2, 3))
Related
This question already has answers here:
Store large data or a service connection per Flask session
(1 answer)
Are global variables thread-safe in Flask? How do I share data between requests?
(4 answers)
Closed 4 years ago.
I'm have written a python script on a Raspberry Pi that controls a large and somewhat complex piece of hardware. Currently the use interface is a python console. I run the python program, and the can enter commands from the console with input("> ").
Now I want to add a web interface to this program, and I'm trying to figure out the right way to do that. I've written a basic web UI in flask, but I haven't found a good way to connect the flask interface with the main script. Because the server controls a single piece of hardware, the main object (handling the hardware control) is only instantiated once. In addition, there is significant hardware configuration that must happen each time the script is run (each time the main object is created.)
Flask seems to create a new thread for each client session, and only persist variable across that session. How can I link flask events (i.e, web user presses a button) to a method in the main object?
Am I using the wrong tool for the job? What is the right tool? What is the correct way to do this? Alternatively, what is a jank way to do this?
EDIT: The linked questions are similar, and led me down the path to the answer below, but they really didn't answer the question, so much as point to some code that solved that specific example. I think the below answer is far more helpful. (Of course I'm a bit biased; I wrote it)
I found a solution, so I'll share it here. (I guess I can't answer my own questions?)
The Flask Web App system (and all WSGI web apps?) rely on the principle that the web app can be executed in a totally fresh environment, without needing to be passed objects on creation. I guess this makes sense for Web Apps, but it adds a ton of annoying complexity to Web UIs, that is, web interfaces purpose built to interface a single instance of a larger program. As a hardware/solutions engineer, I tend to need a Web UI to control a single piece of hardware. If there is something better than flask for this purpose please leave a comment. But flask is nice, and I now know its possible to use for this purpose.
Flask Web Apps are not designed to do the "heavy lifting" themselves. They maintain very limited state (i.e. every request triggers a fresh context), and as mentioned can't be passed references to other objects. In a proper Web App a full stack developer would connect Flask to a database server and other such WebDev-y systems. For controlling hardware, our goal is to trigger the execution of some arbitrary methods in another python process (quite possibly a separately executed process).
In python a convenient way to achieve this execution is the multiprocessing.managers module. Managers are a tool that allows you to easily construct Proxies, which link objects across processes. If you have an object bar = Bar() you can produce a <AutoProxy[get_bar]> proxy, that allows you to manipulate the original bar object from far away. Far away can be in a child process, on another computer across the internet. Here's an example.
server.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
class Bar: #First we setup a dummy class for this example.
def __init__(self):
self.text = ""
def read(self):
return str(self.text)
def write(self, string):
self.text = str(string)
logger.error("Wrote!")
logger.error(string)
bar = Bar() #On the server side we create an instance: this is the object we want to share to other processes.
m.BaseManager.register('get_bar', callable=lambda:bar) #then we register a 'get' function in the manager,
# to retrieve our object from afar. the lambda:bar is just shorthand for a function that returns the bar object.
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server on port 50000, with a password.
server = manager.get_server()
server.serve_forever() #Then we start the server!
client.py:
import multiprocessing.managers as m
import logging
logger = logging.getLogger()
m.BaseManager.register('get_bar') #We register this so that the Manager knows it's a valid method
manager = m.BaseManager(address=('', 50000), authkey=b'abc') #Then we setup the server connection
manager.connect() #and connect!
bar = manager.get_bar() # now we can use our 'get' method to retrieve a Proxy of the object.
So there are a few interesting things to note here. First, we can get_bar() from as many clients as we want, and they'll all point back to the same bar. Second, we can call methods in bar, read() and write(), from a client, without having the Bar class on hand. Pretty neat.
So how to use this? If you have a console-enabled program, first split it into two parts, the console, and the functionality it controls. Have that functionality boxed up in a handful of objects that together comprise the instance of the application. Modify server.py above to instantiate those objects, and host them out with get methods in a Manager. Then tweak your console interface to connect like client.py and use the proxies instead of the actual objects. Finally, set up a flask app that also connects to the server and provides a web interface.
Now you're a Bona-Fide Web Developer! RIP.
I'm writing application that uses python Twisted API (namely WebSocketClientProtocol, WebSocketClientFactory, ReconnectiongClientFactory. I want to wrap client factory into reader with following interface
class Reader:
def start(self):
pass
def stop(self):
pass
Start function will be used to open connection (i.e. connect on ws api and start reading data), while stop will stop such connection.
My issue is that if I use reactor.run() inside start, connection starts and everything is OK, but my code never goes pass that line (looks like blocking call to me) and I cannot execute subsequent lines (include .stop in my tests).
I have tried using variants such as reactor.callFromThread(reactor.run) and reactor.callFromThread(reactor.stop) or even excplicity calling Thread(target=...) but none seems to work (they usually don't build protocol or open connection at all).
Any help or guidelines on how to implement Reader.start and Reader.stop are welcome.
If you put reactor.run inside Reader.start then Reader will be a difficult component to use alongside other code. Your difficulties are just the first symptom of this.
Calling reactor.run and reactor.stop are the job of code responsible for managing the lifetime of your application. Put those calls somewhere separate from your WebSocket application code. For example:
r = Reader()
r.start()
reactor.run()
Or better yet, implement a twist(d) plugin and let twist(d) manage the reactor for you.
My Flask application will receive a request, do some processing, and then make a request to a slow external endpoint that takes 5 seconds to respond. It looks like running Gunicorn with Gevent will allow it to handle many of these slow requests at the same time. How can I modify the example below so that the view is non-blocking?
import requests
#app.route('/do', methods = ['POST'])
def do():
result = requests.get('slow api')
return result.content
gunicorn server:app -k gevent -w 4
If you're deploying your Flask application with gunicorn, it is already non-blocking. If a client is waiting on a response from one of your views, another client can make a request to the same view without a problem. There will be multiple workers to process multiple requests concurrently. No need to change your code for this to work. This also goes for pretty much every Flask deployment option.
First a bit of background, A blocking socket is the default kind of socket, once you start reading your app or thread does not regain control until data is actually read, or you are disconnected. This is how python-requests, operates by default. There is a spin off called grequests which provides non blocking reads.
The major mechanical difference is that send, recv, connect and accept
can return without having done anything. You have (of course) a number
of choices. You can check return code and error codes and generally
drive yourself crazy. If you don’t believe me, try it sometime
Source: https://docs.python.org/2/howto/sockets.html
It also goes on to say:
There’s no question that the fastest sockets code uses non-blocking
sockets and select to multiplex them. You can put together something
that will saturate a LAN connection without putting any strain on the
CPU. The trouble is that an app written this way can’t do much of
anything else - it needs to be ready to shuffle bytes around at all
times.
Assuming that your app is actually supposed to do something more than
that, threading is the optimal solution
But do you want to add a whole lot of complexity to your view by having it spawn it's own threads. Particularly when gunicorn as async workers?
The asynchronous workers available are based on Greenlets (via
Eventlet and Gevent). Greenlets are an implementation of cooperative
multi-threading for Python. In general, an application should be able
to make use of these worker classes with no changes.
and
Some examples of behavior requiring asynchronous workers: Applications
making long blocking calls (Ie, external web services)
So to cut a long story short, don't change anything! Just let it be. If you are making any changes at all, let it be to introduce caching. Consider using Cache-control an extension recommended by python-requests developers.
You can use grequests. It allows other greenlets to run while the request is made. It is compatible with the requests library and returns a requests.Response object. The usage is as follows:
import grequests
#app.route('/do', methods = ['POST'])
def do():
result = grequests.map([grequests.get('slow api')])
return result[0].content
Edit: I've added a test and saw that the time didn't improve with grequests since gunicorn's gevent worker already performs monkey-patching when it is initialized: https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/ggevent.py#L65
TL;DR: I have a beautifully crafted, continuously running piece of Python code controlling and reading out a physics experiment. Now I want to add an HTTP API.
I have written a module which controls the hardware using USB. I can script several types of autonomously operating experiments, but I'd like to control my running experiment over the internet. I like the idea of an HTTP API, and have implemented a proof-of-concept using Flask's development server.
The experiment runs as a single process claiming the USB connection and periodically (every 16 ms) all data is read out. This process can write hardware settings and commands, and reads data and command responses.
I have a few problems choosing the 'correct' way to communicate with this process. It works if the HTTP server only has a single worker. Then, I can use python's multiprocessing.Pipe for communication. Using more-or-less low-level sockets (or things like zeromq) should work, even for request/response, but I have to implement some sort of protocol: send {'cmd': 'set_voltage', 'value': 900} instead of calling hardware.set_voltage(800) (which I can use in the stand-alone scripts). I can use some sort of RPC, but as far as I know they all (SimpleXMLRPCServer, Pyro) use some sort of event loop for the 'server', in this case the process running the experiment, to process requests. But I can't have an event loop waiting for incoming requests; it should be reading out my hardware! I googled around quite a bit, but however I try to rephrase my question, I end up with Celery as the answer, which mostly fires off one job after another, but isn't really about communicating with a long-running process.
I'm confused. I can get this to work, but I fear I'll be reinventing a few wheels. I just want to launch my app in the terminal, open a web browser from anywhere, and monitor and control my experiment.
Update: The following code is a basic example of using the module:
from pysparc.muonlab.muonlab_ii import MuonlabII
muonlab = MuonlabII()
muonlab.select_lifetime_measurement()
muonlab.set_pmt1_voltage(900)
muonlab.set_pmt1_threshold(500)
lifetimes = []
while True:
data = muonlab.read_lifetime_data()
if data:
print "Muon decays detected with lifetimes", data
lifetimes.extend(data)
The module lives at https://github.com/HiSPARC/pysparc/tree/master/pysparc/muonlab.
My current implementation of the HTTP API lives at https://github.com/HiSPARC/pysparc/blob/master/bin/muonlab_with_http_api.
I'm pretty happy with the module (with lots of tests) but the HTTP API runs using Flask's single-threaded development server (which the documentation and the internet tells me is a bad idea) and passes dictionaries through a Pipe as some sort of IPC. I'd love to be able to do something like this in the above script:
while True:
data = muonlab.read_lifetime_data()
if data:
print "Muon decays detected with lifetimes", data
lifetimes.extend(data)
process_remote_requests()
where process_remote_requests is a fairly short function to call the muonlab instance or return data. Then, in my Flask views, I'd have something like:
muonlab = RemoteMuonlab()
#app.route('/pmt1_voltage', methods=['GET', 'PUT'])
def get_data():
if request.method == 'PUT':
voltage = request.form['voltage']
muonlab.set_pmt1_voltage(voltage)
else:
voltage = muonlab.get_pmt1_voltage()
return jsonify(voltage=voltage)
Getting the measurement data from the app is perhaps less of a problem, since I could store that in SQLite or something else that handles concurrent access.
But... you do have an IO loop; it runs every 16ms.
You can use BaseHTTPServer.HTTPServer in such a case; just set the timeout attribute to something small. bascially...
class XmlRPCApi:
def do_something(self):
print "doing something"
server = SimpleXMLRPCServer(("localhost", 8000))
server.register_instance(XMLRpcAPI())
server.timeout = 0
while True:
sleep(0.016)
do_normal_thing()
x.handle_request()
Edit: python has a built in server, also built on BaseHTTPServer, capable of serving a flask app. since flask.Flask() happens to be a wsgi compliant application, your process_remote_requests() should look like this:
import wsgiref.simple_server
remote_server = wsgire.simple_server('localhost', 8000, app)
# app here is just your Flask() application!
# as before, set timeout to zero so that you can go right back
# to your event loop if there are no requests to handle
remote_server.timeout = 0
def process_remote_requests():
remote_server.handle_request()
This works well enough if you have only short running requests; but if you need to handle requests that may possibly take longer than your event loop's normal polling interval, or if you need to handle more requests than you have polls per unit of time, then you can't use this approach, exactly.
You don't necessarily need to fork off another process, though, You can potentially get by using a pool of workers in another thread. roughly:
import threading
import wsgiref.simple_server
remote_server = wsgire.simple_server('localhost', 8000, app)
POOL_SIZE = 10 # or some other value.
pool = [threading.Thread(target=remote_server.serve_forever) for dummy in xrange(POOL_SIZE)]
for thread in pool:
thread.daemon = True
thread.start()
while True:
pass # normal experiment processing here; don't handle requests in this thread.
However; this approach has one major shortcoming, you now have to deal with concurrency! It's not safe to manipulate your program state as freely as you could with the above loop, since you might be, concurrently manipulating that same state in the main thread (or another http server thread). It's up to you to know when this is valid, wrapping each resource with some sort of mutex lock or whatever is appropriate.
I am developing a networked application in Twisted, part of which consists of a web interface written in Django.
I wish to use Twisted's WSGI server to host the web interface, and I've written a working "tap" plugin to allow me to use twistd.
When running the server with the -n flag (don't daemonize) everything works fine, but when this flag is removed the server doesn't respond to requests at all, and there are no messages logged (though the server is still running).
There is a bug on Twisted's Trac which seems to describe the problem exactly, and my plugin happens to be based on the code referenced in the ticket.
Unfortunately, the issue hasn't been fixed - and it was raised almost a year ago.
I have attempted to create a ThreadPoolService class, which extends Service and starts a given ThreadPool when startService is called:
class ThreadPoolService(service.Service):
def __init__(self, pool):
self.pool = pool
def startService(self):
super(ThreadPoolService, self).startService()
self.pool.start()
def stopService(self):
super(ThreadPoolService, self).stopService()
self.pool.stop()
However, Twisted doesn't seem to be calling the startService method at all. I think the problem is that with a "tap" plugin, the ServiceMaker can only return one service to be started - and any others belonging to the same application aren't started. Obviously, I am returning a TCPServer service which contains the WSGI root.
At this point, I've hit somewhat of a bit of a brick wall. Does anyone have any ideas as to how I can work round this issue?
Return a MultiService from your ServiceMaker; one that includes your ThreadPoolService as well as your main application service. The API for assembling such a thing is pretty straightforward:
multi = MultiService()
mine = TCPServer(...) # your existing application service
threads = ThreadPoolService()
mine.setServiceParent(multi)
threads.setServiceParent(multi)
return multi
Given that you've already found the ticket for dealing with this confusing issue within Twisted, I look forward to seeing your patch :).