Restructuring program to use asyncio

Restructuring program to use asyncio - python

Currently I have a game that does networking synchronously using the socket module.
It is structured like this:
Server:
while True:
add_new_clients()
process_game_state()
for client in clients:
send_data(client)
get_data_from(client)
Client:
connect_to_server()
while True:
get_data_from_server()
process_game_state()
draw_to_screen()
send_input_to_server()
I want to replace the network code with some that uses a higher level module than socket, e.g. asyncio or gevent. However, I don't know how to do this.
All the examples I have seen are structured like this:
class Server:
def handle_client(self, connection):
while True:
input = get_input(connection)
output = process(input)
send(connection, output)
and then handle_client being called in parallel, using threads or something, for each client that joins.
This works fine if the clients can be handled separately. However, I still want to keep a game-loop type structure, where processing only occurs in one case - I don't want have to check collisions etc. for each client. How would I do this?

I assume that you understand how to create a server using a protocol and how asynchronous paradigm work.
All you need is to break down your while event loop into handlers.
Let's see server case and client case :
Server case
A client (server-side)
You need to create a what we call a protocol, it will be used to create the server and serve as a pattern where each instance = a client :
class ClientProtocol(asyncio.Protocol):
def connection_made(self, transport):
# Here, we have a new player, the transport represent a socket.
self.transport = transport
def data_received(self, data):
packet = decode_packet(data) # some function for reading packets
if packet.opcode == CMSG_MOVE: # opcode is a operation code.
self.player.move(packet[0]) # packet[0] is the first "real" data.
self.transport.write("OK YOUR MOVE IS ACCEPTED") # Send back confirmation or whatever.
Ok, now you have a idea of how you can do thing with your clients.
Game state
After that, you need to process your game state each X ms :
def processGameState():
# some code...
eventLoop.call_later(0.1, processGameState) # every 100 ms, processGameState is called.
At some point, you will call processGameState in your initialization and it will tell the eventLoop to call processGameState 100 ms later (It may not be the ideal way to do it, but it's an idea like another one)
As for sending new data to clients, you just need to store a list of ClientProtocol and write to their transport with a simple for each.
The get_data_from is obviously removed, as we receive all our data asynchronously in the data_received method of the ClientProtocol.
This is a sketch of how you can refactor all your synchronous code into asynchronous code. You may want to add authentication, and some other things, if it's your first time with asynchronous paradigm, I suggest you to try to do it with Twisted more than asyncio : Twisted is likely to be more documented and explained everywhere than asyncio (but asyncio is quite the same as Twisted, so you can switch back everytime).
Client case
It's pretty the same here.
But, you may need to pay attention to how you draw and how you manage your input. You may need to ultimately use another thread to call inputs handlers, and another thread to draw to the screen at a constant framerate.
Conclusion
Think in asynchronous is pretty difficult at the start.
But it's worth the effort.
Note that even my approach may not be the best or adapted for games. I just feel I would do it like that, please, take your time to test your code and profile it.
Check if you don't mix synchronous and asynchronous code in the same function without proper handling using deferToThread (or other helpers), it would destroy your game's performances.

Related

(py)zmq/PUB : Is it possible to call connect() then send() immediately and do not lose the message?

With this code, I always lose the message :
def publish(frontend_url, message):
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.connect(frontend_url)
socket.send(message)
However, if I introduce a short sleep(), I can get the message :
def publish(frontend_url, message):
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.connect(frontend_url)
time.sleep(0.1) # wait for the connection to be established
socket.send(message)
Is there a way to ensure the message will be delivered without sleeping between the calls to connect() and send() ?
I'm afraid I can't predict the sleep duration (network latencies, etc.)
UPDATE:
Context : I'd like to publish data updates from a Flask REST application to a message broker (eg. on resource creation/update/deletion).
Currently, the message broker is drafted using the 0mq FORWARDER device
I understand 0mq is designed to abstract the TCP sockets and message passing complexities.
In a context where connections are long-lived, I could use it.
However, when running my Flask app in an app container like gunicorn or uwsgi, I have N worker processes and I can't expect the connection nor the process to be long-lived.
As I understand the issue, I should use a real message broker (like RabbitMQ) and use a synchronous client to publish the messages there.

You can't do this exactly, but there may be other solutions that would solve your problem.
Why are you using PUB/SUB sockets? The nature of pub/sub is more suited to long-running sockets, and typically you will bind() on the PUB socket and connect on the SUB socket. What you're doing here, spinning up a socket to send one message, presumably to a "server" of some sort, doesn't really fit the PUB/SUB paradigm very well.
If you instead choose some variation of REQ or DEALER to REP or ROUTER, then things might go smoother for you. A REQ socket will hold a message until its pair is ready to receive it. If you don't particularly care about the response from the "server", then you can just discard it.
Is there any particular reason you aren't just leaving the socket open, instead of building a whole new context and socket, and re-connecting each time you want to send a message? I can think of some limited scenarios where this might be the preferred behavior, but generally it's a better idea to just leave the socket up. If you wanted to stick with PUB/SUB, then just spin the socket up at the start of your app, sleep some safe period of time that covers any reasonable latency scenario, and then start sending your messages without worrying about re-connecting every time. If you'll leave this socket up for long periods of time without any new messages you'll probably want to use heart-beating to make sure the connection stays open.

From the ZMQ Guide:
There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.

Many posts here start with:
"I used .PUB/.SUB and it did not the job I wanted it to do ... Anyone here, do help me make it work like I think it shall work out of the box."
This approach does not work in real world, the less in distributed systems design, the poorer in systems, where near-real-time scheduling and/or tight resources-management is simply un-avoid-able.
Inter-process / inter-platform messaging is not "just another" simple-line-of-code (SLOC)
# A sample demo-code snippet # Issues the demo-code has left to be resolved
#------------------------------------ #------------------------------------------------
def publish( frontend_url, message ): # what is a benefit of a per-call OneStopPUBLISH function?
context = zmq.Context() # .Context() has to be .Terminate()-d (!)
socket = context.socket(zmq.PUB) # is this indeed "a disposable" for each call?
socket.connect(frontend_url) # what transport-class used for .connect()/.bind()?
time.sleep(0.1) # wait for the connection to be established
socket.send(message) # ^ has no control over low-level "connection" handshaking
Anybody may draft a few one-liners and put a decent effort ( own or community outsourced ) to make it finally work ( at least somehow ).
However this is a field of vast capabilities and as such requires a bit of reshaping one's mind to allow its potential to become unlocked and fully utilised.
Sketching a need for a good solution but with wrong grounds or mis-understood SLOC-s ( be it copy/paste-d or not ) typically does not yield anything reasonable for the near, the less for the farther future.
Messaging simply introduces a new paradigm -- a new Macro-COSMOS -- of building automation in wider scale - surprisingly, your (deterministic) code becomes a member of a more complex set of Finite State Automata ( FSA ), that - not so surprisingly, as we intend to do some "MESSAGING" - speak among each other.
For that, there needs to be some [local-resource-management], some "outer" [transport], some "formal behaviour model etiquette" ( not to shout one over another ) [communication-primitive].
This is typically in-built into ZeroMQ, nanomsg and other libraries.
However, there are two important things that remain hidden.
The micro-cosmos of how the things work internally ( many, if not all, attempts to tweak this, instead of making one's best to make proper use of it, are typically waste of time )
The macro-cosmos of how to orchestrate a non-trivial herd of otherwise trivial elements [communication-primitives] into a ROBUST, SCALEABLE messaging ARCHITECTURE, that co-operates across process/localhost/network boundaries and that meets the overall design needs.
Failure to understand the distance between these two worlds typically causes a poor use of the greatest strengths we have received pre-cooked in the messaging libraries.
Simply the best thing to do is to forget the one-liner tweaking approaches. It is not productive.
Understanding the global view first, allows you to harness the powers that will work best for your good to meet your goals.
Why it is so complex?
( courtesy nanomsg.org )
Any non-trivial system is complex. Both in TimeDOMAIN and in ResourcesDOMAIN. The more, if one strives to create a stable, smart, High-performance, Low-latency, transport-class-agnostic Universal Communication Framework.
The good news is, this has been already elaborated and in-built into the micro-cosmos architecture.
The bad news is, this does not solve your needs right from the box ( except a case of some really trivial ones ).
Here we come with the macro-COSMOS design.
It is your responsibility to design a higher-space algorithm, how to make many isolated FSA-primitives converse and find an agreement in accord with the evolving many-to-many conversation. Yes. The library gives you "just" primitive building blocks (very powerful, out of doubt). But it is your responsibility to make the "outer-space" work for your needs.
And this can and typically is complex.
Well, if that would be trivial, then it most probably would have been already included "inside" the library, wouldn't it?
Where to go next?
Perhaps a best next step one may do is IMHO to make a step towards a bit more global view, which may and will sound complicated for the first few things one tries to code with ZeroMQ, but if you at least jump to the page 265 of Pieter Hintjens' book, Code Connected, Volume 1, if it were not the case of reading step-by-step there.
One can start to realise the way, how it is possible to start "programming" the macro-COSMOS of FSA-primitives, so as to form a higher-order-FSA-of-FSAs, that can and will solve all the ProblemDOMAIN specific issues.
First have an un-exposed view on the Fig.60 Republishing Updates and Fig.62 HA Clone Server pair and try only after that to go back to the roots, elements and details.

Building an HTTP API for continuously running python process

TL;DR: I have a beautifully crafted, continuously running piece of Python code controlling and reading out a physics experiment. Now I want to add an HTTP API.
I have written a module which controls the hardware using USB. I can script several types of autonomously operating experiments, but I'd like to control my running experiment over the internet. I like the idea of an HTTP API, and have implemented a proof-of-concept using Flask's development server.
The experiment runs as a single process claiming the USB connection and periodically (every 16 ms) all data is read out. This process can write hardware settings and commands, and reads data and command responses.
I have a few problems choosing the 'correct' way to communicate with this process. It works if the HTTP server only has a single worker. Then, I can use python's multiprocessing.Pipe for communication. Using more-or-less low-level sockets (or things like zeromq) should work, even for request/response, but I have to implement some sort of protocol: send {'cmd': 'set_voltage', 'value': 900} instead of calling hardware.set_voltage(800) (which I can use in the stand-alone scripts). I can use some sort of RPC, but as far as I know they all (SimpleXMLRPCServer, Pyro) use some sort of event loop for the 'server', in this case the process running the experiment, to process requests. But I can't have an event loop waiting for incoming requests; it should be reading out my hardware! I googled around quite a bit, but however I try to rephrase my question, I end up with Celery as the answer, which mostly fires off one job after another, but isn't really about communicating with a long-running process.
I'm confused. I can get this to work, but I fear I'll be reinventing a few wheels. I just want to launch my app in the terminal, open a web browser from anywhere, and monitor and control my experiment.
Update: The following code is a basic example of using the module:
from pysparc.muonlab.muonlab_ii import MuonlabII
muonlab = MuonlabII()
muonlab.select_lifetime_measurement()
muonlab.set_pmt1_voltage(900)
muonlab.set_pmt1_threshold(500)
lifetimes = []
while True:
data = muonlab.read_lifetime_data()
if data:
print "Muon decays detected with lifetimes", data
lifetimes.extend(data)
The module lives at https://github.com/HiSPARC/pysparc/tree/master/pysparc/muonlab.
My current implementation of the HTTP API lives at https://github.com/HiSPARC/pysparc/blob/master/bin/muonlab_with_http_api.
I'm pretty happy with the module (with lots of tests) but the HTTP API runs using Flask's single-threaded development server (which the documentation and the internet tells me is a bad idea) and passes dictionaries through a Pipe as some sort of IPC. I'd love to be able to do something like this in the above script:
while True:
data = muonlab.read_lifetime_data()
if data:
print "Muon decays detected with lifetimes", data
lifetimes.extend(data)
process_remote_requests()
where process_remote_requests is a fairly short function to call the muonlab instance or return data. Then, in my Flask views, I'd have something like:
muonlab = RemoteMuonlab()
#app.route('/pmt1_voltage', methods=['GET', 'PUT'])
def get_data():
if request.method == 'PUT':
voltage = request.form['voltage']
muonlab.set_pmt1_voltage(voltage)
else:
voltage = muonlab.get_pmt1_voltage()
return jsonify(voltage=voltage)
Getting the measurement data from the app is perhaps less of a problem, since I could store that in SQLite or something else that handles concurrent access.

But... you do have an IO loop; it runs every 16ms.
You can use BaseHTTPServer.HTTPServer in such a case; just set the timeout attribute to something small. bascially...
class XmlRPCApi:
def do_something(self):
print "doing something"
server = SimpleXMLRPCServer(("localhost", 8000))
server.register_instance(XMLRpcAPI())
server.timeout = 0
while True:
sleep(0.016)
do_normal_thing()
x.handle_request()
Edit: python has a built in server, also built on BaseHTTPServer, capable of serving a flask app. since flask.Flask() happens to be a wsgi compliant application, your process_remote_requests() should look like this:
import wsgiref.simple_server
remote_server = wsgire.simple_server('localhost', 8000, app)
# app here is just your Flask() application!
# as before, set timeout to zero so that you can go right back
# to your event loop if there are no requests to handle
remote_server.timeout = 0
def process_remote_requests():
remote_server.handle_request()
This works well enough if you have only short running requests; but if you need to handle requests that may possibly take longer than your event loop's normal polling interval, or if you need to handle more requests than you have polls per unit of time, then you can't use this approach, exactly.
You don't necessarily need to fork off another process, though, You can potentially get by using a pool of workers in another thread. roughly:
import threading
import wsgiref.simple_server
remote_server = wsgire.simple_server('localhost', 8000, app)
POOL_SIZE = 10 # or some other value.
pool = [threading.Thread(target=remote_server.serve_forever) for dummy in xrange(POOL_SIZE)]
for thread in pool:
thread.daemon = True
thread.start()
while True:
pass # normal experiment processing here; don't handle requests in this thread.
However; this approach has one major shortcoming, you now have to deal with concurrency! It's not safe to manipulate your program state as freely as you could with the above loop, since you might be, concurrently manipulating that same state in the main thread (or another http server thread). It's up to you to know when this is valid, wrapping each resource with some sort of mutex lock or whatever is appropriate.

Redirect a method call to something with a file descriptor - asyncore

We have a network client based on asyncore with the user's network connection is embodied in a Dispatcher. The goal is for a user working from an interactive terminal to be able to enter network request commands which would go out to a server and eventually come back with an answer. The client is written to be asynchronous so that the user can start several requests on different servers all at once, collecting the results as they become available.
How can we allow the user to type in commands while we're going around a select loop? If we hit the select() call registered only as readable, then we'll sit there until we read data or timeout. During this (possibly very long) time user input will be ignored. If we instead always register as writable, we get a hot loop.
One bad solution is as follows. We run the select loop in its own thread and have the user inject input into a thread safe write Queue by invoking a method we define on our Dispatcher. Something like
def myConnection.enqueue(self, someData):
self.lock.acquire()
self.queue.put(someData)
self.lock.release()
We then only register as writable if the Queue is not emtpy
def writable(self):
return not self.queue.is_empty()
We would then specify a timeout for select() that's short on human scales but long for a computer. This way if we're in the select call registered only for reading when the user puts in new data, the loop will eventually run around again and find out that there's new data to write. This is a bad solution though because we might want to use this code for servers' client connections as well, in which case we don't want the dead time you get waiting for select() to time out. Again, I realize this is a bad solution.
It seems like the correct solution would be to bring the user input in through a file descriptor so that we can detect new input while sitting in the select call registered only as readable. Is there a way to do this?
NOTE: This is an attempt to simplify the question posted here

stdin is selectable. Put stdin into your dispatcher.
Also, I recommend Twisted for future development of any event-driven software. It is much more featureful than asyncore, has better documentation, a bigger community, performs better, etc.

Making an asynchronous call synchronous in tornado

I have the following problem.
I work on a tornado based application server. Most of the code can be synchronous and the web interface does not really use any of the asynchronous facilities of Tornado.
I now have to interface to an (asynchronous) legacy backend for which I use the tornado.iostream interface to send commands. Responses to these commands are sent asynchronously, together with other periodic information, such as status updates.
The code is wrapped in a common interface that is also used for other backends.
What I want to achieve is the following:
# this is executed on initialization
self.stream.read_until_close(self.close, self.read_from_backend)
# this is called whenever data arrives on the input stream
def read_from_backend(self, data):
if data in pending:
# it means we got a response to a request we sent out
del self.pending[data]
else:
# do something else
# this sends a request to the backend
def send_to_backend(self, data):
self.pending[data] = True
while data in self.pending:
# of course this does not work
time.sleep(1)
return
Of course this does not work, as time.sleep(1) will not allow read_from_backend() to run any further.
How do I solve this? I want the send_to_backend() to return only when the response is received. Is there a way I can yield control to read_from_backend without yet returning from the method?
Please note that it is difficult to do this at a in the web layer using #asynchronous and #gen.engine, because that would require a full rewrite of all requests in our web layer. Is there a way I can implement the same design pattern somewhere else?

I think a good idea may be to look into using gevent. By MonkeyPatching and using a simple decorator I wrote you can get very easily nice asynchronous views which are written in a synchronous manner (blocking style).
You can reuse most of the code from a previous answer of mine.
Though you may not want to use gevent for different reasons (not having it has a dependency):
Admitting that you've monkey patched your global process with :
from gevent import monkey; monkey.patch_all()
The above patches threads, sockets, sleep ... so they go through gevent's hub (the hub is to gevent what the ioloop is to tornado).
Once patched and by using the #gasync decorator in my previous answer your view could simply be :
class MyHandler(tornado.web.RequestHandler):
#gasync
def get(self):
# Parse the input data in some fashion
data = get_data_from_request()
# This could be anything using python sockets, urllib ...
backend_response = send_data_to_backend(data)
# Write data to HTTP client
self.write(backend_response)
# You have to finish the response yourself since it's asynchronous
self.finish()
I find that gevent's simplicity and "elegance" far outweighs any advantage you would have writing async code with tornado's ioloop.
In my case I had to use legacy code written in a synchronous fashion, so basically gevent was a life safer, all I had to do was monkey patch and write that decorator and I could use all that legacy code without any modifications.
I hope this helps.

Python BaseHTTPServer: How to get it to stop?

According to the source the BaseServer.shutdown() must be called from a different thread than the one the server is running on.
However I am trying to shut down the server with a specific value provided to the server in a web request.
The request handler obviously runs in this thread so it will still deadlock after I have done this:
httpd = BaseHTTPServer.HTTPServer(('', 80), MyHandler)
print("Starting server in thread")
threading.Thread(target=httpd.serve_forever).start()
How can I accomplish what I want? Must I set up a socket or pipe or something (please show me how to do this, if it is the solution), where the main thread can block and wait on the child thread to send a message, at which point it will be able to call shutdown()?
I am currently able to achieve some "kind of works" behavior by calling "httpd.socket.close()" from the request handler. This generates an [Errno 9] Bad file descriptor error and seems to terminate the process on Windows.
But this is clearly not the right way to go about this.
Update ca. Aug. 2013 I have eloped with node.js to fill the need for robust async I/O, but plenty of side projects (e.g. linear algebra research, various command line tool frontends) keep me coming back to python. Looking back on this question, BaseHTTPServer probably has little practical value comparatively to other options (like various microframeworks that Phil mentions).

1. from the BaseHTTPServer docs
To create a server that doesn’t run forever, but until some condition is fulfilled:
def run_while_true(server_class=BaseHTTPServer.HTTPServer,
handler_class=BaseHTTPServer.BaseHTTPRequestHandler):
"""
This assumes that keep_running() is a function of no arguments which
is tested initially and after each request. If its return value
is true, the server continues.
"""
server_address = ('', 8000)
httpd = server_class(server_address, handler_class)
while keep_running():
httpd.handle_request()
Allow some url call to set the condition to terminate, use whatever strikes your fancy there.
Edit: keep_running is any function you choose, could be as simple as:
def keep_running():
global some_global_var_that_my_request_controller_will_set
return some_global_var_that_my_request_controller_will_set
you probably want something smarter and without_rediculously_long_var_names.
2. BaseHTTPServer is usually lower than you want to go. There are plenty of micro-frameworks that might suit your needs

threading.Event is useful for signalling other threads. e.g.,
please_die = threading.Event()
# in handler
please_die.set()
# in main thread
please_die.wait()
httpd.shutdown()
You might use a Queue if you want to send data between threads.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.