I have a python application , to be more precise a Network Application that can't go down this means i can't kill the PID since it actually talks with other servers and clients and so on ... many € per minute of downtime , you know the usual 24/7 system.
Anyway in my hobby projects i also work a lot with WSGI frameworks and i noticed that i have the same problem even during off-peak hours.
Anyway imagine a normal server using TCP/UDP ( put here your favourite WSGI/SIP/Classified Information Server/etc).
Now you perform a git pull in the remote server and there goes the new python files into the server (these files will of course ONLY affect the data processing and not the actual sockets so there is no need to re-raise the sockets or touch in any way the network part).
I don't usually use File monitors since i prefer to use SIGNAL to wakeup the internal app updater.
Now imagine the following code
from mysuper.app import handler
while True:
data = socket.recv()
if data:
socket.send(handler(data))
Lets imagine that handler is a APP with DB connections, cache connections , etc.
What is the best way to update the handler.
Is it safe to call reload(handler) ?
Will this break DB connections ?
Will DB Connections survive to this restart ?
Will current transactions be lost ?
Will this create anti-matter ?
What is the best-pratice patterns that you guys usually use if there are any ?
It's safe to call reload(handler).
Depends where you initialize your connections. If you make the connections inside handler(), then yes, they'll be garbage collected when the handler() object falls out of scope. But you wouldn't be connecting inside your main loop, would you? I'd highly recommend something like:
dbconnection = connect(...)
while True:
...
socket.send(handler(data, dbconnection))
if for no other reason than that you won't be making an expensive connection inside a tight loop.
That said, I'd recommend going with an entirely different architecture. Make a listener process that does basically nothing more than listen for UDP datagrams, sends them to a messaging queue like RabbitMQ, then waits for the reply message to send the results back to the client. Then write your actual servers that get their requests from the messaging queue, process them, and send a reply message back.
If you want to upgrade the UDP server, launch the new instance listening on another port. Update your firewall rules to redirect incoming traffic to the new port. Reload the rules. Kill the old process. Voila: seamless cutover.
The real win is from uncoupling your backend. Since multiple processes can listen for the same messages from your frontend "proxy" service, you can run several in parallel - on different machines, if you want to. To upgrade the backend, start a new instance then kill the old one so that there's no time when at least one instance isn't running.
To scale your proxy, have multiple instances running on different ports or different hosts, and configure your firewall to randomly redirect incoming datagrams to one of the proxies.
To scale your backend, run more instances.
Related
I have started a private project with Django and Channels to build a web-based UI to control the music player daemon (mpd) on raspberry pi. I know that there are other projects like Volumio or moode audio etc. out of the box that is doing the same, but my intension is to learn something new!
Up to now I have managed to setup a nginx server on the pi that communicates with my devices like mobile phone or pc. In the background nginx communicates with an uWSGI server for http requests to Django and a daphne server as asgi for ws connection to Django Channels. As well there is a redis server installed as backend because the Channels Layer needs this. So, on client request a simple html page as UI is served and a websocket connection is established so far.
In parallel I have a separate script as a mpd handler which is wrapped in a while loop to keep it alive, and which does all the stuff with mpd using the python module python-mpd2.
The mpd handler shall get its commands via websocket from the clients/consumers like play, stop etc. and reacts on that. At the same time, it shall send the timeline of the song when a song is playing, let’s say every one second as well via websocket. I could manage to send frequently data to all connected clients/consumers with async_to_sync(channel_layer.group_send) from outside but I couldn’t find a solution how to pass data/commands coming from the clients via websocket to my separate running mpd handler script.
I read in the docs for Django Channels that it is not recommended to use while loops in the consumers because this will block all the communication – that’s right I have tried this already. Then I tried to receive messages with the command async_to_sync(channel_layer.receive)('channel_name') in the mpd handler with a direct connection to a consumer. But this command blocks my mpd handler because it works async although I use async_to_sync.
So, my question:
Is it possible to pass messages to outside of Django Channels to other scripts with channel own methods? Do you have any suggestion how to solve this maybe with other methods or workarounds? I am looking for a reliable solution.
I gave thoughts to that issue and have some ideas, but I don’t know if this will lead to any solution:
Polling:
The clients send frequently messages and requests via websocket to control the mpd and update the UI. In this case no handler would be needed. (I don’t know if this method will generate to much traffic on the websocket and makes it slow. As well, the connection to mpd has to be established frequently and closed again. Don’t know if this works robust.)
Database:
Generate a database where consumers and the mpd handler have access to. The consumers write the incoming messages in a database and the mpd handler reads them out and does the job. (Here I don’t know if there will be problems when the consumers and mpd handler try to access the db at the same time.)
Using Queues with multiprocessing module:
Consumers passes the messages via a queue to the mpd handler. (Don’t know if this is possible.)
Catching up the messages in redis:
Mpd handler listens frequently on redis to catch up the messages. I read that when the Layers are used in common way the groups and channel names are listed on redis only. Messages are passed via redis when the consumers are started as workers. (That would mean that all my consumers must start as background worker, but how?)
I hope you may have a solution to my question. You may realise from my ideas and the question marks involved to solve this problem that I am not an IT expert. As I wrote at the beginning, I have another engineering background and a newbie in this but very interested to learn something new! So please be patient with me when I don’t understand everything immediately.
I hope to read your answers soon and thank you in advance.
Best regards.
Whilst nobody gave an answer to my question, I tried a little bit out some possible options.
I changed the binding of mpd from fix IP to a socket connection and created a mpd_Handler class with some functions/methods like connect to mpd, disconnect, play, pause etc.
This class is imported in Django consumers.py and views.py. Whenever a web client connects to Django or has a new command (like play, skip etc.), the mpd_Handler will perform the command and respond the actual state of mpd like current song metadata.
A second mpd handler which is running outside of Django as a separate script monitors frequently the mpd state to detect any changes. In case of a change at mpd (e.g., the song of web radio stream has changed or the duration time of the song) this handler informs all clients that are connected to Django consumer group with the command async_to_sync(channel_layer.group_send) so that the clients can update their UI.
At the moment it works, and I hope this is a good solution and helps others who have the same problem. Other suggestions are still welcome!
Best regards.
Short version of my question:
How do I design a single Python script that can listen and respond to inputs received via HTTP or a serial port, and also initiate communications via these channels on its own? My problem is that I don't understand how to design a single script that both (i) uses a web framework to listen on some port for HTTP inputs, and (ii) also does other work that's independent of incoming HTTP requests.
Long version:
I want to use Python to design a system that does the following:
Listens to a serial port for occasional reports. Specifically, I have a network of JeeNode sensors (wireless Arduino-compatible modules) that talk to a central JeeLink, which connects to my computer via USB and talks to my Python script via pySerial.
Listens to a web URL for occasional inputs. Specifically, users send commands to the system via SMS to a Twilio number. Twilio intercepts the SMS messages and posts them to a URL I designate, and I use the Bottle micro web-framework to listen for new HTTP requests.
Responds to both types (serial and HTTP) of inputs. For example, if a user texts the command "Sleep", I want to (i) tell the sensors to go to sleep via the serial port -> JeeLink (which will then forward the command onto the remotes); and (ii) reply to the sender -- and maybe other users -- that the command has been received and is being executed.
Occasionally initiates its own communications to users (via HTTP -> Twilio -> SMS) or remote sensors (via serial -> JeeLink) without any precipitating input event. Two examples: (1) I want to report out to users or remote sensors every N minutes even if I haven't received any new inputs. (2) I want to tell users remotes have actually entered Sleep mode. Because the remotes are battery-powered, they spend most of the time in an inaccessible low-power mode. They can only receive new commands from the JeeLink when they initiate a wireless "check-in" every 5 min. So while technically remotes go to sleep (or wake up, etc.) in response to a user command, commands and responses are effectively independent.
My problem is that all of usage examples of web frameworks I've seen seem to assume that all precipitating events occur via HTTP requests. I can create a Bottle object, and use decorators to bind code to that object that get executed whenever it sees an HTTP request that matches some specified URL path. But I don't know how to do that while simultaneously doing other work that's independent of HTTP events, for example, listening to the serial port.
After struggling a lot, the potential solutions I'm considering now are:
Splitting the functionality into separate scripts. A.py listens for text messages via HTTP and writes the relevant information to some database; B.py continuously reads the database for new records and reacts accordingly, as well as listening to the serial monitor and doing other work. This seems like it would work fine, but it feels inelegant, and I suspect there's a simpler solution I'm unaware of.
Maybe the answer is related to Python decorators? I use various decorators to specify the URL paths that, when a matching HTTP request comes in, execute the code bound to the decorator. So I'm guessing that maybe there's a way to specify some other kind of decorator that, rather than listening for HTTP requests, gets executed when my "main" Python code tells it to? But I don't know enough about decorators to know if this is true.
It seems like you are trying to write an asynchronous application to manage your network of nodes via HTTP. You want to respond to incoming communications on multiple channels as they occur, you want to initiate communications on a schedule, on multiple channels, and you want those two forms of communication to interact. All of these communications are with an outside world that is slow, so it behooves you not to block if you don't need to.
It will probably be easiest to maintain your system if you organize your code into several Python modules, split by their area of concern - serial interface code, HTTP interface code, common processing code-paths, etc. Weave those components together in a central control module, which imports your libraries, and knows how to start and stop cleanly. Then you can test the serial interface independent of the web interface, and potentially reuse some of those Python modules in other projects.
The end result I am trying to achieve is allow a server to assign specific tasks to a client when it makes it's connection. A simplified version would be like this
Client connects to Server
Server tells Client to run some network task
Client receives task and fires up another process to complete task
Client tells Server it has started
Server tells Client it has another task to do (and so on...)
A couple of notes
There would be a cap on how many tasks a client can do
The client would need to be able to monitor the task/process (running? died?)
It would be nice if the client could receive data back from the process to send to the server if needed
At first, I was going to try threading, but I have heard python doesn't do threading correctly (is that right/wrong?)
Then it was thought to fire of a system call from python and record the PID. Then send certain signals to it for status, stop, (SIGUSR1, SIGUSR2, SIGINT). But not sure if that will work, because I don't know if I can capture data from another process. If you can, I don't have a clue how that would be accomplished. (stdout or a socket file?)
What would you guys suggest as far as the best way to handle this?
Use spawnProcess to spawn a subprocess. If you're using Twisted already, then this should integrate pretty seamlessly into your existing protocol logic.
Use Celery, a Python distributed task queue. It probably does everything you want or can be made to do everything you want, and it will also handle a ton of edge cases you might not have considered yet (what happens to existing jobs if the server crashes, etc.)
You can communicate with Celery from your other software using a messaging queue like RabbitMQ; see the Celery tutorials for details on this.
It will probably be most convenient to use a database such as MySQL or PostgreSQL to store information about tasks and their results, but you may be able to engineer a solution that doesn't use a database if you prefer.
A typical situation with a server/web application is that the application needs to be shut down and restarted to implement an upgrade.
What are the possible/common schemes (and available software) to avoid losing data that clients sent to the server during the short time the application was gone?
An example scheme that could work is: For a simple web server where the client connects to port 80, rather than the client connecting directly to the web server application, there could be a simple application in between that listens to port 80 and seamlessly forwards/returns data to/from the "Actual" web server application (on some other port). When the web server needs to be shut down and restarted, the relay app could detect this and buffer all incoming data until the webserver comes back to life. This way there is always an application listening to port 80 and data is never lost (within buffer-size and time reason, of course). Does such a simple intermediate buffer-on-recipient-unavailable piece of software exist already?
I'm mostly interested in solutions for a single application instance and not one where there are multiple instances (in which case a clever rolling update scheme could be used), but in the interests of having a full answer set, any response would be great!
To avoid this, have multiple application servers behind a load balancer. Before bringing one down, ensure the load balancer is not sending it new clients. Bring it down, traffic will go to the other applications servers, and when it comes back up traffic will begin getting sent to it again.
If you have only one application server, simply 'buffering' network traffic is a poor solution. When the server comes back up, it has none of the TCP state information anymore and the old incoming connections have nowhere to go anyway.
I want to implement a lightweight Message Queue proxy. It's job is to receive messages from a web application (PHP) and send them to the Message Queue server asynchronously. The reason for this proxy is that the MQ isn't always avaliable and is sometimes lagging, or even down, but I want to make sure the messages are delivered, and the web application returns immediately.
So, PHP would send the message to the MQ proxy running on the same host. That proxy would save the messages to SQLite for persistence, in case of crashes. At the same time it would send the messages from SQLite to the MQ in batches when the connection is available, and delete them from SQLite.
Now, the way I understand, there are these components in this service:
message listener (listens to the messages from PHP and writes them to a Incoming Queue)
DB flusher (reads messages from the Incoming Queue and saves them to a database; due to SQLite single-threadedness)
MQ connection handler (keeps the connection to the MQ server online by reconnecting)
message sender (collects messages from SQlite db and sends them to the MQ server, then removes them from db)
I was thinking of using Twisted for #1 (TCPServer), but I'm having problem with integrating it with other points, which aren't event-driven. Intuition tells me that each of these points should be running in a separate thread, because all are IO-bound and independent of each other, but I could easily put them in a single thread. Even though, I couldn't find any good and clear (to me) examples on how to implement this worker thread aside of Twisted's main loop.
The example I've started with is the chatserver.py, which uses service.Application and internet.TCPServer objects. If I start my own thread prior to creating TCPServer service, it runs a few times, but the it stops and never runs again. I'm not sure, why this is happening, but it's probably because I don't use threads with Twisted correctly.
Any suggestions on how to implement a separate worker thread and keep Twisted? Do you have any alternative architectures in mind?
You're basically considering writing an ad-hoc extension to your messaging server, the job of which it is to provide whatever reliability guarantees you've asked of it.
Instead, perhaps you should take the hardware where you were planning to run this new proxy and run another MQ node on it. The new node should take care of persisting and relaying messages that you deliver to it while the other nodes are overloaded or offline.
Maybe it's not the best bang for your buck to use a separate thread in Twisted to get around a blocking call, but sometimes the least evil solution is the best. Here's a link that shows you how to integrate threading into Twisted:
http://twistedmatrix.com/documents/10.1.0/core/howto/threading.html
Sometimes in a pinch easy-to-implement is faster than hours/days of research which may all turn out to be for nought.
A neat solution to this problem would be to use the Key Value store Redis. Its a high speed persistent data store, with plenty of clients - it has a php and a python client (if you want to use a timed/batch process to process messages - it saves you creating a database, and also deals with your persistence stories. It runs fine on Cywin/Windows + posix environments.
PHP Redis client is here.
Python client is here.
Both have a very clean and simple API. Redis also offers a publish/subscribe mechanism, should you need it, although it sounds like it would be of limited value if you're publishing to an inconsistent queue.