Executing a command on a remote server with decoupling, redundancy, and asynchronous

Executing a command on a remote server with decoupling, redundancy, and asynchronous - python

I have a few servers that require executing commands on other servers. For example a Bitbucket Server post receive hook executing a git pull on another server. Another example is the CI server pulling a new docker image and restarting an instance on another server.
I would normally use ssh for this, creating a user/group specifically for the job with limited permission.
A few downsides with ssh:
Synchronous ssh call means a git push will have to wait until complete.
If a host is not contactable for whatever reason, the ssh command will fail.
Maintaining keys, users, and sudoers permissions can become unwieldy.
Few possibilities:
Find an open source out of the box solution (I have tried with no luck so far)
Set up an REST API on each server that accepts calls with some type of authentication, e.g. POST https://server/git/pull/?apikey=a1b2c3
Set up Python/Celery to execute tasks on a different queue for each host. This means a celery worker on each server that can execute commands and possibly a service that accepts REST API calls, converting them to Celery tasks.
Is there a nice solution to this problem?

Defining the problem
You want to be able to trigger a remote task without waiting for it to complete.
This can be achieved in any number of ways, including with SSH. You can execute a remote command without waiting for it to complete by closing or redirecting all I/O streams, e.g. like this:
ssh user#host "/usr/bin/foobar </dev/null >/dev/null 2>&1"
You want to be able to defer the task if the host is currently unavailable.
This requires a queuing/retry system of some kind. You will also need to decide whether the target hosts will be querying for messages ("pull") or whether messages will be sent to the target hosts from elsewhere ("push").
You want to simplify access control as much as possible.
There's no way to completely avoid this issue. One solution would be to put most of the authentication logic in a centralized task server. This splits the problem into two parts: configuring access rights in the task server, and configuring authentication between the task server and the target hosts.
Example solutions
Hosts attempt to start tasks over SSH using method above for asynchrony. If host is unavailable, task is written to local file. Cron job periodically retries sending failed tasks. Access control via SSH keys.
Hosts add tasks by writing commands to files on an SFTP server. Cron job on target hosts periodically checks for new commands and executes them if found. Access control managed via SSH keys on the SFTP server.
Hosts post tasks to REST API which adds them to queue. Celery daemon on each target host consumes from queue and executes tasks. Access managed primarily by credentials sent to the task queuing server.
Hosts post tasks to API which adds tasks to queue. Task consumer nodes pull tasks off the queue and send requests to API on target hosts. Authentication managed by cryptographic signature of sender appended to request, verified by task server on target host.
You can also look into tools that do some or all of the required functions out of the box. For example, some Google searching came up with Rundeck which seems to have some job scheduling capabilities and a REST API. You should also consider whether you can leverage any existing automated deployment or management tools already present in your system.
Conclusions
Ultimately, there's no single right answer to this question. It really depends on your particular needs. Ask yourself: How much time and effort do you want to spend creating this system? What about maintenance? How reliable does it need to be? How much does it need to scale? And so on, ad infinitum...

Related

How to connect server from client in Heroku using IP address

I am developing one application using heroku, but struggling with one issue.
In this application, I have 2 dynos (one is for server, and the other is for client).
Since I want to get some data from server, my client needs to know IP address of the server(dyno).
Now I am trying to use Fixie and QuotaGuard Static,
They tell me an IP address, but I can not connect to the server using these IP address.
Could you tell me how to fix it?

You want to have two dynos communicate directly over a socket connection. Unfortunately, you can't easily do that; that runs counter to the ethos of Heroku and 12-factor application design (http://12factor.net), which specifies that processes should be isolated from each other, and that communication be via "network attached services". That second point may seem like a nuance, but it affects how the dynos discover the other services (via injected environment variables).
There are many reasons for this constraint, not the least of which is the fact that "dynos", as a unit of compute, may be scaled, migrated to different physical servers, etc., many times over an application's lifecycle. Trying to connect to a socket on a dyno reliably would actually get pretty complicated (selecting the right one if multiple are running, renegotiating connections after scaling/migration events, etc.). Remember - even if you are never going to call heroku ps:scale client=2, Heroku doesn't know that and, as a platform, it is designed to assume that you will.
The solution is to use an intermediate service like Redis to facilitate the inter-process communication via a framework like Python RQ or similar.
Alternatively, treat the two dynos as separate applications - then you can connect from one to the other via HTTP using the publicly available DNS entry for that application. Note - in that case, it would still be possible to share a database if that's required.
Hope that helps.

Server alarm bell

I would to implement a simple functionality on my server:
I would to know when a my process that it is running on the server is active or no.
If this process is not running I would to send an e-mail to administrator.
To implement this functionality I'm thinking to a cron-job scheduled on x hours.
How can I implement this cron-job? I used python server-side.
Is this the better solution?

I'm assuming that your server is *nix, as you're mentioning cronjobs.
You can find bash scripts doing exactly this functionality, if you google it. Here's a bash script which checks for specific services and sends emails if a service is down:
http://bash.cyberciti.biz/monitoring/simple-process-checker-script/
You just need to tweak the _chklist, _sendemail and _email variables to suit your needs

Best way for client to fire off separate process without blocking client/server communication

The end result I am trying to achieve is allow a server to assign specific tasks to a client when it makes it's connection. A simplified version would be like this
Client connects to Server
Server tells Client to run some network task
Client receives task and fires up another process to complete task
Client tells Server it has started
Server tells Client it has another task to do (and so on...)
A couple of notes
There would be a cap on how many tasks a client can do
The client would need to be able to monitor the task/process (running? died?)
It would be nice if the client could receive data back from the process to send to the server if needed
At first, I was going to try threading, but I have heard python doesn't do threading correctly (is that right/wrong?)
Then it was thought to fire of a system call from python and record the PID. Then send certain signals to it for status, stop, (SIGUSR1, SIGUSR2, SIGINT). But not sure if that will work, because I don't know if I can capture data from another process. If you can, I don't have a clue how that would be accomplished. (stdout or a socket file?)
What would you guys suggest as far as the best way to handle this?

Use spawnProcess to spawn a subprocess. If you're using Twisted already, then this should integrate pretty seamlessly into your existing protocol logic.

Use Celery, a Python distributed task queue. It probably does everything you want or can be made to do everything you want, and it will also handle a ton of edge cases you might not have considered yet (what happens to existing jobs if the server crashes, etc.)
You can communicate with Celery from your other software using a messaging queue like RabbitMQ; see the Celery tutorials for details on this.
It will probably be most convenient to use a database such as MySQL or PostgreSQL to store information about tasks and their results, but you may be able to engineer a solution that doesn't use a database if you prefer.

How-To - Update Live Running Python Application

I have a python application , to be more precise a Network Application that can't go down this means i can't kill the PID since it actually talks with other servers and clients and so on ... many € per minute of downtime , you know the usual 24/7 system.
Anyway in my hobby projects i also work a lot with WSGI frameworks and i noticed that i have the same problem even during off-peak hours.
Anyway imagine a normal server using TCP/UDP ( put here your favourite WSGI/SIP/Classified Information Server/etc).
Now you perform a git pull in the remote server and there goes the new python files into the server (these files will of course ONLY affect the data processing and not the actual sockets so there is no need to re-raise the sockets or touch in any way the network part).
I don't usually use File monitors since i prefer to use SIGNAL to wakeup the internal app updater.
Now imagine the following code
from mysuper.app import handler
while True:
data = socket.recv()
if data:
socket.send(handler(data))
Lets imagine that handler is a APP with DB connections, cache connections , etc.
What is the best way to update the handler.
Is it safe to call reload(handler) ?
Will this break DB connections ?
Will DB Connections survive to this restart ?
Will current transactions be lost ?
Will this create anti-matter ?
What is the best-pratice patterns that you guys usually use if there are any ?

It's safe to call reload(handler).
Depends where you initialize your connections. If you make the connections inside handler(), then yes, they'll be garbage collected when the handler() object falls out of scope. But you wouldn't be connecting inside your main loop, would you? I'd highly recommend something like:
dbconnection = connect(...)
while True:
...
socket.send(handler(data, dbconnection))
if for no other reason than that you won't be making an expensive connection inside a tight loop.
That said, I'd recommend going with an entirely different architecture. Make a listener process that does basically nothing more than listen for UDP datagrams, sends them to a messaging queue like RabbitMQ, then waits for the reply message to send the results back to the client. Then write your actual servers that get their requests from the messaging queue, process them, and send a reply message back.
If you want to upgrade the UDP server, launch the new instance listening on another port. Update your firewall rules to redirect incoming traffic to the new port. Reload the rules. Kill the old process. Voila: seamless cutover.
The real win is from uncoupling your backend. Since multiple processes can listen for the same messages from your frontend "proxy" service, you can run several in parallel - on different machines, if you want to. To upgrade the backend, start a new instance then kill the old one so that there's no time when at least one instance isn't running.
To scale your proxy, have multiple instances running on different ports or different hosts, and configure your firewall to randomly redirect incoming datagrams to one of the proxies.
To scale your backend, run more instances.

Message queue proxy in Python + Twisted

I want to implement a lightweight Message Queue proxy. It's job is to receive messages from a web application (PHP) and send them to the Message Queue server asynchronously. The reason for this proxy is that the MQ isn't always avaliable and is sometimes lagging, or even down, but I want to make sure the messages are delivered, and the web application returns immediately.
So, PHP would send the message to the MQ proxy running on the same host. That proxy would save the messages to SQLite for persistence, in case of crashes. At the same time it would send the messages from SQLite to the MQ in batches when the connection is available, and delete them from SQLite.
Now, the way I understand, there are these components in this service:
message listener (listens to the messages from PHP and writes them to a Incoming Queue)
DB flusher (reads messages from the Incoming Queue and saves them to a database; due to SQLite single-threadedness)
MQ connection handler (keeps the connection to the MQ server online by reconnecting)
message sender (collects messages from SQlite db and sends them to the MQ server, then removes them from db)
I was thinking of using Twisted for #1 (TCPServer), but I'm having problem with integrating it with other points, which aren't event-driven. Intuition tells me that each of these points should be running in a separate thread, because all are IO-bound and independent of each other, but I could easily put them in a single thread. Even though, I couldn't find any good and clear (to me) examples on how to implement this worker thread aside of Twisted's main loop.
The example I've started with is the chatserver.py, which uses service.Application and internet.TCPServer objects. If I start my own thread prior to creating TCPServer service, it runs a few times, but the it stops and never runs again. I'm not sure, why this is happening, but it's probably because I don't use threads with Twisted correctly.
Any suggestions on how to implement a separate worker thread and keep Twisted? Do you have any alternative architectures in mind?

You're basically considering writing an ad-hoc extension to your messaging server, the job of which it is to provide whatever reliability guarantees you've asked of it.
Instead, perhaps you should take the hardware where you were planning to run this new proxy and run another MQ node on it. The new node should take care of persisting and relaying messages that you deliver to it while the other nodes are overloaded or offline.

Maybe it's not the best bang for your buck to use a separate thread in Twisted to get around a blocking call, but sometimes the least evil solution is the best. Here's a link that shows you how to integrate threading into Twisted:
http://twistedmatrix.com/documents/10.1.0/core/howto/threading.html
Sometimes in a pinch easy-to-implement is faster than hours/days of research which may all turn out to be for nought.

A neat solution to this problem would be to use the Key Value store Redis. Its a high speed persistent data store, with plenty of clients - it has a php and a python client (if you want to use a timed/batch process to process messages - it saves you creating a database, and also deals with your persistence stories. It runs fine on Cywin/Windows + posix environments.
PHP Redis client is here.
Python client is here.
Both have a very clean and simple API. Redis also offers a publish/subscribe mechanism, should you need it, although it sounds like it would be of limited value if you're publishing to an inconsistent queue.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.