Creating and interacting with background tasks in python

Creating and interacting with background tasks in python - python

I'm trying to build a system that would manage a small database and populate it with data from the web.
I would like to have this process run in the background, but still have some way of interacting with it.
How can I go about this in Python?
I would like to know how to do this in two cases:
from within the same python script something like daemon = task.start() followed by daemon.get_info() or daemon.do_something()
from the shell (via another program I could make) myclient get_info or myclient do_something
Could someone give me some key concepts to go look into?
edit: I just read this blogpost, is socket programming (as indicated in his last example) the best way to go about this?

So in the end I landed on some terminology that I was missing.
The core concept seems to be inter-process communication (ipc).
On unix variants the two easiest ways of implementing this are:
Named Pipes (one way communication)
Sockets (two way)
A python script that would make use of these could spawn another thread which would repeatedly read from the pipe and communicate messages back to the main thread through a queue.

Related

Calling Python code from Twisted

First of all, I should say that this might more a design question rather than about code itself.
I have a network with one Server and multiple Clients (written in Twisted cause I need these asynchronous non-blocking features), such server-client couple it's just only receiving-sending messages.
However, at some point, I want one client to run a python file when received a certain message. That client should keep listening and talking to the Server, also, I should be able to stop that file if needed, so my first thought is starting a thread for that python file and forget about it.
At the end it should go like this: Server sends message to ClientA, ClientA and its dataReceived function interprets the message and decides to run that python file (which I don't know how long will take and maybe contains blocking calls), when that python file finishes running should send the result to ClientB.
So, questions are:
Would it be starting a thread a good idea for that python file in ClientA?
As I want to send the result of that python file to ClientB, can I have another reactor loop inside that python file?
In any case I would highly appreciate any kind of advice as both python and twisted are not my specialty and all these ideas may not be the best ones.
Thanks!

At first reading, I though you were implying twisted isn't python. If you are thinking that, keep in mind the following:
Twisted is a python framework, I.E. it is python. Specifically it's about getting the most out of a single process/thread/core by allowing the programmer to hand-tune the scheduling/order-of-ops in their own code (which is nearly the opposite of the typical use of threads).
While you can interact with threads in twisted, its quite tricky to do without ruining the efficiency of twisted. (for longer description of threads vs events see SO: https://stackoverflow.com/a/23876498/3334178 )
If you really want to spawn your new python away from your twisted python (I.E. get this work running on a different core) then I would look at spawning it off as a process, see Glyph's answer in this SO: https://stackoverflow.com/a/5720492/3334178 for good libraries to get that done.
Processes give the proper separation to allow your twisted apps to run without meaningful slowdown and you should find all your starting/stoping/pausing/killing needs will be fulfilled.
Answering your questions specifically:
Would it be starting a thread a good idea for that python file in ClientA?
I would say "No" its generally not a good idea, and in your specific case you should look at using processes instead.
Can I have another reactor loop inside that python file?
Strictly speaking "no you can't have multiple reactors" - but what twisted can do is concurrently manage hundreds or thousands of separate tasks, all in the same reactor, will do what you need. I.E. run all your different async tasks in one reactor, that is what twisted is built for.
BTW I always recommend the following tutorial for twisted: krondo Twisted Introduction http://krondo.com/?page_id=1327 Its long, but if you get through it this kind of work will become very clear.
All the best!

Whats the best way to implement python TCP client?

I need to write python script which performs several tasks:
read commands from console and send to server over tcp/ip
receive server response, process and make output to console.
What is the best way to create such a script? Do I have to create separate thread to listen to server response, while interacting with user in main thread? Are there any good examples?

Calling for a best way or code examples is rather off topic, but this is too long to be a comment.
There are three general ways to build those terminal emulator like applications :
multiple processes - the way the good old Unix cu worked with a fork
multiple threads - a variant from the above using light way threads instad of processes
using select system call with multiplexed io.
Generally, the 2 first methods are considered more straightforward to code with one thread (or process) processing upward communication while the other processes the downward one. And the third while being trickier to code is generally considered as more efficient
As Python supports multithreading, multiprocessing and select call, you can choose any method, with a slight preference for multithreading over multiprocessing because threads are lighter than processes and I cannot see a reason to use processes.
Following in just my opinion
Unless if you are writing a model for rewriting it later in a lower level language, I assume that performance is not the key issue, and my advice would be to use threads here.

Named Event in Python

In python, is there a cross platform way of creating something similar to Windows named Event in one process, and set it from another process to signal something to the first one?
My specific problem is that I need to create a process that on startup will check if any other instances of itself are running, and if so, signal them to quit. With Windows API I would use CreateEvent with the lpName parameter, and SetEvent.

I've spent about a day now searching for a good answer to this and here is what I am coming up with at this moment:
It is possible to use signals to indicate to the process that some change needs to take place, however in a more complex legacy codebase I am dealing with it causes the process to crash. Signaling interrupts various I/O processes and alike based on python signal docs. You can implement signal handler with signal.SIGUSR1
import signal
def signal_handler(signum, stack):
print('Signal %d received'%signum)
signal.signal(signal.SIGUSR1, signal_handler)
This code can be triggered in Linux et al. through:
$ kill -s SIGUSR1 $pid
I am presently leaning towards kazoo Python Zookeeper library. It requires to stand up Zookeeper as infrastructure.
I do have an additional need for toggling configuration values in my case. However Zookeeper supports a number of interprocessor communication tools that will serve your needs.
UPDATE:
I finally settled on a named pipe (FIFO), calling it inside a thread with readline.
if not os.path.exists(fifo_name):
os.mkfifo(fifo_name)
while True:
with open(fifo_name, 'r') as config_fifo:
line = config_fifo.readline()[:-1]
print(line)
I used tempfile.gettempdir() to find a good location to place the FIFO in the file system. It requires quite a bit of refinement however, since I did not care to parse passed content while you might. Also if you are planing on having more then one consumer of the event you are going to have it propagated to only one consumer as it is a queue.

It seems to me that this is not so much a question as to whether this is possible in Python, but whether such a cross-platform approach exists: if one does, then even if no directly written Python exists, one can always make system calls using subprocess.call() and the like.
As for whether it's a possibility, I can't profess to be much of an expert, but a bit of a search has thrown up these discussions which might prove helpful to you.

Running as many instances of a program as possible

I'm trying to implement some code to import user's data from another service via the service's API. The way I'm going to set it up is all the request jobs will be kept in a queue which my simple importer program will draw from. Handling one task at a time won't come anywhere close to maxing out any of the computer's resources so I'm wondering what is the standard way to structure a program to run multiple "jobs" at once? Should I be looking into threading or possibly a program that pulls the jobs from the queue and launches instances of the importer program? Thanks for the help.
EDIT: What I have right now is in Python although I'm open to rewriting it in another language if need be.

Use a Producer-Consumer queue, with as many Consumer threads as you need to optimize resource usage on the host (sorry - that's very vague advice, but the "right number" is problem-dependent).
If requests are lightweight you may well only need one Producer thread to handle them.
Launching multiple processes could work too - best choice depends on your requirements. Do you need the Producer to know whether the operation worked, or is it 'fire-and-forget'? Do you need retry logic in the event of failure? How do you keep count of concurrent Consumers in this model? And so on.
For Python, take a look at this.

Python Job Service Daemon?

What packages should I look at for writing a python daemon and processing jobs? Also, what do I need to do for a python daemon?

I'm pretty happy with beanstalkd, which has client libraries available in various languages:
Daemon:
http://kr.github.com/beanstalkd/
Python client library:
http://code.google.com/p/pybeanstalk/

Your question is a bit ambiguous, but I'm assuming you mean you would like to write a python daemon that will process jobs that get thrown in a queue. If not, please say as much. :-)
I've heard a lot of great things about redis. The folks at github built resque as a job processing daemon for Ruby. If you're language flexible, you could just use that, but if you're not, you could emulate it in as much or as little depth as you like making use of redis as your queue system. Depending on how pluggable and extensible you need it to be, this could be a really simple thing to implement.
Another option I ran across after some more googling is redqueue. It looks like it might already implement most of a job queue.
If you're using django, you may wish to consider the Celery project. It's a job queue system based on RabbitMQ which is yet another queuing server with excellent reviews.
As far as creating a daemon in python, there are a number of options. You can look at this page on activestate, which is a good start. Better yet, you can use python-daemon to do it all for you. But if you use one of the above options or beanstalkd as recommended by mczepiel, you probably won't have to make your process run as a daemon.

I have recently (this week) implemented a queue in RabbitMQ with a python daemon extracting the information and storing it on a database (using Django ORM). The daemon has a intermediate buffer so it will wait a little and write in the database in batches, instead of writing each time a little message arrives.
I've made the integration with the queue using this little flopsy module, which is easy to set up. The only problem I've got it to be able to set up a timeout for waiting a message, as the module has not a clear way of doing that. After a while playing with the interactive shell and making a few dir(), I manage to get to the socket object and set up the timeout.
I considered also Celery, but seems to be more focused on using internally a RabbitMQ to allow you to launch tasks (periodically or asynchronously), more that using a queue to communicating with other systems. In our case, the queue can be feed both by Python systems and Ruby ones.
Once I've completed the process, I've made some adjustments to allow running it as a daemon (mostly storing the standard output to a file to allow easy logging) and then create a bash script that launch a start-stop-daemon command. I've followed more or less this schema
I discovered python-daemon just about one day late, so after the work is done it makes no sense revisiting it, but maybe it makes more sense for a Python project.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.