Is celery the appropriate tech for running long-running processes I simply need to start/stop?

My python program isn't the sort of thing you'd create an init script for. It's simply a long-running process which needs to run until I tell it to shut down.
I run multiple instances of the program, each with different cmd-line args. Just FYI, the program acts like a Physics Tutor who chats with my users, and each instance represents a different Physics problem.
My Django app communicates with these processes using Redis pub/sub
I'd like to improve how I start/stop and manage these processes from Django views. What I don't know is if Celery is the proper technology to do this for me. A lot of the celery docs make it sound like it's for running short-lived asynchronous tasks, such as their 'add()' example task.
Currently my views are doing some awful 'spawn' stuff to start the processes, and I'm keeping track of which processes are running in a completely ad-hoc way utilizing a Redis hash.
My program actually only daemonizes if it pass it a -d argument, which I suppose I wouldn't pass it if using celery, although it does output to stdout/stderr if I don't pass that option.
All I really need is:
A way to start/stop my processes
information on whether start/stop operation succeeded
information on which of my processes are running
What I don't want is:
multiple instances of a process with the same configuration running
need to replace the way I communicate with Django (Redis pub/sub)
Does celery sound like the proper tech for me to use for my process management?

Maybe you can utilize supervisor for this. It is good at running and monitoring long running processes and has an XML-RPC interface.
You can view an example of what I did here (example output here).


Making a zmq server run forever in Django?

I'm trying to figure that best way to keep a zeroMQ listener running forever in my django app.
I'm setting up a zmq server app in my Django project that acts as internal API to other applications in our network (no need to go through http/requests stuff since these apps are internal). I want the zmq listener inside of my django project to always be alive.
I want the zmq listener in my Django project so I have access to all of the projects models (for querying) and other django context things.
I'm currently thinking:
Set up a Django management command that will run the listener and keep it alive forever (aka infinite loop inside the zmq listener code) or
use a celery worker to always keep the zmq listener alive? But I'm not exactly sure on how to get a celery worker to restart a task only if it's not running. All the celery docs are about frequency/delayed running. Or maybe I should let celery purge the task # a given interval & restart it anyways..
Any tips, advice on performance implications or alternate approaches?
Setting up a management command is a fine way to do this, especially if you're running on your own hardware.
If you're running in a cloud, where a machine may disappear along with your process, then the latter is a better option. This is how I've done it:
Setup a periodic task that runs every N seconds (you need celerybeat running somewhere)
When the task spawns, it first checks a shared network resource (redis, zookeeper, or a db), to see if another process has an active/valid lease. If one exists, abort.
If there's no valid lease, obtain your lease (beware of concurrency here!), and start your infinite loop, making sure you periodically renew the lease.
Add instrumentation so that you know who, where the process is running.
Start celery workers on multiple boxes, consuming from the same queue your periodic task is designated for.
The second solution is more complex and harder to get right; so if you can, a singleton is great and consider using something like supervisord to ensure the process gets restarted if it faults for some reason.

Python Celery task to restart celery worker

In celery, is there a simple way to create a (series of) task(s) that I could use to automagically restart a worker?
The goal is to have my deployment automagically restart all the child celery workers every time it gets a new source from github. So I could then send out a restartWorkers() task to my management celery instance on that machine that would kill (actually stopwait) all the celery worker processes on that machine, and restart them with the new modules.
The plan is for each machine to have:
Management node [Queues: Management, machine-specific] - Responsible for managing the rest of the workers on the machine, bringing up new nodes and killing old ones as necessary
Worker nodes [Queues: git revision specific, worker specific, machine specific] - Actually responsible for doing the work.
It looks like the code I need is somewhere in dist_packages/celery/bin/, but the source is rather opaque for starting workers, and I can't tell how it's supposed to work or where it's actually starting the nodes. (It looks like shutdown_nodes is the correct code to be calling for killing the processes, and I'm slowly debugging my way through it to figure out what my arguments should be)
Is there a function/functions restart_nodes(self, nodes) somewhere that I could call or am I going to be running shell scripts from within python?
/Also, is there a simpler way to reload the source into Python than killing and restarting the processes? If I knew that reloading the module actually worked(Experiments say that it doesn't. Changes to functions do not percolate until I restart the process), I'd just do that instead of the indirection with management nodes.
I can now shutdown, thanks to broadcast(Thank you mihael. If I had more rep, I'd upvote). Any way to broadcast a restart? There's pool_restart, but that doesn't kill the node, which means that it won't update the source.
I've been looking into some of the behind the scenes source in celery.bin.celeryd:WorkerCommand().run(), but there's some weird stuff going on before and after the run call, so I can't just call that function and be done because it crashes. It just makes 0 sense to call a shell command from a python script to run another python script, and I can't believe that I'm the first one to want to do this.
You can try to use broadcast functionality of Celery.
Here you can see some good examples:

Asynchronous background processes with web2py

I need to to handle a large (time and memory-consuming) process asynchronously in a web2py application called inside a controller method.
My specific use case is to call a process via stdlib.subprocess and wait for it to exit without blocking the web server, but I am open to alternative methods.
Hands-on examples would be a plus.
3rd party library recommendations
are welcome.
CRON scheduling is not required/wanted.
Assuming you'll need to start multiple, possibly simultaneous, instances of the background task, the solution is a task queue. I've heard good things about Celery and RabbitMQ, if you're looking for 3rd-party options, and web2py includes it's own task queue system that might be sufficient for your needs.
With either tool, you'll define a function that encapsulates the operation you want the background process to perform. Then bring the task queue workers online. The web2py manual and forums indicate this can be done with an #reboot statement in the web2py cron system, which is triggered whenever the web server starts. There are probably other ways to start the workers if this is unsatisfactory.
In your controller you'll insert a task into the task queue, passing any necessary parameters as inputs to the function (the background function will not run in the same environment as the controller, so it won't have access to the session, DB, etc. unless you explicitly pass the appropriate values into the task function).
Now, to get the output of the background operation to the user. When you insert a task into the task queue, you should get back a unique ID for the task. You would then implement controller logic (either something that expects an AJAX call, or a page that keeps refreshing until the task completes) that calls the task queue's API to check the status of the specified task. If the task's status is "finished", return the data to the user. If not, keep waiting.
Maybe review the book section on running tasks in the background. You can use the new scheduler or create a homemade queue (email example). There's also a web2py-celery plugin, though I'm not sure what state that is in.
This is more difficult than one might expect. Note the deadlock warnings in the stdlib.subprocess documentation. It's easy if you don't mind blocking---use Popen.communicate. To work around the blocking, you can manage the process using stdlib.subprocess from a thread.
My favorite way to deal with subprocesses is to use Twisted's spawnProcess. But, it is not easy to get Twisted to play nicely with other frameworks.

How to create a thread-safe singleton in python

I would like to hold running threads in my Django application. Since I cannot do so in the model or in the session, I thought of holding them in a singleton. I've been checking this out for a while and haven't really found a good how-to for this.
Does anyone know how to create a thread-safe singleton in python?
More specifically what I wand to do is I want to implement some kind of "anytime algorithm", i.e. when a user presses a button, a response returned and a new computation begins (a new thread). I want this thread to run until the user presses the button again, and then my app will return the best solution it managed to find. to do that, i need to save somewhere the thread object - i thought of storing them in the session, what apparently i cannot do.
The bottom line is - i have a FAT computation i want to do on the server side, in different threads, while the user is using my site.
Unless you have a very good reason - you should execute the long running threads in a different process altogether, and use Celery to execute them:
Celery is an open source asynchronous
task queue/job queue based on
distributed message passing. It is
focused on real-time operation, but
supports scheduling as well.
The execution units, called tasks, are
executed concurrently on one or more
worker nodes using multiprocessing,
Eventlet or gevent. Tasks can execute
asynchronously (in the background) or
synchronously (wait until ready).
Celery guide for djangonauts:
For singletons and sharing data between tasks/threads, again, unless you have a good reason, you should use the db layer (aka, models) with some caution regarding db locks and refreshing stale data.
Update: regarding your use case, define a Computation model, with a status field. When a user starts a computation, an instance is created, and a task will start to run. The task will monitor the status field (check db once in a while). When a user clicks the button again, a view will change the status to user requested to stop, causing the task to terminate.
If you want asynchronous code in a web application then you're taking the wrong approach. You should run background tasks with a specialist task queue like Celery:
The biggest problem you have is web server architecture. Unless you go against the recommended Django web server configuration and use a worker thread MPM, you will have no way to track your thread handles between requests as each request typically occupies its own process. This is how Apache normally works:
In light of your edit I think you might learn more by creating a custom solution that does this:
Maintains start/stop state in the database
Create a new program that runs as a daemon
Periodically check the start/stop state and begin or end work from here
There's no need for multithreading here unless you need to create a new process for each user. If so, things get more complicated and using Celery will make your life much easier.

Python Job Service Daemon?

What packages should I look at for writing a python daemon and processing jobs? Also, what do I need to do for a python daemon?
I'm pretty happy with beanstalkd, which has client libraries available in various languages:
Python client library:
Your question is a bit ambiguous, but I'm assuming you mean you would like to write a python daemon that will process jobs that get thrown in a queue. If not, please say as much. :-)
I've heard a lot of great things about redis. The folks at github built resque as a job processing daemon for Ruby. If you're language flexible, you could just use that, but if you're not, you could emulate it in as much or as little depth as you like making use of redis as your queue system. Depending on how pluggable and extensible you need it to be, this could be a really simple thing to implement.
Another option I ran across after some more googling is redqueue. It looks like it might already implement most of a job queue.
If you're using django, you may wish to consider the Celery project. It's a job queue system based on RabbitMQ which is yet another queuing server with excellent reviews.
As far as creating a daemon in python, there are a number of options. You can look at this page on activestate, which is a good start. Better yet, you can use python-daemon to do it all for you. But if you use one of the above options or beanstalkd as recommended by mczepiel, you probably won't have to make your process run as a daemon.
I have recently (this week) implemented a queue in RabbitMQ with a python daemon extracting the information and storing it on a database (using Django ORM). The daemon has a intermediate buffer so it will wait a little and write in the database in batches, instead of writing each time a little message arrives.
I've made the integration with the queue using this little flopsy module, which is easy to set up. The only problem I've got it to be able to set up a timeout for waiting a message, as the module has not a clear way of doing that. After a while playing with the interactive shell and making a few dir(), I manage to get to the socket object and set up the timeout.
I considered also Celery, but seems to be more focused on using internally a RabbitMQ to allow you to launch tasks (periodically or asynchronously), more that using a queue to communicating with other systems. In our case, the queue can be feed both by Python systems and Ruby ones.
Once I've completed the process, I've made some adjustments to allow running it as a daemon (mostly storing the standard output to a file to allow easy logging) and then create a bash script that launch a start-stop-daemon command. I've followed more or less this schema
I discovered python-daemon just about one day late, so after the work is done it makes no sense revisiting it, but maybe it makes more sense for a Python project.


