My site write with django. I need to run some task in the background of container(I using ec2).
Recently, I research Celery. But, it required redis or queue server to run. It makes I cannot using celery because I mustn't install something else.
Question: Can I setup celery stand alone? If yes, how to do this? If no, Are we have any alternative, which can install stand alone?
The answer is - no, you cannot use Celery without a broker (Redis, RabbitMQ, or any other from the list of supported brokers).
I am not aware of a service that does both (queue management AND execution environment for your tasks). Best services follow the UNIX paradigm - "do one thing, and do it right". Service you described above would have to do two different, non-trivial things and that is probably why most likely such service does not exist (at least not in the Python world).
Related
I have prototyped a system using python on linux. I am now designing the architecture to move to a web based system. I will use Django to serve public and private admin pages. I also need a service running, which will periodically run scripts, connect to the internet and allow API messaging with an admin user. Thus there will be 3 components : web server, api_service and database.
1) What is best mechanism for deploying a python api_service on the VM? My background is mainly C++/C# and I would have usually deployed a C#-written service on the same VM as the web server and used some sort of TCP messaging wrapper for the API. My admin API code will be ad hoc python scripts run from my machine to execute functionality in this service.
2) All my database code is written to an interface that presently uses flat-files. Any database suggestion? PostgreSQL, MongoDB, ...
Many thanks in advance for helpful suggestions. I am an ex-windows/C++/C# developer who now absolutely loves Python/Cython and needs a little help please ...
Right, am answering my own question. Have done a fair bit of research since posting.
2) PostgreSQL seems a good choice. There seem to be no damning warnings against using it and there is much searchable help. I am therefore implementing concrete PostgreSQL classes to implement my serialization interfaces.
1) Rather than implement my own service in python that sits on a remote machine, I am going to use Celery. RabbitMQ will act as the distributed TCP message wrapper. I can put required functionality in python scripts on the VM that Celery can find and execute as tasks. I can run these Celery tasks in 3 ways. i) A web request through Django can queue a task. ii) I can manually queue a remote Celery task from my machine by running a python script. iii) I can use Celery Beat to schedule tasks periodically. This fits my needs perfectly as I have a handful of daily/periodic tasks that can be scheduled plus a few rare maintenance tasks that I can fire off from my machine.
To summarize then, where before I would have created a windows service that handled both incoming TCP commands and scheduled behaviour, I can use RabbitMQ, Celery, Celery Beat and python scripts that sit on the VM.
Hope this helps anybody with a similar 'how to get started' problem .....
I need to run some tasks in background of web app (checking the code out, etc) without blocking the views.
The twist in typical Queue/Celery scenario is that I have to ensure that the tasks will complete, surviving even web app crash or restart until those tasks complete, whatever their final result.
I was thinking about recording parameters for multiprocessing.Pool in a database and starting all the incomplete tasks at webapp restart. It's doable, but I'm wondering if there's a simpler or more cost-effective aproach?
UPDATE: Why not Celery itself? Well, I used Celery in some projects and it's really a great solution, but for this task it's on the big side: it requires a separate server, communication, etc., while all I need is spawning a few processes/threads, doing some work in them (git clone ..., svn co ...) and checking whether they succeeded or failed. Another issue is that I need the solution to be as small as possible since I have to make it follow elaborate corporate guidelines, procedures, etc., and the human administrative and bureaucratic overhead I'd have to go through to get Celery onboard is something I'd prefer to avoid if I can.
I would suggest you to use Celery.
Celery does not require its own server, you can have a worker running on the same machine. You can also have a "poor man's queue" using an SQL database instead of a "real" queue/messaging server such as RabbitMQ - this setup would look very much like what you're describing, only with a separate process doing the long-running tasks.
The problem with starting long-running tasks from the webserver process is that in the production environment the web "workers" are normally managed by the webserver - multiple workers can be spawned or killed at any time. The viability of your approach would highly depend on the web server you're using and its configuration. Also, with multiple workers each trying to do a task you may have some concurrency issues.
Apart from Celery, another option is to look at UWSGI's spooler subsystem, especially if you're already using UWSGI.
I am working on a web application that uses a permanent object MyService. Using a web interface I am dynamically updating its state and monitor its behavior. Now I would like to periodically call one of its methods. I was thinking of using celery PeriodicTask but run into some scope issues. It seems I need to execute three different processes:
python manage.py runserver
python manage.py celery worker
python manage.py celerybeat
The problem is that even if I ensure that MyService is a singleton that can be safely used by more than one thread, celery creates its own fresh copy of the object. Is there a way I could share this object between both django server and celery main process? I tried to find a way to start celery from within django script but until now with no success. Would appreciate any help.
If you need to share something between multiple processes or maybe even multiple machines (eg. your workers could run on a seperate machine) the best (and probably easiest) practice to share information would be using an external service.
In the simplest case you could use Django's DB, but if you encounter that this is not suitable for you, for example if you have a heavy write load you can use something like Redis or Memcache (which you can also talk to via Django's caching API). These will enable you to be able to handle a big write load and besides you can use eg. Redis as a queue for celery as well.
Im looking for relatively simple and lightweight way to setup primitive DB maintain tasks for Django-based web-site. Celery seems for me like overkill.
In my mind its now looking like making custom Django management command, and putting in in cron. Maybe some could suggest better method?
django-extensions has a jobs-scheduling function that would work well for DB maintenance tasks. You still would rely on cron entries to actually run them though.
But then again, just doing a management command from cron is perfectly reasonable.
Django Chronograph is a django app with a very nice admin interface for managing Cron Jobs and setting up multiple task. So in this way, you don't have to go and fiddle with your server's cron file and this interface/app would manage it efficiently for you.
You can also do it the Django way by writing Custom Management Commands as also mentioned here.
Something I've had interest in is regularly running a certain set of actions at regular time intervals. Obviously, this is a task for cron, right?
Unfortunately, the Internet seems to be in a bit of disagreement there.
Let me elaborate a little about my setup. First, my development environment is in Windows, while my production environment is hosted on Webfaction (Linux). There is no real cron on Windows, right? Also, I use Django! And what's suggested for Django?
Celery of course! Unfortunately, setting up Celery has been more or less a literal nightmare for me - please see Error message 'No handlers could be found for logger “multiprocessing”' using Celery. And this is only ONE of the problems I've had with Celery. Others include a socket error which it I'm the only one ever to have gotten the problem.
Don't get me wrong, Celery seems REALLY cool. Unfortunately, there seems to be a lack of support, and some odd limitations built into its preferred backend, RabbitMQ. Unfortunately, no matter how cool a program is, if it doesn't work, well, it doesn't work!
That's where I hope all of you can come in. I'd like to know about cron or a cron-equivalent, which can be set up similarly (preferably identically) in both a Windows and a Linux environment.
(I've been struggling with Celery for about two weeks now and unfortunately I think it's time to toss in the towel and give up on it, at least for now.)
I had the same problem, and held off trying to solve it with celery (too complicated) or cron (external to application) and ended up finding Advanced Python Scheduler. Only just started using it but it seems reasonably mature and stable, has decent documentation and will take a number of scheduling formats (e.g. cron style).
From the documentation, running a function at a specific interval.
from apscheduler.scheduler import Scheduler
sched = Scheduler()
sched.start()
def hello_world():
print "hello world"
sched.add_interval_job(hello_world,seconds=10)
This is non-blocking, and I run something pretty identical by simply importing the module from my urls.py. Hope this helps.
A simple, non-Celery way to approach things would be to create custom django-admin commands to perform your asynchronous or scheduled tasks.
Then, on Windows, you use the at command to schedule these tasks. On Linux, you use cron.
I'd also strongly recommend ditching Windows if you can for a development environment. Your life will be so much better on Linux or even Mac OSX. Re-purpose a spare or old machine with Ubuntu for example, or run Ubuntu in a VM on your Windows box.
https://github.com/andybak/django-cron
Triggered by a single cron task but all the scheduling and configuration is done in Python.
Django Chronograph is a great alternative. You only need to setup one cron then do everything in django admin. You can schedule tasks/commands from django management.