How to get celery to prioritize tasks?

How to get celery to prioritize tasks? - python

I am trying to prioritize certain tasks using celery (v5.0.0) but it seems I am missing something fundamental. According to the documentation, task priority should be available for RabbitMQ. However, whenever I try to add the relevant lines to the configuration file, task execution stops working. To illustrate my problem, I will first outline my setup without the prioritization.
My setup consists of a configuration file celeryconfig.py
from kombu import Exchange, Queue
broker_url = 'amqp://user:password#some-rabbit//'
result_backend = 'rpc://'
Then I have a file for task definition, prio.py, that contains three tasks intended for execution timing:
import time
from celery import Celery
app = Celery('tasks')
app.config_from_object('celeryconfig')
#app.task
def get_time():
t0 = time.time()
return t0
#app.task
def wait(t0):
time.sleep(0.2)
return t0
#app.task
def time_delta(t0):
time.sleep(0.2)
t1 = time.time()
return t1 - t0
And finally, the code to start the tasks (in start_tasks.py). Here, I have created two identical paths (getting time t0, wait two times, get time t1, and calculate t1-t0) that I later want to prioritize in order to let the first one finish before the second starts.
from celery import chain, group, Celery
from prio import get_time, time_delta, wait
tasks = chain(get_time.s(),
group([
chain(wait.s(), wait.s(), time_delta.s()),
chain(wait.s(), wait.s(), time_delta.s())
]))
res = tasks.apply_async()
print(res.get())
This will output
>>> [1.0178837776184082, 1.2237505912780762]
I interpret the results such that the first and the second path tasks run alternately, which results in the small difference of execution times. In the end, I would like to achieve a result like [0.6, 1.2].
Now, to introduce prioritization, I have modified the celeryconfig.py according to the documentation:
from kombu import Exchange, Queue
broker_url = 'amqp://user:password#some-rabbit//'
result_backend = 'rpc://'
task_queues = [
Queue('tasks', Exchange('tasks'), routing_key='tasks', queue_arguments={'x-max-priority': 10}),
]
task_queue_max_priority = 10
task_default_priority = 5
With this change, however, the tasks do not seem to be executed at all, and I have no clue what to change in order to make it work. I have already tried to pass the queue name to apply_async (res = tasks.apply_async(queue='tasks')) but that did not solve the problem. Any hints are welcome!
Edit:
Since I still cannot get it to work, I tried to make the setup cleaner and clearer using docker containers.
I created a docker image via the following minimalistic Dockerfile:
FROM python:latest
RUN pip install --quiet celery
The image is then built via docker build --tag minimalcelery ..
I start RabbitMQ with:
docker run --network celery_network --interactive --tty --hostname my-rabbit --name some-rabbit --env RABBITMQ_DEFAULT_USER=user --env RABBITMQ_DEFAULT_PASS=password --rm rabbitmq
The client is started with:
docker run --name "client" --interactive --tty --rm --volume %cd%:/home/work --hostname localhost --network celery_network --env PYTHONPATH=/home/work/ minimalcelery /bin/bash -c "cd /home/work/ && python start_tasks.py"
The worker is started with:
docker run --name "worker" --interactive --tty --rm --volume %cd%:/home/work --hostname localhost --network celery_network --env PYTHONPATH=/home/work/ minimalcelery /bin/bash -c "cd /home/work/ && celery --app=prio worker"
prio.py was modified according to #ItayB's answer:
from celery import Celery
app = Celery('tasks', backend='rpc://', broker='amqp://user:password#some-rabbit//')
app.config_from_object('celeryconfig')
#app.task
def get_time(**kwargs):
t0 = time.time()
return t0
#app.task
def wait(t0, **kwargs):
time.sleep(0.2)
return t0
#app.task
def time_delta(t0, **kwargs):
time.sleep(0.2)
t1 = time.time()
return t1 - t0
With this setup and the "unpriorized" config, I get a result similar to the one I previously obtained: [0.6443462371826172, 0.6746957302093506].
However, as before, if I use the additional infos for queueing/priorization, I don't get any reply. These are the files I used:
celeryconfig.py:
task_queues = [
Queue('tasks', Exchange('tasks'), routing_key='tasks', queue_arguments={'x-max-priority': 10}),
]
task_queue_max_priority = 10
task_default_priority = 5
start_tasks.py:
from celery import Celery, chain, group
from prio import get_time, time_delta, wait
tasks = chain(get_time.s(), group([chain(wait.s(), wait.s(), time_delta.s()), chain(wait.s(), wait.s(), time_delta.s())]))
print("about to start the tasks.")
res = tasks.apply_async(queue='tasks', routing_key='tasks', priority=5)
print("waiting...")
print(res.get())
Also, I tried to modify the command, by which the worker is started:
docker run --name "worker" --interactive --tty --rm --volume %cd%:/home/work --hostname localhost --network celery_network --env PYTHONPATH=/home/work/
minimalcelery /bin/bash -c "cd /home/work/ && celery --app=prio worker --hostname=worker.tasks#%h --queues=tasks"
I am aware that this setup does not prioritize the "paths" differently. Currently, I just want to get an answer from the worker.

It's seems like you're almost there. It's been a while since I did that but I'll try (based on some snippets I have):
change your celery tasks to support kwargs:
def get_time(**kwargs):
pass the priority when you set the signature:
get_time.s(kwargs={"priority": 3}) # set value below x-max-priority
I'm not sure if it's a must but I've also defined the task as immutable:
get_time.s(immutable=True, kwargs={"priority": 3})
set the worker_prefetch_multiplier to 1. The default is 4, which means that 4 tasks are prefetched together so I think there won't be prioritization between them (AFAIR).
Good luck!

Related

Celery worker doesn't launch from Python

We have Python 3.6.1 set up with Django, Celery, and Rabbitmq on Ubuntu 14.04. Right now, I'm using the Django debug server (for dev and Apache isn't working). My current problem is that the celery workers get launched from Python and immediately die -- processes show as defunct. If I use the same command in a terminal window, the worker gets created and picks up the task if there is one waiting in the queue.
Here's the command:
celery worker --app=myapp --loglevel=info --concurrency=1 --maxtasksperchild=20 -n celery_1 -Q celery
The same functionality occurs for whichever queues are being set up.
In the terminal, we see the output myapp.settings - INFO - Loading... followed by output that describes the queue and lists the tasks. When running from Python, the last thing we see is the Loading...
In the code, we do have a check to be sure we are not running the celery command as root.
These are the Celery settings from our settings.py file:
CELERY_ACCEPT_CONTENT = ['json','pickle']
CELERY_TASK_SERIALIZER = 'pickle'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_IMPORTS = ('api.tasks',)
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYD_CONCURRENCY = 1
BROKER_POOL_LIMIT = 120 # Note: I tried this set to None but it didn't seem to make any difference
CELERYD_LOG_COLOR = False
CELERY_LOG_FORMAT = '%)asctime)s - $(processName)s - %(levelname)s - %(message)s'
CELERYD_HIJACK_ROOT_LOGGER = False
STATIC_URL = '/static/'
STATIC_ROOT = os.path.join(psconf.BASE_DIR, 'myapp_static/')
BROKER_URL = psconf.MQ_URI
CELERY_RESULT_BACKEND = 'rpc'
CELERY_RESULT_PERSISTENT = True
CELERY_ROUTES = {}
for entry in os.scandir(psconf.PLUGIN_PATH):
if not entry.is_dir() or entry.name == '__pycache__':
continue
plugin_dir = entry.name
settings_file = f'{plugin_dir}.settings'
try:
plugin_tasks = importlib.import_module(settings_file)
queue_name = plugin_tasks.QUEUENAME
except ModuleNotFoundError as e:
logging.warning(e)
except AttributeError:
logging.debug(f'The plugin {plugin_dir} will use the general worker queue.')
else:
CELERY_ROUTES[f'{plugin_dir}.tasks.run'] = {'queue': queue_name}
logging.debug(f'The plugin {plugin_dir} will use the {queue_name} queue.')
Here is the part that kicks off the worker:
class CeleryWorker(BackgroundProcess):
def __init__(self, n, q):
self.name = n
self.worker_queue = q
cmd = f'celery worker --app=myapp --loglevel=info --concurrency=1 --maxtasksperchild=20 -n {self.name" -Q {self.worker_queue}'
super().__init__(cmd, cwd=str(psconf.BASE_DIR))
class BackgroundProcess(subprocess.Popen):
def __init__(self, args, **kwargs):
super().__init__(args, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True, **kwargs)
Any suggestions as to how to get this working from Python are appreciated. I'm new to Rabbitmq/Celery.

Just in case someone else needs this...It turns out that the problem was that the shell script which kicks off this whole app is now being launched with sudo and, even though I thought I was checking so we wouldn't launch the celery worker with sudo, I'd missed something and we were trying to launch as root. That is a no-no. I'm now explicitly using 'sudo -u ' and the workers are starting properly.

celery one broker multiple queues and workers

I have a python file called tasks.py in which I am defining 4 single tasks. I would like to configure celery in order to use 4 queues because each queue would have a different number of workers assigned. I was reading I should use route_task property but I tried several options and not a success.
I was following this doc celery route_tasks docs
My goal would be run 4 workers, one for each task, and don't mix tasks from different workers in different queues. It's possible? It's a good approach?
If I am doing something wrong I would be happy to change my code to make it work
Here is my config so far
tasks.py
app = Celery('tasks', broker='pyamqp://guest#localhost//')
app.conf.task_default_queue = 'default'
app.conf.task_queues = (
Queue('queueA', routing_key='tasks.task_1'),
Queue('queueB', routing_key='tasks.task_2'),
Queue('queueC', routing_key='tasks.task_3'),
Queue('queueD', routing_key='tasks.task_4')
)
#app.task
def task_1():
print "Task of level 1"
#app.task
def task_2():
print "Task of level 2"
#app.task
def task_3():
print "Task of level 3"
#app.task
def task_4():
print "Task of level 4"
Run celery one worker for each queue
celery -A tasks worker --loglevel=debug -Q queueA --logfile=celery-A.log -n W1&
celery -A tasks worker --loglevel=debug -Q queueB --logfile=celery-B.log -n W2&
celery -A tasks worker --loglevel=debug -Q queueC --logfile=celery-C.log -n W3&
celery -A tasks worker --loglevel=debug -Q queueD --logfile=celery-D.log -n W4&

There is no need to get into complex routing for submitting tasks into different queues. Define your tasks as usual.
from celery import celery
app = Celery('tasks', broker='pyamqp://guest#localhost//')
#app.task
def task_1():
print "Task of level 1"
#app.task
def task_2():
print "Task of level 2"
Now while queuing the tasks, put the tasks in proper queue. Here is an example on how to do it.
In [12]: from tasks import *
In [14]: result = task_1.apply_async(queue='queueA')
In [15]: result = task_2.apply_async(queue='queueB')
This will put the task_1 in queue named queueA and task_2 in queueB.
Now you can just start your workers to consume them.
celery -A tasks worker --loglevel=debug -Q queueA --logfile=celery-A.log -n W1&
celery -A tasks worker --loglevel=debug -Q queueB --logfile=celery-B.log -n W2&

Note: task and message are used interchangeably in the answer. It is basically a payload that the producer sends to RabbitMQ
You can either follow approach suggested by Chillar or you can define and use the task_routes configuration to route the messages to appropriate queue. This way you don't need to specify queue name every time you call apply_async.
Example: Route task1 to QueueA and route task2 to QueueB
app = Celery('my_app')
app.conf.update(
task_routes={
'task1': {'queue': 'QueueA'},
'task2': {'queue': 'QueueB'}
}
)
Sending a task to multiple queue is a bit tricky. You will have to declare an exchange, and then route your task with appropriate routing_key. You can get more information about type of exchange here. Let's go with direct for purpose of illustration.
Create Exchange
from kombu import Exchange, Queue, binding
exchange_for_queueA_and_B = Exchange('exchange_for_queueA_and_B', type='direct')
Create bindings on Queues to that exchange
app.conf.update(
task_queues=(
Queue('QueueA', [
binding(exchange_for_queueA_and_B, routing_key='queue_a_and_b')
]),
Queue('QueueB', [
binding(exchange_for_queueA_and_B, routing_key='queue_a_and_b')
])
)
)
Define the task_route to send task1 to the exchange
app.conf.update(
task_routes={
'task1': {'exchange': 'exchange_for_queueA_and_B', 'routing_key': 'queue_a_and_b'}
}
)
You can also declare these options of exchange and routing_key in your apply_async method as suggested by Chillar in the above answer.
After that, you can define your workers on same machine or different machines, to consume from those queues.
celery -A my_app worker -n consume_from_QueueA_and_QueueB -Q QueueA,QueueB
celery -A my_app worker -n consume_from_QueueA_only -Q QueueA

Starting celery worker from multiprocessing

I'm new to celery. All of the examples I've seen start a celery worker from the command line. e.g:
$ celery -A proj worker -l info
I'm starting a project on elastic beanstalk and thought it would be nice to have the worker be a subprocess of my web app. I tried using multiprocessing and it seems to work. I'm wondering if this is a good idea, or if there might be some disadvantages.
import celery
import multiprocessing
class WorkerProcess(multiprocessing.Process):
def __init__(self):
super().__init__(name='celery_worker_process')
def run(self):
argv = [
'worker',
'--loglevel=WARNING',
'--hostname=local',
]
app.worker_main(argv)
def start_celery():
global worker_process
worker_process = WorkerProcess()
worker_process.start()
def stop_celery():
global worker_process
if worker_process:
worker_process.terminate()
worker_process = None
worker_name = 'celery#local'
worker_process = None
app = celery.Celery()
app.config_from_object('celery_app.celeryconfig')

Seems like a good option, definitely not the only option but a good one :)
One thing you might want to look into (you might already be doing this), is linking the autoscaling to the size of your Celery queue. So you only scale up when the queue is growing.
Effectively Celery does something similar internally of course, so there's not a lot of difference. The only snag I can think of is the handling of external resources (database connections for example), that might be a problem but is completely dependent on what you are doing with Celery.

If anyone is interested, I did get this working on Elastic Beanstalk with a pre-configured AMI server running Python 3.4. I had a lot of problems with the Docker based server running Debian Jessie. Something to do with port remapping, maybe. Docker is kind of a black box, and I've found it very hard to work with and debug. Fortunately, the good folks at AWS just added a non-docker Python 3.4 option on April 8, 2015.
I did a lot of searching to get this deployed and working. I saw lots of questions without answers. So here's my very simple deployed python 3.4/flask/celery process.
Celery you can just pip install. You'll need to install rabbitmq from a configuration file with a config command or container_command. I'm using a script in my uploaded project zip, so a container_command is necessary to use the script (regular eb config command takes place before the project is installed).
[yourapproot]/.ebextensions/05_install_rabbitmq.config:
container_commands:
01RunScript:
command: bash ./init_scripts/app_setup.sh
[yourapproot]/init_scripts/app_setup.sh:
#!/usr/bin/env bash
# Download and install Erlang
yum install erlang
# Download the latest RabbitMQ package using wget:
wget http://www.rabbitmq.com/releases/rabbitmq-server/v3.5.1/rabbitmq-server-3.5.1-1.noarch.rpm
# Install rabbit
rpm --import http://www.rabbitmq.com/rabbitmq-signing-key-public.asc
yum -y install rabbitmq-server-3.5.1-1.noarch.rpm
# Start server
/sbin/service rabbitmq-server start
I'm doing a flask app, so I startup the workers before the first request:
#app.before_first_request
def before_first_request():
task_mgr.start_celery()
The task_mgr creates the celery app object (which I call celery, since the flask app object is app). The -Ofair is pretty key here, for a simple task manager. There's all kinds of strange behavior with task prefetch. This should maybe be the default?
task_mgr/task_mgr.py:
import celery as celery_module
import multiprocessing
class WorkerProcess(multiprocessing.Process):
def __init__(self):
super().__init__(name='celery_worker_process')
def run(self):
argv = [
'worker',
'--loglevel=WARNING',
'--hostname=local',
'-Ofair',
]
celery.worker_main(argv)
def start_celery():
global worker_process
multiprocessing.set_start_method('fork') # 'spawn' seems to work also
worker_process = WorkerProcess()
worker_process.start()
def stop_celery():
global worker_process
if worker_process:
worker_process.terminate()
worker_process = None
worker_name = 'celery#local'
worker_process = None
celery = celery_module.Celery()
celery.config_from_object('task_mgr.celery_config')
My config is pretty simple so far:
task_mgr/celery_config.py:
BROKER_URL = 'amqp://'
CELERY_RESULT_BACKEND = 'amqp://'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json' # 'pickle' warning: can't use datetime in json
CELERY_RESULT_SERIALIZER = 'json' # 'pickle' warning: can't use datetime in json
CELERY_TASK_RESULT_EXPIRES = 18000 # Results hang around for 5 hours
CELERYD_CONCURRENCY = 4
Then you can put tasks wherever you need them:
from task_mgr.task_mgr import celery
import time
#celery.task(bind=True)
def error_task(self):
self.update_state(state='RUNNING')
time.sleep(10)
raise KeyError('im an error')
#celery.task(bind=True)
def long_task(self):
self.update_state(state='RUNNING')
time.sleep(20)
return 'long task finished'
#celery.task(bind=True)
def task_with_status(self, wait):
self.update_state(state='RUNNING')
for i in range(5):
time.sleep(wait)
self.update_state(
state='PROGRESS',
meta={
'current': i + 1,
'total': 5,
'status': 'progress',
'host': self.request.hostname,
}
)
time.sleep(wait)
return 'finished with wait = ' + str(wait)
I also keep a task queue to hold the async results so I can monitor the tasks:
task_queue = []
def queue_task(task, *args):
async_result = task.apply_async(args)
task_queue.append(
{
'task_name':task.__name__,
'task_args':args,
'async_result':async_result
}
)
return async_result
def get_tasks_info():
tasks = []
for task in task_queue:
task_name = task['task_name']
task_args = task['task_args']
async_result = task['async_result']
task_id = async_result.id
task_state = async_result.state
task_result_info = async_result.info
task_result = async_result.result
tasks.append(
{
'task_name': task_name,
'task_args': task_args,
'task_id': task_id,
'task_state': task_state,
'task_result.info': task_result_info,
'task_result': task_result,
}
)
return tasks
And of course, start the tasks where you need to:
from webapp.app import app
from flask import url_for, render_template, redirect
from webapp import tasks
from task_mgr import task_mgr
#app.route('/start_all_tasks')
def start_all_tasks():
task_mgr.queue_task(tasks.long_task)
task_mgr.queue_task(tasks.error_task)
for i in range(1, 9):
task_mgr.queue_task(tasks.task_with_status, i * 2)
return redirect(url_for('task_status'))
#app.route('/task_status')
def task_status():
current_tasks = task_mgr.get_tasks_info()
return render_template(
'parse/task_status.html',
tasks=current_tasks
)
And that's about it. Let me know if you need any help, though my celery knowledge is still fairly limited.

Django/Celery multiple queues on localhost - routing not working

I followed celery docs to define 2 queues on my dev machine.
My celery settings:
CELERY_ALWAYS_EAGER = True
CELERY_TASK_RESULT_EXPIRES = 60 # 1 mins
CELERYD_CONCURRENCY = 2
CELERYD_MAX_TASKS_PER_CHILD = 4
CELERYD_PREFETCH_MULTIPLIER = 1
CELERY_CREATE_MISSING_QUEUES = True
CELERY_QUEUES = (
Queue('default', Exchange('default'), routing_key='default'),
Queue('feeds', Exchange('feeds'), routing_key='arena.social.tasks.#'),
)
CELERY_ROUTES = {
'arena.social.tasks.Update': {
'queue': 'fs_feeds',
},
}
i opened two terminal windows, in virtualenv of my project, and ran following commands:
terminal_1$ celery -A arena worker -Q default -B -l debug --purge -n deafult_worker
terminal_2$ celery -A arena worker -Q feeds -B -l debug --purge -n feeds_worker
what i get is that all tasks are being processed by both queues.
My goal is to have one queue to process only the one task defined in CELERY_ROUTES and default queue to process all other tasks.
I also followed this SO question, rabbitmqctl list_queues returns celery 0, and running rabbitmqctl list_bindings returns exchange celery queue celery [] twice. Restarting rabbit server didn't change anything.

Ok, so i figured it out. Following is my whole setup, settings and how to run celery, for those who might be wondering about same thing as my question did.
Settings
CELERY_TIMEZONE = TIME_ZONE
CELERY_ACCEPT_CONTENT = ['json', 'pickle']
CELERYD_CONCURRENCY = 2
CELERYD_MAX_TASKS_PER_CHILD = 4
CELERYD_PREFETCH_MULTIPLIER = 1
# celery queues setup
CELERY_DEFAULT_QUEUE = 'default'
CELERY_DEFAULT_EXCHANGE_TYPE = 'topic'
CELERY_DEFAULT_ROUTING_KEY = 'default'
CELERY_QUEUES = (
Queue('default', Exchange('default'), routing_key='default'),
Queue('feeds', Exchange('feeds'), routing_key='long_tasks'),
)
CELERY_ROUTES = {
'arena.social.tasks.Update': {
'queue': 'feeds',
'routing_key': 'long_tasks',
},
}
How to run celery?
terminal - tab 1:
celery -A proj worker -Q default -l debug -n default_worker
this will start first worker that consumes tasks from default queue. NOTE! -n default_worker is not a must for the first worker, but is a must if you have any other celery instances up and running. Setting -n worker_name is the same as --hostname=default#%h.
terminal - tab 2:
celery -A proj worker -Q feeds -l debug -n feeds_worker
this will start second worker that consumers tasks from feeds queue. Notice -n feeds_worker, if you are running with -l debug (log level = debug), you will see that both workers are syncing between them.
terminal - tab 3:
celery -A proj beat -l debug
this will start the beat, executing tasks according to the schedule in your CELERYBEAT_SCHEDULE.
I didn't have to change the task, or the CELERYBEAT_SCHEDULE.
For example, this is how looks my CELERYBEAT_SCHEDULE for the task that should go to feeds queue:
CELERYBEAT_SCHEDULE = {
...
'update_feeds': {
'task': 'arena.social.tasks.Update',
'schedule': crontab(minute='*/6'),
},
...
}
As you can see, no need for adding 'options': {'routing_key': 'long_tasks'} or specifying to what queue it should go. Also, if you were wondering why Update is upper cased, its because its a custom task, which are defined as sub classes of celery.Task.
Update Celery 5.0+
Celery made a couple changes since version 5, here is an updated setup for routing of tasks.
How to create the queues?
Celery can create the queues automatically. It works perfectly for simple cases, where celery default values for routing are ok.
task_create_missing_queues=True or, if you're using django settings and you're namespacing all celery configs under CELERY_ key, CELERY_TASK_CREATE_MISSING_QUEUES=True. Note, that it is on by default.
Automatic scheduled task routing
After configuring celery app:
celery_app.conf.beat_schedule = {
"some_scheduled_task": {
"task": "module.path.some_task",
"schedule": crontab(minute="*/10"),
"options": {"queue": "queue1"}
}
}
Automatic task routing
Celery app still has to be configured first and then:
app.conf.task_routes = {
"module.path.task2": {"queue": "queue2"},
}
Manual routing of tasks
In case and you want to route the tasks dynamically, then when sending the task specify the queue:
from module import task
def do_work():
# do some work and launch the task
task.apply_async(args=(arg1, arg2), queue="queue3")
More details re routing can be found here:
https://docs.celeryproject.org/en/stable/userguide/routing.html
And regarding calling tasks here:
https://docs.celeryproject.org/en/stable/userguide/calling.html

In addition to accepted answer, if anyone comes here and still wonders why his settings aren't working (as I did just moments ago), here's why: celery documentation isn't listing settings names properly.
For celery 5.0.5 settings CELERY_DEFAULT_QUEUE, CELERY_QUEUES, CELERY_ROUTES should be named CELERY_TASK_DEFAULT_QUEUE, CELERY_TASK_QUEUESand CELERY_TASK_ROUTES instead. These are settings that I've tested, but my guess is the same rule applies for exchange and routing key aswell.

Running multiple instances of celery on the same server

I want to run two instances of celery on the same machine. One is for an 'A' version of my application, the other is for the 'B' version.
I have two instances, which I start like this:
(env1)/home/me/firstapp$ celery -A app.tasks worker --config celeryconfig
(env2)/home/me/secondapp$ celery -A app.tasks worker -n Carrot --config celeryconfig
In tasks.py in each application, I create a celery instance like this:
celery = Celery('tasks', backend='amqp', broker='amqp://guest#127.0.0..1.5672//')
#celery.task
def run_a_task():
do_stuff()
In env2's task.py, how can I specify that I want to use the second celery instance from secondapp(named Carrot), rather than the first one from firstapp? I suspect I need to change something in the constructor for celery on the first line, but I don't know what to add.

I solved this by using a virtual host for celery.
Once the rabbitmq server is running I issue these commands:
rabbitmqctl add_user user password
rabbitmqctl add_vhost app2
rabbitmqctl set_permissions -p app2 user ".*" ".*" ".*"
Then I start celery with:
celery -A tasks worker --broker=amqp://user:password#localhost/app2
With my task, I initialize the celery object like this:
celery = Celery('tasks', backend='amqp', broker='amqp://user:password#localhost:5672/app2

It looks like you're using AMQP so I would recommend solving this using different exchanges.
I recommend reading this blogposts about the AMQP structure: http://blogs.digitar.com/jjww/?s=rabbits&x=0&y=0
For Celery specific information, have a look here: http://docs.celeryproject.org/en/latest/userguide/routing.html
A short version of what you could do is give the apps a different default queue:
from kombu import Exchange, Queue
CELERY_DEFAULT_QUEUE = 'app1'
CELERY_QUEUES = (
Queue('app1', Exchange('app1'), routing_key='app1'),
)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get celery to prioritize tasks? - python

Related

Celery worker doesn't launch from Python

celery one broker multiple queues and workers

Starting celery worker from multiprocessing

Django/Celery multiple queues on localhost - routing not working

Running multiple instances of celery on the same server

Categories

Resources