Start celery beat from a python program without using command line arguments - python

I have a implemented a python command which starts celery
#click.command("tasks", help="run this command to start celery task queue")
def tasks():
"""
Runs the celery task queue
"""
from celery.bin import worker
try:
worker = worker.worker(app=app.config.get("task_app"))
worker.run(app="my_app.task_queue", loglevel="info", uid=os.environ["uid"])
except Exception as e:
raise e
I need to create a similar command which starts celery beat and I had the following approach
#click.command("beat", help="run this command to start beat")
def beat():
try:
p = app.config.get("task_app") # gets the current
command = CeleryCommand()
lst = ["celery", "beat"]
command.execute_from_commandline(lst)
except Exception as e:
raise e
the --app parameter will not work with this approach. Is there a way to make this command programmatically work celery -A proj beat without passing from command line?

To start celery beat programmatically:
app.Beat(loglevel='debug').run()
Alternatively:
from celery.apps.beat import Beat
b = Beat(app=your_celery_app, loglevel='debug')
b.run()
For keyword arguments, see celery.beat documentation.

Related

Celery task not detecting error from subprocess

I have the following task with runs a python scripts using Popen (I already tried with check_output, call with the same result)
#app.task
def TaskName(filename):
try:
proc = subprocess.Popen(['python', filename],stdout=subprocess.PIPE,
stderr=subprocess.PIPE):
taskoutp, taskerror = proc.communicate()
print('Output: ', taskoutp)
print('Error: ', taskerror)
Except:
raise Exception('Task error: ', taskerror)
If I generate an error in the subprocess it is displayed in the worker cmd window as if it were a normal output/print, the task remains indefinetly with the status 'STARTED' even when I manually close the window opened by the Popen subprocess.
What can I do so the tasks not only prints the error but actually stops and changes its status to FAILURE?
If you want to store results of your task , you could use this parameter result_backend or CELERY_RESULT_BACKEND depending on the version of celery you're using.
This parameter can be used for acknowledging a task only after it is processed :
task_acks_late
Default: Disabled.
Late ack means the task messages will be acknowledged after the task has been executed, not just before (the default behavior).
Complete list of Configuration options can be found here => https://docs.celeryproject.org/en/stable/userguide/configuration.html

Celery worker doesn't launch from Python

We have Python 3.6.1 set up with Django, Celery, and Rabbitmq on Ubuntu 14.04. Right now, I'm using the Django debug server (for dev and Apache isn't working). My current problem is that the celery workers get launched from Python and immediately die -- processes show as defunct. If I use the same command in a terminal window, the worker gets created and picks up the task if there is one waiting in the queue.
Here's the command:
celery worker --app=myapp --loglevel=info --concurrency=1 --maxtasksperchild=20 -n celery_1 -Q celery
The same functionality occurs for whichever queues are being set up.
In the terminal, we see the output myapp.settings - INFO - Loading... followed by output that describes the queue and lists the tasks. When running from Python, the last thing we see is the Loading...
In the code, we do have a check to be sure we are not running the celery command as root.
These are the Celery settings from our settings.py file:
CELERY_ACCEPT_CONTENT = ['json','pickle']
CELERY_TASK_SERIALIZER = 'pickle'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_IMPORTS = ('api.tasks',)
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYD_CONCURRENCY = 1
BROKER_POOL_LIMIT = 120 # Note: I tried this set to None but it didn't seem to make any difference
CELERYD_LOG_COLOR = False
CELERY_LOG_FORMAT = '%)asctime)s - $(processName)s - %(levelname)s - %(message)s'
CELERYD_HIJACK_ROOT_LOGGER = False
STATIC_URL = '/static/'
STATIC_ROOT = os.path.join(psconf.BASE_DIR, 'myapp_static/')
BROKER_URL = psconf.MQ_URI
CELERY_RESULT_BACKEND = 'rpc'
CELERY_RESULT_PERSISTENT = True
CELERY_ROUTES = {}
for entry in os.scandir(psconf.PLUGIN_PATH):
if not entry.is_dir() or entry.name == '__pycache__':
continue
plugin_dir = entry.name
settings_file = f'{plugin_dir}.settings'
try:
plugin_tasks = importlib.import_module(settings_file)
queue_name = plugin_tasks.QUEUENAME
except ModuleNotFoundError as e:
logging.warning(e)
except AttributeError:
logging.debug(f'The plugin {plugin_dir} will use the general worker queue.')
else:
CELERY_ROUTES[f'{plugin_dir}.tasks.run'] = {'queue': queue_name}
logging.debug(f'The plugin {plugin_dir} will use the {queue_name} queue.')
Here is the part that kicks off the worker:
class CeleryWorker(BackgroundProcess):
def __init__(self, n, q):
self.name = n
self.worker_queue = q
cmd = f'celery worker --app=myapp --loglevel=info --concurrency=1 --maxtasksperchild=20 -n {self.name" -Q {self.worker_queue}'
super().__init__(cmd, cwd=str(psconf.BASE_DIR))
class BackgroundProcess(subprocess.Popen):
def __init__(self, args, **kwargs):
super().__init__(args, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True, **kwargs)
Any suggestions as to how to get this working from Python are appreciated. I'm new to Rabbitmq/Celery.
Just in case someone else needs this...It turns out that the problem was that the shell script which kicks off this whole app is now being launched with sudo and, even though I thought I was checking so we wouldn't launch the celery worker with sudo, I'd missed something and we were trying to launch as root. That is a no-no. I'm now explicitly using 'sudo -u ' and the workers are starting properly.

Why does this Celery "hello world" loop forever?

Consider the code:
from celery import Celery, group
from time import time
app = Celery('tasks', broker='redis:///0', backend='redis:///1', task_ignore_result=False)
#app.task
def test_task(i):
print('hi')
return i
x = test_task.delay(3)
print(x.get())
I run it by calling python script.py, but I'm getting no results. Why?
You don't get any results because you've asked your celery app to execute a task without starting a worker process to do the work executing it. The process you did start is blocked on the call to get().
First things first, when using celery it is critical that you do not have tasks get executed when a module is imported, so let's put your task execution inside of a main() function, and put it in a file called celery_test.py.
from celery import Celery, group
from time import time
app = Celery('tasks', broker='redis:///0', backend='redis:///1', task_ignore_result=False)
#app.task
def test_task(i):
print('hi')
return i
def main():
x = test_task.delay(3)
print(x.get())
if __name__ == '__main__':
main()
Now let's start a pool of celery workers to execute tasks for this app. You can do this by opening a new terminal and executing the following.
celery worker -A celery_test --loglevel=INFO
The -A flag refers to the module where celery will find an application to add workers to. You should see some output in the terminal to indicate that the the celery worker is running and ready for tasks to process.
Now, try executing your script again with python celery_test.py. You should see hi show up in the worker's log output, but the the value 3 returned in the script that called get().
Be warned, if you've been playing with celery without running a worker, it probably has lots of tasks waiting in your broker to execute. The first time you start up the worker pool, you'll see them all execute in parallel until the broker runs out of tasks.

Can a celery worker do more than one thing at a time?

If I have a celery task such as the following:
#celery.task(name='tasks.webrequest')
def webrequest(*args):
try:
webrequest = requests.get('{0}'.format(args[0]), auth=(args[1], args[2]), verify=False, timeout=240)
except Exception as e:
print e
webrequest='cant talk to server'
return webrequest
and a celery worker with only one core, so only 1 worker thread. Is there a way and how would you have that worker perform two or more of these tasks at once?
Currently I am executing the working like so:
celery -A app.celery worker -l DEBUG
When I call it with the concurrency (thanks Ale) it allows me to have more threads than cpu's.
celery -A app.celery worker -c 30 -l DEBUG

Detect whether Celery is Available/Running

I'm using Celery to manage asynchronous tasks. Occasionally, however, the celery process goes down which causes none of the tasks to get executed. I would like to be able to check the status of celery and make sure everything is working fine, and if I detect any problems display an error message to the user. From the Celery Worker documentation it looks like I might be able to use ping or inspect for this, but ping feels hacky and it's not clear exactly how inspect is meant to be used (if inspect().registered() is empty?).
Any guidance on this would be appreciated. Basically what I'm looking for is a method like so:
def celery_is_alive():
from celery.task.control import inspect
return bool(inspect().registered()) # is this right??
EDIT: It doesn't even look like registered() is available on celery 2.3.3 (even though the 2.1 docs list it). Maybe ping is the right answer.
EDIT: Ping also doesn't appear to do what I thought it would do, so still not sure the answer here.
Here's the code I've been using. celery.task.control.Inspect.stats() returns a dict containing lots of details about the currently available workers, None if there are no workers running, or raises an IOError if it can't connect to the message broker. I'm using RabbitMQ - it's possible that other messaging systems might behave slightly differently. This worked in Celery 2.3.x and 2.4.x; I'm not sure how far back it goes.
def get_celery_worker_status():
ERROR_KEY = "ERROR"
try:
from celery.task.control import inspect
insp = inspect()
d = insp.stats()
if not d:
d = { ERROR_KEY: 'No running Celery workers were found.' }
except IOError as e:
from errno import errorcode
msg = "Error connecting to the backend: " + str(e)
if len(e.args) > 0 and errorcode.get(e.args[0]) == 'ECONNREFUSED':
msg += ' Check that the RabbitMQ server is running.'
d = { ERROR_KEY: msg }
except ImportError as e:
d = { ERROR_KEY: str(e)}
return d
From the documentation of celery 4.2:
from your_celery_app import app
def get_celery_worker_status():
i = app.control.inspect()
availability = i.ping()
stats = i.stats()
registered_tasks = i.registered()
active_tasks = i.active()
scheduled_tasks = i.scheduled()
result = {
'availability': availability,
'stats': stats,
'registered_tasks': registered_tasks,
'active_tasks': active_tasks,
'scheduled_tasks': scheduled_tasks
}
return result
of course you could/should improve the code with error handling...
To check the same using command line in case celery is running as daemon,
Activate virtualenv and go to the dir where the 'app' is
Now run : celery -A [app_name] status
It will show if celery is up or not plus no. of nodes online
Source:
http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/
The following worked for me:
import socket
from kombu import Connection
celery_broker_url = "amqp://localhost"
try:
conn = Connection(celery_broker_url)
conn.ensure_connection(max_retries=3)
except socket.error:
raise RuntimeError("Failed to connect to RabbitMQ instance at {}".format(celery_broker_url))
One method to test if any worker is responding is to send out a 'ping' broadcast and return with a successful result on the first response.
from .celery import app # the celery 'app' created in your project
def is_celery_working():
result = app.control.broadcast('ping', reply=True, limit=1)
return bool(result) # True if at least one result
This broadcasts a 'ping' and will wait up to one second for responses. As soon as the first response comes in, it will return a result. If you want a False result faster, you can add a timeout argument to reduce how long it waits before giving up.
I found an elegant solution:
from .celery import app
try:
app.broker_connection().ensure_connection(max_retries=3)
except Exception as ex:
raise RuntimeError("Failed to connect to celery broker, {}".format(str(ex)))
You can use ping method to check whether any worker (or specific worker) is alive or not https://docs.celeryproject.org/en/latest/_modules/celery/app/control.html#Control.ping
celey_app.control.ping()
You can test on your terminal by running the following command.
celery -A proj_name worker -l INFO
You can review every time your celery runs.
The below script is worked for me.
#Import the celery app from project
from application_package import app as celery_app
def get_celery_worker_status():
insp = celery_app.control.inspect()
nodes = insp.stats()
if not nodes:
raise Exception("celery is not running.")
logger.error("celery workers are: {}".format(nodes))
return nodes
Run celery status to get the status.
When celery is running,
(venv) ubuntu#server1:~/project-dir$ celery status
-> celery#server1: OK
1 node online.
When no celery worker is running, you get the below information displayed in terminal.
(venv) ubuntu#server1:~/project-dir$ celery status
Error: No nodes replied within time constraint

Categories

Resources