I am using Celery in python. I have the following task:
#app.task
def data():
while 1:
response = requests.get(url, timeout=300).json()
db.colloectionName.insert_many(response)
sleep(10000)
This task gets data from a web server and saves it on MongoDB in a loop.
I have called it by the following code:
data.delay()
it works fine. But, I want to kill it by programming. I tried by data.AsyncResult(task_id).revoke()
but it does not work.
How can I kill a running task in celery?
Why are you doing data.AsyncResult? You can try doing something like this.
from celery.task.control import revoke
revoke(task_id, terminate=True)
Also, further details can be found here:
http://docs.celeryproject.org/en/latest/userguide/workers.html#revoke-revoking-tasks
You can try:
res = data.delay()
res.revoke()
but I don't understand the point of using Celery in your scenario. How does Celery help you if you're doing a while 1 loop?
Consider breaking it down to a small task that performs a single HTTP request and add Celery beat to call it every 10 seconds.
You can stop beats whenever you like.
You can use following code:
revoke(task_Id, terminate=True)
Related
I'm building a django app where I use a camera to capture images, analyze them, store metadata and results of the analysis in a database, and finally present the data to users.
I'm considering using Celery to handle to background process of capturing images and then processing them:
app = Celery('myapp')
#app.task
def capture_and_process_images(camera):
while True:
image = camera.get_image()
process_image(image)
sleep(5000)
#app.task
def process_image(image):
# do some calculations
# django orm calls
# etc...
The first task will run perpetually, while the second should take ~20 seconds, so there will be multiple images being processed at once.
I haven't found any examples online of using Celery in this way, so I'm not sure if this is bad practice or not.
Can/should Celery be used to handle perpetually running tasks?
Thank you.
Running perpetual tasks in Celery is a done in practise. Take a look at daemonization, which essentially runs a permanent task without user interaction, so I wouldn't say there is anything wrong with running it permanently in your case.
Having celery task running infinitely is not seems like a good idea to me.
If you are going to capture images at some intervals I would suggest you to use some cron-like script getting an image every 5 seconds and launching celery task to process it.
Note also that it is a best practice to avoid synchronous subtasks in celery, see docs for more details.
Do you think it is possible to use asyncio to run a task every n seconds in django, so that the main process wouldn't be blocked?
Something for example that would print every 5 min in console, like:
import asyncio
from random import randint
async def do_stuff(something, howmany):
for i in range(howmany):
print('We are doing {}'.format(something))
await asyncio.sleep(randint(0, 5))
if __name__ == '__main__':
loop = asyncio.get_event_loop()
work = [
asyncio.ensure_future(do_stuff('something', 5)),
]
loop.run_until_complete(asyncio.gather(*work))
It seems that django will stop working while the loop is running. Even if this can be made to work in development, how would it behave when the site goes live on something like apache or gunicorn?
While it would be possible to achieve this with a lot of hard work. Much simpler is to use the time honoured practice of using a cron job. See this: Using django for CLI tool
Nowadays a more popular approach (among django devs at any rate) is to use celery. Specifically celery beat
celery beat is a scheduler; It kicks off tasks at regular intervals,
that are then executed by available worker nodes in the cluster.
I need explanation for scheduled task
I need to run task in every end of the day automatically like cron
I tried schedule app in my project
import schedule
import time
def job():
pprint.pprint("I'm working...")
schedule.every(10).minutes.do(job)
while True:
schedule.run_pending()
time.sleep(1)
when i add above code in project site was loading continuously
Question: Need to create task run automatically in background without user knowledge and without any command? It is possible?
I am new for python and django
Please suggest any idea for this task
If it is not an overkill - I recommend Celery.
It has "Celerybeat" which is like "cron"
Actually I think this is exactly what you need.
Usually you create a management command (https://docs.djangoproject.com/en/dev/howto/custom-management-commands/) and run it from a cron job.
I got celery project with RabbitMQ backend, that relies heavily on inspecting scheduled tasks. I found that the following code returns nothing for most of the time (of course, there are scheduled tasks) :
i = app.control.inspect()
scheduled = i.scheduled()
if (scheduled):
# do something
This code also runs from one of tasks, but I think it doesn't matter, I got same result from interactive python command line (with some exceptions, see below).
At the same time, celery -A <proj> inspect scheduled command never fails. Also, I noticed, that when called from interactive python command line for the first time, this command also never fails. Most of the successive i.scheduled() calls return nothing.
i.scheduled() guarantees result only when called for the first time?
If so, why and how then can I inspect scheduled tasks from task? Run dedicated worker and restart it after every task? Seems like overkill for such trivial task.
Please explain, how to use this feature the right way.
This is caused by some weird issue inside Celery app. To repeat methods from Inspect object you have to create new Celery app instance object.
Here is small snippet, which can help you:
from celery import Celery
def inspect(method):
app = Celery('app', broker='amqp://')
return getattr(app.control.inspect(), method)()
print inspect('scheduled')
print inspect('active')
I have a web app written in Flask that is currently running on IIS on Windows (don't ask...).
I'm using Celery to handle some asynchronous processing (accessing a slow database and generating a report).
However, when trying to set up some behavior for error handling, I came across this in the docs:
"Time limits do not currently work on Windows and other platforms that do not support the SIGUSR1 signal."
Since the DB can get really slow, I would really like to be able to specify a timeout behavior for my tasks, and have them retry later when the DB might not be so tasked. Given that the app, for various reasons, has to be served from Windows, is there any workaround for this?
Thanks so much for your help.
If you really need to set the task timeout, you can use the child process to achieve, the code as follows
import json
from multiprocessing import Process
from celery import current_app
from celery.exceptions import SoftTimeLimitExceeded
soft_time_limit = 60
#current_app.task(name="task_name")
def task_worker(self, *args, **kwargs):
def on_failure():
pass
worker = Process(target=do_working, args=args, kwargs=kwargs, name='worker')
worker.daemon = True
worker.start()
worker.join(soft_time_limit)
while worker.is_alive():
worker.terminate()
raise SoftTimeLimitExceeded
return json.dumps(dict(message="ok"))
def do_working(*args, **kwargs):
pass # do something
It doesn't look like there is any built in workaround for this in Celery. Could you perhaps code this into your task directly? In other words, in your python code, start a timer when you begin the task, if the task takes too long to complete, raise an exception, and resubmit the job to the queue again.