I have used this tutorial to set up Celery on my Flask application but i keep getting the following error:
File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\celery\app\base.py", line 141, in data
return self.callback()
celery.exceptions.ImproperlyConfigured:
Cannot mix new setting names with old setting names, please
rename the following settings to use the old format:
include -> CELERY_INCLUDE
Or change all of the settings to use the new format :)
What am i doing wrong? The code i used is basically the same of the tutorial:
init.py
app = Flask(__name__)
app.config.from_object(Config)
app.config['TESTING'] = True
db = SQLAlchemy(app)
migrate = Migrate(app, db)
def make_celery(app):
celery = Celery(
app.import_name,
backend=app.config['CELERY_RESULT_BACKEND'],
broker=app.config['CELERY_BROKER_URL']
)
celery.conf.update(app.config)
class ContextTask(celery.Task):
def __call__(self, *args, **kwargs):
with app.app_context():
return self.run(*args, **kwargs)
celery.Task = ContextTask
return celery
app.config.update(
CELERY_BROKER_URL='redis://localhost:6379',
CELERY_RESULT_BACKEND='redis://localhost:6379'
)
celery = make_celery(app)
the code you are using used old variable names
change lines
app.config.update(
CELERY_BROKER_URL='redis://localhost:6379',
CELERY_RESULT_BACKEND='redis://localhost:6379'
)
to
app.config.update(
broker_url='redis://localhost:6379',
result_backend='redis://localhost:6379'
)
The similar Celery/Flask implementation was working fine on Ubuntu and Python 3.8, but giving errors as above on Windows 10 (Python 3.9.0 and Celery 4.4.6 (cliffs)).
It finally worked for me by adding -P solo to the celery worker command (ref - https://github.com/celery/celery/issues/3759)
$ celery worker -A proj -Q yourqueuename -P solo --loglevel=INFO
Related
I would like to add the following celery setting modification to the django app
worker_send_task_event = False
task_ignore_result = True
task_acks_late = True
worker_prefetch_multiplier = 10
In my celery.py, I got
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'server.settings')
app = Celery('server')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
And my tasks.py
#shared_task
def some_task():
pass
Celery is executed using the following
command: celery -A server worker -Ofair — without-gossip — without-mingle — without-heartbeat
I have added them directly to the Django settings.py but I am not sure if Celery actually picked those settings up. So I am wondering if there is another way to add them or someone has a similar experience?
I am using
celery==5.2.1
Django==3.2.5
You need to use upper case naming like described here
Example:
Use CELERYD_SEND_EVENTS = True istead of worker_send_task_event = False
I have been trying to get Celery task results to be routed to another process by making results persisted to a queue and another process can pick results from queue. So, have configured Celery as CELERY_RESULT_BACKEND = 'rpc', but still Python function returned value is not persisted to queue.
Not sure if any other configuration or code change required. Please help.
Here is the code example:
celery.py
from __future__ import absolute_import
from celery import Celery
app = Celery('proj',
broker='amqp://',
backend='rpc://',
include=['proj.tasks'])
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_RESULT_BACKEND = 'rpc',
CELERY_RESULT_PERSISTENT = True,
CELERY_TASK_SERIALIZER = 'json',
CELERY_RESULT_SERIALIZER = 'json'
)
if __name__ == '__main__':
app.start()
tasks.py
from proj.celery import app
#app.task
def add(x, y):
return x + y
Running Celery as
celery worker --app=proj -l info --pool=eventlet -c 4
Solved by using Pika (Python implementation of the AMQP 0-9-1 protocol - https://pika.readthedocs.org) to post results back to celeryresults channel
I'm new to celery. All of the examples I've seen start a celery worker from the command line. e.g:
$ celery -A proj worker -l info
I'm starting a project on elastic beanstalk and thought it would be nice to have the worker be a subprocess of my web app. I tried using multiprocessing and it seems to work. I'm wondering if this is a good idea, or if there might be some disadvantages.
import celery
import multiprocessing
class WorkerProcess(multiprocessing.Process):
def __init__(self):
super().__init__(name='celery_worker_process')
def run(self):
argv = [
'worker',
'--loglevel=WARNING',
'--hostname=local',
]
app.worker_main(argv)
def start_celery():
global worker_process
worker_process = WorkerProcess()
worker_process.start()
def stop_celery():
global worker_process
if worker_process:
worker_process.terminate()
worker_process = None
worker_name = 'celery#local'
worker_process = None
app = celery.Celery()
app.config_from_object('celery_app.celeryconfig')
Seems like a good option, definitely not the only option but a good one :)
One thing you might want to look into (you might already be doing this), is linking the autoscaling to the size of your Celery queue. So you only scale up when the queue is growing.
Effectively Celery does something similar internally of course, so there's not a lot of difference. The only snag I can think of is the handling of external resources (database connections for example), that might be a problem but is completely dependent on what you are doing with Celery.
If anyone is interested, I did get this working on Elastic Beanstalk with a pre-configured AMI server running Python 3.4. I had a lot of problems with the Docker based server running Debian Jessie. Something to do with port remapping, maybe. Docker is kind of a black box, and I've found it very hard to work with and debug. Fortunately, the good folks at AWS just added a non-docker Python 3.4 option on April 8, 2015.
I did a lot of searching to get this deployed and working. I saw lots of questions without answers. So here's my very simple deployed python 3.4/flask/celery process.
Celery you can just pip install. You'll need to install rabbitmq from a configuration file with a config command or container_command. I'm using a script in my uploaded project zip, so a container_command is necessary to use the script (regular eb config command takes place before the project is installed).
[yourapproot]/.ebextensions/05_install_rabbitmq.config:
container_commands:
01RunScript:
command: bash ./init_scripts/app_setup.sh
[yourapproot]/init_scripts/app_setup.sh:
#!/usr/bin/env bash
# Download and install Erlang
yum install erlang
# Download the latest RabbitMQ package using wget:
wget http://www.rabbitmq.com/releases/rabbitmq-server/v3.5.1/rabbitmq-server-3.5.1-1.noarch.rpm
# Install rabbit
rpm --import http://www.rabbitmq.com/rabbitmq-signing-key-public.asc
yum -y install rabbitmq-server-3.5.1-1.noarch.rpm
# Start server
/sbin/service rabbitmq-server start
I'm doing a flask app, so I startup the workers before the first request:
#app.before_first_request
def before_first_request():
task_mgr.start_celery()
The task_mgr creates the celery app object (which I call celery, since the flask app object is app). The -Ofair is pretty key here, for a simple task manager. There's all kinds of strange behavior with task prefetch. This should maybe be the default?
task_mgr/task_mgr.py:
import celery as celery_module
import multiprocessing
class WorkerProcess(multiprocessing.Process):
def __init__(self):
super().__init__(name='celery_worker_process')
def run(self):
argv = [
'worker',
'--loglevel=WARNING',
'--hostname=local',
'-Ofair',
]
celery.worker_main(argv)
def start_celery():
global worker_process
multiprocessing.set_start_method('fork') # 'spawn' seems to work also
worker_process = WorkerProcess()
worker_process.start()
def stop_celery():
global worker_process
if worker_process:
worker_process.terminate()
worker_process = None
worker_name = 'celery#local'
worker_process = None
celery = celery_module.Celery()
celery.config_from_object('task_mgr.celery_config')
My config is pretty simple so far:
task_mgr/celery_config.py:
BROKER_URL = 'amqp://'
CELERY_RESULT_BACKEND = 'amqp://'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json' # 'pickle' warning: can't use datetime in json
CELERY_RESULT_SERIALIZER = 'json' # 'pickle' warning: can't use datetime in json
CELERY_TASK_RESULT_EXPIRES = 18000 # Results hang around for 5 hours
CELERYD_CONCURRENCY = 4
Then you can put tasks wherever you need them:
from task_mgr.task_mgr import celery
import time
#celery.task(bind=True)
def error_task(self):
self.update_state(state='RUNNING')
time.sleep(10)
raise KeyError('im an error')
#celery.task(bind=True)
def long_task(self):
self.update_state(state='RUNNING')
time.sleep(20)
return 'long task finished'
#celery.task(bind=True)
def task_with_status(self, wait):
self.update_state(state='RUNNING')
for i in range(5):
time.sleep(wait)
self.update_state(
state='PROGRESS',
meta={
'current': i + 1,
'total': 5,
'status': 'progress',
'host': self.request.hostname,
}
)
time.sleep(wait)
return 'finished with wait = ' + str(wait)
I also keep a task queue to hold the async results so I can monitor the tasks:
task_queue = []
def queue_task(task, *args):
async_result = task.apply_async(args)
task_queue.append(
{
'task_name':task.__name__,
'task_args':args,
'async_result':async_result
}
)
return async_result
def get_tasks_info():
tasks = []
for task in task_queue:
task_name = task['task_name']
task_args = task['task_args']
async_result = task['async_result']
task_id = async_result.id
task_state = async_result.state
task_result_info = async_result.info
task_result = async_result.result
tasks.append(
{
'task_name': task_name,
'task_args': task_args,
'task_id': task_id,
'task_state': task_state,
'task_result.info': task_result_info,
'task_result': task_result,
}
)
return tasks
And of course, start the tasks where you need to:
from webapp.app import app
from flask import url_for, render_template, redirect
from webapp import tasks
from task_mgr import task_mgr
#app.route('/start_all_tasks')
def start_all_tasks():
task_mgr.queue_task(tasks.long_task)
task_mgr.queue_task(tasks.error_task)
for i in range(1, 9):
task_mgr.queue_task(tasks.task_with_status, i * 2)
return redirect(url_for('task_status'))
#app.route('/task_status')
def task_status():
current_tasks = task_mgr.get_tasks_info()
return render_template(
'parse/task_status.html',
tasks=current_tasks
)
And that's about it. Let me know if you need any help, though my celery knowledge is still fairly limited.
I have a flask app that roughly looks like this:
app = Flask(__name__)
#app.route('/',methods=['POST'])
def foo():
data = json.loads(request.data)
# do some stuff
return "OK"
Now in addition I would like to run a function every ten seconds from that script. I don't want to use sleep for that. I have the following celery script in addition:
from celery import Celery
from datetime import timedelta
celery = Celery('__name__')
CELERYBEAT_SCHEDULE = {
'add-every-30-seconds': {
'task': 'tasks.add',
'schedule': timedelta(seconds=10)
},
}
#celery.task(name='tasks.add')
def hello():
app.logger.info('run my function')
The script works fine, but the logger.info is not executed. What am I missing?
Do you have Celery worker and Celery beat running? Scheduled tasks are handled by beat, which queues the task mentioned when appropriate. Worker then actually crunches the numbers and executes your task.
celery worker --app myproject--loglevel=info
celery beat --app myproject
Your task however looks like it's calling the Flask app's logger. When using the worker, you probably don't have the Flask application around (since it's in another process). Try using a normal Python logger for the demo task.
Well, celery beat can be embedded in regular celery worker as well, with -B parameter in your command.
celery -A --app myproject --loglevel=info -B
It is only recommended for the development environment. For production, you should run beat and celery workers separately as documentation mentions. Otherwise, your periodic task will run more than one time.
A celery task by default will run outside of the Flask app context and thus it won't have access to Flask app instance. However it's very easy to create the Flask app context while running a task by using app_context method of the Flask app object.
app = Flask(__name__)
celery = Celery(app.name)
#celery.task
def task():
with app.app_context():
app.logger.info('running my task')
This article by Miguel Grinberg is a very good place to get a primer on the basics of using Celery in a Flask application.
First install the redis on machine and check it is running or not.
install the python dependencies
celery
redis
flask
folder structure
project
app
init.py
task.py
main.py
write task.py
from celery import Celery
from celery.schedules import crontab
from app import app
from app.scrap import product_data
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
def make_celery(app):
#Celery configuration
app.config['CELERY_BROKER_URL'] = 'redis://127.0.0.1:6379'
app.config['CELERY_RESULT_BACKEND'] = 'db+postgresql://user:password#172.17.0.3:5432/mydatabase'
app.config['CELERY_RESULT_EXTENDED']=True
app.config['CELERYBEAT_SCHEDULE'] = {
# Executes every minute
'periodic_task-every-minute': {
'task': 'periodic_task',
'schedule': crontab(minute="*")
}
}
celery = Celery(app.import_name, broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
celery = make_celery(app)
#celery.task(name="periodic_task",bind=True)
def testing(self):
file1 = open("../myfile.txt", "a")
# writing newline character
file1.write("\n")
file1.write("Today")
#faik
print("Running")
self.request.task_name = "state"
logger.info("Hello! from periodic task")
return "Done"
write init.py
from flask import Flask, Blueprint,request
from flask_restx import Api,Resource,fields
from flask_sqlalchemy import SQLAlchemy
import redis
from rq import Queue
app = Flask(__name__)
app.config['SECRET_KEY']='7c09ebc8801a0ce8fb82b3d2ec51b4db'
app.config['SQLALCHEMY_DATABASE_URI']='sqlite:///site.db'
db=SQLAlchemy(app)
command to run celery beat and worker
celery -A app.task.celery beat
celery -A app.task.celery worker --loglevel=info
Using flask-script's add_option method I'm trying to pass the name of a config file into my create_app() so I can configure from_pyfile() -- Flask Instance Folders
I used this gist to get me started.
manage.py
from fbone import create_app
app = create_app()
manager = Manager(app)
manager.add_option('-c', '--config', dest='config', required=False)
app.py
def create_app(config=None, app_name=None, blueprints=None):
"""Create a Flask app."""
print config
This is just a snippet of my create_app function but I'm starting the app like this:
$ python manage.py -c config/heroku.cfg runserver
None
/env/lib/python2.7/site-packages/flask_script/__init__.py:153: UserWarning: Options will be ignored.
* Running on http://127.0.0.1:5000/
* Restarting with reloader
None
As you can see, instead of printing config/heroku.cfg it prints None
I think this is because of the UserWarning from flask script but I can't find out why that's happening.
It turns out you are creating the flask object by calling create_app() (with the parens).
If you do
app=create_app
or
Manager(create_app)
then you should be able to use
add_option()