I'm running celery and celery flower with redis as a broker. Everything boots up correctly, the worker can find jobs from redis, and the celery worker completes the jobs successfully.
The issue I'm having is the Broker tab in the celery flower web UI doesn't show any of the information from Redis. I know the Redis url is correct, because it's the same URL that celeryd is using. I also know that the celery queue has information in it, because I can manually confirm that via redis-cli.
I'm wondering if celery flower is trying to monitor a different queue in the Broker tab? I don't see any settings in the flower documentation to override or confirm. I'm happy to provide additional information upon request, but I'm not certain what is relevant.
Turns out I needed to start Celery Flower with both the broker and broker_api command line arguments:
celery flower --broker=redis://localhost:6379/0 --broker_api=redis://localhost:6379/0
Hope this helps someone else.
For AMQP this is an example.
/usr/bin/celery -A app_name --broker=amqp://user:pw#host//vhost --broker_api=http://user:pw#host:host_port/api flower
The broker_api is the rabbitmq web ui endpoint with /api
rabbitmq-plugins enable rabbitmq_management
that was help me from http://flower.readthedocs.org/en/latest/config.html?highlight=broker_api#broker-api
Faced the same issue with RabbitMQ. Here is how I have it works:
rabbitmq:
image: rabbitmq:3-management
flower:
image: mher/flower
ports:
- 5555:5555
command:
- "celery"
- "--broker=amqp://guest#rabbitmq:5672//"
- "flower"
- "--broker_api=http://guest:guest#rabbitmq:15672/api//"
depends_on:
- rabbitmq
Brokers & other tabs will show up.
Related
I have a setup where airflow is running in kubernetes (EKS) and remote worker running in docker-compose in a VM behind a firewall in a different location.
Problem
Airflow Web server in EKS is getting 403 forbidden error when trying to get logs on remote worker.
Build Version
Airflow - 2.2.2
OS - Linux - Ubuntu 20.04 LTS
Kubernetes
1.22 (EKS)
Redis (Celery Broker) - Service Port exposed on 6379
PostgreSQL (Celery Backend) - Service Port exposed on 5432
Airflow ENV config setup
AIRFLOW__API__AUTH_BACKEND: airflow.api.auth.backend.basic_auth
AIRFLOW__CELERY__BROKER_URL: redis://<username>:<password>#redis-master.airflow-dev.svc.cluster.local:6379/0
AIRFLOW__CELERY__RESULT_BACKEND: >-
db+postgresql://<username>:<password>#db-postgresql.airflow-dev.svc.cluster.local/<db>
AIRFLOW__CLI__ENDPOINT_URL: http://{hostname}:8080
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__EXECUTOR: CeleryExecutor
AIRFLOW__CORE__FERNET_KEY: <fernet_key>
AIRFLOW__CORE__HOSTNAME_CALLABLE: socket.getfqdn
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'
AIRFLOW__CORE__SQL_ALCHEMY_CONN: >-
postgresql+psycopg2://<username>:<password>#db-postgresql.airflow-dev.svc.cluster.local/<db>
AIRFLOW__LOGGING__BASE_LOG_FOLDER: /opt/airflow/logs
AIRFLOW__LOGGING__WORKER_LOG_SERVER_PORT: '8793'
AIRFLOW__WEBSERVER__BASE_URL: http://{hostname}:8080
AIRFLOW__WEBSERVER__SECRET_KEY: <secret_key>
_AIRFLOW_DB_UPGRADE: 'true'
_AIRFLOW_WWW_USER_CREATE: 'true'
_AIRFLOW_WWW_USER_PASSWORD: <username-webserver>
_AIRFLOW_WWW_USER_USERNAME: <password-webserver>
Airflow is using CeleryExecutor
Setup Test
Network reach ability by ping - OK
Celery Broker reach ability for both EKS and remote worker - OK
Celery Backend reach ability for both EKS and remote worker - OK
Firewall Port expose for remote worker Gunicorn API - OK
curl -v telnet://:8793 test - OK (Connected)
Airflow flower recognizing both workers from Kubernetes and remote worker - OK
All the ENV on both webserver, worker (EKS, remote) and scheduler are identical
Queue is setup so the DAG runs exactly in that particular worker
Time on both docker, VM and EKS is on UTC. There is a slight 5 to 8 seconds difference in docker and the pod in EKS
Ran webserver on the remote VM as well which can pick up and show logs
Description
Airflow is able to execute the DAG in remote worker, the logs can be seen in the remote worker. I have tried all combinations of setting but still keep getting 403.
Another test which was done was just normal curl with webserver auth
This curl was done both from EKS and remote server which hosts docker-compose. Results are the same on all the server.
curl --user <username-webserver> -vvv http:<remote-worker>:8793/logs/?<rest-of-the-log-url>
Getting 403 Forbidden
I might have miss configured it, but I doubt that is the case.
Any tips on what I am missing here? Many thanks in advance.
https://github.com/apache/airflow/discussions/26624#discussioncomment-3715688
With the above discussion I had with airflow community in Github, I synced the servers to use NTP, EKS and the remote worker had 135sec time drift.
Later worked on the auth.
I rebuilt the curl auth from this file of branch 2.2 https://github.com/apache/airflow/blob/main/airflow/utils/log/file_task_handler.py
Later realized that the auth doesn't like special characters in secret key, and added to that there was NTP time drift of 135 seconds (2min 15seconds) which would also factor in causing confusion.
I would recommend people who would face this problem to avoid special characters in secret key. Just an airflow user recommendation, I wouldn't want to say it is the only solution but something which helped me.
Special character and combined with NTP caused confusion for debugging the issue, resolving NTP should be first thing than with the auth.
I am having issues while running my Python Flask application from Docker pull (remote pull).
In my app I had used RabbitMQ as message broker, and Celery as task scheduler. It is working as expected when running locally, But when I put my application on Docker, and Docker pull it from remote system, it runs fine, but Celery and RabbitMQ are not running with it, so all tasks (with method.delay()) are running infinitely and http request is not being processed.
I need help in putting my Python Flask application to Docker, as my application has asynchronous tasks to be processed with Celery. I am not aware about how to modify docker-compose.yml for including Celery service.
Thanks is advance.
I think you need to link celery container with rabbitmq.
From https://docs.docker.com/compose/compose-file/#links
Link to containers in another service. Either specify both the service name and a link alias (SERVICE:ALIAS), or just the service name.
links:
- rabbitmq
Or
- rabbitmq:rabbitmq
We're using celery eta tasks to schedule tasks FAR (like months) in the future.
Now using the rabbitMQ backend because the mongo backend did loose such tasks on a worker restart.
Actually tasks with the rabbitMQ backend seem to be persistent across celery and rabbitMQ restarts, BUT revoke messages seem to be lost on rabbitMQ restarts.
I guess that if revoke messages are lost, those eta tasks that should be killed will execute anyway.
This may be helpful from the documentation (Persistent Revokes):
The list of revoked tasks is in-memory so if all workers restart the
list of revoked ids will also vanish. If you want to preserve this
list between restarts you need to specify a file for these to be
stored in by using the –statedb argument to celery worker:
$ celery -A proj worker -l info --statedb=/var/run/celery/worker.state
I started to use celery flower for tasks monitoring and it is working like a charm. I have one concern though, how can i "reload" info about monitored tasks after flower restart ?
I use redis as a broker, and i need to have option to check on tasks even in case of unexpected restart of service (or server).
Thanks in advance
I found i out.
It is the matter of setting the persistant flag in command running celery flower.
How can I use two different celery project which consumes messages from single RabbitMQ installation.
Generally, these scripts work fine if I use different rabbitmq for them. But on production machine, I need to share the same RabbitMQ backend for them.
Note: Due to some constraint, I cannot merge new projects in existing, so it will be two different project.
RabbitMQ has the ability to create virtual message brokers called virtual
hosts or vhosts. Each one is essentially a mini-RabbitMQ server with its own queues. This lets you safely use one RabbitMQ server for multiple applications.
rabbitmqctl add_vhost command creates a vhost.
By default Celery uses the / default vhost:
celery worker --broker=amqp://guest#localhost//
But you can use any custom vhost:
celery worker --broker=amqp://guest#localhost/myvhost
Examples:
rabbitmqctl add_vhost new_host
rabbitmqctl add_vhost /another_host
celery worker --broker=amqp://guest#localhost/new_host
celery worker --broker=amqp://guest#localhost//another_host