airflow connection_closed_abruptly

airflow connection_closed_abruptly - python

I have a problem with airflow ( v1.9.0dev0+apache.incubating )
and anything looks working fine until the scheduler gets the job it crashes with this log:
[2017-03-15 15:54:18,075] {jobs.py:1329} INFO - Waiting up to 5s for processes to exit... Traceback (most recent call last): File "/usr/local/bin/airflow", line 4, in <module>
__import__('pkg_resources').run_script('airflow==1.9.0.dev0+apache.incubating', 'airflow') File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 738, in run_script
self.require(requires)[0].run_script(script_name, ns) File "/usr/local/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1499, in run_script
exec(code, namespace, namespace) File "/usr/local/lib/python2.7/dist-packages/airflow-1.9.0.dev0+apache.incubating-py2.7.egg/EGG-INFO/scripts/airflow", line 28, in <module>
args.func(args) File "/usr/local/lib/python2.7/dist-packages/airflow-1.9.0.dev0+apache.incubating-py2.7.egg/airflow/bin/cli.py", line 839, in scheduler
job.run() File "/usr/local/lib/python2.7/dist-packages/airflow-1.9.0.dev0+apache.incubating-py2.7.egg/airflow/jobs.py", line 200, in run
self._execute() File "/usr/local/lib/python2.7/dist-packages/airflow-1.9.0.dev0+apache.incubating-py2.7.egg/airflow/jobs.py", line 1309, in _execute
self._execute_helper(processor_manager) File "/usr/local/lib/python2.7/dist-packages/airflow-1.9.0.dev0+apache.incubating-py2.7.egg/airflow/jobs.py", line 1441, in _execute_helper
self.executor.heartbeat() File "/usr/local/lib/python2.7/dist-packages/airflow-1.9.0.dev0+apache.incubating-py2.7.egg/airflow/executors/base_executor.py", line 132, in heartbeat
self.sync() File "/usr/local/lib/python2.7/dist-packages/airflow-1.9.0.dev0+apache.incubating-py2.7.egg/airflow/executors/celery_executor.py", line 88, in sync
state = async.state File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 431, in state
return self._get_task_meta()['status'] File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 370, in _get_task_meta
return self._maybe_set_cache(self.backend.get_task_meta(self.id)) File "/usr/local/lib/python2.7/dist-packages/celery/backends/amqp.py", line 156, in get_task_meta
binding.declare() File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 604, in declare
self._create_exchange(nowait=nowait, channel=channel) File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 611, in
_create_exchange
self.exchange.declare(nowait=nowait, channel=channel) File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 185, in declare
nowait=nowait, passive=passive, File "/usr/local/lib/python2.7/dist-packages/amqp/channel.py", line 630, in exchange_declare
wait=None if nowait else spec.Exchange.DeclareOk, File "/usr/local/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 64, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content) File "/usr/local/lib/python2.7/dist-packages/amqp/method_framing.py", line 174, in write_frame
write(view[:offset]) File "/usr/local/lib/python2.7/dist-packages/amqp/transport.py", line 269, in write
self._write(s) File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args) socket.error: [Errno 104] Connection reset by peer
On rabbitmq log I see:
=INFO REPORT==== 15-Mar-2017::15:54:17 ===
accepting AMQP connection <0.492.0> (127.0.0.1:42970 -> 127.0.0.1:5672)
=WARNING REPORT==== 15-Mar-2017::15:54:18 ===
closing AMQP connection <0.492.0> (127.0.0.1:42970 -> 127.0.0.1:5672):
connection_closed_abruptly
Which looks like the client is doing something weird when closes the connection.
Before the 1.9.0 I user the 1.7.1.3 which reports the same problem ( https://issues.apache.org/jira/browse/AIRFLOW-342).
Does somebody fixed this in some way ?
Any idea about where to put hands ?

Based on the logs, connection closing seems was initiated by rabbitmq. You may want to do tcpdump to confirm this.
It might be something related to security settings (username/password, or security settings). Did you do
Configure RabbitMQ: create user and grant privileges rabbitmqctl add_user rabbitmq_user_name rabbitmq_password rabbitmqctl add_vhost
rabbitmq_virtual_host_name rabbitmqctl set_user_tags
rabbitmq_user_name rabbitmq_tag_name rabbitmqctl set_permissions -p
rabbitmq_virtual_host_name rabbitmq_user_name "." "." ".*"
Taken from https://stlong0521.github.io/20161023%20-%20Airflow.html
If that's not the case, then it could be that for example rabbitmq is configured to use TLS and airflow/celery is not, or something like that - try using something Ethereal which supports RabbitMQ protocol - see for example https://www.rabbitmq.com/amqp-wireshark.html.
Hope this helps.

Related

Using apscheduler with asyncpg db as Job Store: Error MissingGreenlet

I'm trying to use Apscheduler with a postgresql db via an asyncpg connection. I thought it would working, because asyncpg supports sqlalchemy ref. But yeah, it isn't working. And to make it even worst, I don't understand the error message, so I have not even a guess what to google for.
import asyncio
from apscheduler.schedulers.asyncio import AsyncIOScheduler
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
def simple_job():
print('This was an easy job!')
scheduler = AsyncIOScheduler()
jobstore = SQLAlchemyJobStore(url='postgresql+asyncpg://user:password#localhost:5432/public')
scheduler.add_jobstore(jobstore)
# schedule a simple job
scheduler.add_job(simple_job, 'cron', second='15', id='heartbeat',
coalesce=True, misfire_grace_time=5, replace_existing=True)
scheduler.start()
Versions:
python 3.7
APScheduler==3.7.0
asyncpg==0.22.0
SQLAlchemy==1.4.3
Error Message and traceback:
Traceback (most recent call last):
File "C:/Users/d/PycharmProjects/teamutils/utils/automation.py", line 320, in <module>
scheduler.start()
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\apscheduler\schedulers\asyncio.py", line 45, in start
super(AsyncIOScheduler, self).start(paused)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\apscheduler\schedulers\base.py", line 163, in start
store.start(self, alias)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\apscheduler\jobstores\sqlalchemy.py", line 68, in start
self.jobs_t.create(self.engine, True)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\sql\schema.py", line 940, in create
bind._run_ddl_visitor(ddl.SchemaGenerator, self, checkfirst=checkfirst)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\engine\base.py", line 2979, in _run_ddl_visitor
with self.begin() as conn:
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\engine\base.py", line 2895, in begin
conn = self.connect(close_with_result=close_with_result)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\engine\base.py", line 3067, in connect
return self._connection_cls(self, close_with_result=close_with_result)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\engine\base.py", line 91, in __init__
else engine.raw_connection()
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\engine\base.py", line 3146, in raw_connection
return self._wrap_pool_connect(self.pool.connect, _connection)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\engine\base.py", line 3113, in _wrap_pool_connect
return fn()
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\base.py", line 301, in connect
return _ConnectionFairy._checkout(self)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\base.py", line 755, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\base.py", line 419, in checkout
rec = pool._do_get()
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\impl.py", line 145, in _do_get
self._dec_overflow()
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\util\langhelpers.py", line 72, in __exit__
with_traceback=exc_tb,
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\util\compat.py", line 198, in raise_
raise exception
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\impl.py", line 142, in _do_get
return self._create_connection()
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\base.py", line 247, in _create_connection
return _ConnectionRecord(self)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\base.py", line 362, in __init__
self.__connect(first_connect_check=True)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\base.py", line 605, in __connect
pool.logger.debug("Error on connect(): %s", e)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\util\langhelpers.py", line 72, in __exit__
with_traceback=exc_tb,
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\util\compat.py", line 198, in raise_
raise exception
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\pool\base.py", line 599, in __connect
connection = pool._invoke_creator(self)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\engine\create.py", line 578, in connect
return dialect.connect(*cargs, **cparams)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\engine\default.py", line 548, in connect
return self.dbapi.connect(*cargs, **cparams)
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 744, in connect
await_only(self.asyncpg.connect(*arg, **kw)),
File "C:\Users\d\PycharmProjects\teamutils\venv\lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 48, in await_only
"greenlet_spawn has not been called; can't call await_() here. "
sqlalchemy.exc.MissingGreenlet: greenlet_spawn has not been called; can't call await_() here. Was IO attempted in an unexpected place? (Background on this error at: http://sqlalche.me/e/14/xd2s)
sys:1: RuntimeWarning: coroutine 'connect' was never awaited
I looked up the provided link, but not getting smart of it. So it would be nice, if somebody can tell me what is going on, so I can search for a solution by my own. (a solution would okay too, of course xD)
Sorry for this "open" question, but my understanding is so bad, that I dont know what to ask for.

I think problem is in ApScheduler.
What is happening is that scheduler.start() will attempt to create the job table in your database. But since your database url is specified as +asyncpg and there is no async coroutine running (ie: async def) when ApScheduler tries to create the table. Hence the "coroutine 'connect' was never awaited" error.
After reading the ApScheduler code, I think "integrates with asyncio" is a little misleading - specifically the scheduler can run asyncio, but the JobStore itself has no provision for an asyncio database connection.
You can get it working by removing +asyncpg in the connection url used with ApScheduler.
Note it would still be possible to use async db calls within job functions with a separate asyncpg connection.

Ariflow [Errno 104] Connection reset by peer

I am trying to run tasks through the command 'airflow scheduler' when it produced this error, AFTER I tried to run one of the dags.
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 28, in <module>
args.func(args)
File "/usr/local/lib/python3.5/dist-packages/airflow/bin/cli.py", line 839, in scheduler
job.run()
File "/usr/local/lib/python3.5/dist-packages/airflow/jobs.py", line 200, in run
self._execute()
File "/usr/local/lib/python3.5/dist-packages/airflow/jobs.py", line 1309, in _execute
self._execute_helper(processor_manager)
File "/usr/local/lib/python3.5/dist-packages/airflow/jobs.py", line 1441, in _execute_helper
self.executor.heartbeat()
File "/usr/local/lib/python3.5/dist-packages/airflow/executors/base_executor.py", line 132, in heartbeat
self.sync()
File "/usr/local/lib/python3.5/dist-packages/airflow/executors/celery_executor.py", line 88, in sync
state = async.state
File "/home/userName/.local/lib/python3.5/site-packages/celery/result.py", line 436, in state
return self._get_task_meta()['status']
File "/home/userName/.local/lib/python3.5/site-packages/celery/result.py", line 375, in _get_task_meta
return self._maybe_set_cache(self.backend.get_task_meta(self.id))
File "/home/userName/.local/lib/python3.5/site-packages/celery/backends/amqp.py", line 156, in get_task_meta
binding.declare()
File "/home/userName/.local/lib/python3.5/site-packages/kombu/entity.py", line 605, in declare
self._create_queue(nowait=nowait, channel=channel)
File "/home/userName/.local/lib/python3.5/site-packages/kombu/entity.py", line 614, in _create_queue
self.queue_declare(nowait=nowait, passive=False, channel=channel)
File "/home/userName/.local/lib/python3.5/site-packages/kombu/entity.py", line 649, in queue_declare
nowait=nowait,
File "/home/userName/.local/lib/python3.5/site-packages/amqp/channel.py", line 1147, in queue_declare
nowait, arguments),
File "/home/userName/.local/lib/python3.5/site-packages/amqp/abstract_channel.py", line 50, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content)
File "/home/userName/.local/lib/python3.5/site-packages/amqp/method_framing.py", line 166, in write_frame
write(view[:offset])
File "/home/userName/.local/lib/python3.5/site-packages/amqp/transport.py", line 258, in write
self._write(s)
**ConnectionResetError: [Errno 104] Connection reset by peer**
I am using Python 3.5, Airflow 1.8, Celery 4.1.0, and RabbitMQ 3.5.7 as the worker :
It looks like I am having a problem on RabbitMQ, but I cannot figure out the reason.

The reported error seems to be a identified error solved in Airflow 1.10.0.

Had the same issue.
Your dag contains many API calls to a server and your airflow scheduler has a limit to follow, there isn't a specific number of request at once to abide by but you should do trial and error to find the number that works for your Airflow environment. usually occurs when your dag has n number of tasks to run alongside each other simultaneously.
this issue is not resolvable by any updates that claimed in answers, I was getting the error even when I was using the latest release.

Celery Beat process crashes after nslookup failure

I'm using Celery 3.1.19 with scheduled tasks. I start the process like so:
celery beat --app=my_app.celery.app:app --pidfile=/usr/local/celerybeat.pid --schedule=/usr/local/celerybeat-schedule -l INFO
I've had a couple occurrences where the celery process terminates after an nslookup failure. This causes future scheduled tasks to not run. Eventually I notice and restart celery beat.
As far as I can tell the hostname it's trying to lookup is my RabbitMQ host. The nslookup failures are temporary. The hostname is correct and evidently there was a blip in name resolution. Ideally that would not crash the process and instead it would retry until it the hostname lookup succeeded.
Questions:
Is this expected behavior?
Is there a common way to ensure that the scheduler keeps running?
Do people have a system to watch the process and restart if it crashes?
Stack trace:
Message Error: Couldn't apply scheduled task ping: Error opening socket: hostname lookup failed
File "/usr/local/bin/celery", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/celery/__main__.py", line 30, in main
main()
File "/usr/local/lib/python2.7/dist-packages/celery/bin/celery.py", line 81, in main
cmd.execute_from_commandline(argv)
File "/usr/local/lib/python2.7/dist-packages/celery/bin/celery.py", line 770, in execute_from_commandline
super(CeleryCommand, self).execute_from_commandline(argv)))
File "/usr/local/lib/python2.7/dist-packages/celery/bin/base.py", line 311, in execute_from_commandline
return self.handle_argv(self.prog_name, argv[1:])
File "/usr/local/lib/python2.7/dist-packages/celery/bin/celery.py", line 762, in handle_argv
return self.execute(command, argv)
File "/usr/local/lib/python2.7/dist-packages/celery/bin/celery.py", line 694, in execute
).run_from_argv(self.prog_name, argv[1:], command=argv[0])
File "/usr/local/lib/python2.7/dist-packages/celery/bin/base.py", line 315, in run_from_argv
sys.argv if argv is None else argv, command)
File "/usr/local/lib/python2.7/dist-packages/celery/bin/base.py", line 377, in handle_argv
return self(*args, **options)
File "/usr/local/lib/python2.7/dist-packages/celery/bin/base.py", line 274, in __call__
ret = self.run(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/celery/bin/beat.py", line 79, in run
return beat().run()
File "/usr/local/lib/python2.7/dist-packages/celery/apps/beat.py", line 83, in run
self.start_scheduler()
File "/usr/local/lib/python2.7/dist-packages/celery/apps/beat.py", line 112, in start_scheduler
beat.start()
File "/usr/local/lib/python2.7/dist-packages/celery/beat.py", line 473, in start
File "/usr/local/lib/python2.7/dist-packages/celery/beat.py", line 221, in tick

CKAN won't start

On my FreeBSD 10, I managed to install PostgreSQL, Apache Solr and CKAN but when I run paster serve /etc/ckan/default/development.ini I get these error messages (I know there is an issue with CKAN because on my ip:5000 I have a error: connection failed).
Here are the error messages:
$ paster serve /etc/ckan/default/production.ini
Traceback (most recent call last):
File "/usr/lib/ckan/default/bin/paster", line 9, in <module>
load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/script/command.py", line 104, in run
invoke(command, command_name, options, args[1:])
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/script/command.py", line 143, in invoke
exit_code = runner.run(args)
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/script/command.py", line 238, in run
result = self.command()
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/script/serve.py", line 284, in command
relative_to=base, global_conf=vars)
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/script/serve.py", line 321, in loadapp
**kw)
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 247, in loadapp
return loadobj(APP, uri, name=name, **kw)
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 272, in loadobj
return context.create()
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 710, in create
return self.object_type.invoke(self)
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 146, in invoke
return fix_call(context.object, context.global_conf, **context.local_conf)
File "/usr/lib/ckan/default/lib/python2.7/site-packages/paste/deploy/util.py", line 56, in fix_call
val = callable(*args, **kw)
File "/root/ckan/lib/default/src/ckan/ckan/config/middleware.py", line 58, in make_app
load_environment(conf, app_conf)
File "/root/ckan/lib/default/src/ckan/ckan/config/environment.py", line 232, in load_environment
p.load_all(config)
File "/root/ckan/lib/default/src/ckan/ckan/plugins/core.py", line 134, in load_all
load(*plugins)
File "/root/ckan/lib/default/src/ckan/ckan/plugins/core.py", line 170, in load
plugins_update()
File "/root/ckan/lib/default/src/ckan/ckan/plugins/core.py", line 116, in plugins_update
environment.update_config()
File "/root/ckan/lib/default/src/ckan/ckan/config/environment.py", line 357, in update_config
plugin.configure(config)
File "/root/ckan/lib/default/src/ckan/ckanext/datastore/plugin.py", line 115, in configure
self._check_urls_and_permissions()
File "/root/ckan/lib/default/src/ckan/ckanext/datastore/plugin.py", line 159, in _check_urls_and_permissions
self._log_or_raise('The read-only user has write privileges.')
File "/root/ckan/lib/default/src/ckan/ckanext/datastore/plugin.py", line 140, in _log_or_raise
raise DatastoreException(message)
ckanext.datastore.plugin.DatastoreException: The read-only user has write privileges.
What should I do?

Your read-only postgresql user for the datastore database is not setup correctly: the user has write privileges. The relevant part of the docs is: http://docs.ckan.org/en/latest/maintaining/datastore.html#set-permissions

I too faced the problem like this....I have deleted the user "datastore_default" from the postgres. For Deleting the Postgres User Please refer: Postgres Delete user
And create again the postgres user "datastore_default" as per the CKAN Documentation. Please ref: Create User datastore_default for CKAN

Celery creating a new connection for each task

I'm using Celery with Redis to run some background tasks, but each time a task is called, it creates a new connection to Redis. I'm on Heroku and my Redis to Go plan allows for 10 connections. I'm quickly hitting that limit and getting a "max number of clients reached" error.
How can I ensure that Celery queues the tasks on a single connection rather than opening a new one each time?
EDIT - including the full traceback
File "/app/.heroku/venv/lib/python2.7/site-packages/django/core/handlers/base.py", line 111, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/newrelic-1.4.0.137/newrelic/api/object_wrapper.py", line 166, in __call__
self._nr_instance, args, kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/newrelic-1.4.0.137/newrelic/hooks/framework_django.py", line 447, in wrapper
return wrapped(*args, **kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/django/views/decorators/csrf.py", line 77, in wrapped_view
return view_func(*args, **kwargs)
File "/app/feedback/views.py", line 264, in zencoder_webhook_handler
tasks.process_zencoder_notification.delay(webhook)
File "/app/.heroku/venv/lib/python2.7/site-packages/celery/app/task.py", line 343, in delay
return self.apply_async(args, kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/celery/app/task.py", line 458, in apply_async
with app.producer_or_acquire(producer) as P:
File "/usr/local/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/app/.heroku/venv/lib/python2.7/site-packages/celery/app/base.py", line 247, in producer_or_acquire
with self.amqp.producer_pool.acquire(block=True) as producer:
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/connection.py", line 705, in acquire
R = self.prepare(R)
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/pools.py", line 54, in prepare
p = p()
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/pools.py", line 45, in <lambda>
return lambda: self.create_producer()
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/pools.py", line 42, in create_producer
return self.Producer(self._acquire_connection())
File "/app/.heroku/venv/lib/python2.7/site-packages/celery/app/amqp.py", line 160, in __init__
super(TaskProducer, self).__init__(channel, exchange, *args, **kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/messaging.py", line 83, in __init__
self.revive(self.channel)
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/messaging.py", line 174, in revive
channel = self.channel = maybe_channel(channel)
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/connection.py", line 879, in maybe_channel
return channel.default_channel
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/connection.py", line 617, in default_channel
self.connection
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/connection.py", line 610, in connection
self._connection = self._establish_connection()
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/connection.py", line 569, in _establish_connection
conn = self.transport.establish_connection()
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/transport/virtual/__init__.py", line 722, in establish_connection
self._avail_channels.append(self.create_channel(self))
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/transport/virtual/__init__.py", line 705, in create_channel
channel = self.Channel(connection)
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/transport/redis.py", line 271, in __init__
self.client.info()
File "/app/.heroku/venv/lib/python2.7/site-packages/newrelic-1.4.0.137/newrelic/api/object_wrapper.py", line 166, in __call__
self._nr_instance, args, kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/newrelic-1.4.0.137/newrelic/api/function_trace.py", line 81, in literal_wrapper
return wrapped(*args, **kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/redis/client.py", line 344, in info
return self.execute_command('INFO')
File "/app/.heroku/venv/lib/python2.7/site-packages/kombu/transport/redis.py", line 536, in execute_command
conn.send_command(*args)
File "/app/.heroku/venv/lib/python2.7/site-packages/redis/connection.py", line 273, in send_command
self.send_packed_command(self.pack_command(*args))
File "/app/.heroku/venv/lib/python2.7/site-packages/redis/connection.py", line 256, in send_packed_command
self.connect()
File "/app/.heroku/venv/lib/python2.7/site-packages/newrelic-1.4.0.137/newrelic/api/object_wrapper.py", line 166, in __call__
self._nr_instance, args, kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/newrelic-1.4.0.137/newrelic/api/function_trace.py", line 81, in literal_wrapper
return wrapped(*args, **kwargs)
File "/app/.heroku/venv/lib/python2.7/site-packages/redis/connection.py", line 207, in connect
self.on_connect()
File "/app/.heroku/venv/lib/python2.7/site-packages/redis/connection.py", line 233, in on_connect
if self.read_response() != 'OK':
File "/app/.heroku/venv/lib/python2.7/site-packages/redis/connection.py", line 283, in read_response
raise response
ResponseError: max number of clients reached

I ran into the same problem on Heroku with CloudAMQP. I do not know why, but I had no luck when assigning low integers to the BROKER_POOL_LIMIT setting.
Ultimately, I found that by setting BROKER_POOL_LIMIT=None or BROKER_POOL_LIMIT=0 my issue was mitigated. According to the Celery docs, this disables the connection pool. So far, this has not been a noticeable issue for me, however I'm not sure if it might be for you.
Link to relevant info: http://celery.readthedocs.org/en/latest/configuration.html#broker-pool-limit

I wish I was using Redis, because there is a specific option to limit the number of connections: CELERY_REDIS_MAX_CONNECTIONS.
http://docs.celeryproject.org/en/3.0/configuration.html#celery-redis-max-connections (for 3.0)
http://docs.celeryproject.org/en/latest/configuration.html#celery-redis-max-connections (for 3.1)
http://docs.celeryproject.org/en/master/configuration.html#celery-redis-max-connections (for dev)
The MongoDB has a similar backend setting.
Given these backend settings, I have no idea what BROKER_POOL_LIMIT actually does. Hopefully CELERY_REDIS_MAX_CONNECTIONS solves your problem.
I'm one of those folks using CloudAMQP, and the AMQP backend does not have its own connection limit parameter.

Try those settings :
CELERY_IGNORE_RESULT = True
CELERY_STORE_ERRORS_EVEN_IF_IGNORED = True

I had a similar issue involving number of connections and Celery. It wasn't on Heroku, and it was Mongo and not Redis though.
I initiated the connection outside of the task function definition at the task module level. At least for Mongo this allowed the tasks to share the connection.
Hope that helps.
https://github.com/instituteofdesign/wander/blob/master/wander/tasks.py
mongoengine.connect('stored_messages')
#celery.task(default_retry_delay = 61)
def pull(settings, google_settings, user, folder, messageid):
'''
Pulls a message from zimbra and stores it in Mongo
'''
try:
imap = imap_connect(settings, user)
imap.select(folder, True)
.......

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

airflow connection_closed_abruptly - python

Related

Using apscheduler with asyncpg db as Job Store: Error MissingGreenlet

Ariflow [Errno 104] Connection reset by peer

Celery Beat process crashes after nslookup failure

CKAN won't start

Celery creating a new connection for each task

Categories

Resources