Framing Errors in Celery 3.0.1 - python

I recently upgraded to Celery 3.0.1 from 2.3.0 and all the tasks run fine. Unfortunately. I'm getting a "Framing Error" exception pretty frequently. I'm also running supervisor to restart the threads but since these are never really killed supervisor has no way of knowing that celery needs to be restarted. Has anyone seen this before?
2012-07-13 18:53:59,004: ERROR/MainProcess] Unrecoverable error: Exception('Framing Error, received 0x00 while expecting 0xce',)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/worker/__init__.py", line 350, in start
component.start()
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 360, in start
self.consume_messages()
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 445, in consume_messages
drain_nowait()
File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 175, in drain_nowait
self.drain_events(timeout=0)
File "/usr/local/lib/python2.7/dist-packages/kombu/connection.py", line 171, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 262, in drain_events
return connection.drain_events(**kwargs)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 97, in drain_events
chanmap, None, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 155, in _wait_multiple
channel, method_sig, args, content = read_timeout(timeout)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/amqplib.py", line 129, in read_timeout
return self.method_reader.read_method()
File "/usr/local/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
raise m
Exception: Framing Error, received 0x00 while expecting 0xce

While I am not sure why this actually happens, switching from amqplib to librabbitmq helped me to overcome this trouble.
I haven't changed anything in configuration, just:
pip uninstall amqplib
pip install librabbitmq
And restarted celery workers.
Got this idea form https://github.com/celery/celery/issues/922

Related

Django server won't run

I just tries to start django project on win7(x64), but i faced with following issue:
$ python manage.py runserver
Performing system checks...
System check identified no issues (0 silenced).
March 24, 2018 - 14:24:08
Django version 1.11.3, using settings 'superlists.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.
Unhandled exception in thread started by <function check_errors.<locals>.wrapper
at 0x035BD978>
Traceback (most recent call last):
File "C:\Users\alesya\.virtualenvs\superlists\lib\site-packages\django\utils\a
utoreload.py", line 227, in wrapper
fn(*args, **kwargs)
File "C:\Users\alesya\.virtualenvs\superlists\lib\site-packages\django\core\ma
nagement\commands\runserver.py", line 149, in inner_run
ipv6=self.use_ipv6, threading=threading, server_cls=self.server_cls)
File "C:\Users\alesya\.virtualenvs\superlists\lib\site-packages\django\core\se
rvers\basehttp.py", line 164, in run
httpd = httpd_cls(server_address, WSGIRequestHandler, ipv6=ipv6)
File "C:\Users\alesya\.virtualenvs\superlists\lib\site-packages\django\core\se
rvers\basehttp.py", line 74, in __init__
super(WSGIServer, self).__init__(*args, **kwargs)
File "c:\users\alesya\appdata\local\programs\python\python36-32\Lib\socketserv
er.py", line 453, in __init__
self.server_bind()
File "c:\users\alesya\appdata\local\programs\python\python36-32\Lib\wsgiref\si
mple_server.py", line 50, in server_bind
HTTPServer.server_bind(self)
File "c:\users\alesya\appdata\local\programs\python\python36-32\Lib\http\serve
r.py", line 138, in server_bind
self.server_name = socket.getfqdn(host)
File "c:\users\alesya\appdata\local\programs\python\python36-32\Lib\socket.py"
, line 673, in getfqdn
hostname, aliases, ipaddrs = gethostbyaddr(name)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb in position 14: invalid
start byte
My computer has an ASCII name, so I even not realized, what happens.
Did all these things on another win7 and everything was ok.
Maybe someone can help with?
UPD. My problem was due to the changed 'hosts' file - there are a lot of disabled addresses.
Thanks all for the answers.
use python3, if you use python2.x many letters like accents or others, they cause abnormal crashes
try this:
a.encode('utf-8').strip()
if "a" is the string with non-ascii character

django manage.py runserver fails to run

Im new to django, i started a project inside a virtualenv and whenever i try to runserver i get this message:
Performing system checks...
System check identified no issues (0 silenced).
March 01, 2018 - 13:22:34
Django version 2.0.2, using settings 'PollApp.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.
Unhandled exception in thread started by <function check_errors.
<locals>.wrapper at 0x03BBF7C8>
Traceback (most recent call last):
File "C:\Users\Sebastian\Desktop\Desarrollo\HelloWorld1\lib\site-
packages\django\utils\autoreload.py", line 225, in wrapper
fn(*args, **kwargs)
File "C:\Users\Sebastian\Desktop\Desarrollo\HelloWorld1\lib\site-
packages\django\core\management\commands\runserver.py", line 143, in
inner_run
ipv6=self.use_ipv6, threading=threading, server_cls=self.server_cls)
File "C:\Users\Sebastian\Desktop\Desarrollo\HelloWorld1\lib\site-
packages\django\core\servers\basehttp.py", line 163, in run
httpd = httpd_cls(server_address, WSGIRequestHandler, ipv6=ipv6)
File "C:\Users\Sebastian\Desktop\Desarrollo\HelloWorld1\lib\site-
packages\django\core\servers\basehttp.py", line 66, in __init__
super().__init__(*args, **kwargs)
File "c:\users\sebastian\appdata\local\programs\python\python36-
32\Lib\socketserver.py", line 453, in __init__
self.server_bind()
File "c:\users\sebastian\appdata\local\programs\python\python36-
32\Lib\wsgiref\simple_server.py", line 50, in server_bind
HTTPServer.server_bind(self)
File "c:\users\sebastian\appdata\local\programs\python\python36-
32\Lib\http\server.py", line 138, in server_bind
self.server_name = socket.getfqdn(host)
File "c:\users\sebastian\appdata\local\programs\python\python36-
32\Lib\socket.py", line 673, in getfqdn
hostname, aliases, ipaddrs = gethostbyaddr(name)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 7:
invalid continuation byte
I dont realy know where to start solving this error, i have a WAMP server installed, should i check if im using the 8000 port?
Change your computer name so it only has valid ASCII characters
According to known issues bellow, your best solution will be changing your host name.
I don't WAMP is causing this, unless they are running on the same PORT!
https://code.djangoproject.com/ticket/19357
https://bugs.python.org/issue26227
Error while running Django app

Celery upgrade (3.1->4.1) - Connection reset by peer

We are working with celery at the last year, with ~15 workers, each one defined with concurrency between 1-4.
Recently we upgraded our celery from v3.1 to v4.1
Now we are having the following errors in each one of the workers logs, any ideas what can cause to such error?
2017-08-21 18:33:19,780 94794 ERROR Control command error: error(104, 'Connection reset by peer') [file: pidbox.py, line: 46]
Traceback (most recent call last):
File "/srv/dy/venv/lib/python2.7/site-packages/celery/worker/pidbox.py", line 42, in on_message
self.node.handle_message(body, message)
File "/srv/dy/venv/lib/python2.7/site-packages/kombu/pidbox.py", line 129, in handle_message
return self.dispatch(**body)
File "/srv/dy/venv/lib/python2.7/site-packages/kombu/pidbox.py", line 112, in dispatch
ticket=ticket)
File "/srv/dy/venv/lib/python2.7/site-packages/kombu/pidbox.py", line 135, in reply
serializer=self.mailbox.serializer)
File "/srv/dy/venv/lib/python2.7/site-packages/kombu/pidbox.py", line 265, in _publish_reply
**opts
File "/srv/dy/venv/lib/python2.7/site-packages/kombu/messaging.py", line 181, in publish
exchange_name, declare,
File "/srv/dy/venv/lib/python2.7/site-packages/kombu/messaging.py", line 203, in _publish
mandatory=mandatory, immediate=immediate,
File "/srv/dy/venv/lib/python2.7/site-packages/amqp/channel.py", line 1748, in _basic_publish
(0, exchange, routing_key, mandatory, immediate), msg
File "/srv/dy/venv/lib/python2.7/site-packages/amqp/abstract_channel.py", line 64, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content)
File "/srv/dy/venv/lib/python2.7/site-packages/amqp/method_framing.py", line 178, in write_frame
write(view[:offset])
File "/srv/dy/venv/lib/python2.7/site-packages/amqp/transport.py", line 272, in write
self._write(s)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 104] Connection reset by peer
BTW: our tasks in the form:
#app.task(name='EXAMPLE_TASK'],
bind=True,
base=ConnectionHolderTask)
def example_task(self, arg1, arg2, **kwargs):
# task code
We are also having massive issues with celery... I spend 20% of my time just dancing around weird idle-hang/crash issues with our workers sigh
We had a similar case that was caused by a high concurrency combined with a high worker_prefetch_multiplier, as it turns out fetching thousands of tasks is a good way to frack the connection.
If that's not the case: try to disable the broker pool by setting broker_pool_limit to None.
Just some quick ideas that might (hopefully) help :-)

403 Forbidden when connecting to S3 bucket in AWS Cloud using Toil

I am a newbie in Toil and AWS trying to run HelloWorld.py example in the Toil Document. I have already successfully installed toil and related python packages on my local mac laptop and have setup my account at AWS. I have created a small leader/worker cluster
$ cgcloud create-cluster toil -s 2 -t m3.large
and started it:
$ cgcloud ssh toil-leader
This changed my screen prompt to:
mesosbox#ip-172-31-25-135:~$
Then from an other window on my mac, I started the Toil HellowWorld example with with command:
$ python2.7 HelloWorld.py --batchSystem=mesos --mesosMaster=mesos-master:5050 aws:us-west-2:my-aws-jobstore
And I got the following output:
Apples-Air 2017-06-02 19:30:53,524 MainThread INFO toil.lib.bioio: Root logger is at level 'INFO', 'toil' logger at level 'INFO'.
Apples-Air 2017-06-02 19:30:53,524 MainThread INFO toil.lib.bioio: Root logger is at level 'INFO', 'toil' logger at level 'INFO'.
Apples-Air 2017-06-02 19:30:54,852 MainThread WARNING toil.jobStores.aws.jobStore: Exception during panic
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/toil/jobStores/aws/jobStore.py", line 209, in initialize
self.destroy()
File "/usr/local/lib/python2.7/site-packages/toil/jobStores/aws/jobStore.py", line 1334, in destroy
self._bind(create=False, block=False)
File "/usr/local/lib/python2.7/site-packages/toil/jobStores/aws/jobStore.py", line 241, in _bind
versioning=True)
File "/usr/local/lib/python2.7/site-packages/toil/jobStores/aws/jobStore.py", line 721, in _bindBucket
bucket = self.s3.get_bucket(bucket_name, validate=True)
File "/usr/local/lib/python2.7/site-packages/boto/s3/connection.py", line 502, in get_bucket
return self.head_bucket(bucket_name, headers=headers)
File "/usr/local/lib/python2.7/site-packages/boto/s3/connection.py", line 535, in head_bucket
raise err
S3ResponseError: S3ResponseError: 403 Forbidden
Traceback (most recent call last):
File "helloWorld.py", line 22, in <module>
print(Job.Runner.startToil(j, options)) #Prints Hello, world!, ….
File "/usr/local/lib/python2.7/site-packages/toil/job.py", line 740, in startToil
with Toil(options) as toil:
File "/usr/local/lib/python2.7/site-packages/toil/common.py", line 614, in __enter__
jobStore.initialize(config)
File "/usr/local/lib/python2.7/site-packages/toil/jobStores/aws/jobStore.py", line 209, in initialize
self.destroy()
File "/usr/local/lib/python2.7/site-packages/toil/jobStores/aws/jobStore.py", line 206, in initialize
self._bind(create=True)
File "/usr/local/lib/python2.7/site-packages/toil/jobStores/aws/jobStore.py", line 241, in _bind
versioning=True)
File "/usr/local/lib/python2.7/site-packages/toil/jobStores/aws/jobStore.py", line 721, in _bindBucket
bucket = self.s3.get_bucket(bucket_name, validate=True)
File "/usr/local/lib/python2.7/site-packages/boto/s3/connection.py", line 502, in get_bucket
return self.head_bucket(bucket_name, headers=headers)
File "/usr/local/lib/python2.7/site-packages/boto/s3/connection.py", line 535, in head_bucket
raise err
boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
Please help.
Thanks.
---John
I realize that this answer is a little late. One problem I notice is with the mesosMaster argument.
Instead, your command should have look like
python2.7 HelloWorld.py --batchSystem=mesos --mesosMaster=172.31.25.135:5050 aws:us-west-2:my-aws-jobstore
Notice that I replaces mesos-master with the actual IP address from
mesosbox#ip-172-31-25-135:~$
Hopefully in the future, one will not need to pass this argument at all, however this is not yet implemented as of 26 July 2017.
Also for further problems with Toil you will probably have better luck posting a new issue to the Toil Github page.

Celery kombu fails after self.connections.acquire

When my celery service is running after 7-10 days I received this exception out of nowhere, this causes my Tasks not to be processed. A restart of celery fixes the problem.
INTERNAL ERROR: RuntimeError('Acquire on closed pool',)
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 253, in trace_task
I, R, state, retval = on_error(task_request, exc, uuid)
File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 201, in on_error
R = I.handle_error_state(task, eager=eager)
File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 85, in handle_error_state
}[self.state](task, store_errors=store_errors)
File "/usr/lib/python2.7/dist-packages/celery/app/trace.py", line 118, in handle_failure
req.id, exc, einfo.traceback, request=req,
File "/usr/lib/python2.7/dist-packages/celery/backends/base.py", line 121, in mark_as_failure
traceback=traceback, request=request)
File "/usr/lib/python2.7/dist-packages/celery/backends/amqp.py", line 124, in store_result
with self.app.amqp.producer_pool.acquire(block=True) as producer:
File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 868, in acquire
R = self.prepare(R)
File "/usr/lib/python2.7/dist-packages/kombu/pools.py", line 63, in prepare
conn = self._acquire_connection()
File "/usr/lib/python2.7/dist-packages/kombu/pools.py", line 38, in _acquire_connection
return self.connections.acquire(block=True)
File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 859, in acquire
raise RuntimeError('Acquire on closed pool')
RuntimeError: Acquire on closed pool
Software versions
software -> celery:3.1.20 (Cipater) kombu:3.0.35 py:2.7.6
billiard:3.3.0.22 py-amqp:1.4.9
platform -> system:Linux arch:64bit, ELF imp:CPython
loader -> celery.loaders.default.Loader
settings -> transport:amqp results:amqp
CELERY_ACCEPT_CONTENT: ['json', 'pickle', 'yaml']
CELERY_ENABLE_UTC: True
CELERY_IGNORE_RESULT: False
CELERY_IMPORTS:
('catalogue.app.voice.cluster.deploy_cluster',
'catalogue.app.common.install_uc',
'hypervisor.app.deploy_esx',
'hypervisor.app.vm_operations',
'tools.deploy_tools')
CELERYD_CHDIR: '/usr/local/src/imbue/application/app'
CELERY_TASK_RESULT_EXPIRES: 18000
CELERY_RESULT_PERSISTENT: True
CELERY_TIMEZONE: 'US/Eastern'
BROKER_URL: 'amqp://******:********#rabbitmq:5672//'
CELERY_RESULT_BACKEND: 'amqp'
Only workaround now is to restart.
Ubuntu 14.04 2 GB RAM/2 CPU/40 GB HDD
This looks like a bug in celery. Asksol fixed this few days back.
You can install celery from source code and try it. If it is still causing problems, please create new issue on github.

Categories

Resources