Hi I'm running Apache Airflow 2.2.3 and recently started to get this error where the scheduler keeps re-launching throwing a warning:** "DagFileProcessorManager (PID=xxxx) exited with exit code 1 - re-launching"**
The Error:
[2022-11-15 14:19:45,922] {manager.py:318} WARNING - DagFileProcessorManager (PID=9176) exited with exit code 1 - re-launching
[2022-11-15 14:19:45,943] {manager.py:163} INFO - Launched DagFileProcessorManager with pid: 9182
[2022-11-15 14:19:45,963] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
Process ForkProcess-6051:
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.8/dist-packages/airflow/dag_processing/manager.py", line 287, in _run_processor_manager
processor_manager.start()
File "/usr/local/lib/python3.8/dist-packages/airflow/dag_processing/manager.py", line 520, in start
return self._run_parsing_loop()
File "/usr/local/lib/python3.8/dist-packages/airflow/dag_processing/manager.py", line 530, in _run_parsing_loop
self._refresh_dag_dir()
File "/usr/local/lib/python3.8/dist-packages/airflow/dag_processing/manager.py", line 683, in _refresh_dag_dir
[
File "/usr/local/lib/python3.8/dist-packages/airflow/dag_processing/manager.py", line 686, in <listcomp>
if might_contain_dag(info.filename, True, z)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/file.py", line 221, in might_contain_dag
with zip_file.open(file_path) as current_file:
File "/usr/lib/python3.8/zipfile.py", line 1535, in open
raise BadZipFile("Bad magic number for file header")
zipfile.BadZipFile: Bad magic number for file header
This error prevents Airflow UI to reload new DAG's or modifications to DAG codes.
At first i thought it might be something related with python codes that i have in my "includes" folder inside the dags folder, so i have added that folder to my .airflowignore file to skip this files, but scheduler keeps re-launching.
The only way i've mangaged to reload the new Dags or dag modifications is by executing "airflow db init" command. But this is not optimal.
Please help!
Related
When working with luigi on Windows 10, the following error is sometimes thrown:
Traceback (most recent call last):
File "D:\Users\myuser\PycharmProjects\project\venv\lib\site-packages\luigi\worker.py", line 192, in run
new_deps = self._run_get_new_deps()
File "D:\Users\myuser\PycharmProjects\project\venv\lib\site-packages\luigi\worker.py", line 130, in _run_get_new_deps
task_gen = self.task.run()
File "project.py", line 15000, in run
data_frame = pd.read_excel(self.input()["tables"].open(),segment)
File "D:\Users\myuser\PycharmProjects\project\venv\lib\site-packages\pandas\io\excel.py", line 191, in read_excel
io = ExcelFile(io, engine=engine)
File "D:\Users\myuser\PycharmProjects\project\venv\lib\site-packages\pandas\io\excel.py", line 247, in __init__
self.book = xlrd.open_workbook(file_contents=data)
File "D:\Users\myuser\PycharmProjects\project\venv\lib\site-packages\xlrd\__init__.py", line 115, in open_workbook
zf = zipfile.ZipFile(timemachine.BYTES_IO(file_contents))
File "C:\Python27\lib\zipfile.py", line 793, in __init__
self._RealGetContents()
File "C:\Python27\lib\zipfile.py", line 862, in _RealGetContents
raise BadZipfile("Bad magic number for central directory")
BadZipfile: Bad magic number for central directory
This error seems to only happen when working on Windows, and happens more frequently when a task requires to access the same file more than once in the same task, or is used by different tasks as a Target even when there are no parallel workers running. Code runs without errors on Linux.
Has anyone else experienced this behavior?
I'm trying to create pandas DataFrames from Excel file sheets but I get a BadZipFile error instead sometimes.
The airflow scheduler crashes when I trigger it manually from the dashboard.
executor = DaskExecutor
Airflow version: = 1.10.7
sql_alchemy_conn = postgresql://airflow:airflow#localhost:5432/airflow
python version = 3.6
The logs on crash are:
[2020-08-20 07:01:49,288] {scheduler_job.py:1148} INFO - Sending ('hello_world', 'dummy_task', datetime.datetime(2020, 8, 20, 1, 31, 47, 20630, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 2 and queue default
[2020-08-20 07:01:49,288] {base_executor.py:58} INFO - Adding to queue: ['airflow', 'run', 'hello_world', 'dummy_task', '2020-08-20T01:31:47.020630+00:00', '--local', '--pool', 'default_pool', '-sd', '/workflows/dags/helloWorld.py']
/mypython/lib/python3.6/site-packages/airflow/executors/dask_executor.py:63: UserWarning: DaskExecutor does not support queues. All tasks will be run in the same cluster
'DaskExecutor does not support queues. '
distributed.protocol.pickle - INFO - Failed to serialize <function DaskExecutor.execute_async.<locals>.airflow_run at 0x12057a9d8>. Exception: Cell is empty
[2020-08-20 07:01:49,292] {scheduler_job.py:1361} ERROR - Exception when executing execute_helper
Traceback (most recent call last):
File "/mypython/lib/python3.6/site-packages/distributed/worker.py", line 843, in dumps_function
result = cache[func]
KeyError: <function DaskExecutor.execute_async.<locals>.airflow_run at 0x12057a9d8>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mypython/lib/python3.6/site-packages/distributed/protocol/pickle.py", line 38, in dumps
result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
AttributeError: Can't pickle local object 'DaskExecutor.execute_async.<locals>.airflow_run'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mypython/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1359, in _execute
self._execute_helper()
File "/mypython/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1420, in _execute_helper
if not self._validate_and_run_task_instances(simple_dag_bag=simple_dag_bag):
File "/mypython/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1482, in _validate_and_run_task_instances
self.executor.heartbeat()
File "/mypython/lib/python3.6/site-packages/airflow/executors/base_executor.py", line 130, in heartbeat
self.trigger_tasks(open_slots)
File "/mypython/lib/python3.6/site-packages/airflow/executors/base_executor.py", line 154, in trigger_tasks
executor_config=simple_ti.executor_config)
File "/mypython/lib/python3.6/site-packages/airflow/executors/dask_executor.py", line 70, in execute_async
future = self.client.submit(airflow_run, pure=False)
File "/mypython/lib/python3.6/site-packages/distributed/client.py", line 1279, in submit
actors=actor)
File "/mypython/lib/python3.6/site-packages/distributed/client.py", line 2249, in _graph_to_futures
'tasks': valmap(dumps_task, dsk3),
File "/mypython/lib/python3.6/site-packages/toolz/dicttoolz.py", line 83, in valmap
rv.update(zip(iterkeys(d), map(func, itervalues(d))))
File "/mypython/lib/python3.6/site-packages/distributed/worker.py", line 881, in dumps_task
return {'function': dumps_function(task[0]),
File "/mypython/lib/python3.6/site-packages/distributed/worker.py", line 845, in dumps_function
result = pickle.dumps(func)
File "/mypython/lib/python3.6/site-packages/distributed/protocol/pickle.py", line 51, in dumps
return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
File "/mypython/lib/python3.6/site-packages/cloudpickle/cloudpickle_fast.py", line 101, in dumps
cp.dump(obj)
File "/mypython/lib/python3.6/site-packages/cloudpickle/cloudpickle_fast.py", line 540, in dump
return Pickler.dump(self, obj)
File "/usr/local/opt/python#3.6/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py", line 409, in dump
self.save(obj)
File "/usr/local/opt/python#3.6/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/mypython/lib/python3.6/site-packages/cloudpickle/cloudpickle_fast.py", line 722, in save_function
*self._dynamic_function_reduce(obj), obj=obj
File "/mypython/lib/python3.6/site-packages/cloudpickle/cloudpickle_fast.py", line 659, in _save_reduce_pickle5
dictitems=dictitems, obj=obj
File "/usr/local/opt/python#3.6/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py", line 610, in save_reduce
save(args)
File "/usr/local/opt/python#3.6/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/opt/python#3.6/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py", line 751, in save_tuple
save(element)
File "/usr/local/opt/python#3.6/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/opt/python#3.6/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py", line 736, in save_tuple
save(element)
File "/usr/local/opt/python#3.6/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/mypython/lib/python3.6/site-packages/dill/_dill.py", line 1146, in save_cell
f = obj.cell_contents
ValueError: Cell is empty
[2020-08-20 07:01:49,302] {helpers.py:322} INFO - Sending Signals.SIGTERM to GPID 11451
[2020-08-20 07:01:49,303] {dag_processing.py:804} INFO - Exiting gracefully upon receiving signal 15
[2020-08-20 07:01:49,310] {dag_processing.py:1379} INFO - Waiting up to 5 seconds for processes to exit...
[2020-08-20 07:01:49,318] {helpers.py:288} INFO - Process psutil.Process(pid=11451, status='terminated') (11451) terminated with exit code 0
[2020-08-20 07:01:49,319] {helpers.py:288} INFO - Process psutil.Process(pid=11600, status='terminated') (11600) terminated with exit code None
[2020-08-20 07:01:49,319] {scheduler_job.py:1364} INFO - Exited execute loop
I am running it on macOS Catalina, if that might help to isolate the error.
I believe this issue is possibly what you are experiencing.
Looking at that ticket, it appears to still be open as a fix has been made, but has not yet made it to an official release.
This pull request contains the fix for the issue linked above - you could try building your Airflow stack locally from there, and see if it resolves the issue for you.
This started happening with new versions in downstream Dask dependencies. Locking the versions fixes the issue.
pip uninstall cloudpickle distributed
pip install cloudpickle==1.4.1 distributed==2.17.0
These were the problematic versions:
cloudpickle==1.6.0
distributed==2.26.0
I run Airflow 1.10.10 in docker and use the same image for Dask 2.13.0.
I'm having problems with attaching to gevent-based processes with the PyDev's debugger under PyDev 3.9.2.
It works fine in PyDev 3.9.1 and LiClipse 3.9.1, just not with 3.9.2 which is the latest PyDev right now (6 Feb, 2015).
It also works properly when a piece of code is run directly from PyDev's debugger rather than externally.
Note that it doesn't depend on whether there are breakpoints set (enabled or not) - the mere fact of attaching to a process suffices for the exception to be raised.
Here is a sample module to reproduce the behaviour along with two exceptions - one from PyDev and the other one from the gevent code's point of view.
Can anyone please shed any light on it? Thanks a lot.
from gevent.monkey import patch_all
patch_all()
import threading
import gevent
def myfunc():
t = threading.current_thread()
print(t.name)
while True:
gevent.spawn(myfunc)
gevent.sleep(1)
Debug Server at port: 5678
Traceback (most recent call last):
File "/opt/slow/01/data/install/pydev/eclipse/plugins/org.python.pydev_3.9.2.201502050007/pysrc/pydevd_attach_to_process/attach_script.py", line 16, in attach
patch_multiprocessing=False,
File "/opt/slow/01/data/install/pydev/eclipse/plugins/org.python.pydev_3.9.2.201502050007/pysrc/pydevd.py", line 1828, in settrace
patch_multiprocessing,
File "/opt/slow/01/data/install/pydev/eclipse/plugins/org.python.pydev_3.9.2.201502050007/pysrc/pydevd.py", line 1920, in _locked_settrace
CheckOutputThread(debugger).start()
File "/opt/slow/01/data/install/pydev/eclipse/plugins/org.python.pydev_3.9.2.201502050007/pysrc/pydevd.py", line 261, in start
thread.start()
File "/usr/lib/python2.7/threading.py", line 750, in start
self.__started.wait()
File "/usr/lib/python2.7/threading.py", line 620, in wait
self.__cond.wait(timeout)
File "/usr/lib/python2.7/threading.py", line 339, in wait
waiter.acquire()
File "_semaphore.pyx",...
Traceback (most recent call last):
File "/opt/slow/01/data/install/pydev/eclipse/plugins/org.python.pydev_3.9.2.201502050007/pysrc/pydevd_attach_to_process/attach_script.py", line 16, in attach
patch_multiprocessing=False,
File "/opt/slow/01/data/install/pydev/eclipse/plugins/org.python.pydev_3.9.2.201502050007/pysrc/pydevd.py", line 1828, in settrace
patch_multiprocessing,
File "/opt/slow/01/data/install/pydev/eclipse/plugins/org.python.pydev_3.9.2.201502050007/pysrc/pydevd.py", line 1920, in _locked_settrace
CheckOutputThread(debugger).start()
File "/opt/slow/01/data/install/pydev/eclipse/plugins/org.python.pydev_3.9.2.201502050007/pysrc/pydevd.py", line 261, in start
thread.start()
File "/usr/lib/python2.7/threading.py", line 750, in start
self.__started.wait()
File "/usr/lib/python2.7/threading.py", line 620, in wait
self.__cond.wait(timeout)
File "/usr/lib/python2.7/threading.py", line 339, in wait
waiter.acquire()
File "_semaphore.pyx", line 112, in gevent._semaphore.Semaphore.acquire (gevent/gevent._semaphore.c:3004)
File "/home/dsuch/projects/pydev-plugin/pydev-plugin/local/lib/python2.7/site-packages/gevent/hub.py", line 330, in switch
switch_out()
File "/home/dsuch/projects/pydev-plugin/pydev-plugin/local/lib/python2.7/site-packages/gevent/hub.py", line 334, in switch_out
raise AssertionError('Impossible to call blocking function in the event loop callback')
AssertionError: Impossible to call blocking function in the event loop callback
My scraper runs fine for about an hour. After a while I start seeing these errors:
2014-01-16 21:26:06+0100 [-] Unhandled Error
Traceback (most recent call last):
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/Scrapy-0.20.2-py2.7.egg/scrapy/crawler.py", line 93, in start
self.start_reactor()
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/Scrapy-0.20.2-py2.7.egg/scrapy/crawler.py", line 130, in start_reactor
reactor.run(installSignalHandlers=False) # blocking call
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/twisted/internet/base.py", line 1192, in run
self.mainLoop()
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/twisted/internet/base.py", line 1201, in mainLoop
self.runUntilCurrent()
--- <exception caught here> ---
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/twisted/internet/base.py", line 824, in runUntilCurrent
call.func(*call.args, **call.kw)
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/Scrapy-0.20.2-py2.7.egg/scrapy/utils/reactor.py", line 41, in __call__
return self._func(*self._a, **self._kw)
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/Scrapy-0.20.2-py2.7.egg/scrapy/core/engine.py", line 106, in _next_request
if not self._next_request_from_scheduler(spider):
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/Scrapy-0.20.2-py2.7.egg/scrapy/core/engine.py", line 132, in _next_request_from_scheduler
request = slot.scheduler.next_request()
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/Scrapy-0.20.2-py2.7.egg/scrapy/core/scheduler.py", line 64, in next_request
request = self._dqpop()
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/Scrapy-0.20.2-py2.7.egg/scrapy/core/scheduler.py", line 94, in _dqpop
d = self.dqs.pop()
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/queuelib/pqueue.py", line 43, in pop
m = q.pop()
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/Scrapy-0.20.2-py2.7.egg/scrapy/squeue.py", line 18, in pop
s = super(SerializableQueue, self).pop()
File "/home/scraper/.fakeroot/lib/python2.7/site-packages/queuelib/queue.py", line 157, in pop
self.f.seek(-size-self.SIZE_SIZE, os.SEEK_END)
exceptions.IOError: [Errno 22] Invalid argument
What could possibly be causing this? My version is 0.20.2. Once I get this error, scrapy stops doing anything. Even if I stop and run it again (using a JOBDIR directory), it still gives me these errors. I need to delete the job directory and start over if I need to get rid of these errors.
Try this:
Ensure that you're running latest Scrapy version (current: 0.24)
Search inside the resumed folder, and backup the file requests.seen
After backed up, remove the scrapy job folder
Start the crawl resuming with JOBDIR= option again
Stop the crawl
Replace the newly created requests.seen with previously backed up
Start crawl again
I'm having trouble running tasks. I run ./manage celeryd -B -l info, it correctly loads all tasks to registry.
The error happens when any of the tasks run - the task starts, does its thing, and then I get:
[ERROR/MainProcess] Thread 'ResultHandler' crashed: ValueError('Octet out of range 0..2**64-1',)
Traceback (most recent call last):
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 221, in run
return self.body()
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 458, in body
on_state_change(task)
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 436, in on_state_change
state_handlers[state](*args)
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 413, in on_ack
cache[job]._ack(i, time_accepted, pid)
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/celery/concurrency/processes/pool.py", line 1016, in _ack
self._accept_callback(pid, time_accepted)
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/celery/worker/job.py", line 424, in on_accepted
self.acknowledge()
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/celery/worker/job.py", line 516, in acknowledge
self.on_ack()
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/celery/worker/consumer.py", line 405, in ack
message.ack()
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/kombu-2.1.0-py2.7.egg/kombu/transport/base.py", line 98, in ack
self.channel.basic_ack(self.delivery_tag)
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/amqplib-1.0.2-py2.7.egg/amqplib/client_0_8/channel.py", line 1740, in basic_ack
args.write_longlong(delivery_tag)
File "/Users/jzelez/Sites/my_virtual_env/lib/python2.7/site-packages/amqplib-1.0.2-py2.7.egg/amqplib/client_0_8/serialization.py", line 325, in write_longlong
raise ValueError('Octet out of range 0..2**64-1')
ValueError: Octet out of range 0..2**64-1
I also must note that this worked on my previous Lion install, and even if I create a blank virtualenv with some test code, when a task runs it gives this error.
This happens with Python 2.7.2 and 2.6.4.
Django==1.3.1
amqplib==1.0.2
celery==2.4.6
django-celery==2.4.2
It appears there is some bug with homebrew install python. I've now switched to the native Lion one (2.7.1) and it works.