I receive following output:
Traceback (most recent call last):
File "/home/ec2-user/env/lib64/python3.7/site-packages/redis/connection.py", line 1192, in get_connection
raise ConnectionError('Connection has data')
redis.exceptions.ConnectionError: Connection has data
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ec2-user/env/lib64/python3.7/site-packages/eventlet/hubs/hub.py", line 457, in fire_timers
timer()
File "/home/ec2-user/env/lib64/python3.7/site-packages/eventlet/hubs/timer.py", line 58, in __call__
cb(*args, **kw)
File "/home/ec2-user/env/lib64/python3.7/site-packages/eventlet/greenthread.py", line 214, in main
result = function(*args, **kwargs)
File "crawler.py", line 53, in fetch_listing
url = dequeue_url()
File "/home/ec2-user/WebCrawler/helpers.py", line 109, in dequeue_url
return redis.spop("listing_url_queue")
File "/home/ec2-user/env/lib64/python3.7/site-packages/redis/client.py", line 2255, in spop
return self.execute_command('SPOP', name, *args)
File "/home/ec2-user/env/lib64/python3.7/site-packages/redis/client.py", line 875, in execute_command
conn = self.connection or pool.get_connection(command_name, **options)
File "/home/ec2-user/env/lib64/python3.7/site-packages/redis/connection.py", line 1197, in get_connection
raise ConnectionError('Connection not ready')
redis.exceptions.ConnectionError: Connection not ready
I couldn't find any issue related to this particular error. I emptied/flushed all redis databases, so there should be no data there. I assume it has something to do with eventlet and patching. But even when I put following code right at the beginning of the file, the error appears.
import eventlet
eventlet.monkey_path()
What does this error mean?
Finally, I came up with the answer to my problem.
When connecting to redis with python, I specified the database with the number 0.
redis = redis.Redis(host=example.com, port=6379, db=0)
After changing the dabase to number 1 it worked.
redis = redis.Redis(host=example.com, port=6379, db=1)
Another way is to set protected_mode to no in etc\redis\redis.conf. Recommended when running redis locally.
Related
Inserting via debezium connector to mysql database brought up via docker container.
Trying to query and it is working fine until some number of hours. But, after that, same query is throwing below exception.
export JAVA_HOME=/tmp/tests/artifacts/java-17/jdk-17; export PATH=$PATH:/tmp/tests/artifacts/java-17/jdk-17/bin; docker exec -i mysql_be1e6a mysql --user=demo --password=demo -D demo -e "select count(k) from test_cdc_f0bf84 where uuid = 'd1e5cd6d-8f7a-457c-b2ea-880c2be52f69'"
2023-01-02 16:27:43,812:ERROR: failed to execute query MySQL rows count by uuid:
Traceback (most recent call last):
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 699, in recv
out = self.in_buffer.read(nbytes, self.timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/buffered_pipe.py", line 164, in read
raise PipeTimeout()
paramiko.buffered_pipe.PipeTimeout
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/abstract.py", line 667, in try_query
res = query_function()
^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/test_cdc.py", line 635, in <lambda>
query = lambda: self.mysql_query(
^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/suites/cdc/abstract.py", line 544, in mysql_query
result = self.ssh.exec_on_host(host, [
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/main/connection.py", line 335, in exec_on_host
return self._exec_on_host(host, commands, fetch, timeout=timeout, limit_output=limit_output)[host]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/main/connection.py", line 321, in _exec_on_host
res = list(out)
^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/file.py", line 125, in __next__
line = self.readline()
^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/file.py", line 291, in readline
new_data = self._read(n)
^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 1361, in _read
return self.channel.recv(size)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/workspace/stress_tests/run_test_with_universe/src/env/lib/python3.11/site-packages/paramiko/channel.py", line 701, in recv
raise socket.timeout()
TimeoutError
After some time, logged manually to machine and tried to read, it still reads fine. Not sure, what does this issue mean.
As explained, tried querying from database via python. Expected it will return count of rows, which it was happening until certain time, but after that, it threw timeout error and socket error.
Trying to query and it is working fine until some number of hours. But, after that, same query is throwing below exception.
The default value for interactive_timeout and wait_timeout is 28880 seconds (8 hours). you can disable this behavior by setting this system variable to zero in your MySQL config.
source: Configuring session timeouts
A MySQL server is initialized in Flask (with connexion) on service startup.
service.app.datastore = DatastoreMySQL(service.config)
class DatastoreMySQL(Datastore):
def __init__(self, config):
...
self.connection_pool = pooling.MySQLConnectionPool(
database=self.database,
host=self.hostname,
username=self.username,
password=self.password,
pool_name="pool_name",
pool_size=self.pool_size,
autocommit=True
)
def exec_query(self, query, params=None):
try:
connection = self.connection_pool.get_connection()
connection.ping(reconnect=True)
with closing(connection.cursor(dictionary=True, buffered=True)) as cursor:
if params:
cursor.execute(query, params)
else:
cursor.execute(query)
finally:
connection.close()
The view functions use the database by passing the DB reference from current_app.
def new():
do_something_in_db(current_app.datastore, request.get_json())
def do_something_in_db(db, data):
db.create_new_item(data)
...
However, a background process (run with APScheduler) must also run do_something_in_db(), but when passed a datastore reference an mysql.connector.errors.OperationalError error is thrown.
My understanding is that this error comes from two sources:
The server timed out and closed the connection. However, in this service the exec_query() function obtains a connection and executes right away, so there should be no reason that it times out. The monitor is also initialized at service startup with a datastore reference, but I am not sure how that can time out given that a new connection is created each time exec_query() is called.
The server dropped an incorrect or too large packet. However, there are no packets here - the process is run by a local background scheduler.
The error in full:
Job "Monitor.monitor_running_queries" raised an exception
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/mysql/connector/connection_cext.py", line 509, in cmd_query
raw_as_string=raw_as_string)
_mysql_connector.MySQLInterfaceError: MySQL server has gone away
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/k8s-service/lib/datastore/datastore_mysql.py", line 88, in exec_query
cursor.execute(query, params)
File "/usr/local/lib/python3.6/site-packages/mysql/connector/cursor_cext.py", line 276, in execute
raw_as_string=self._raw_as_string)
File "/usr/local/lib/python3.6/site-packages/mysql/connector/connection_cext.py", line 512, in cmd_query
sqlstate=exc.sqlstate)
mysql.connector.errors.DatabaseError: 2006 (HY000): MySQL server has gone away
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/apscheduler/executors/base.py", line 125, in run_job
retval = job.func(*job.args, **job.kwargs)
File "/opt/k8s-service/lib/background.py", line 60, in monitor_running_queries
self.handle_process_state(query.id, datastore, hive)
File "/opt/k8s-service/lib/background.py", line 66, in handle_process_state
query = datastore.get_item(query_id)
File "/opt/k8s-service/lib/datastore/datastore.py", line 48, in get_item
return_results=True)
File "/opt/k8s-service/lib/datastore/datastore.py", line 97, in exec_query
connection.close()
File "/usr/local/lib/python3.6/site-packages/mysql/connector/pooling.py", line 131, in close
cnx.reset_session()
File "/usr/local/lib/python3.6/site-packages/mysql/connector/connection_cext.py", line 768, in reset_session
raise errors.OperationalError("MySQL Connection not available.")
mysql.connector.errors.OperationalError: MySQL Connection not available.
This is similar to dask read_csv timeout on Amazon s3 with big files, but that didn't actually resolve my question.
import s3fs
fs = s3fs.S3FileSystem()
fs.connect_timeout = 18000
fs.read_timeout = 18000 # five hours
fs.download('s3://bucket/big_file','local_path_to_file')
The error I then get is
Traceback (most recent call last):
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/aiobotocore/response.py", line 50, in read
chunk = await self.__wrapped__.read(amt if amt is not None else -1)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/aiohttp/streams.py", line 380, in read
await self._wait("read")
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/aiohttp/streams.py", line 306, in _wait
await waiter
aiohttp.client_exceptions.ServerTimeoutError: Timeout on reading data from socket
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/spec.py", line 1113, in download
return self.get(rpath, lpath, recursive=recursive, **kwargs)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/asyn.py", line 281, in get
return sync(self.loop, self._get, rpaths, lpaths)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/asyn.py", line 71, in sync
raise exc.with_traceback(tb)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in f
result[0] = await future
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/asyn.py", line 266, in _get
return await asyncio.gather(
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/s3fs/core.py", line 701, in _get_file
chunk = await body.read(2**16)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/aiobotocore/response.py", line 52, in read
raise AioReadTimeoutError(endpoint_url=self.__wrapped__.url,
aiobotocore.response.AioReadTimeoutError: Read timeout on endpoint URL: "https://ptpiskiss.s3.eu-west-1.amazonaws.com/REBUILD%20FOR%20TIME%20SERIES/v30a%20sept%202019.accdb"
Which is strange, because I thought I was setting the appropriate timeouts on the worker copy of the class. It's solely due to my bad internet connection, but is there something I need to do on my s3 end to assist here?
I am trying to use Azure Service Bus as the broker for my celery app.
I have patched the solution by referring to various sources.
The goal is to use Azure Service Bus as the broker and PostgresSQL as the backend.
I created an Azure Service Bus and copied the credentials for the RootManageSharedAccessKey to the celery app.
Following is the task.py
from time import sleep
from celery import Celery
from kombu.utils.url import safequote
SAS_policy = safequote("RootManageSharedAccessKey") #SAS Policy
SAS_key = safequote("1234222zUY28tRUtp+A2YoHmDYcABCD") #Primary key from the previous SS
namespace = safequote("bluenode-dev")
app = Celery('tasks', backend='db+postgresql://afsan.gujarati:admin#localhost/local_dev',
broker=f'azureservicebus://{SAS_policy}:{SAS_key}=#{namespace}')
#app.task
def divide(x, y):
sleep(30)
return x/y
When I try to run the Celery app using the following command:
celery -A tasks worker --loglevel=INFO
I get the following error
[2020-10-09 14:00:32,035: CRITICAL/MainProcess] Unrecoverable error: AzureHttpError('Unauthorized\n<Error><Code>401</Code><Detail>claim is empty or token is invalid. TrackingId:295f7c76-770e-40cc-8489-e0eb56248b09_G5S1, SystemTracker:bluenode-dev.servicebus.windows.net:$Resources/Queues, Timestamp:2020-10-09T20:00:31</Detail></Error>')
Traceback (most recent call last):
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/transport/virtual/base.py", line 918, in create_channel
return self._avail_channels.pop()
IndexError: pop from empty list
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/azure/servicebus/control_client/servicebusservice.py", line 1225, in _perform_request
resp = self._filter(request)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/azure/servicebus/control_client/_http/httpclient.py", line 211, in perform_request
raise HTTPError(status, message, respheaders, respbody)
azure.servicebus.control_client._http.HTTPError: Unauthorized
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/worker/worker.py", line 203, in start
self.blueprint.start(self)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/bootsteps.py", line 365, in start
return self.obj.start()
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 311, in start
blueprint.start(self)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/worker/consumer/connection.py", line 21, in start
c.connection = c.connect()
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 398, in connect
conn = self.connection_for_read(heartbeat=self.amqheartbeat)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 404, in connection_for_read
return self.ensure_connected(
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 430, in ensure_connected
conn = conn.ensure_connection(
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/connection.py", line 383, in ensure_connection
self._ensure_connection(*args, **kwargs)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/connection.py", line 435, in _ensure_connection
return retry_over_time(
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/utils/functional.py", line 325, in retry_over_time
return fun(*args, **kwargs)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/connection.py", line 866, in _connection_factory
self._connection = self._establish_connection()
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/connection.py", line 801, in _establish_connection
conn = self.transport.establish_connection()
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/transport/virtual/base.py", line 938, in establish_connection
self._avail_channels.append(self.create_channel(self))
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/transport/virtual/base.py", line 920, in create_channel
channel = self.Channel(connection)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/kombu/transport/azureservicebus.py", line 64, in __init__
for queue in self.queue_service.list_queues():
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/azure/servicebus/control_client/servicebusservice.py", line 313, in list_queues
response = self._perform_request(request)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/azure/servicebus/control_client/servicebusservice.py", line 1227, in _perform_request
return _service_bus_error_handler(ex)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/azure/servicebus/control_client/_serialization.py", line 569, in _service_bus_error_handler
return _general_error_handler(http_error)
File "/Users/afsan.gujarati/.pyenv/versions/3.8.1/envs/celery-servicebus/lib/python3.8/site-packages/azure/servicebus/control_client/_common_error.py", line 41, in _general_error_handler
raise AzureHttpError(message, http_error.status)
azure.common.AzureHttpError: Unauthorized
<Error><Code>401</Code><Detail>claim is empty or token is invalid. TrackingId:295f7c76-770e-40cc-8489-e0eb56248b09_G5S1, SystemTracker:bluenode-dev.servicebus.windows.net:$Resources/Queues, Timestamp:2020-10-09T20:00:31</Detail></Error>
I don't see a straight solution for this anywhere. What am I missing?
P.S. I did not create the Queue in Azure Service Bus. I am assuming that celery would create the Queue by itself when the celery app is executed.
P.S.S. I also tried to use the exact same credentials in Python's Service Bus Client and it seemed to work. It feels like a Celery issue, but I am not able to figure out exactly what.
If you want to use Azure Service Bus Transport to connect Azure service bus, the URL should be azureservicebus://{SAS policy name}:{SAS key}#{Service Bus Namespace}.
For example
Get Shared access policies RootManageSharedAccessKey
Code
from celery import Celery
from kombu.utils.url import safequote
SAS_policy = "RootManageSharedAccessKey" # SAS Policy
# Primary key from the previous SS
SAS_key = safequote("X/*****qyY=")
namespace = "bowman1012"
app = Celery('tasks', backend='db+postgresql://<>#localhost/<>',
broker=f'azureservicebus://{SAS_policy}:{SAS_key}#{namespace}')
#app.task
def add(x, y):
return x + y
I try to implement Apache Airflow with the CeleryExecutor. For the database I use Postgres, for the celery message queue I use Redis. When using LocalExecutor everything works fine, but when I set the CeleryExecutor in the airflow.cfg and want to set the Postgres database as the result_backend
result_backend = postgresql+psycopg2://airflow_user:*******#localhost/airflow
I get this error when running the Airflow scheduler no matter which DAG I trigger:
[2020-03-18 14:14:13,341] {scheduler_job.py:1382} ERROR - Exception when executing execute_helper
Traceback (most recent call last):
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'backend'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1380, in _execute
self._execute_helper()
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1441, in _execute_helper
if not self._validate_and_run_task_instances(simple_dag_bag=simple_dag_bag):
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1503, in _validate_and_run_task_instances
self.executor.heartbeat()
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/airflow/executors/base_executor.py", line 130, in heartbeat
self.trigger_tasks(open_slots)
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 205, in trigger_tasks
cached_celery_backend = tasks[0].backend
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/celery/local.py", line 146, in __getattr__
return getattr(self._get_current_object(), name)
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/celery/app/task.py", line 1037, in backend
return self.app.backend
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/kombu/utils/objects.py", line 44, in __get__
value = obj.__dict__[self.__name__] = self.__get(obj)
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/celery/app/base.py", line 1227, in backend
return self._get_backend()
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/celery/app/base.py", line 944, in _get_backend
self.loader)
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/celery/app/backends.py", line 74, in by_url
return by_name(backend, loader), url
File "<PATH_TO_VIRTUALENV>/lib/python3.6/site-packages/celery/app/backends.py", line 60, in by_name
backend, 'is a Python module, not a backend class.'))
celery.exceptions.ImproperlyConfigured: Unknown result backend: 'postgresql'. Did you spell that correctly? ('is a Python module, not a backend class.')
The exact same parameter to direct to the database works
sql_alchemy_conn = postgresql+psycopg2://airflow_user:*******#localhost/airflow
Setting Redis as the celery result_backend works, but I read it is not the recommended way.
result_backend = redis://localhost:6379/0
Does anyone see what I am doing wrong?
You need to add the db+ prefix to the database connection string:
f"db+postgresql+psycopg2://{user}:{password}#{host}/{database}"
This is also mentioned in the docs: https://docs.celeryproject.org/en/stable/userguide/configuration.html#database-url-examples
You need to add the db+ prefix to the database connection string:
result_backend = db+postgresql://airflow_user:*******#localhost/airflow