I have a very simple implementation on python using redis-py to interface with Redis.
As part of the development, I am shutting redis down to simulate a timeout exception.
Problem is that I am setting the timeout to a few seconds, but the connection just sits there without timing out.
from redis import StrictRedis
print('Connecting')
redis_instance = StrictRedis(host=settings.REDIS_HOST,
port=settings.REDIS_PORT,
db=settings.REDIS_DB,
socket_connect_timeout=5,
socket_timeout=5,
)
print('Setting key')
redis_instance.set('X','Y')
print('Key SET')
I can see it goes up to Setting key message, but doesnt go beyond that or throw a timeout.
Any idea what I am doing wrong?
If you shutdown redis before running the code. redis-py raises socket exception ConnectionRefusedError, and redis ConnectionError.
You have not connected the redis yet, how could the connection times out?
Related
I have this piece of code that just downloads files from WebDav server.
_download(self) is a thread function, it is handled by multi_download(self) controller, that keeps the Thread count under 24 and it works fine. It should ensure that no more than 24 sockets are used. It is very straightforward, I am not even going to post the method here. Maybe relevant is that I am using Threads, not ThreadPoolExecutor - i am not a fan of Pool so much - handling the max ThreadCount manually.
Problem is when e.g. VPN drops and i cannot connect, or some other unhandled network problems. I could handle that ofc, but thats not the point here.
The unexpected behaviour is HERE:
After a while of running retrials and logging exceptions , the file descriptor count seems to overreach the limit because it starts throwing this Error. It never happened when there was no errors/retrials in the whole process.:
NOTE: webdav.download() library method uses with open(file, 'wb') to download data, so there should be no hanging FDs either.
2022-02-09 10:36:53,898 - DEBUG - 2294-1644212940.tdms thrd - Retried download successfull on 25 attempt [webdav.py:_download:183]
2022-02-09 10:36:53,904 - DEBUG - 2294-1644212940.tdms thrd - downloaded 900 files [webdav.py:add_download_counter:67]#just a log
2022-02-09 10:36:59,801 - DEBUG - 2294-1644219643.tdms thrd - Retried download successfull on 25 attempt [webdav.py:_download:183]
2022-02-09 10:36:59,856 - DEBUG - 2294-1644213248.tdms thrd - Retried download successfull on 25 attempt [webdav.py:_download:183]
2022-02-09 10:36:59,905 - WARNING - 2294-1643646904.tdms thrd - WebDav cannot connect: HTTPConnectionPool(host='123.16.456.123', port=987):
Max retries exceeded with url:/path/to/webdav/file (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7b3377d898>:
Failed to establish a new connection: [Errno 24] Too many open files'))
# Marked in code where this is thrown !
I assume that it means i am opening too many sockets, but I tried to clean after myself, in code you can see me closing the session and even deleting the reference to client to make it more neat. BUT after a while of debugging it, I cannot seem to get hold of WHERE I am forgetting something and where are the hanging sockets. I try ask for help before I start counting FDs and subclassing easywebdav2 classes :) Thanks, Q.
# Python 3.7.3
from easywebdav2 import Client as WebDavClient
# WebDav source:
# https://github.com/zabuldon/easywebdav/blob/master/easywebdav2/client.py
def clean_webdav(self, webdav):
"""Closing sockets after"""
try:
webdav.session.close()
except Exception as err:
logger.error(f'Err closing session: {err}')
finally:
del webdav
def _download(self, local, remote, *, retry=0):
"""This is a thread function, therefore raising SystemExit."""
try:
webdav = WebDavClient(**kw)
except Exception as err:
logger.error(f'There is an err creating client: {err}')
raise SystemExit
try:
webdav.download(remote, local) # < --------------- HERE THROWS
if retry != 0:
logger.info(f'Retry number {retry} was successfull')
except(ConnectionError, requests.exceptions.ConnectionError) as err:
if retry >= MAX_RETRY:
logger.exception(f'There was err: {err}')
return
retry += 1
self.clean_webdav(webdav)
self._download(local, remote, retry=retry)
except Exception as err:
logger.error(f'Unhandled Exception: {err}')
finally:
self.clean_webdav(webdav)
raise SystemExit
EDIT: Since one answer contained reference to WebDav being HTTP protocol expansion(which it is) - HTTP keep-alive should not play a role here if I am specifically closing the requests session by webdav.session.close() which is indeed THE requests session made by webdav. There should be no keep-alive after specifically closing right?
I'm not specifically familiar with the WebDAV client package you're using, but any WebDAV client would usually support HTTP Keep-Alive, which means that after you release a connection, it will keep it alive in a pool for a while in case you need it again.
The way you use a client like that would be to construct one WebDavClient for your application, and use that one client for all requests. Of course, you need to know that it's thread safe if you're going to call it from multiple download threads.
Since you are creating a WebDavClient in each thread, there's a good chance that total number of connections being kept alive across all of their pools exceeds your file handle limit.
--
EDIT: A quick look on the Web indicates that each WebDavClient creates a session of object from requests, which does indeed have a connection pool, but unfortunately isn't thread-safe. You should create a WebDavClient per thread and use it for all of the downloads that that thread does. That will probably require a little refactoring.
I have a script that waits for tasks from a task queue and then runs them. Something like this minimal example:
import redis
cache = redis.Redis(host='127.0.0.1', port=6379)
import time
def main():
while True:
message = cache.blpop('QUEUE', timeout=0)
work(message)
def work(message):
print(f"beginning work: {message}")
time.sleep(10)
if __name__ == "__main__":
main()
I am not using a web server because the script does not need to answer http requests. However I've confused myself a bit about how to make this script robust against errors in production.
With a web server and gunicorn, gunicorn would handle forking a process for each request. If the request causes an error then the worker dies and the request fails but the server continues to run.
How can I achieve this if I'm not running an http server? I could fork a process to perform the "work" function, but the code performing the fork would still be application code.
Is it possible to deploy a non-http server script like mine using Gunicorn? Is there something else I should be using to handle forking processes?
Or is it reasonable to fork inside the application and deploy to production?
what about this:
while True:
try:
message = cache.blpop('QUEUE', timeout=0)
work(message)
except: Exception as e :
print(e)
I have created a Python program that uses Autobahn to make WebSocket connections to a remote host, and receive a data flow over these connections.
From time to time, some different exceptions occur during these connections, most often either an exception immediately when attempting to connect, stating that the initial WebSocket handshake failed (most likely due to an overloaded server), like this:
2017-05-03T20:31:10 dropping connection to peer tcp:1.2.3.4:443 with abort=True: WebSocket opening handshake timeout (peer did not finish the opening handshake in time)
Or a later exception during a successful and ongoing connection, saying that the connection timed out due to lack of pong response to a ping, as follows:
2017-05-04T13:33:40 dropping connection to peer tcp:1.2.3.4:443 with abort=True: WebSocket ping timeout (peer did not respond with pong in time)
2017-05-04T13:33:40 session closed with reason wamp.close.transport_lost [WAMP transport was lost without closing the session before]
2017-05-04T13:33:40 While firing onDisconnect: Traceback (most recent call last):
File "c:\Python36\lib\site-packages\txaio\aio.py", line 450, in done
f._result = x
AttributeError: attribute '_result' of '_asyncio.Future' objects is not writable
As can be seen above, this also triggers some other strange exception in the txaio module in this particular case.
No matter what kind of exception that occurs, I would like to catch them and handle them gracefully, but for some reason the exceptions (none of them) seem to bubble up to the code that initiated these connections (i.e. get caught by my try ... except clause there), which looks like this:
from autobahn.asyncio.wamp import ApplicationSession
from autobahn.asyncio.wamp import ApplicationRunner
...
class MyComponent(ApplicationSession):
...
try:
runner = ApplicationRunner("wss://my.websocket.server.com:443", "realm1")
runner.run(MyComponent)
except Exception as e:
print('Unexpected connection error')
...
Instead, all these exceptions just hang my program completely after the error messages have been dumped out to the terminal as above, why is this?
So, the question is: How and where in the code can I catch these exceptions that occur during the WebSocket connections in Autobahn, and react/handle them gracefully?
I am using python and pika on linux OS Environment.
Message/Topic Receiver keeps crashing when RabbitMQ is not running.
I am wondering is there a way to keep the Message/Topic Receiver running when RabbitMQ is not because RabbitMQ would not be on the same Virtual Machine as the Message/Topic Receiver.
This cover if RabbitMQ crashes for some reason but the Message/Topic Receiver should keep running. Saving having to start/restart the Message/Topic Receiver again.
As far as I understand "Message/Topic Reciever" in your case is the consumer.
You are responsible to make an application in such a way that it will catch an exception if it is trying to connect to the not running RabbitMQ.
for example:
creds = pika.PlainCredentials(**creds)
params = pika.ConnectionParameters(credentials=creds,
**conn_params)
try:
connection = pika.BlockingConnection(params)
LOG.info("Connection to Rabbit was established")
return connection
except (ProbableAuthenticationError, AuthenticationError):
LOG.error("Authentication Failed", exc_info=True)
except ProbableAccessDeniedError:
LOG.error("The Virtual Host configured wrong!", exc_info=True)
except ChannelClosed:
LOG.error("ChannelClosed error", exc_info=True)
except AMQPConnectionError:
LOG.error("RabbitMQ server is down or Host Unreachable")
LOG.error("Connection attempt timed out!")
LOG.error("Trying to re-connect to RabbitMQ...")
time.sleep(reconnection_interval)
# <here goes your reconnection logic >
And as far as making sure that you Rabbit server is always up and running:
you can create a cluster make you queue durable, HA
install some type of supervision (let say monit or supervisord) and configure it to check rabbit process. for example:
check process rabbitmq with pidfile /var/run/rabbitmq/pid
start program = "/etc/init.d/rabbitmq-server stop"
stop program = "/etc/init.d/rabbitmq-server start"
if 3 restarts within 5 cycles then alert
Edit:
The main issue is the 3rd party rabbitmq machine seems to kill idle connections every now and then. That's when I start getting "Broken Pipe" exceptions. The only way to gets comms. back to normal is for me to kill the processes and restart them. I assume there's a better way?
--
I'm a little lost here. I am connecting to a 3rd party RabbitMQ server to push messages to. Every now and then all the sockets on their machine gets dropped and I end up getting a "Broken Pipe" exception.
I've been told to implement a heartbeat check in my code but I'm not sure how exactly. I've found some info here: http://kombu.readthedocs.org/en/latest/changelog.html#version-2-3-0 but no real example code.
Do I only need to add "?heartbeat=x" to the connection string? Does Kombu do the rest? I see I need to call "Connection.heartbeat_check()" at "x/2". Should I create a periodic task to call this? How does the connection get re-established?
I'm using:
celery==3.0.12
kombu==2.5.4
My code looks like this right now. A simple Celery task gets called to send the message through to the 3rd party RabbitMQ server (removed logging and comments to keep it short, basic enough):
class SendMessageTask(Task):
name = "campaign.backends.send"
routing_key = "campaign.backends.send"
ignore_result = True
default_retry_delay = 60 # 1 minute.
max_retries = 5
def run(self, send_to, message, **kwargs):
payload = "Testing message"
try:
conn = BrokerConnection(
hostname=HOSTNAME,
port=PORT,
userid=USER_ID,
password=PASSWORD,
virtual_host=VHOST
)
with producers[conn].acquire(block=True) as producer:
publish = conn.ensure(producer, producer.publish, errback=sending_errback, max_retries=3)
publish(
body=payload,
routing_key=OUT_ROUTING_KEY,
delivery_mode=2,
exchange=EXCHANGE,
serializer=None,
content_type='text/xml',
content_encoding = 'utf-8'
)
except Exception, ex:
print ex
Thanks for any and all help.
While you certainly can add heartbeat support to a producer, it makes more sense for consumer processes.
Enabling heartbeats means that you have to send heartbeats regularly, e.g. if the heartbeat is set to 1 second, then you have to send a heartbeat every second or more or the remote will close the connection.
This means that you have to use a separate thread or use async io to reliably send heartbeats in time, and since a connection cannot be shared between threads this leaves us with async io.
The good news is that you probably won't get much benefit adding heartbeats to a produce-only connection.