Setting up Rabbit MQ Heartbeat with Kombu - python

The main issue is the 3rd party rabbitmq machine seems to kill idle connections every now and then. That's when I start getting "Broken Pipe" exceptions. The only way to gets comms. back to normal is for me to kill the processes and restart them. I assume there's a better way?
I'm a little lost here. I am connecting to a 3rd party RabbitMQ server to push messages to. Every now and then all the sockets on their machine gets dropped and I end up getting a "Broken Pipe" exception.
I've been told to implement a heartbeat check in my code but I'm not sure how exactly. I've found some info here: but no real example code.
Do I only need to add "?heartbeat=x" to the connection string? Does Kombu do the rest? I see I need to call "Connection.heartbeat_check()" at "x/2". Should I create a periodic task to call this? How does the connection get re-established?
I'm using:
My code looks like this right now. A simple Celery task gets called to send the message through to the 3rd party RabbitMQ server (removed logging and comments to keep it short, basic enough):
class SendMessageTask(Task):
name = "campaign.backends.send"
routing_key = "campaign.backends.send"
ignore_result = True
default_retry_delay = 60 # 1 minute.
max_retries = 5
def run(self, send_to, message, **kwargs):
payload = "Testing message"
conn = BrokerConnection(
with producers[conn].acquire(block=True) as producer:
publish = conn.ensure(producer, producer.publish, errback=sending_errback, max_retries=3)
content_encoding = 'utf-8'
except Exception, ex:
print ex
Thanks for any and all help.

While you certainly can add heartbeat support to a producer, it makes more sense for consumer processes.
Enabling heartbeats means that you have to send heartbeats regularly, e.g. if the heartbeat is set to 1 second, then you have to send a heartbeat every second or more or the remote will close the connection.
This means that you have to use a separate thread or use async io to reliably send heartbeats in time, and since a connection cannot be shared between threads this leaves us with async io.
The good news is that you probably won't get much benefit adding heartbeats to a produce-only connection.


Test of sending & receiving message for Azure Service Bus Queue

I would like to write an integration test checking connection of the Python script with Azure Service Bus queue. The test should:
send a message to a queue,
confirm that the message landed in the queue.
The test looks like this:
import pytest
from azure.servicebus import ServiceBusClient, ServiceBusMessage, ServiceBusSender
CONNECTION_STRING = <some connection string>
QUEUE = <queue name>
def send_message_to_service_bus(sender: ServiceBusSender, msg: str) -> None:
message = ServiceBusMessage(msg)
class TestConnectionWithQueue:
def test_message_is_sent_to_queue_and_received(self):
msg = "test message sent to queue"
expected_message = ServiceBusMessage(msg)
servicebus_client = ServiceBusClient.from_connection_string(conn_str=CONNECTION_STRING, logging_enable=True)
with servicebus_client:
sender = servicebus_client.get_queue_sender(queue_name=QUEUE)
with sender:
send_message_to_service_bus(sender, expected_message)
receiver = servicebus_client.get_queue_receiver(queue_name=QUEUE)
with receiver:
messages_in_queue = receiver.receive_messages(max_message_count=10, max_wait_time=20)
assert any(expected_message == str(actual_message) for actual_message in messages_in_queue)
The test occassionally works, more often than not it doesn't. There are no other messages sent to the queue at the same time. As I debugged the code, if the test does not work, the variable messages_in_queue is just an empty list.
Why doesn't the code work at all times and what should be done to fix it?
Are you sure you don't have another process that receive your messages ? Maybe you are sharing your queue connections strings with other colleagues, build machines...
To troubleshoot you need to keep an eye on the Queue monitoring on Azure Portal. Debug your test and look at incoming messages if it increment by 1. Then continue your debug and check if it decrement by 1.
Also, are you sure that this unit test is useful? It looks like you are testing your infra instead of testing your code

Always Open Publish Channel RabbitMQ

I am trying to integrate snmptrapd and RabbitMQ for delivering traps notifications to an exterior system.
My system is composed of 3 components:
A Linux virtual machine with snmptrapd and RabbitMQ (Publisher);
A Linux virtual machine with RabbitMQ (Consumer);
A Linux bare metal with docker so I can have a lot of containers sending traps (using nping)
The snmptrapd part is simple:
authCommunity execute mycom
traphandle default /root/some_script
In my first attempts the some_script was written in Python, but the performance was not perfect (20 containers sending 1 trap per second during 10 seconds, I only received 160 messages in the consumer).
#!/usr/bin/env python
import pika
import sys
message = ""
for line in sys.stdin :
message += (line)
credentials = pika.PlainCredentials('test', 'test')
parameters = pika.ConnectionParameters('my_ip', 5672, '/', credentials)
connection = pika.BlockingConnection(parameters)
channel =
I switched to Perl and now I can get 200 traps/messages.
My Perl script uses Net::AMQP::RabbitMQ
use Net::AMQP::RabbitMQ;
foreach my $line ( <STDIN> ) {
chomp( $line );
$message = "$message\n$line";
my $mq = Net::AMQP::RabbitMQ->new();
$mq->connect("my_ip", {
user => "test",
password => "test",
vhost => "/"
$mq->publish(1, "snmp", $message);
But I want better. I tried 200 containers sending 1 trap per second and it failed miserably, receiving only around 10% of messages in the consumer.
I think this has to do with the overhead of always have to open, publish and close the channel in RabbitMQ per trap received, because at the network level I receive all the messages (checked trough a tcpdump).
Is there a way to keep an always-open publish channel so I don't have to reopen/create a connection to the queue?
Asking if you can talk to RabbitMQ server without connecting to it first is like asking if you can talk to someone on the telephone without connecting to their phone first (by dialing and answering).
You really should reuse your connection if you're going to send multiple messages, but you do need a connection first!
Anyway, the problem isn't with the publisher. It's the consumer that's buggy if it's losing messages.

RabbitMQ Message/Topic Receiver Crashing when RabbitMQ is not running

I am using python and pika on linux OS Environment.
Message/Topic Receiver keeps crashing when RabbitMQ is not running.
I am wondering is there a way to keep the Message/Topic Receiver running when RabbitMQ is not because RabbitMQ would not be on the same Virtual Machine as the Message/Topic Receiver.
This cover if RabbitMQ crashes for some reason but the Message/Topic Receiver should keep running. Saving having to start/restart the Message/Topic Receiver again.
As far as I understand "Message/Topic Reciever" in your case is the consumer.
You are responsible to make an application in such a way that it will catch an exception if it is trying to connect to the not running RabbitMQ.
for example:
creds = pika.PlainCredentials(**creds)
params = pika.ConnectionParameters(credentials=creds,
connection = pika.BlockingConnection(params)"Connection to Rabbit was established")
return connection
except (ProbableAuthenticationError, AuthenticationError):
LOG.error("Authentication Failed", exc_info=True)
except ProbableAccessDeniedError:
LOG.error("The Virtual Host configured wrong!", exc_info=True)
except ChannelClosed:
LOG.error("ChannelClosed error", exc_info=True)
except AMQPConnectionError:
LOG.error("RabbitMQ server is down or Host Unreachable")
LOG.error("Connection attempt timed out!")
LOG.error("Trying to re-connect to RabbitMQ...")
# <here goes your reconnection logic >
And as far as making sure that you Rabbit server is always up and running:
you can create a cluster make you queue durable, HA
install some type of supervision (let say monit or supervisord) and configure it to check rabbit process. for example:
check process rabbitmq with pidfile /var/run/rabbitmq/pid
start program = "/etc/init.d/rabbitmq-server stop"
stop program = "/etc/init.d/rabbitmq-server start"
if 3 restarts within 5 cycles then alert

Tornado connections not closing in freeBSD

I have a tornado web server, something like :
app = tornado.web.Application(handlersList,log_function=printIt)
serverInstance = tornado.ioloop.IOLoop.instance()
the handlers are made with tornado.web.RequestHandler .
When I run the server on freeBSD, sometimes the page/resource takes long time to load,
trying to debug I see that when waiting for page to load Tornado didn't create a request object yet, and looking at netstat results, I see a lot of connection with status ESTABLISHED.
So my thought is that there are too many not closed connection, and operating system refuse new connection coming from the same session.
Can this be the case?
I don't do anything in get,post functions after writing, should I somehow shutdown/close the connection before returning?
EDIT 1: get/post are synchronous (no #asynchronous)
EDIT 2: temporarily fixed by forcing no_keep_alive
class BasicFeedHandler(tornado.web.RequestHandler):
def finish(self, chunk=None):
self.request.connection.no_keep_alive = True
tornado.web.RequestHandler.finish(self, chunk)
I'm not sure if keep_alive connections should stay open so long after client closed connection, any way this workaround works.
I found how to do this by looking at HTTPConnection._finish_request, when no keep-alive
this line"\r\n\r\n"), self._header_callback) runs.
what is \r\n\r\n in this context ?
Try this:
class Application(tornado.web.Application):
def __init__(self):
http_server = tornado.httpserver.HTTPServer(Application(),no_keep_alive=True)

tornado - transferring a file to cdn without blocking

I have the nginx upload module handling site uploads, but still need to transfer files (let's say 3-20mb each) to our cdn, and would rather not delegate that to a background job.
What is the best way to do this with tornado without blocking other requests? Can i do this in an async callback?
You may find it useful in the overall architecture of your site to add a message queuing service such as RabbitMQ.
This would let you complete the upload via the nginx module, then in the tornado handler, post a message containing the uploaded file path and exit. A separate process would be watching for these messages and handle the transfer to your CDN. This type of service would be useful for many other tasks that could be handled offline ( sending emails, etc.. ). As your system grows, this also provides you a mechanism to scale by moving queue processing to separate machines.
I am using an architecture very similar to this. Just make sure to add your message consumer process to supervisord or whatever you are using to manage your processes.
In terms of implementation, if you are on Ubuntu installing RabbitMQ is a simple:
sudo apt-get install rabbitmq-server
On CentOS w/EPEL repositories:
yum install rabbit-server
There are a number of Python bindings to RabbitMQ. Pika is one of them and it happens to be created by an employee of LShift, who is responsible for RabbitMQ.
Below is a bit of sample code from the Pika repo. You can easily imagine how the handle_delivery method would accept a message containing a filepath and push it to your CDN.
import sys
import pika
import asyncore
conn = pika.AsyncoreConnection(pika.ConnectionParameters(
sys.argv[1] if len(sys.argv) > 1 else '',
credentials = pika.PlainCredentials('guest', 'guest')))
print 'Connected to %r' % (conn.server_properties,)
ch =
ch.queue_declare(queue="test", durable=True, exclusive=False, auto_delete=False)
should_quit = False
def handle_delivery(ch, method, header, body):
print "method=%r" % (method,)
print "header=%r" % (header,)
print " body=%r" % (body,)
ch.basic_ack(delivery_tag = method.delivery_tag)
global should_quit
should_quit = True
tag = ch.basic_consume(handle_delivery, queue = 'test')
while conn.is_alive() and not should_quit:
asyncore.loop(count = 1)
if conn.is_alive():
print conn.connection_close
advice on the tornado google group points to using an async callback (documented at to move the file to the cdn.
the nginx upload module writes the file to disk and then passes parameters describing the upload(s) back to the view. therefore, the file isn't in memory, but the time it takes to read from disk–which would cause the request process to block itself, but not other tornado processes, afaik–is negligible.
that said, anything that doesn't need to be processed online shouldn't be, and should be deferred to a task queue like celeryd or similar.

