Message Routing in AMQP

Message Routing in AMQP - python

I'd like to do some routing magic with AMQP. My setup is Python with Pika on the consumer/producer side and RabbitMQ for the AMQP server.
What I'd like to achieve:
send a message to a single exchange
(insert magic here)
consume messages like so:
one set of subscribers should be able to retrieve based on a routing key
one set of subscribers should just get all messages.
The tricky part is that if the any server in the second set has received a message no other server from the second set will receive it. All the servers from the first set should still be able to consume this message.
Is this possible with a single basic_publish call or do I need to send the message to a routing exchange (for the first set of consumers) and to a "global" exchange for the second set of consumers?
CLARIFICATION:
What I'd like to achieve is a single
call to publish a message and have it
received by 2 distinct sets of
consumers.
Case 1: Just receive messages based on routing key (that is a
message with routing key foo will be
received by all the consumers
currently interested in that topic)
Case 2: This basically resembles the RabbitMQ Tutorial for Worker
Queues.
There are a number of workers that
will receive messages dispatched in a
round robin way. Only one worker will receive a message
Still the message that is received by the consumers that are interested in a certain
routing key should be exactly the same as the messages received by the workers, produced
by a single API call.
(Hope my question makes sense I'm not too familiar with AMQP terms)

To start with, you need to use a topic exchange and publish your messages with a different routing key for each queue. The magic happens when the consumer binds the queue with a binding key (or pattern to be matched). Some consumers just use the routing keys as their binding key. But the second set will use a wildcard pattern for their binding key.
For Case 1, you need to create a queue per consumer, and bind each queue with an appropriate routing key.
For Case 2, just create a single queue with a routing key of # and have each of your worker consumers consume from that. The broker will dispatch in a round-robin manner to the workers.
Here's a screenshot of what it would look like in RabbitMQ. In this example there are two consumers from your "case 1" (Foo and Bar) and one queue for all the workers to satisfy "case 2".
This model should be supported by all AMQP-compliant brokers and wouldn't require any vendor-specific enhancements.

Related

Why does publisher declares queue a in Pika RabbitMQ?

I have gone through the fundamentals of RabbitMQ. One thing I figured out that a publisher does not directly publish on a queue. The exchange decides on which queue the message should be published based on routing-key and type of exchange (code below is using default exchange). I have also found an example code of publisher.
import pika, os, logging
logging.basicConfig()
# Parse CLODUAMQP_URL (fallback to localhost)
url = os.environ.get('CLOUDAMQP_URL', 'amqp://guest:guest#localhost/%2f')
params = pika.URLParameters(url)
params.socket_timeout = 5
connection = pika.BlockingConnection(params)
channel = connection.channel()
channel.queue_declare(queue='pdfprocess')
# send a message
channel.basic_publish(exchange='', routing_key='pdfprocess', body='User information')
print ("[x] Message sent to consumer")
connection.close()
In line #9 the queue is being declared. I am a bit confused because the publisher does not have to be aware of the queue. For example if it is using fanout exchange and there are 100 queues with different names, how the consumer know and declare 100 queues?

The consumer can declare the queue and bind it to the exchange when the consumer connects to RabbitMQ. A fanout exchange then copies and routes a received message to all queues bound to it, regardless of routing keys or pattern matching as with direct and topic exchanges.
So no, the publisher does not have to be aware of all queues bound to the exchange. However, the publisher can ensure that the queue exists to ensure that the code will run smoothly, but that is of more importance for other exchange types.

Any client (Publisher or Consumer) can create queues in RabbitMQ. Sometimes you might want the Publisher to create a queue, but for me that is usually the role of the Consumer. The Publisher doesn't need to know where or even whether anything it sends will be consumed.
For example, the Publisher can get an acknowledgement from the RabbitMQ server that a message has been received. The RabbitMQ server can get a acknowledgement from the Consumer when a message is consumed from a Queue.
A Publisher cannot get an acknowledgement of when a message is Consumed from a Queue, it has no visibity of whether the message was routes to zero, one or multiple queues, or whether they were consumed from these queues.

Discarding rabbitmq messages no one is listening to

I am a complete newbie to rabbitmq messaging, apologies if this question is silly or my setup completely pear-shaped.
My setup where I use rabbitmq is sending messages from certain probes. Each probe has a unique name. I have then a centralised server where I process the data - if there is a need.
I use a direct exchange and routing keys that correspond to probe names.
I declare my consumer (server) as follows (this is more or less from rabbitmq tutorials):
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.exchange_declare(exchange="foo", type="direct")
result = channel.queue_declare(exclusive=True)
queue_name = result.method.queue
If at some point I become interested in what a probe is reporting, I issue
channel.queue_bind(exchange="foo", queue=queue_name, routing_key="XXX")
where XXX is the name of the probe.
My publishers at the probes are declared as follows:
connection = pika.BlockingConnection(pika.ConnectionParameters(host="foo.bar.com"))
channel = connection.channel()
channel.exchange_declare(exchange="foo", type="direct")
and when I send a message, I use
channel.basic_publish(exchange="foo", routing_key="XXX", body=data)
where XXX is the name of the probe.
This all works fine. But how do I make it so that messages to routing keys that no one is listening to get discarded immediately? Now if my consumer stops listening to a routing key or is not running at all, messages sent by probes start piling up. When I start my consumer or have it listen to a routing key is has not been listening to in a while, I might have tens of thousands of messages backlog there. This is not what I need, and that backlog is bound to cause a resource exhaustion somewhere.
Is there a way to modify this so that messages get discarded instead of queued if there is no one listening to them when they arrive at the exchange? I would assume there is a way but Google and pika documents did not help.
Thanks in advance.

But how do I make it so that messages to routing keys that no one is
listening to get discarded immediately?
By default, Rabbitmq has implemented this.You just need to make sure that there is no queue which is binded to that routing key.
Now if my consumer stops listening to a routing key or is not running
at all, messages sent by probes start piling up
If there is no queue for that routing key, all messages will be discarded.
Is there a way to modify this so that messages get discarded instead
of queued if there is no one listening to them when they arrive at the
exchange?
Rabbitmq defualt behavior itself support this(for Direct Exchange)
Go through page at https://www.rabbitmq.com/tutorials/tutorial-four-python.html

Do I need rabbitmq bindings for direct exchange?

I have a rabbit mq server running, with one direct exchange which all my messages go through. The messages are routed to individual non-permanent queues (they may last a couple hours). I just started reading about queue bindings to exchanges and am a bit confused as to if I actually need to bind my queues to the exchange or not. I'm using pika basic_publish and consume functions so maybe this is implied? Not really sure just wanna understand a bit more.
Thanks

If you are using the default exchange for direct routing (exchange = ''), then you don't have to declare any bindings. By default, all queues are bound to the default exchange. As long as the routing key exactly matches a queue name (and the queue exists), the queues can stay bound to the default exchange. See https://www.rabbitmq.com/tutorials/tutorial-one-dotnet.html.

Always. In fact, even though queues are strictly a consumer-side entity, they should be declared & bound to the direct exchange by the producer(s) at the time they create the exchange.

You have to bind a queue with some binding key to an exchange, else messages will be discarded.
This is how any amqp broker works, publisher publish a message to exchange with some key, amqp broker(RabbitMq) routes this message from exchange to those queue(s) which are binded with exchange with the given key.
However it's not mandatory to declare and bind a queue in publisher.
You can do that in subscriber but make sure you run your subscriber before starting your publisher.
If you think your messages are getting routed to queue without bindings than you are missing something.

RabbitMQ, delivery tag value and message order after reconnect

I use pika for python to communicate with RabbitMQ. I have 6 threads, which consume and acknowledge messages from the same queue. I use different connections(and channels) for each thread. So i have a few questions very close to each other:
If connection to rabbit will close in 1 of the thread, and i will make reconnect, delivery tag value will reset and after reconnect it will start from 0?
After reconnect i will receive same unacknowledged messages in the same order for each thread or it will start distribute them again between all threads or it will start from reconnect point?
It is important in my app, because there is delay between message receiving and acknowledgement, and i want to avoid duplicates on the next process steps.

Delivery tag are channel-specific and server assigned. See details in AMQP spec, section 1.1.
AMQPdefined Domains or RabbitMQ's documentation for delivery-tag. RabbitMQ initial value for delivery-tag is 1,
Zero is reserved for client use, meaning "all messages so far received".
With multiple consumers on a single queue there are no guarantee that consumers will get messages in the same order they was queued. See RabbitMQ's Broker Semantics, "Message ordering guarantees" paragraph for details:
Section 4.7 of the AMQP 0-9-1 core specification explains the conditions under which ordering is guaranteed: messages published in one channel, passing through one exchange and one queue and one outgoing channel will be received in the same order that they were sent. RabbitMQ offers stronger guarantees since release 2.7.0.
Messages can be returned to the queue using AMQP methods that feature a requeue parameter (basic.recover, basic.reject and basic.nack), or due to a channel closing while holding unacknowledged messages. Any of these scenarios caused messages to be requeued at the back of the queue for RabbitMQ releases earlier than 2.7.0. From RabbitMQ release 2.7.0, messages are always held in the queue in publication order, even in the presence of requeueing or channel closure.
With release 2.7.0 and later it is still possible for individual consumers to observe messages out of order if the queue has multiple subscribers. This is due to the actions of other subscribers who may requeue messages. From the perspective of the queue the messages are always held in the publication order.
Also, see this answer for "RabbitMQ - Message order of delivery" question.

RabbitMQ custom exchange (script-exchange installation trouble)

By default when I send a message with absent routing_key, the broker rejects it. How can I force RabitMQ to send one to the some 'default' queue? For example, I have 3 consumers with keys 'con1', 'con2' and 'con4'. I send a message with the key 'con3' and I need broker requeues message to some 'starter' queue that can start 'con3' consumer and requeue message again?
I found this https://github.com/tonyg/script-exchange and I sure it helps me, but I can't install it because the repository updated 4 years ago and modern umbrella dev kit is not support this old makefile.

It's necessary to use a combination of the alternate exchange protocol extension and the consistent-hash exchange plugin. So, you should declare 2 exchanges: direct and x-consistent-hash (alternative to the first one). Then all existing consumers should create their own queues bounded to the direct exchange. In this case all messages with routing keys of 'con1', 'con2' and 'con4' will be routed directly to the consumers whereas messages with another routing keys will be rerouted to the alternative exchange which can route them to the some 'managers', which starts the necessary processors (consumers).
The 'script-exchange' RabbitMQ plugin is unsupported now.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.