I have gone through the fundamentals of RabbitMQ. One thing I figured out that a publisher does not directly publish on a queue. The exchange decides on which queue the message should be published based on routing-key and type of exchange (code below is using default exchange). I have also found an example code of publisher.
import pika, os, logging
logging.basicConfig()
# Parse CLODUAMQP_URL (fallback to localhost)
url = os.environ.get('CLOUDAMQP_URL', 'amqp://guest:guest#localhost/%2f')
params = pika.URLParameters(url)
params.socket_timeout = 5
connection = pika.BlockingConnection(params)
channel = connection.channel()
channel.queue_declare(queue='pdfprocess')
# send a message
channel.basic_publish(exchange='', routing_key='pdfprocess', body='User information')
print ("[x] Message sent to consumer")
connection.close()
In line #9 the queue is being declared. I am a bit confused because the publisher does not have to be aware of the queue. For example if it is using fanout exchange and there are 100 queues with different names, how the consumer know and declare 100 queues?
The consumer can declare the queue and bind it to the exchange when the consumer connects to RabbitMQ. A fanout exchange then copies and routes a received message to all queues bound to it, regardless of routing keys or pattern matching as with direct and topic exchanges.
So no, the publisher does not have to be aware of all queues bound to the exchange. However, the publisher can ensure that the queue exists to ensure that the code will run smoothly, but that is of more importance for other exchange types.
Any client (Publisher or Consumer) can create queues in RabbitMQ. Sometimes you might want the Publisher to create a queue, but for me that is usually the role of the Consumer. The Publisher doesn't need to know where or even whether anything it sends will be consumed.
For example, the Publisher can get an acknowledgement from the RabbitMQ server that a message has been received. The RabbitMQ server can get a acknowledgement from the Consumer when a message is consumed from a Queue.
A Publisher cannot get an acknowledgement of when a message is Consumed from a Queue, it has no visibity of whether the message was routes to zero, one or multiple queues, or whether they were consumed from these queues.
Related
I am a complete newbie to rabbitmq messaging, apologies if this question is silly or my setup completely pear-shaped.
My setup where I use rabbitmq is sending messages from certain probes. Each probe has a unique name. I have then a centralised server where I process the data - if there is a need.
I use a direct exchange and routing keys that correspond to probe names.
I declare my consumer (server) as follows (this is more or less from rabbitmq tutorials):
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()
channel.exchange_declare(exchange="foo", type="direct")
result = channel.queue_declare(exclusive=True)
queue_name = result.method.queue
If at some point I become interested in what a probe is reporting, I issue
channel.queue_bind(exchange="foo", queue=queue_name, routing_key="XXX")
where XXX is the name of the probe.
My publishers at the probes are declared as follows:
connection = pika.BlockingConnection(pika.ConnectionParameters(host="foo.bar.com"))
channel = connection.channel()
channel.exchange_declare(exchange="foo", type="direct")
and when I send a message, I use
channel.basic_publish(exchange="foo", routing_key="XXX", body=data)
where XXX is the name of the probe.
This all works fine. But how do I make it so that messages to routing keys that no one is listening to get discarded immediately? Now if my consumer stops listening to a routing key or is not running at all, messages sent by probes start piling up. When I start my consumer or have it listen to a routing key is has not been listening to in a while, I might have tens of thousands of messages backlog there. This is not what I need, and that backlog is bound to cause a resource exhaustion somewhere.
Is there a way to modify this so that messages get discarded instead of queued if there is no one listening to them when they arrive at the exchange? I would assume there is a way but Google and pika documents did not help.
Thanks in advance.
But how do I make it so that messages to routing keys that no one is
listening to get discarded immediately?
By default, Rabbitmq has implemented this.You just need to make sure that there is no queue which is binded to that routing key.
Now if my consumer stops listening to a routing key or is not running
at all, messages sent by probes start piling up
If there is no queue for that routing key, all messages will be discarded.
Is there a way to modify this so that messages get discarded instead
of queued if there is no one listening to them when they arrive at the
exchange?
Rabbitmq defualt behavior itself support this(for Direct Exchange)
Go through page at https://www.rabbitmq.com/tutorials/tutorial-four-python.html
I have two brokers configured[1] with federation plugin. Both are pointing to each other as upstream.
My test is:
publish a message on broker A
consume on broker B
The result is:
consuming on broker B works
< good > the queue on broker B pops the message
< not good > the queue on broker A still has the message
< reason why this is not good > The issue I see is: if I always publish on one broker and then always consume on the other --> then the queue on the publishing broker will grow until it's full and start dropping messages.
The result I would like is:
both queues on broker A and B pop their messages when the consumer consumes on broker B
How do I configure RabbitMQ to pop the message from all queues when a consumer consumes the message on broker B? Right now I am trying to do so with RabbitMQ Federation plugin.
[1] The two brokers point to each other as upstreams and I configure them the same way as described in the "simple example" given by the documentation except that there are two brokers each pointing to each other as upstream. The code for the publisher looks like this and the code for the consumer looks like this.
#Trevor Boyd Smith, probably option 2 or 3 as shown below are something you could consider.
Option 1: Bidirectional federated exchange
A message will end up in the both broker A and B, one copy each broker, independent of each other. In other words, e.g., even after broker B has delivered the message to its consumer, the other copy of the message still remains in broker A.
Advantage: You will always have two copies of the message, one in each broker, which is highly available.
Disadvantage: You need to have a consumer connected to each broker.
Option 2: Bidirectional federated queue
A message will end up in one of the two brokers. By default, the broker where the message has been published will have the priority to enqueue the message; however if only the other broker has got a consumer, the message will move to the other broker.
It does not matter which broker the message ends up in, the message will be delivered once and once only by a consumer connected to either of the brokers.
Advantage: The message will be delivered once and once only to a consumer connected to either of the brokers.
Disadvantage: There will be only one copy of the message. If the broker that has got the message goes down, the other broker can not get the message. But if you are fine with eventual consistency, this option is OK. The reason is when the problematic broker comes back up running, the message would be available, eventually.
Option 3: Bidirectional federated exchange and queue
In this case, a message will end up in both brokers, one copy each broker. The same message will be delivered to a consumer connected to either of the brokers, twice! Once the message has been delivered twice, it will be gone in both brokers. (If there are two consumers, one connected to each broker, each broker will deliver the same message to its consumer once.)
Advantage: The consumer can connect to either of the brokers, and the message in each broker will be delivered and dequeued.
Disadvantage: The same message will be delivered twice. The workaround would be, before handling the message, check if the same message has already handled.
Note:
It does not mean which option is better than the other one or ones. It all depends on your use case, and there are many other configurations that can come into play, which could change the behaviour.
I created this environment:
Server A, Server B.
Created a bidirectional federated in this way:
Federation Upstream: Server_B = amqp://servera
Federation Upstream: Server_A = amqp://serverb
Then created the same policy on both the servers:
Pattern : ^fed\.
Apply to: all
federation-upstream-set:all
Created one queue to the Server A called: fed.test1 then created a consumer to the Server B as:
ConnectionFactory factory = new ConnectionFactory();
factory.setHost("localhost");
factory.setPort(5673);
Connection connection = factory.newConnection();
Channel channel = connection.createChannel();
Consumer consumer = new DefaultConsumer(channel) {
#Override
public void handleDelivery(String consumerTag, Envelope envelope, AMQP.BasicProperties properties, byte[] body)
throws IOException {
String message = new String(body, "UTF-8");
System.out.println(" [x] Message '" + message );
}
};
channel.basicConsume("fed.test1", true, consumer);
Then published a message to the Server A ---> fed.test1
The message has been consumed to the Server B and the queues messages count is zero to both the queues (Server A, Server B).
This works as you expected.
Hope it helps.
I use pika for python to communicate with RabbitMQ. I have 6 threads, which consume and acknowledge messages from the same queue. I use different connections(and channels) for each thread. So i have a few questions very close to each other:
If connection to rabbit will close in 1 of the thread, and i will make reconnect, delivery tag value will reset and after reconnect it will start from 0?
After reconnect i will receive same unacknowledged messages in the same order for each thread or it will start distribute them again between all threads or it will start from reconnect point?
It is important in my app, because there is delay between message receiving and acknowledgement, and i want to avoid duplicates on the next process steps.
Delivery tag are channel-specific and server assigned. See details in AMQP spec, section 1.1.
AMQPĀdefined Domains or RabbitMQ's documentation for delivery-tag. RabbitMQ initial value for delivery-tag is 1,
Zero is reserved for client use, meaning "all messages so far received".
With multiple consumers on a single queue there are no guarantee that consumers will get messages in the same order they was queued. See RabbitMQ's Broker Semantics, "Message ordering guarantees" paragraph for details:
Section 4.7 of the AMQP 0-9-1 core specification explains the conditions under which ordering is guaranteed: messages published in one channel, passing through one exchange and one queue and one outgoing channel will be received in the same order that they were sent. RabbitMQ offers stronger guarantees since release 2.7.0.
Messages can be returned to the queue using AMQP methods that feature a requeue parameter (basic.recover, basic.reject and basic.nack), or due to a channel closing while holding unacknowledged messages. Any of these scenarios caused messages to be requeued at the back of the queue for RabbitMQ releases earlier than 2.7.0. From RabbitMQ release 2.7.0, messages are always held in the queue in publication order, even in the presence of requeueing or channel closure.
With release 2.7.0 and later it is still possible for individual consumers to observe messages out of order if the queue has multiple subscribers. This is due to the actions of other subscribers who may requeue messages. From the perspective of the queue the messages are always held in the publication order.
Also, see this answer for "RabbitMQ - Message order of delivery" question.
I'd like to do some routing magic with AMQP. My setup is Python with Pika on the consumer/producer side and RabbitMQ for the AMQP server.
What I'd like to achieve:
send a message to a single exchange
(insert magic here)
consume messages like so:
one set of subscribers should be able to retrieve based on a routing key
one set of subscribers should just get all messages.
The tricky part is that if the any server in the second set has received a message no other server from the second set will receive it. All the servers from the first set should still be able to consume this message.
Is this possible with a single basic_publish call or do I need to send the message to a routing exchange (for the first set of consumers) and to a "global" exchange for the second set of consumers?
CLARIFICATION:
What I'd like to achieve is a single
call to publish a message and have it
received by 2 distinct sets of
consumers.
Case 1: Just receive messages based on routing key (that is a
message with routing key foo will be
received by all the consumers
currently interested in that topic)
Case 2: This basically resembles the RabbitMQ Tutorial for Worker
Queues.
There are a number of workers that
will receive messages dispatched in a
round robin way. Only one worker will receive a message
Still the message that is received by the consumers that are interested in a certain
routing key should be exactly the same as the messages received by the workers, produced
by a single API call.
(Hope my question makes sense I'm not too familiar with AMQP terms)
To start with, you need to use a topic exchange and publish your messages with a different routing key for each queue. The magic happens when the consumer binds the queue with a binding key (or pattern to be matched). Some consumers just use the routing keys as their binding key. But the second set will use a wildcard pattern for their binding key.
For Case 1, you need to create a queue per consumer, and bind each queue with an appropriate routing key.
For Case 2, just create a single queue with a routing key of # and have each of your worker consumers consume from that. The broker will dispatch in a round-robin manner to the workers.
Here's a screenshot of what it would look like in RabbitMQ. In this example there are two consumers from your "case 1" (Foo and Bar) and one queue for all the workers to satisfy "case 2".
This model should be supported by all AMQP-compliant brokers and wouldn't require any vendor-specific enhancements.
I'm having some issue getting Pika to work with routing keys or exchanges in a way that's consistent with it AMQP or RabbitMQ documentation. I understand that the RabbitMQ documentation uses an older version of Pika, so I have disregarded their example code.
What I'm trying to do is define a queue, "order" and have two consumers, one that handle the exchange or routing_key "production" and one that handles "test". From looking at that RabbitMQ documentation that should be easy enough to do by using either a direct exchange and routing keys or by using a topic exchange.
Pika however doesn't appear to know what to do with the exchanges and routing keys. Using the RabbitMQ management tool to inspect the queues, it's pretty obvious that Pika either didn't queue the message correctly or that RabbitMQ just threw it away.
On the consumer side it isn't really clear how I should bind a consumer to an exchange or handle routing keys and the documentation isn't really helping.
If I drop all ideas or exchanges and routing keys, messages queue up nicely and are easily handled by my consumer.
Any pointers or example code people have would be nice.
As it turns out, my understanding of AMQP was incomplete.
The idea is as following:
Client:
The client after getting the connection should not care about anything else but the name of the exchange and the routing key. That is we don't know which queue this will end up in.
channel.basic_publish(exchange='order',
routing_key="order.test.customer",
body=pickle.dumps(data),
properties=pika.BasicProperties(
content_type="text/plain",
delivery_mode=2))
Consumer
When the channel is open, we declare the exchange and queue
channel.exchange_declare(exchange='order',
type="topic",
durable=True,
auto_delete=False)
channel.queue_declare(queue="test",
durable=True,
exclusive=False,
auto_delete=False,
callback=on_queue_declared)
When the queue is ready, in the "on_queue_declared" callback is a good place, we can bind the queue to the exchange, using our desired routing key.
channel.queue_bind(queue='test',
exchange='order',
routing_key='order.test.customer')
#handle_delivery is the callback that will actually pickup and handle messages
#from the "test" queue
channel.basic_consume(handle_delivery, queue='test')
Messages send to the "order" exchange with the routing key "order.test.customer" will now be routed to the "test" queue, where the consumer can pick it up.
While Simon's answer seems right in general, you might need to swap the parameters for consuming
channel.basic_consume(queue='test', on_message_callback=handle_delivery)
Basic setup is sth like
credentials = pika.PlainCredentials("some_user", "some_password")
parameters = pika.ConnectionParameters(
"some_host.domain.tld", 5672, "some_vhost", credentials
)
connection = pika.BlockingConnection(parameters)
channel = connection.channel()
To start consuming:
channel.start_consuming()