Pika: how to consume messages synchronously

Pika: how to consume messages synchronously - python

I would like to run a process periodically(like once per 10 minutes, or once per hour) that gets all the messages from queue, processes them and then exits. Is there any way to do this with pika or should I use a different python lib?

I think an ideal solution here would be to use the basic_get method. It will fetch a single message, but if the the queue is already empty it will return None. The advantage of this is that you can clear the queue with a simple loop, and then simply break the loop once None is returned, plus it is safe to run basic_get with multiple consumers.
This example is based on my own library; amqpstorm, but you could easily implement the same with pika as well.
from amqpstorm import Connection
connection = Connection('127.0.0.1', 'guest', 'guest')
channel = connection.channel()
channel.queue.declare('simple_queue')
while True:
result = channel.basic.get(queue='simple_queue', no_ack=False)
if not result:
print("Channel Empty.")
# We are done, lets break the loop and stop the application.
break
print("Message:", result['body'])
channel.basic.ack(result['method']['delivery_tag'])
channel.close()
connection.close()

Would this work for you:
Measure the current queue length as N = queue.method.message_count
Make the callback count the processed messages and as soon as N are processed, call channel.stop_consuming.
So, client code would be something like this:
class CountCallback(object):
def __init__(self, count):
self.count = count
def __call__(self, ch, method, properties, body):
# process the message here
self.count -= 1
if not self.count:
ch.stop_consuming()
channel = conn.channel()
queue = channel.queue_declare('tasks')
callback = CountCallback(queue.method.message_count)
channel.basic_consume(callback, queue='tasks')
channel.start_consuming()

#eandersson
This example is based on my own library; amqpstorm, but you could easily implement the same with pika as well.
updated for amqpstorm 2.6.1 :
from amqpstorm import Connection
connection = Connection('127.0.0.1', 'guest', 'guest')
channel = connection.channel()
channel.queue.declare('simple_queue')
while True:
result = channel.basic.get(queue='simple_queue', no_ack=False)
if not result:
print("Channel Empty.")
# We are done, lets break the loop and stop the application.
break
print("Message:", result.body)
channel.basic.ack(result.method['delivery_tag'])
channel.close()
connection.close()

Related

Pyhon Pika how to use GeventConnection

We have several tasks that we consume from a message queue. The runtimes of those tasks are dependent on fetching some data from a database. Therefore we would like to work with Gevent to not block the program if some database requests take a long time. We are trying to couple it with the Pika client, which has some asynchronous adapters, one of them for gevent: pika.adapters.gevent_connection.GeventConnection.
I set up some toy code, which consumes from a MQ tasks that consists of integers and publishes them on another queue, while sleeping for 4 seconds for each odd number:
# from gevent import monkey
# # Monkeypatch core python libraries to support asynchronous operations.
# monkey.patch_time()
import pika
from pika.adapters.gevent_connection import GeventConnection
from datetime import datetime
import time
def handle_delivery(unused_channel, method, header, body):
"""Called when we receive a message from RabbitMQ"""
print(f"Received: {body} at {datetime.now()}")
channel.basic_ack(method.delivery_tag)
num = int(body)
print(num)
if num % 2 != 0:
time.sleep(4)
channel.basic_publish(
exchange='my_test_exchange2',
routing_key='my_test_queue2',
body=body
)
print("Finished processing")
def on_connected(connection):
"""Called when we are fully connected to RabbitMQ"""
# Open a channel
connection.channel(on_open_callback=on_channel_open)
def on_channel_open(new_channel):
"""Called when our channel has opened"""
global channel
channel = new_channel
channel.basic_qos(prefetch_count=1)
channel.queue_declare(queue="my_queue_gevent5")
channel.exchange_declare("my_test_exchange2")
channel.queue_declare(queue="my_test_queue2")
channel.queue_bind(exchange="my_test_exchange2", queue="my_test_queue2")
channel.basic_consume("my_queue_gevent5", handle_delivery)
def start_loop(i):
conn = GeventConnection(pika.ConnectionParameters('localhost'), on_open_callback=on_connected)
conn.ioloop.start()
start_loop(1)
If I run it without the monkey.patch_time() call it works OK and it publishes results on the my_test_queue2, but it works sequentially. The expected behaviour after adding monkey.patch_time() patch would be that it still works but concurrently. However, the code gets stuck (nothing happens anymore) after it comes to the call time.sleep(4). It processes and publishes the first integer, which is 0, and then gets stuck at 1, when the if clause gets triggered. What am I doing wrong?

With the help of ChatGPT I managed to make it work. There was a gevent.spawn() call missing:
def handle_delivery(unused_channel, method, header, body):
print("Handling delivery")
gevent.spawn(process_message, method, body)
def process_message(method, body):
print(f"Received: {body} at {datetime.now()}")
channel.basic_ack(method.delivery_tag)
num = int(body)
print(num)
if num % 2 != 0:
time.sleep(4)
channel.basic_publish(
exchange='my_test_exchange2',
routing_key='my_test_queue2',
body=body
)
print("Finished processing")

Pika - RabbitMQ - Why is my deliver-rate greater than acknowledge rate when using prefetch-count 1 for consumer

I have a problem with the RabbitMQ implementation PIKA in Python.
I want to consume 1 message from a queue, work with it and acknowledge it when the work is done. Then the next message should be received.
I used the prefetch_count = 1 option, to tell rabbitMQ that this consumer only wants 1 message at a time and don't want a new message until this message is acknowledged.
Here is my (very simple) code:
credentials = pika.PlainCredentials("username","password")
connection = pika.BlockingConnection(
pika.ConnectionParameters(host='1.2.3.4', credentials=credentials))
channel = connection.channel()
def consume(ch, method, properties, body):
time.sleep(5) # Here is the work, now just hold 5 seconds
ch.basic_ack(method.delivery_tag)
def init():
channel.basic_consume(
queue="raw.archive", on_message_callback=consume, auto_ack=False)
channel.basic_qos(prefetch_count=1)
channel.start_consuming()
if __name__ == "__main__":
init()
So my question is, why does rabbitmq deliver more documents (40/sec) than acknowledged (0.20/sec, correct, because of 5 seconds pause). Shouldn't these two be equal?
Furthermore the Unacked value (1650) should never be greater than 1, because it should not deliver any document, until this document gets acknowleged.
The second view shows, that the consumer has no prefetch count. But the prefetch count is set on the connection. Maybe I must set it to the consumer, but I don't know, how to set this.
What am I doing wrong?
Thanks in advance.

As confirmed by Marcel,
Issue is related to when the basic_qos is set on the channel.
It seems it should be set prior to the basic_consume.
def init():
channel.basic_qos(prefetch_count=1)
channel.basic_consume(
queue="raw.archive", on_message_callback=consume, auto_ack=False)
channel.start_consuming()

Non-blocking multiprocessing.connection.Listener?

I use multiprocessing.connection.Listener for communication between processes, and it works as a charm for me. Now i would really love my mainloop to do something else between commands from client. Unfortunately listener.accept() blocks execution until connection from client process is established.
Is there a simple way of managing non blocking check for multiprocessing.connection? Timeout? Or shall i use a dedicated thread?
# Simplified code:
from multiprocessing.connection import Listener
def mainloop():
listener = Listener(address=(localhost, 6000), authkey=b'secret')
while True:
conn = listener.accept() # <--- This blocks!
msg = conn.recv()
print ('got message: %r' % msg)
conn.close()

One solution that I found (although it might not be the most "elegant" solution is using conn.poll. (documentation) Poll returns True if the Listener has new data, and (most importantly) is nonblocking if no argument is passed to it. I'm not 100% sure that this is the best way to do this, but I've had success with only running listener.accept() once, and then using the following syntax to repeatedly get input (if there is any available)
from multiprocessing.connection import Listener
def mainloop():
running = True
listener = Listener(address=(localhost, 6000), authkey=b'secret')
conn = listener.accept()
msg = ""
while running:
while conn.poll():
msg = conn.recv()
print (f"got message: {msg}")
if msg == "EXIT":
running = False
# Other code can go here
print(f"I can run too! Last msg received was {msg}")
conn.close()
The 'while' in the conditional statement can be replaced with 'if,' if you only want to get a maximum of one message at a time. Use with caution, as it seems sort of 'hacky,' and I haven't found references to using conn.poll for this purpose elsewhere.

You can run the blocking function in a thread:
conn = await loop.run_in_executor(None, listener.accept)

I've not used the Listener object myself- for this task I normally use multiprocessing.Queue; doco at the following link:
https://docs.python.org/2/library/queue.html#Queue.Queue
That object can be used to send and receive any pickle-able object between Python processes with a nice API; I think you'll be most interested in:
in process A
.put('some message')
in process B
.get_nowait() # will raise Queue.Empty if nothing is available- handle that to move on with your execution
The only limitation with this is you'll need to have control of both Process objects at some point in order to be able to allocate the queue to them- something like this:
import time
from Queue import Empty
from multiprocessing import Queue, Process
def receiver(q):
while 1:
try:
message = q.get_nowait()
print 'receiver got', message
except Empty:
print 'nothing to receive, sleeping'
time.sleep(1)
def sender(q):
while 1:
message = 'some message'
q.put('some message')
print 'sender sent', message
time.sleep(1)
some_queue = Queue()
process_a = Process(
target=receiver,
args=(some_queue,)
)
process_b = Process(
target=sender,
args=(some_queue,)
)
process_a.start()
process_b.start()
print 'ctrl + c to exit'
try:
while 1:
time.sleep(1)
except KeyboardInterrupt:
pass
process_a.terminate()
process_b.terminate()
process_a.join()
process_b.join()
Queues are nice because you can actually have as many consumers and as many producers for that exact same Queue object as you like (handy for distributing tasks).
I should point out that just calling .terminate() on a Process is bad form- you should use your shiny new messaging system to pass a shutdown message or something of that nature.

The multiprocessing module comes with a nice feature called Pipe(). It is a nice way to share resources between two processes(never tried more than two before). With the dawn of python 3.80 came the shared memory function in the multiprocessing module but i have not really tested that so i cannot vouch for it
You will use the pipe function something like
from multiprocessing import Pipe
.....
def sending(conn):
message = 'some message'
#perform some code
conn.send(message)
conn.close()
receiver, sender = Pipe()
p = Process(target=sending, args=(sender,))
p.start()
print receiver.recv() # prints "some message"
p.join()
with this you should be able to have separate processes running independently and when you get to the point which you need the input from one process. If there is somehow an error due to the unrelieved data of the other process you can put it on a kind of sleep or halt or use a while loop to constantly check pending when the other process finishes with that task and sends it over
while not parent_conn.recv():
time.sleep(5)
this should keep it in an infinite loop until the other process is done running and sends the result. This is also about 2-3 times faster than Queue. Although queue is also a good option personally I do not use it.

Pika channel.stop_consuming doesn't stop start_consuming loop

I have this piece of code, basically it run channel.start_consuming().
I want it to stop after a while.
I think that channel.stop_consuming() is the right method:
def stop_consuming(self, consumer_tag=None):
""" Cancels all consumers, signalling the `start_consuming` loop to
exit.
But it doesn't work: start_consuming() never ends (execution doesn't exit from this call, "end" is never printed).
import unittest
import pika
import threading
import time
_url = "amqp://user:password#xxx.rabbitserver.com/aaa"
class Consumer_test(unittest.TestCase):
def test_startConsuming(self):
def callback(channel, method, properties, body):
print("callback")
print(body)
def connectionTimeoutCallback():
print("connecionClosedCallback")
def _closeChannel(channel_):
print("_closeChannel")
time.sleep(1)
print("close")
if channel_.is_open:
channel_.stop_consuming()
print("stop_cosuming")
else:
print("channel is closed")
#channel_.close()
params = pika.URLParameters(_url)
params.socket_timeout = 5
connection = pika.BlockingConnection(params)
#connection.add_timeout(2, connectionTimeoutCallback)
channel = connection.channel()
channel.basic_consume(callback,
queue='test',
no_ack=True)
t = threading.Thread(target=_closeChannel, args=[channel])
t.start()
print("start_consuming")
channel.start_consuming() # start consuming (loop never ends)
connection.close()
print("end")
connection.add_timeout solve my problem, maybe call basic_cancel too, but I want to use the right method.
Thanks
Note:
I can't respond or add comment to this (pika, stop_consuming does not work) due to my low reputation points.
Note 2:
I think that I'm not sharing channel or connection across threads (Pika doesn't support this) because I use "channel_" passed as parameter and not "channel" instance of the class (Am I wrong?).

I was having the same problem; as pika is not thread safe. i.e. connections and channels can't be safely shared across threads.
So I used a separate connection to send a shutdown message; then stopped consuming the original channel from the callback function.

Synchronous and blocking consumption in RabbitMQ using pika

I want to consume a queue (RabbitMQ) synchronously with blocking.
Note: below is full code ready to be run.
The system set up is using RabbitMQ as it's queuing system, but asynchronous consumption is not needed in one of our modules.
I've tried using basic_get on top of a BlockingConnection, which doesn't block (returns (None, None, None) immediately):
# declare queue
get_connection().channel().queue_declare(TEST_QUEUE)
def blocking_get_1():
channel = get_connection().channel()
# get from an empty queue (prints immediately)
print channel.basic_get(TEST_QUEUE)
I've also tried to use the consume generator, fails with "Connection Closed" after a long time of not consuming.
def blocking_get_2():
channel = get_connection().channel()
# put messages in TEST_QUEUE
for i in range(4):
channel.basic_publish(
'',
TEST_QUEUE,
'body %d' % i
)
consume_generator = channel.consume(TEST_QUEUE)
print next(consume_generator)
time.sleep(14400)
print next(consume_generator)
Is there a way to use RabbitMQ using the pika client as I would a Queue.Queue in python? or anything similar?
My option at the moment is busy-wait (using basic_get) - but I rather use the existing system to not busy-wait, if possible.
Full code:
#!/usr/bin/env python
import pika
import time
TEST_QUEUE = 'test'
def get_connection():
# define connection
connection = pika.BlockingConnection(
pika.ConnectionParameters(
host=YOUR_IP,
port=YOUR_PORT,
credentials=pika.PlainCredentials(
username=YOUR_USER,
password=YOUR_PASSWORD,
)
)
)
return connection
# declare queue
get_connection().channel().queue_declare(TEST_QUEUE)
def blocking_get_1():
channel = get_connection().channel()
# get from an empty queue (prints immediately)
print channel.basic_get(TEST_QUEUE)
def blocking_get_2():
channel = get_connection().channel()
# put messages in TEST_QUEUE
for i in range(4):
channel.basic_publish(
'',
TEST_QUEUE,
'body %d' % i
)
consume_generator = channel.consume(TEST_QUEUE)
print next(consume_generator)
time.sleep(14400)
print next(consume_generator)
print "blocking_get_1"
blocking_get_1()
print "blocking_get_2"
blocking_get_2()
get_connection().channel().queue_delete(TEST_QUEUE)

A common problem with Pika is that it is currently not handling incoming events in the background. This basically means that in many scenarios you will need to call connection.process_data_events() periodically to ensure that it does not miss heartbeats.
This also means that if you sleep for a extended period of time, pika will not be handling incoming data, and eventually die as it is not responding to heartbeats. An option here is to disable heartbeats.
I usually solve this by having a thread in the background check for new events, as seen in this example.
If you want to block completely I would do something like this (based on my own library AMQPStorm).
while True:
result = channel.basic.get(queue='simple_queue', no_ack=False)
if result:
print("Message:", message.body)
message.ack()
else:
print("Channel Empty.")
sleep(1)
This is based on the example found here.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pika: how to consume messages synchronously - python

I would like to run a process periodically(like once per 10 minutes, or once per hour) that gets all the messages from queue, processes them and then exits. Is there any way to do this with pika or should I use a different python lib?

Related

Pyhon Pika how to use GeventConnection

Pika - RabbitMQ - Why is my deliver-rate greater than acknowledge rate when using prefetch-count 1 for consumer

Non-blocking multiprocessing.connection.Listener?

Pika channel.stop_consuming doesn't stop start_consuming loop

Synchronous and blocking consumption in RabbitMQ using pika

Categories

Resources