I have stumbled upon this problem while I was documenting Kombu for the new SO documentation project.
Consider the following Kombu code of a Consumer Mixin:
from kombu import Connection, Queue
from kombu.mixins import ConsumerMixin
from kombu.exceptions import MessageStateError
import datetime
# Send a message to the 'test_queue' queue
with Connection('amqp://guest:guest#localhost:5672//') as conn:
with conn.SimpleQueue(name='test_queue') as queue:
queue.put('String message sent to the queue')
# Callback functions
def print_upper(body, message):
print body.upper()
message.ack()
def print_lower(body, message):
print body.lower()
message.ack()
# Attach the callback function to a queue consumer
class Worker(ConsumerMixin):
def __init__(self, connection):
self.connection = connection
def get_consumers(self, Consumer, channel):
return [
Consumer(queues=Queue('test_queue'), callbacks=[print_even_characters, print_odd_characters]),
]
# Start the worker
with Connection('amqp://guest:guest#localhost:5672//') as conn:
worker = Worker(conn)
worker.run()
The code fails with:
kombu.exceptions.MessageStateError: Message already acknowledged with state: ACK
Because the message was ACK-ed twice, on print_even_characters() and print_odd_characters().
A simple solution that works would be ACK-ing only the last callback function, but it breaks modularity if I want to use the same functions on other queues or connections.
How to ACK a queued Kombu message that is sent to more than one callback function?
Solutions
1 - Checking message.acknowledged
The message.acknowledged flag checks whether the message is already ACK-ed:
def print_upper(body, message):
print body.upper()
if not message.acknowledged:
message.ack()
def print_lower(body, message):
print body.lower()
if not message.acknowledged:
message.ack()
Pros: Readable, short.
Cons: Breaks Python EAFP idiom.
2 - Catching the exception
def print_upper(body, message):
print body.upper()
try:
message.ack()
except MessageStateError:
pass
def print_lower(body, message):
print body.lower()
try:
message.ack()
except MessageStateError:
pass
Pros: Readable, Pythonic.
Cons: A little long - 4 lines of boilerplate code per callback.
3 - ACKing the last callback
The documentation guarantees that the callbacks are called in order. Therefore, we can simply .ack() only the last callback:
def print_upper(body, message):
print body.upper()
def print_lower(body, message):
print body.lower()
message.ack()
Pros: Short, readable, no boilerplate code.
Cons: Not modular: the callbacks can not be used by another queue, unless the last callback is always last. This implicit assumption can break the caller code.
This can be solved by moving the callback functions into the Worker class. We give up some modularity - these functions will not be called from outside - but gain safety and readability.
Summary
The difference between 1 and 2 is merely a matter of style.
Solution 3 should be picked if the order of execution matters, and whether a message should not be ACK-ed before it went through all the callbacks successfully.
1 or 2 should be picked if the message should always be ACK-ed, even if one or more callbacks failed.
Note that there are other possible designs; this answer refers to callback functions that reside outside the worker.
Related
We have several tasks that we consume from a message queue. The runtimes of those tasks are dependent on fetching some data from a database. Therefore we would like to work with Gevent to not block the program if some database requests take a long time. We are trying to couple it with the Pika client, which has some asynchronous adapters, one of them for gevent: pika.adapters.gevent_connection.GeventConnection.
I set up some toy code, which consumes from a MQ tasks that consists of integers and publishes them on another queue, while sleeping for 4 seconds for each odd number:
# from gevent import monkey
# # Monkeypatch core python libraries to support asynchronous operations.
# monkey.patch_time()
import pika
from pika.adapters.gevent_connection import GeventConnection
from datetime import datetime
import time
def handle_delivery(unused_channel, method, header, body):
"""Called when we receive a message from RabbitMQ"""
print(f"Received: {body} at {datetime.now()}")
channel.basic_ack(method.delivery_tag)
num = int(body)
print(num)
if num % 2 != 0:
time.sleep(4)
channel.basic_publish(
exchange='my_test_exchange2',
routing_key='my_test_queue2',
body=body
)
print("Finished processing")
def on_connected(connection):
"""Called when we are fully connected to RabbitMQ"""
# Open a channel
connection.channel(on_open_callback=on_channel_open)
def on_channel_open(new_channel):
"""Called when our channel has opened"""
global channel
channel = new_channel
channel.basic_qos(prefetch_count=1)
channel.queue_declare(queue="my_queue_gevent5")
channel.exchange_declare("my_test_exchange2")
channel.queue_declare(queue="my_test_queue2")
channel.queue_bind(exchange="my_test_exchange2", queue="my_test_queue2")
channel.basic_consume("my_queue_gevent5", handle_delivery)
def start_loop(i):
conn = GeventConnection(pika.ConnectionParameters('localhost'), on_open_callback=on_connected)
conn.ioloop.start()
start_loop(1)
If I run it without the monkey.patch_time() call it works OK and it publishes results on the my_test_queue2, but it works sequentially. The expected behaviour after adding monkey.patch_time() patch would be that it still works but concurrently. However, the code gets stuck (nothing happens anymore) after it comes to the call time.sleep(4). It processes and publishes the first integer, which is 0, and then gets stuck at 1, when the if clause gets triggered. What am I doing wrong?
With the help of ChatGPT I managed to make it work. There was a gevent.spawn() call missing:
def handle_delivery(unused_channel, method, header, body):
print("Handling delivery")
gevent.spawn(process_message, method, body)
def process_message(method, body):
print(f"Received: {body} at {datetime.now()}")
channel.basic_ack(method.delivery_tag)
num = int(body)
print(num)
if num % 2 != 0:
time.sleep(4)
channel.basic_publish(
exchange='my_test_exchange2',
routing_key='my_test_queue2',
body=body
)
print("Finished processing")
I am unable to return the data from the Pika since it start_consuming is not stopping. It prints the results but it does not return the output
def on_request(ch, method, props, body):
directory =body
print(directory.decode('utf-8'))
response = parse(directory.decode('utf-8'))
ch.basic_publish(exchange='',
routing_key=props.reply_to,
properties=pika.BasicProperties(correlation_id = \
props.correlation_id),
body=str(response))
ch.basic_ack(delivery_tag=method.delivery_tag)
def start():
print("hi")
connection = pika.BlockingConnection(
pika.ConnectionParameters(host='localhost'))
channel = connection.channel()
channel.queue_declare(queue='rpc_queue')
channel.basic_qos(prefetch_count=2)
channel.basic_consume(queue='rpc_queue', on_message_callback=on_request)
print(" [x] Awaiting RPC requests")
channel.start_consuming()
By design start_consuming blocks forever. You will have to cancel the consumer in your on_request method.
You can also use this method to consume messages which allows an inactivity_timeout to be set, where you could then cancel your consumer.
Finally, SelectConnection allows much more flexibility in interacting with Pika's I/O loop and is recommended when your requirements are more complex than what BlockingConnection supports.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
Just use channel.stop_consuming()
I'm trying to write an IRC bot that continues to work normally while it executes a long (10+ seconds) function.
I started by writing the bot using socket. When I called a 'blocking' function (computation that takes few seconds to execute), the bot naturally stopped responding and did not record any messages sent in chat while the function was computing.
I did some googling and saw a lot of people recommend using Twisted.
I implemented basic IRC bot, heavily based on some examples:
# twisted imports
from twisted.words.protocols import irc
from twisted.internet import reactor, protocol
from twisted.python import log
# system imports
import time, sys, datetime
def a_long_function():
time.sleep(180)
print("finished")
class BotMain(irc.IRCClient):
nickname = "testIRC_bot"
def connectionMade(self):
irc.IRCClient.connectionMade(self)
def connectionLost(self, reason):
irc.IRCClient.connectionLost(self, reason)
# callbacks for events
def signedOn(self):
"""Signed to server"""
self.join(self.factory.channel)
def joined(self, channel):
"""Joined channel"""
def privmsg(self, user, channel, msg):
"""Received message"""
user = user.split('!', 1)[0]
if 'test' in msg.lower():
print("timeout started")
a_long_function()
msg = "test finished"
self.msg(channel, msg)
if 'ping' in msg.lower():
self.msg(channel, "pong")
print("pong")
class BotMainFactory(protocol.ClientFactory):
"""A factory for BotMains """
protocol = BotMain
def __init__(self, channel, filename):
self.channel = channel
self.filename = filename
def clientConnectionLost(self, connector, reason):
"""Try to reconnect on connection lost"""
connector.connect()
def clientConnectionFailed(self, connector, reason):
print ("connection failed:", reason)
reactor.stop()
if __name__ == '__main__':
log.startLogging(sys.stdout)
f = BotMainFactory("#test", "log.txt")
reactor.connectTCP("irc.freenode.net", 6667, f)
reactor.run()
This approach is definitely better than my earlier socket implementation, because now the bot still receives the messages sent while it executes a_long_function().
However, it only 'sees' these messages after the function is complete. This means that when I was logging the messages to txt file, all messages received when a_long_function() was executing receive the same timestamp of when the function has finished - and not when they were actually sent in the chatroom.
Also, the bot still isn't able to send any messages while its executing the long function.
Could someone point me in the right direction of how I should go about changing the code so that this long function can be executed asynchronously, so that the bot can still log and reply to messages as it's executing?
Thanks in advance.
Edit:
I came across this answer, which gave me an idea that I could add deferLater calls into my a_long_function to split it into smaller chunks (that say take 1s to execute), and have the bot resume normal operation in between to reply to and log any messages that were sent to the IRC channel in mean time. Or perhaps add a timer that counts how long a_long_function has been running for, and if its longer than a threshold, it would call a deferLater to let the bot catch up on the buffered messages.
This does seem like a bit of hack thought - is there a more elegant solution?
No, there is not really a more elegant solution. Unless you want to use threading, which might look more elegant but could easily lead to an unstable program. If you can avoid it, go with the deferral solution.
To asynchronously call a function, you should use the asyncio package along with async/await, or coroutines. Keep in mind that calling async/await is a v3 implementation, not v2.
Using async/await:
#!/usr/bin/env python3
# countasync.py
import asyncio
async def count():
print("One")
await asyncio.sleep(1)
print("Two")
async def main():
await asyncio.gather(count(), count(), count())
if __name__ == "__main__":
import time
s = time.perf_counter()
asyncio.run(main())
elapsed = time.perf_counter() - s
print(f"{__file__} executed in {elapsed:0.2f} seconds.")
There is a really good tutorial you can read here that goes over using asyncio, in depth.
Hope of help!
For server automation, we're trying to develop a tool, which can handle and execute a lot of tasks on different servers. We send the task and the server hostname into a queue. The queue is then consumed from a requester, which give the information to the ansible api. To achieve that we can execute more then one task at once, we're using threading.
Now we're stuck with the acknowledge of the message...
What we have done so far:
The requester.py consumes the queue and starts then a thread, in which the ansible task is running. The result is then sended into another queue. So each new messages creates a new thread. Is the task finished, the thread dies.
But now comes difficult part. We have to made the messages persistent, in case our server dies. So each message should be acknowledged after the result from ansible was sended back.
Our problem is now, when we try to acknowledged the message in the thread itselfs, there is no more "simultaneously" work done, because the consume of pika waits for the acknowledge. So how we can achieve, that the consume consumes messages and dont wait for the acknowledge? Or how we can work around or improve our little programm?
requester.py
#!/bin/python
from worker import *
import ansible.inventory
import ansible.runner
import threading
class Requester(Worker):
def __init__(self):
Worker.__init__(self)
self.connection(self.selfhost, self.from_db)
self.receive(self.from_db)
def send(self, result, ch, method):
self.channel.basic_publish(exchange='',
routing_key=self.to_db,
body=result,
properties=pika.BasicProperties(
delivery_mode=2,
))
print "[x] Sent \n" + result
ch.basic_ack(delivery_tag = method.delivery_tag)
def callAnsible(self, cmd, ch, method):
#call ansible api pre 2.0
result = json.dumps(result, sort_keys=True, indent=4, separators=(',', ': '))
self.send(result, ch, method)
def callback(self, ch, method, properties, body):
print(" [x] Received by requester %r" % body)
t = threading.Thread(target=self.callAnsible, args=(body,ch,method,))
t.start()
worker.py
import pika
import ConfigParser
import json
import os
class Worker(object):
def __init__(self):
#read some config files
def callback(self, ch, method, properties, body):
raise Exception("Call method in subclass")
def receive(self, queue):
self.channel.basic_qos(prefetch_count=1)
self.channel.basic_consume(self.callback,queue=queue)
self.channel.start_consuming()
def connection(self,server,queue):
self.connection = pika.BlockingConnection(pika.ConnectionParameters(
host=server,
credentials=self.credentials))
self.channel = self.connection.channel()
self.channel.queue_declare(queue=queue, durable=True)
We're working with Python 2.7 and pika 0.10.0.
And yes, we noticed in the pika FAQ: http://pika.readthedocs.io/en/0.10.0/faq.html
that pika is not thread safe.
Disable auto-acknowledge and set the prefetch count to something bigger then 1, depending on how many messages would you like your consumer to take.
Here is how to set prefetch
channel.basic_qos(prefetch_count=1), found here.
I have this piece of code, basically it run channel.start_consuming().
I want it to stop after a while.
I think that channel.stop_consuming() is the right method:
def stop_consuming(self, consumer_tag=None):
""" Cancels all consumers, signalling the `start_consuming` loop to
exit.
But it doesn't work: start_consuming() never ends (execution doesn't exit from this call, "end" is never printed).
import unittest
import pika
import threading
import time
_url = "amqp://user:password#xxx.rabbitserver.com/aaa"
class Consumer_test(unittest.TestCase):
def test_startConsuming(self):
def callback(channel, method, properties, body):
print("callback")
print(body)
def connectionTimeoutCallback():
print("connecionClosedCallback")
def _closeChannel(channel_):
print("_closeChannel")
time.sleep(1)
print("close")
if channel_.is_open:
channel_.stop_consuming()
print("stop_cosuming")
else:
print("channel is closed")
#channel_.close()
params = pika.URLParameters(_url)
params.socket_timeout = 5
connection = pika.BlockingConnection(params)
#connection.add_timeout(2, connectionTimeoutCallback)
channel = connection.channel()
channel.basic_consume(callback,
queue='test',
no_ack=True)
t = threading.Thread(target=_closeChannel, args=[channel])
t.start()
print("start_consuming")
channel.start_consuming() # start consuming (loop never ends)
connection.close()
print("end")
connection.add_timeout solve my problem, maybe call basic_cancel too, but I want to use the right method.
Thanks
Note:
I can't respond or add comment to this (pika, stop_consuming does not work) due to my low reputation points.
Note 2:
I think that I'm not sharing channel or connection across threads (Pika doesn't support this) because I use "channel_" passed as parameter and not "channel" instance of the class (Am I wrong?).
I was having the same problem; as pika is not thread safe. i.e. connections and channels can't be safely shared across threads.
So I used a separate connection to send a shutdown message; then stopped consuming the original channel from the callback function.