Kafka consumer proceed hanging awaiting messages when connection is lost - python

I am using aiokafka==0.5.2 as python lib for kafka
I have just the code form the example:
async def consume():
consumer = AIOKafkaConsumer(
'my_topic', 'my_other_topic',
loop=loop, bootstrap_servers='localhost:9092',
group_id="my-group")
await consumer.start()
try:
# Consume messages
async for msg in consumer:
#...
When I run it - it works fine. But when I stop Kafka server - my app proceed hanging waiting for messages. And, I guess, when prod server exclude Kafka node from balancing - my app know nothing about this. How could I have listener on Kafka connection inside my app?

Related

How to detect gRPC server is down from gRPC AIO python client

I've been facing this issue where I have a gRPC AIO python client sending bunch of configuration changes to the gRPC server, though it's a bi-directional RPC, client is not expecting any message from gRPC server. So whenever there is a configuration change the client sends gRPC message containing the configuration. It keeps the channel open and it's not closed (not calling done_writing()) . But when it doesn't have anything to send, it polls a queue for any new message in a tight loop. During that time if the server goes down, the client is not able to detect that. But, as soon as some data is available in the queue, while pushing it to server, client is able to detect that server went down(exception thrown).
How to detect server went down while there is no data to send and client is waiting for data availability. Is there any gRPC API I can call on the channel while the client is waiting for data to send, in order to detect channel failure(basically if there is any api call which throws exception while the server went down is also good for me, I was not able to find any useful API)? I tried gRPC keepalive, but it didn't work for my scenario.
async with grpc.aio.insecure_channel('127.0.0.1:51234') as channel:
stream = hello_pb2.HelloStub(channel).Configure()
await stream.wait_for_connection()
while True:
if queue.empty():
continue
if not queue.empty():
item = queue.get()
await asyncio.sleep(0.001)
queue.task_done()
await stream.write(item)
await asyncio.sleep(0.01)
await stream.done_writing()
I tried to enable gRPS keepalive while forming insecure_channel. It didn't have desired effect.
Subsequently I tried calling channel_ready() inside the tight loop during queue empty, was expecting to throw some exception and come out of that loop, but it didn't work.
async with grpc.aio.insecure_channel('127.0.0.1:51234',
options = [
('grpc.keepalive_time_ms', 60000),
('grpc.keepalive_timeout_ms', 600000),
('grpc.keepalive_permit_without_calls', 1),
('grpc.http2.max_pings_without_data', 0),
('grpc.http2.min_time_between_pings_ms', 10000),
('grpc.http2.min_ping_interval_without_data_ms', 60000),
('grpc.max_connection_age_ms', 2147483647),
]) as channel:
I was able to solve it using channel.get_state(). Below is the code snippet
if queue.empty():
if channel.get_state() != grpc.ChannelConnectivity.READY:
break
time.sleep(5)

Fastapi consuming live messages

I have a virtual assistant which receives messages and send it to a event broker (e.g rabbitmq).
An event broker allows me to connect my running assistant to other services that process the data coming in from conversations.
Example:
If I have a RabbitMQ server running, as well as another application that consumes the events, then this consumer needs to implement Pika's start_consuming() method with a callback action. Here's a simple example:
import json
import pika
def _callback(self, ch, method, properties, body):
# Do something useful with your incoming message body here, e.g.
# saving it to a database
print('Received event {}'.format(json.loads(body)))
if __name__ == '__main__':
# RabbitMQ credentials with username and password
credentials = pika.PlainCredentials('username', 'password')
# Pika connection to the RabbitMQ host - typically 'rabbit' in a
# docker environment, or 'localhost' in a local environment
connection = pika.BlockingConnection(
pika.ConnectionParameters('rabbit', credentials=credentials))
# start consumption of channel
channel = connection.channel()
channel.basic_consume(_callback,
queue='events',
no_ack=True)
channel.start_consuming()
What is the correct way to use fastapi with Pika to consume these live messages and saving it to a database ?
Do I need a websocket route ?

How to fix not receiving kafka messages in python but receiving the same messages in shell?

I want to consume messages coming in a kafka topic. I am using debezium which oplogs the mongodb changes and puts them in the kafka queue. I am able to connect to kafka using my python code, list the kafka topics. Although, when I want to consume the messages, its all blank whereas the same topic when consumed from the shell gives messages, performs perfectly.
from kafka import KafkaConsumer
topic = "dbserver1.inventory.customers"
# consumer = KafkaConsumer(topic, bootstrap_servers='localhost:9092', auto_offset_reset='earliest', auto_commit_enable=True)
consumer = KafkaConsumer(topic)
print("Consumer connected!")
# print("Topics are {}".format(consumer.topics()))
for message in consumer:
print(message)

RabbitMQ durable queue bindings

I am trying to reliably send a message from a publisher to multiple consumers using RabbitMQ topic exchange.
I have configured durable queues (one per consumer) and I am sending persistent messages delivery_mode=2. I am also setting the channel in confim_delivery mode, and have added mandatory=True flag to publish.
Right now the service is pretty reliable, but messages get lost to one of the consumers if it stays down during a broker restart followed by a
message publication.
It seems that broker can recover queues and messages on restart, but it doesn't seem to keep the binding between consumers and queues. So messages only reach one of the consumers and get lost for the one that is down.
Note: Messages do reach the queue and the consumer if the broker doesn't suffer a restart during the time a consumer is down. They accumulate properly on the queue and they are delivered to the consumer when it is up again.
Edit - adding consumers code:
import pika
class Consumer(object):
def __init__(self, queue_name):
self.queue_name = queue_name
def consume(self):
credentials = pika.PlainCredentials(
username='myuser', password='mypassword')
connection = pika.BlockingConnection(
pika.ConnectionParameters(host='myhost', credentials=credentials))
channel = connection.channel()
channel.exchange_declare(exchange='myexchange', exchange_type='topic')
channel.queue_declare(queue=self.queue_name, durable=True)
channel.queue_bind(
exchange='myexchange', queue=self.queue_name, routing_key='my.route')
channel.basic_consume(
consumer_callback=self.message_received, queue=self.queue_name)
channel.start_consuming()
def message_received(self, channel, basic_deliver, properties, body):
print(f'Message received: {body}')
channel.basic_ack(delivery_tag=basic_deliver.delivery_tag)
You can assume each consumer server does something similar to:
c = Consumer('myuniquequeue') # each consumer has a permanent queue name
c.consume()
Edit - adding publisher code:
def publish(message):
credentials = pika.PlainCredentials(
username='myuser', password='mypassword')
connection = pika.BlockingConnection(
pika.ConnectionParameters(host='myhost', credentials=credentials))
channel = connection.channel()
channel.exchange_declare(exchange='myexchange', exchange_type='topic')
channel.confirm_delivery()
success = channel.basic_publish(
exchange='myexchange',
routing_key='my.route',
body=message,
properties=pika.BasicProperties(
delivery_mode=2, # make message persistent
),
mandatory=True
)
if success:
print("Message sent")
else:
print("Could not send message")
# Save for sending later
It is worth saying that I am handling the error case on my own, and it is not the part I would like to improve. When my messages get lost to some of the consumers the flow goes through the success section
Use basic.ack(delivery_tag=basic_deliver.delivery_tag) in your consumer callback method. This acknowledgement tells whether the consumer has received a message and processed it or not. If it's a negative acknowledgement, the message will be requeued.
Edit #1
In order to receive messages during broker crash, the broker needs to be distributed. It is a concept called Mirrored Queues in RabbitMQ. Mirrored Queues lets your queues to be replicated across the nodes in your cluster. If one of the nodes containing the queue goes down, the other node containing the queue will act as your broker.
For complete understanding refer this Mirrored Queues

How to get messages published to Redis before subscribing to the channel

I am writing an application to get messages published to a channel in Redis and process them. It is a long-lived app which never basically listens to the channel.
def msg_handler():
r = redis.client.StrictRedis(host='localhost', port=6379, db=0)
sub = r.pubsub()
sub.subscribe(settings.REDIS_CHANNEL)
while True:
msg = sub.get_message()
if msg:
if msg['type'] == 'message':
print(msg)
def main():
for i in range(3):
t = threading.Thread(target=msg_handler, name='worker-%s' % i)
print('thread {}'.format(i))
t.setDaemon(True)
t.start()
while True:
print('Waiting')
time.sleep(1)
When I run this program, I notice that it is not getting messages that were published to the channel before the the program started. It works fine to get messages sent to the channel after the app subscribing to the channel.
In production, it is very likely there are some messages in the channel before the program starts. Is there a way to get these old messages?
Redis PUB/SUB does not store messages that were published. It sends them to who was listening at the moment. If you need to have access to old messages, you can:
Use Redis Streams. They are in beta now and coming for version 5.
Use another PUBSUB system, for example nats.io

Categories

Resources