How can i send large message to kafka producer using python?

How can i send large message to kafka producer using python? - python

if I send the largest Json to the Kafka server it will show this kind of error, How can I increase message.max.bytes=15728640 and replica.fetch.max.bytes=15728640 in Kafka. I tried to increase byte level as below it won't work
The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=15728640
The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=15728640
Error:=>
[2022-01-06 12:36:51,281] [9015] [ERROR] [^-App]: Crashed reason=ProducerSendError("Error while sending: MessageSizeTooLargeError('The message is 6677420 bytes when serialized which is larger than the maximum request size you have configured with the max_request_size configuration',)",)
Traceback (most recent call last):
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/faust/transport/drivers/aiokafka.py", line 1059, in send
transactional_id=transactional_id,
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/aiokafka/producer/producer.py", line 310, in send
key_bytes, value_bytes = self._serialize(topic, key, value)
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/aiokafka/producer/producer.py", line 231, in _serialize
" max_request_size configuration" % message_size)
kafka.errors.MessageSizeTooLargeError: [Error 10] MessageSizeTooLargeError: The message is 6677420 bytes when serialized which is larger than the maximum request size you have configured with the max_request_size configuration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/mode/services.py", line 779, in _execute_task
await task
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/faust/app/base.py", line 941, in _wrapped
return await task()
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/faust/app/base.py", line 991, in around_timer
await fun(*args)
File "/home/twilightuser/faust_library/producer.py", line 14, in my_send
await topic.send(value=value)
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/faust/topics.py", line 193, in send
callback=callback,
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/faust/channels.py", line 303, in _send_now
schema, key_serializer, value_serializer, callback))
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/faust/topics.py", line 417, in publish_message
headers=headers,
File "/home/twilightuser/faust_library/venv/lib/python3.6/site-packages/faust/transport/drivers/aiokafka.py", line 1062, in send
raise ProducerSendError(f'Error while sending: {exc!r}') from exc
faust.exceptions.ProducerSendError: Error while sending: MessageSizeTooLargeError('The message is 6677420 bytes when serialized which is larger than the maximum request size you have configured with the max_request_size configuration',)

Related

TelegramAPIError: Bad Gateway (aiogram)

I create a bot that notifies the user at certain times, but from time to time gives this error
dispatcher.py [LINE:390] ERROR | 2022-10-03 04:10:16,846 : Cause exception while getting updates.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/aiogram/dispatcher/dispatcher.py", line 381, in start_polling
updates = await self.bot.get_updates(
File "/usr/local/lib/python3.8/dist-packages/aiogram/bot/bot.py", line 110, in get_updates
result = await self.request(api.Methods.GET_UPDATES, payload)
File "/usr/local/lib/python3.8/dist-packages/aiogram/bot/base.py", line 231, in request
return await api.make_request(await self.get_session(), self.server, self.__token, method, data, files,
File "/usr/local/lib/python3.8/dist-packages/aiogram/bot/api.py", line 140, in make_request
return check_result(method, response.content_type, response.status, await response.text())
File "/usr/local/lib/python3.8/dist-packages/aiogram/bot/api.py", line 128, in check_result
raise exceptions.TelegramAPIError(description)
aiogram.utils.exceptions.TelegramAPIError: Bad Gateway
I think this problem is on the telegram side itself and is solved via webhook, but I don't want to use them.

Trying to get AIOKafka to work with self-signed cert (Python)

I've been banging my head against my keyboard for a day now, so I'm giving up and asking for help.
I've got a working consumer using confluence kafka, but I need to make it run as a coroutine so I can get things working with FastAPI. I really wanted to try out AIOKafka for this, but for the life of me, I can't get it to work with a self-signed certificate (this is in our dev env).
Here is the working config for my confluence kafka consumer:
conf = {
"bootstrap.servers": "10.142.252.214:9093",
"group.id": "myConsumerID",
"security.protocol": "SASL_SSL",
"sasl.username": kafkaUser,
"sasl.password": kafkaPass,
"sasl.mechanisms": "PLAIN",
"enable.ssl.certificate.verification": "False",
"on_commit": commit_completed,
"heartbeat.interval.ms": "1000",
"socket.connection.setup.timeout.ms": "10000",
"auto.offset.reset": "earliest"
}
Here is the code I'm trying to use for AIOKafka
async def consume():
cert = "../foo/cert/certificate.pem"
key = "../foo/cert/key.pem"
context2 = ssl.create_default_context()
context2.load_cert_chain(certfile=cert, keyfile=key)
context2.check_hostname = False
context2.verify_mode = CERT_NONE
#context2.ssl_cafile="../foo/cert/CARoot.pem"
context2.ssl_certfile = "cert.pem"
context2.ssl_keyfile = "key.pem"
context2.ssl_password = kafkaKey
context2.ssl_keystore_type = "PEM"
consumer = AIOKafkaConsumer(
'TopicA', 'TopicB',
bootstrap_servers="10.142.252.214:9093",
group_id="myConsumerGroup",
sasl_plain_username="kafkaUser",
sasl_plain_password="kafkaPass",
sasl_mechanism="PLAIN",
security_protocol="SASL_SSL",
ssl_context=context2)
await consumer.start()
try:
# Consume messages
async for msg in consumer:
print("consumed: ", msg.topic, msg.partition, msg.offset,
msg.key, msg.value, msg.timestamp)
finally:
# Will leave consumer group; perform autocommit if enabled.
await consumer.stop()
When I try to run this, I just get the most cryptic errors ever and I can't make any sense on where to start trying to figure out what's wrong.
$ python test-main.py
Traceback (most recent call last):
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/site-packages/aiokafka/conn.py", line 375, in _on_read_task_error
read_task.result()
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/site-packages/aiokafka/conn.py", line 518, in _read
resp = await reader.readexactly(4)
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/asyncio/streams.py", line 706, in readexactly
raise exceptions.IncompleteReadError(incomplete, n)
asyncio.exceptions.IncompleteReadError: 0 bytes read on a total of 4 expected bytes
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/myUser/scripts/ansible-hello-world/test-main.py", line 165, in <module>
asyncio.run(consume())
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
return future.result()
File "/Users/myUser/scripts/ansible-hello-world/test-main.py", line 155, in consume
await consumer.start()
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/site-packages/aiokafka/consumer/consumer.py", line 346, in start
await self._client.bootstrap()
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/site-packages/aiokafka/client.py", line 210, in bootstrap
bootstrap_conn = await create_conn(
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/site-packages/aiokafka/conn.py", line 96, in create_conn
await conn.connect()
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/site-packages/aiokafka/conn.py", line 234, in connect
await self._do_sasl_handshake()
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/site-packages/aiokafka/conn.py", line 314, in _do_sasl_handshake
auth_bytes = await self._send_sasl_token(
File "/Users/myUser/.pyenv/versions/3.10.5/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
kafka.errors.KafkaConnectionError: KafkaConnectionError: Connection at 10.142.252.214:9093 closed
Unclosed AIOKafkaConsumer
consumer: <aiokafka.consumer.consumer.AIOKafkaConsumer object at 0x107338670>

s3fs timeout on big S3 files

This is similar to dask read_csv timeout on Amazon s3 with big files, but that didn't actually resolve my question.
import s3fs
fs = s3fs.S3FileSystem()
fs.connect_timeout = 18000
fs.read_timeout = 18000 # five hours
fs.download('s3://bucket/big_file','local_path_to_file')
The error I then get is
Traceback (most recent call last):
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/aiobotocore/response.py", line 50, in read
chunk = await self.__wrapped__.read(amt if amt is not None else -1)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/aiohttp/streams.py", line 380, in read
await self._wait("read")
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/aiohttp/streams.py", line 306, in _wait
await waiter
aiohttp.client_exceptions.ServerTimeoutError: Timeout on reading data from socket
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/spec.py", line 1113, in download
return self.get(rpath, lpath, recursive=recursive, **kwargs)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/asyn.py", line 281, in get
return sync(self.loop, self._get, rpaths, lpaths)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/asyn.py", line 71, in sync
raise exc.with_traceback(tb)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/asyn.py", line 55, in f
result[0] = await future
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/fsspec/asyn.py", line 266, in _get
return await asyncio.gather(
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/s3fs/core.py", line 701, in _get_file
chunk = await body.read(2**16)
File "/Users/christopherturnbull/PointTopic/PointTopic/lib/python3.9/site-packages/aiobotocore/response.py", line 52, in read
raise AioReadTimeoutError(endpoint_url=self.__wrapped__.url,
aiobotocore.response.AioReadTimeoutError: Read timeout on endpoint URL: "https://ptpiskiss.s3.eu-west-1.amazonaws.com/REBUILD%20FOR%20TIME%20SERIES/v30a%20sept%202019.accdb"
Which is strange, because I thought I was setting the appropriate timeouts on the worker copy of the class. It's solely due to my bad internet connection, but is there something I need to do on my s3 end to assist here?

How to add retry for celery backend connection?

I am using celery 5.0.1 and using CELERY_BACKEND_URL as redis://:password#redisinstance1:6379/0. It works fine, but when there is a Redis instance loose connection, it breaks out tasks with an error.
Exception: Error while reading from socket: (104, 'Connection reset by peer')
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/redis/connection.py", line 198, in _read_from_socket
data = recv(self._sock, socket_read_size)
File "/usr/local/lib/python3.7/dist-packages/redis/_compat.py", line 72, in recv
return sock.recv(*args, **kwargs)
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/celery/app/trace.py", line 477, in trace_task
uuid, retval, task_request, publish_result,
File "/usr/local/lib/python3.7/dist-packages/celery/backends/base.py", line 154, in mark_as_done
self.store_result(task_id, result, state, request=request)
File "/usr/local/lib/python3.7/dist-packages/celery/backends/base.py", line 439, in store_result
request=request, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/celery/backends/base.py", line 855, in _store_result
current_meta = self._get_task_meta_for(task_id)
File "/usr/local/lib/python3.7/dist-packages/celery/backends/base.py", line 873, in _get_task_meta_for
meta = self.get(self.get_key_for_task(task_id))
File "/usr/local/lib/python3.7/dist-packages/celery/backends/redis.py", line 346, in get
return self.client.get(key)
File "/usr/local/lib/python3.7/dist-packages/redis/client.py", line 1606, in get
return self.execute_command('GET', name)
File "/usr/local/lib/python3.7/dist-packages/redis/client.py", line 901, in execute_command
return self.parse_response(conn, command_name, **options)
File "/usr/local/lib/python3.7/dist-packages/redis/client.py", line 915, in parse_response
response = connection.read_response()
File "/usr/local/lib/python3.7/dist-packages/redis/connection.py", line 739, in read_response
response = self._parser.read_response()
File "/usr/local/lib/python3.7/dist-packages/redis/connection.py", line 324, in read_response
raw = self._buffer.readline()
File "/usr/local/lib/python3.7/dist-packages/redis/connection.py", line 256, in readline
self._read_from_socket()
File "/usr/local/lib/python3.7/dist-packages/redis/connection.py", line 223, in _read_from_socket
(ex.args,))
redis.exceptions.ConnectionError: Error while reading from socket: (104, 'Connection reset by peer')
Celery worker: None
Celery task id: 244b56af-7c96-56cf-a01a-9256cfd98ade
Celery retry attempt: 0
Task args: []
Task kwargs: {'address': 'ipadd', 'uid': 'uid', 'hexID': 'hexID', 'taskID': '244b56af-7c96-56cf-a01a-9256cfd98ade'}
When I run the second tasks, it works fine, there is some glitch in the connection for a short period of time.
Can I set something by which, when celery tries to update the results to Redis, if it returns an error, it will retry after 2-5 seconds?
I know how to set retry in the task, but this does not task failure. My tasks work fine and it returns the data, but celery is losing connection while updating to the backend.

To deal with connection timeouts you can have the following in your Celery configuration:
app.conf.broker_transport_options = {
'retry_policy': {
'timeout': 5.0
}
}
app.conf.result_backend_transport_options = {
'retry_policy': {
'timeout': 5.0
}
}
There are few other Redis backend settings that you may want to consider having in your configuration, like the redis_retry_on_timeout for an example.

Surviving icinga2 restart in a python requests stream

I have been working on a chatbot interface to icinga2, and have not found a persistent way to survive the restart/reload of the icinga2 server. After a week of moving try/except blocks, using requests sessions, et al, it's time to reach out to the community.
Here is the current iteration of the request function:
def i2api_request(url, headers={}, data={}, stream=False, *, auth=api_auth, ca=api_ca):
''' Do not call this function directly; it's a helper for the i2* command functions '''
# Adapted from http://docs.icinga.org/icinga2/latest/doc/module/icinga2/chapter/icinga2-api
# Section 11.10.3.1
try:
r = requests.post(url,
headers=headers,
auth=auth,
data=json.dumps(data),
verify=ca,
stream=stream
)
except (requests.exceptions.ChunkedEncodingError,requests.packages.urllib3.exceptions.ProtocolError, http.client.IncompleteRead,ValueError) as drop:
return("No connection to Icinga API")
if r.status_code == 200:
for line in r.iter_lines():
try:
if stream == True:
yield(json.loads(line.decode('utf-8')))
else:
return(json.loads(line.decode('utf-8')))
except:
debug("Could not produce JSON from "+line)
continue
else:
#r.raise_for_status()
debug('Received a bad response from Icinga API: '+str(r.status_code))
print('Icinga2 API connection lost.')
(The debug function just flags and prints the indicated error to the console.)
This code works fine handling events from the API and sending them to the chatbot, but if the icinga server is reloaded, as would be needed after adding a new server definition in /etc/icinga2..., the listener crashes.
Here is the error response I get when the server is restarted:
Exception in thread Thread-11:
Traceback (most recent call last):
File "/home/errbot/err3/lib/python3.4/site-packages/requests/packages/urllib3/response.py", line 447, in _update_chunk_length
self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/errbot/err3/lib/python3.4/site-packages/requests/packages/urllib3/response.py", line 228, in _error_catcher
yield
File "/home/errbot/err3/lib/python3.4/site-packages/requests/packages/urllib3/response.py", line 498, in read_chunked
self._update_chunk_length()
File "/home/errbot/err3/lib/python3.4/site-packages/requests/packages/urllib3/response.py", line 451, in _update_chunk_length
raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/errbot/err3/lib/python3.4/site-packages/requests/models.py", line 664, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/home/errbot/err3/lib/python3.4/site-packages/requests/packages/urllib3/response.py", line 349, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "/home/errbot/err3/lib/python3.4/site-packages/requests/packages/urllib3/response.py", line 526, in read_chunked
self._original_response.close()
File "/usr/lib64/python3.4/contextlib.py", line 77, in __exit__
self.gen.throw(type, value, traceback)
File "/home/errbot/err3/lib/python3.4/site-packages/requests/packages/urllib3/response.py", line 246, in _error_catcher
raise ProtocolError('Connection broken: %r' % e, e)
requests.packages.urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib64/python3.4/threading.py", line 920, in _bootstrap_inner
self.run()
File "/usr/lib64/python3.4/threading.py", line 868, in run
self._target(*self._args, **self._kwargs)
File "/home/errbot/plugins/icinga2bot.py", line 186, in report_events
for line in queue:
File "/home/errbot/plugins/icinga2bot.py", line 158, in i2events
for line in queue:
File "/home/errbot/plugins/icinga2bot.py", line 98, in i2api_request
for line in r.iter_lines():
File "/home/errbot/err3/lib/python3.4/site-packages/requests/models.py", line 706, in iter_lines
for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):
File "/home/errbot/err3/lib/python3.4/site-packages/requests/models.py", line 667, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
With Icinga2.4, this crash happened every time the server was restarted. I thought the problem had gone away after we upgraded to 2.5, but it now appears to have turned into a heisenbug.

I wound up getting advice on IRC to reorder the try/except blocks and make sure they were in the right places. Here's the working result.
def i2api_request(url, headers={}, data={}, stream=False, *, auth=api_auth, ca=api_ca):
''' Do not call this function directly; it's a helper for the i2* command functions '''
# Adapted from http://docs.icinga.org/icinga2/latest/doc/module/icinga2/chapter/icinga2-api
# Section 11.10.3.1
debug(url)
debug(headers)
debug(data)
try:
r = requests.post(url,
headers=headers,
auth=auth,
data=json.dumps(data),
verify=ca,
stream=stream
)
debug("Connecting to Icinga server")
debug(r)
if r.status_code == 200:
try:
for line in r.iter_lines():
debug('in i2api_request: '+str(line))
try:
if stream == True:
yield(json.loads(line.decode('utf-8')))
else:
return(json.loads(line.decode('utf-8')))
except:
debug("Could not produce JSON from "+line)
return("Could not produce JSON from "+line)
except (requests.exceptions.ChunkedEncodingError,ConnectionRefusedError):
return("Connection to Icinga lost.")
else:
debug('Received a bad response from Icinga API: '+str(r.status_code))
print('Icinga2 API connection lost.')
except (requests.exceptions.ConnectionError,
requests.packages.urllib3.exceptions.NewConnectionError) as drop:
debug("No connection to Icinga API. Error received: "+str(drop))
sleep(5)
return("No connection to Icinga API.")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can i send large message to kafka producer using python? - python

Related

TelegramAPIError: Bad Gateway (aiogram)

Trying to get AIOKafka to work with self-signed cert (Python)

s3fs timeout on big S3 files

How to add retry for celery backend connection?

Surviving icinga2 restart in a python requests stream

Categories

Resources