python and kafka KeyError: -1 - python

Python kafka does not work. Any reason why? I mean I cant event connect to kafka and i get an error? Is there a kafka client that works with 0.8? I mean this is a brand new server. I just booted it up.
I am using https://github.com/mumrah/kafka-python
from kafka.client import KafkaClient
kafka = KafkaClient(kafka_domain, 9092)
Traceback (most recent call last):
File "/home/ubuntu/workspace/rtbhui-devops/simulations/pixel_druid_simulations.py", line 36, in <module>
kafka = KafkaClient(kafka_domain, 9092)
File "/usr/local/lib/python2.7/dist-packages/kafka/client.py", line 38, in __init__
self.load_metadata_for_topics() # bootstrap with all metadata
File "/usr/local/lib/python2.7/dist-packages/kafka/client.py", line 247, in load_metadata_for_topics
self.topics_to_brokers[topic_part] = brokers[meta.leader]
KeyError: -1
In kafka logs I see the below.
[2014-02-26 08:36:21,471] INFO Closing socket connection to /222.127.xxx.xxx. (kafka.network.Processor)
[2014-02-26 08:40:30,801] ERROR [KafkaApi-1393401480] Error while fetching metadata for partition [topic-pixel,0] (kafka.server.KafkaApis)
kafka.common.LeaderNotAvailableException: Leader not available for partition [topic-pixel,0]
at kafka.server.KafkaApis$$anonfun$17$$anonfun$20.apply(KafkaApis.scala:474)
at kafka.server.KafkaApis$$anonfun$17$$anonfun$20.apply(KafkaApis.scala:462)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.List.map(List.scala:45)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:462)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:458)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at scala.collection.immutable.Set$Set1.foreach(Set.scala:81)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.Set$Set1.map(Set.scala:68)
at kafka.server.KafkaApis.handleTopicMetadataRequest(KafkaApis.scala:458)
at kafka.server.KafkaApis.handle(KafkaApis.scala:68)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:42)
at java.lang.Thread.run(Thread.java:744)

This is fixed in kafka-python version 0.9.0. The error is internal to Kafka Python, not the kafka server, and simply involves handling partitions that are currently without a leader (which happens for any new topic when you auto-create topics, but other than that is fairly rare under normal operations)
See
https://github.com/mumrah/kafka-python/pull/109

Related

PyFlink - RabbitMQ sink : A serializer has already been registered for the state; re-registration is not allowed

In PyFlink coding with Python, I am using Flink 1.15.2 and I source messages from RabbitMQ with the following connector: flink-sql-connector-rabbitmq-1.15.2.jar
However, when I try to sink to RabbitMQ with this code, following this link: https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/rabbitmq/#installing-rabbitmq
stream.add_sink(RMQSink(
connection_config, # config for the RabbitMQ connection
'queueName', # name of the RabbitMQ queue to send messages to
SimpleStringSchema()))
I got the following error trace:
File "/home/ali/.virtualenvs/LAB_920_log_parser_more_investigation-DQLOhTET/lib/python3.8/site-packages/grpc/_channel.py", line 826, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.CANCELLED
details = "Multiplexer hanging up"
debug_error_string = "{"created":"#1662371359.807069114","description":"Error received from peer ipv6:[::1]:44295","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"Multiplexer hanging up","grpc_status":1}"
>
Traceback (most recent call last):
File "/home/ali/.virtualenvs/LAB_920_log_parser_more_investigation-DQLOhTET/lib/python3.8/site-packages/apache_beam/runners/worker/sdk_worker.py", line 289, in _execute
response = task()
and more logs here:
RuntimeError: java.lang.UnsupportedOperationException: A serializer has already been registered for the state; re-registration is not allowed.
at org.apache.flink.runtime.state.StateSerializerProvider$EagerlyRegisteredStateSerializerProvider.registerNewSerializerForRestoredState(StateSerializerProvider.java:344)
at org.apache.flink.runtime.state.RegisteredKeyValueStateBackendMetaInfo.updateNamespaceSerializer(RegisteredKeyValueStateBackendMetaInfo.java:132)
Thanks for you help.
I fixed this issue by adding this maping to String as the original datastream contains Tuples:
ds_string = stream.map(lambda tuple: tuple[0]+str(tuple[0]), output_type=Types.STRING())

Python Kafka client cannot connect to remote Kafka server

I have an Ubuntu VM on cloud, where I downloaded Kafka version 2.8.1 from the Official Kafka site and followed the instructions in Kafka's official quickstart guide.
I am using a python client to consume one of the topics that I created as part of the quickstart guide. When I run it on the VM, everything runs fine, however, when I run the same program on my local system, I get the below error
Unable connect to node with id 0: [Errno 8] nodename nor servname provided, or not known
Traceback (most recent call last):
...
...
File "/Path/python3.9/site-packages/aiokafka/client.py", line 547, in check_version
raise KafkaConnectionError(
kafka.errors.KafkaConnectionError: KafkaConnectionError: No connection to node with id 0
The python program I am using:
import asyncio
import aiokafka
async def consume(self):
consumer = aiokafka.AIOKafkaConsumer(
"quickstart-events", bootstrap_servers="IP:9092"
)
try:
await consumer.start()
async for msg in self.consumer:
print(
"consumed: ",
msg.topic,
msg.partition,
msg.offset,
msg.key,
msg.value,
msg.timestamp,
)
finally:
await consumer.stop()
asyncio.run(consume())
I have ensured that the necessary ports (9022) on Ubuntu is open -
I checked that I could telnet into port 9022 from my local system.
I am not sure what could be the reason that I am unable to access Kafka over internet. Am I missing something obvious?
Change the following attribute in config/server.properties to bootstrap server address you are using in your code.
advertised.listeners = PLAINTEXT://IP or FQDN:9092

Python/P2P - Unable to connect to rendezvous server

I am trying to create a P2P node using python (pyp2p) but I am getting this error:
Eamons-MacBook-Pro:blockchain eamonwhite$ python3 serveralice.py
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
Traceback (most recent call last):
File "/Users/eamonwhite/.pyenv/versions/3.6.3/lib/python3.6/site-packages/pyp2p/net.py", line 732, in start
rendezvous_con = self.rendezvous.server_connect()
File "/Users/eamonwhite/.pyenv/versions/3.6.3/lib/python3.6/site-packages/pyp2p/rendezvous_client.py", line 92, in server_connect
con.connect(server["addr"], server["port"])
File "/Users/eamonwhite/.pyenv/versions/3.6.3/lib/python3.6/site-packages/pyp2p/sock.py", line 189, in connect
self.s.bind((src_ip, 0))
TypeError: str, bytes or bytearray expected, not NoneType
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "serveralice.py", line 10, in <module>
alice.start()
File "/Users/eamonwhite/.pyenv/versions/3.6.3/lib/python3.6/site-packages/pyp2p/net.py", line 735, in start
raise Exception("Unable to connect to rendezvous server.")
Exception: Unable to connect to rendezvous server.
My relevant code looks like this:
from uuid import uuid4
from blockchain import Blockchain
from flask import Flask, jsonify, request
from pyp2p.net import *
import time
#Setup Alice's p2p node.
alice = Net(passive_bind="192.168.1.131", passive_port=44444, interface="en0", node_type="passive", debug=1)
alice.start()
alice.bootstrap()
alice.advertise()
while 1:
for con in alice:
for reply in con:
print(reply)
time.sleep(1)
...
It is getting stuck on the Net function right at the beginning - something to do with the rendezvous package. The IP is my IP on the my network, and I port forwarded 44444 although I'm not sure if I need to do that or not. Thanks.
I am new to this, apparently with the way the server code was configured, it needed a rendezvous server to work (a node that handles all the other nodes). It is in net.py of the pyp2p package:
# Bootstrapping + TCP hole punching server.
rendezvous_servers = [
{
"addr": "162.243.213.95",
"port": 8000
}
]
The address was the problem, obviously it is just a placeholder IP. So then I realized I needed my own rendezvous server, and I used this code - https://raw.githubusercontent.com/StorjOld/pyp2p/master/pyp2p/rendezvous_server.py.
However I had to debug this file a little, it ended up needing to have import sys, import time and import re statements at the top before it would work. Now I am going to host it on my raspberry pi so that it is always up to handle nodes :)

Redis Clusteirng with Python: StirctRedisCluster pubsub.subcribe gives KeyError

I have been trying to configure Redis Cluster with my python wepapp (which is using single redis instance currently)
Since redis clustering is supported from 3.x, upgraded the redis server version to 3.0.7
Using redis-py-clusterV1.2.0 module in my python app.
PubSub command to publish and subscribe to a particular channel does not seem to be working with the KeyError as mentioned below in the stack trace. It would be great if somebody can help on this.
from rediscluster import StrictRedisCluster
startup_nodes = [{"host": "127.0.0.1", "port": "6379"}]
redis_conn = StrictRedisCluster(startup_nodes=startup_nodes,
decode_responses=True)
redis_conn.pubsub()
pubsub.subscribe(['a_b:c:d']) //tried with diff keys, getting the same error
Stack Trace of the Error:
File "/home/<username>/.pex/install/redis-2.10.2-py2-none-
any.whl.621ec5075459e032e02a8654e15c5ca320969b0b/redis-2.10.2-py2-none-
any.whl/redis/client.py",line 2196, in subscribe
ret_val = self.execute_command('SUBSCRIBE', *iterkeys(new_channels))
File "/home/<username>/.pex/install/redis_py_cluster-1.2.0-py2-none-
any.whl.5a485619e4e267bf90b972cf0e736baba918e3cc/redis_py_cluster-1.2.0-py2-
none-any.whl/rediscluster/pubsub.py",line 29, in execute_command
channel=args[1],
File "/home/<username>/.pex/install/redis_py_cluster-1.2.0-py2-none-
any.whl.5a485619e4e267bf90b972cf0e736baba918e3cc/redis_py_cluster-1.2.0-py2-
none-any.whl/rediscluster/connection.py",line 141, in get_connection
node = self.get_master_node_by_slot(slot)
File "/home/<username>/.pex/install/redis_py_cluster-1.2.0-py2-none-
any.whl.5a485619e4e267bf90b972cf0e736baba918e3cc/redis_py_cluster-1.2.0-py2-
none-any.whl/rediscluster/connection.py",line 267, inget_master_node_by_slot
return self.nodes.slots[slot][0]
KeyError: 7226
When I debug , got to know self.nodes.slots[] seems to be empty. No clues why I get this error.
Thanks in Advance
Priya

ERROR in using python to read data from hbase through thrift

I am trying to read data from hbase use python. I installed thrift, and generated the gen-py files then moved it to the python lib:~/.local/lib/python2.7/site-packages/
The hbase thrift server is in 192.168.15.116:39090. Is was started.
my code is in the server 192.168.15.146. In this machine, I can use hbase shell command to read the hbase data.
Here is my Python code:
#! /usr/bin/env python
from thrift import Thrift
from thrift.transport import TSocket, TTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase
transport = TSocket.TSocket('192.168.15.116', 39090)
transport.setTimeout(5000)
#transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
transport.open()
print(client.getTableNames())
It was all ok but the last line, here is the error:
Traceback (most recent call last):
File "test01.py", line 24, in <module>
client.getTableNames()
File "~/.local/lib/python2.7/site-packages/hbase/Hbase.py", line 788, in getTableNames
return self.recv_getTableNames()
File "~/.local/lib/python2.7/site-packages/hbase/Hbase.py", line 803, in recv_getTableNames
raise x
thrift.Thrift.TApplicationException: Internal error processing getTableNames
I googled it but could'nt find the way to solve this ERRO, could anyone give me some help?
Thanks in advance!
I'm not sure but after having a quick look in the code it seems that the handler/processor code on the Server throws an unexpected exception. Unexpected meaning: neither a thrift one, nor one of the declared exceptions in the IDL file.
In that case the recommendation would be to look at HBase, whether this is a known issue, etc.
Please check your thrift host and Port, you can use happybase, example:
import happybase
connection = happybase.Connection(host='hadoop_env.com',port=9090,timeout=1000000)
connection.open()
print connection.tables()
There is another series of example:
https://github.com/Shadow-Hunter-X/python_practice_stepbystep/blob/master/python-on-bigdata/chapter6/chapter6_happybase.py

Categories

Resources