ERROR in using python to read data from hbase through thrift - python

I am trying to read data from hbase use python. I installed thrift, and generated the gen-py files then moved it to the python lib:~/.local/lib/python2.7/site-packages/
The hbase thrift server is in 192.168.15.116:39090. Is was started.
my code is in the server 192.168.15.146. In this machine, I can use hbase shell command to read the hbase data.
Here is my Python code:
#! /usr/bin/env python
from thrift import Thrift
from thrift.transport import TSocket, TTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase
transport = TSocket.TSocket('192.168.15.116', 39090)
transport.setTimeout(5000)
#transport = TTransport.TBufferedTransport(transport)
protocol = TBinaryProtocol.TBinaryProtocol(transport)
client = Hbase.Client(protocol)
transport.open()
print(client.getTableNames())
It was all ok but the last line, here is the error:
Traceback (most recent call last):
File "test01.py", line 24, in <module>
client.getTableNames()
File "~/.local/lib/python2.7/site-packages/hbase/Hbase.py", line 788, in getTableNames
return self.recv_getTableNames()
File "~/.local/lib/python2.7/site-packages/hbase/Hbase.py", line 803, in recv_getTableNames
raise x
thrift.Thrift.TApplicationException: Internal error processing getTableNames
I googled it but could'nt find the way to solve this ERRO, could anyone give me some help?
Thanks in advance!

I'm not sure but after having a quick look in the code it seems that the handler/processor code on the Server throws an unexpected exception. Unexpected meaning: neither a thrift one, nor one of the declared exceptions in the IDL file.
In that case the recommendation would be to look at HBase, whether this is a known issue, etc.

Please check your thrift host and Port, you can use happybase, example:
import happybase
connection = happybase.Connection(host='hadoop_env.com',port=9090,timeout=1000000)
connection.open()
print connection.tables()
There is another series of example:
https://github.com/Shadow-Hunter-X/python_practice_stepbystep/blob/master/python-on-bigdata/chapter6/chapter6_happybase.py

Related

Executing sparql query from python to virtuoso server in linux?

I am having problem for running the following program (sparql_test.py). I am running it from Linux machine. I am installing Virtuoso server in the same Linux machine. In the Linux server, I don't have sudo permission nor browser access. But, I can execute SPARQL query from isql prompt (SQL>) successfully.
Program: sparql_test.py
from SPARQLWrapper import SPARQLWrapper, JSON
sparql = SPARQLWrapper("http://localhost:8890/sparql")
sparql.setQuery("select ?s where { ?s a <http://ehrofip.com/data/Admissions>.} limit 10")
sparql.setReturnFormat(JSON)
result = sparql.query().convert()
for res in result["results"]["bindings"]:
print(res)
I got the following error:
[suresh#deodar complex2vec]$ python sparql_test.py
Traceback (most recent call last):
File "sparql1.py", line 14, in "<module>"
result = sparql.query().convert()
File "/home/suresh/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 687, in query
return QueryResult(self._query())
File "/home/suresh/.local/lib/python2.7/site-packages/SPARQLWrapper/Wrapper.py", line 667, in _query
raise e
urllib2.HTTPError: HTTP Error 502: Bad Gateway
However, the above program run smoothly in my own laptop. What might be the problem? Is this issue of connection?
Thank you
Best,
Suresh
I do not believe this error is raised by Virtuoso. I believe it is raised by SPARQLWrapper.
It looks like there's something between the outside world (which includes the Linux machine itself) and the Virtuoso listener on port 8890. The "Bad Gateway" suggests there may be two things -- a reverse proxy, and a firewall.
Port 8890 (set as [HttpServer]:Listen in the INI file) must be open to communications, direct or proxied, for SPARQL access to work.
iSQL talks to port 1111 (set as [Parameters]:Listen in the INI file), which apparently doesn't have a similar block/proxy.

Python/P2P - Unable to connect to rendezvous server

I am trying to create a P2P node using python (pyp2p) but I am getting this error:
Eamons-MacBook-Pro:blockchain eamonwhite$ python3 serveralice.py
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
Traceback (most recent call last):
File "/Users/eamonwhite/.pyenv/versions/3.6.3/lib/python3.6/site-packages/pyp2p/net.py", line 732, in start
rendezvous_con = self.rendezvous.server_connect()
File "/Users/eamonwhite/.pyenv/versions/3.6.3/lib/python3.6/site-packages/pyp2p/rendezvous_client.py", line 92, in server_connect
con.connect(server["addr"], server["port"])
File "/Users/eamonwhite/.pyenv/versions/3.6.3/lib/python3.6/site-packages/pyp2p/sock.py", line 189, in connect
self.s.bind((src_ip, 0))
TypeError: str, bytes or bytearray expected, not NoneType
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "serveralice.py", line 10, in <module>
alice.start()
File "/Users/eamonwhite/.pyenv/versions/3.6.3/lib/python3.6/site-packages/pyp2p/net.py", line 735, in start
raise Exception("Unable to connect to rendezvous server.")
Exception: Unable to connect to rendezvous server.
My relevant code looks like this:
from uuid import uuid4
from blockchain import Blockchain
from flask import Flask, jsonify, request
from pyp2p.net import *
import time
#Setup Alice's p2p node.
alice = Net(passive_bind="192.168.1.131", passive_port=44444, interface="en0", node_type="passive", debug=1)
alice.start()
alice.bootstrap()
alice.advertise()
while 1:
for con in alice:
for reply in con:
print(reply)
time.sleep(1)
...
It is getting stuck on the Net function right at the beginning - something to do with the rendezvous package. The IP is my IP on the my network, and I port forwarded 44444 although I'm not sure if I need to do that or not. Thanks.
I am new to this, apparently with the way the server code was configured, it needed a rendezvous server to work (a node that handles all the other nodes). It is in net.py of the pyp2p package:
# Bootstrapping + TCP hole punching server.
rendezvous_servers = [
{
"addr": "162.243.213.95",
"port": 8000
}
]
The address was the problem, obviously it is just a placeholder IP. So then I realized I needed my own rendezvous server, and I used this code - https://raw.githubusercontent.com/StorjOld/pyp2p/master/pyp2p/rendezvous_server.py.
However I had to debug this file a little, it ended up needing to have import sys, import time and import re statements at the top before it would work. Now I am going to host it on my raspberry pi so that it is always up to handle nodes :)

Redis Clusteirng with Python: StirctRedisCluster pubsub.subcribe gives KeyError

I have been trying to configure Redis Cluster with my python wepapp (which is using single redis instance currently)
Since redis clustering is supported from 3.x, upgraded the redis server version to 3.0.7
Using redis-py-clusterV1.2.0 module in my python app.
PubSub command to publish and subscribe to a particular channel does not seem to be working with the KeyError as mentioned below in the stack trace. It would be great if somebody can help on this.
from rediscluster import StrictRedisCluster
startup_nodes = [{"host": "127.0.0.1", "port": "6379"}]
redis_conn = StrictRedisCluster(startup_nodes=startup_nodes,
decode_responses=True)
redis_conn.pubsub()
pubsub.subscribe(['a_b:c:d']) //tried with diff keys, getting the same error
Stack Trace of the Error:
File "/home/<username>/.pex/install/redis-2.10.2-py2-none-
any.whl.621ec5075459e032e02a8654e15c5ca320969b0b/redis-2.10.2-py2-none-
any.whl/redis/client.py",line 2196, in subscribe
ret_val = self.execute_command('SUBSCRIBE', *iterkeys(new_channels))
File "/home/<username>/.pex/install/redis_py_cluster-1.2.0-py2-none-
any.whl.5a485619e4e267bf90b972cf0e736baba918e3cc/redis_py_cluster-1.2.0-py2-
none-any.whl/rediscluster/pubsub.py",line 29, in execute_command
channel=args[1],
File "/home/<username>/.pex/install/redis_py_cluster-1.2.0-py2-none-
any.whl.5a485619e4e267bf90b972cf0e736baba918e3cc/redis_py_cluster-1.2.0-py2-
none-any.whl/rediscluster/connection.py",line 141, in get_connection
node = self.get_master_node_by_slot(slot)
File "/home/<username>/.pex/install/redis_py_cluster-1.2.0-py2-none-
any.whl.5a485619e4e267bf90b972cf0e736baba918e3cc/redis_py_cluster-1.2.0-py2-
none-any.whl/rediscluster/connection.py",line 267, inget_master_node_by_slot
return self.nodes.slots[slot][0]
KeyError: 7226
When I debug , got to know self.nodes.slots[] seems to be empty. No clues why I get this error.
Thanks in Advance
Priya

python and kafka KeyError: -1

Python kafka does not work. Any reason why? I mean I cant event connect to kafka and i get an error? Is there a kafka client that works with 0.8? I mean this is a brand new server. I just booted it up.
I am using https://github.com/mumrah/kafka-python
from kafka.client import KafkaClient
kafka = KafkaClient(kafka_domain, 9092)
Traceback (most recent call last):
File "/home/ubuntu/workspace/rtbhui-devops/simulations/pixel_druid_simulations.py", line 36, in <module>
kafka = KafkaClient(kafka_domain, 9092)
File "/usr/local/lib/python2.7/dist-packages/kafka/client.py", line 38, in __init__
self.load_metadata_for_topics() # bootstrap with all metadata
File "/usr/local/lib/python2.7/dist-packages/kafka/client.py", line 247, in load_metadata_for_topics
self.topics_to_brokers[topic_part] = brokers[meta.leader]
KeyError: -1
In kafka logs I see the below.
[2014-02-26 08:36:21,471] INFO Closing socket connection to /222.127.xxx.xxx. (kafka.network.Processor)
[2014-02-26 08:40:30,801] ERROR [KafkaApi-1393401480] Error while fetching metadata for partition [topic-pixel,0] (kafka.server.KafkaApis)
kafka.common.LeaderNotAvailableException: Leader not available for partition [topic-pixel,0]
at kafka.server.KafkaApis$$anonfun$17$$anonfun$20.apply(KafkaApis.scala:474)
at kafka.server.KafkaApis$$anonfun$17$$anonfun$20.apply(KafkaApis.scala:462)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.List.map(List.scala:45)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:462)
at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:458)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at scala.collection.immutable.Set$Set1.foreach(Set.scala:81)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.Set$Set1.map(Set.scala:68)
at kafka.server.KafkaApis.handleTopicMetadataRequest(KafkaApis.scala:458)
at kafka.server.KafkaApis.handle(KafkaApis.scala:68)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:42)
at java.lang.Thread.run(Thread.java:744)
This is fixed in kafka-python version 0.9.0. The error is internal to Kafka Python, not the kafka server, and simply involves handling partitions that are currently without a leader (which happens for any new topic when you auto-create topics, but other than that is fairly rare under normal operations)
See
https://github.com/mumrah/kafka-python/pull/109

Python Pysftp Error

My code:
import pysftp
s = pysftp.Connection(host='test.rebex.net', username='demo', password='password')
data = s.listdir()
s.close()
for i in data:
print i
I'm getting an error trying to connect to a SFTP server using pysftp.
This should be straight forward enough but I get the error below:
Traceback (most recent call last):
File "/Users/gavinhinfey/Documents/Python Files/sftp_test.py", line 3, in <module>
s = pysftp.Connection(host='test.rebex.net', username='demo', password='password')
File "build/bdist.macosx-10.6-intel/egg/pysftp.py", line 55, in __init__
File "build/bdist.macosx-10.5-intel/egg/paramiko/transport.py", line 303, in __init__
paramiko.SSHException: Unable to connect to test.rebex.net: [Errno 60] Operation timed out
Exception AttributeError: "'Connection' object has no attribute '_tranport_live'" in <bound method Connection.__del__ of <pysftp.Connection object at 0x101a5a810>> ignored
I've tried using different versions of python (mostly 2.7), I have all dependencies installed and I tried numerous sftp connections.
I'm using OS X 10.9.1.
updating the package didn't work for me, as it was already up-to-date (latest for python 2.7 at least)
Found a better aproach here.
1) You can manualy add the ssh key to the known_hosts file
ssh test.rebex.net
2) Or you can set a flag to ignore it
import pysftp
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None # disable host key checking.
with pysftp.Connection('host', username='me',private_key=private_key,
private_key_pass=private_key_password,
cnopts=cnopts) as sftp
# do stuff here
That initial error appears to be a problem connecting with the remote server (SSHException). The second (AttributeError), is from a bug in the code that occurs when the connection fails. It is fixed in the latest version of pysftp
https://pypi.python.org/pypi/pysftp
pip install -U pysftp
is your friend.
#Martin.Prikryl: setting hostkeys = None is very useful in the initial stage of coding with pysftp. Debugging a program that keeps failing for a known exception hides other problems that need attention--like making an actual connection. I can deal with the 'man in the middle' problem later once I know my code is actually working correctly.
#All: The current pysftp.CnOpts() object appears to have a bug:
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None
The above code does not prevent host key checking.
python getfile_v3.py --help
Traceback (most recent call last):
File "getfile_v3.py", line 9, in
cnopts = pysftp.CnOpts()
File "c:\Program Files\Python\Python38\lib\site-packages\pysftp_init_.py", line 64, in init
raise HostKeysException('No Host Keys Found')
pysftp.exceptions.HostKeysException: No Host Keys Found
The second line doesn't get executed because the first does the host key check by default. If I set the key with:
cnopts = pysftp.CnOpts(hostkeys=None)
the same error results.
It appears that 'hostkeys' has been deprecated, and there is no way to disable the host key check.
Joe White
Workaround for SSHException: No hostkey for host test.rebex.net found
just to add it manually via ssh
ssh demo#test.rebex.net
it will ask for password, you need to enter "password"
it will show you the message
The authenticity of host 'test.rebex.net (ip-adress)' can't be established.
ECDSA key fingerprint is SHA256:OzvpQxxxV9F/ECMXbQ7B7zbKxxxxUno65c.
Are you sure you want to continue connecting (yes/no/[fingerprint])?
confirm action -> type "yes"
done, you will get a message, since now your sftp connection to 'test.rebex.net' should work
Warning: Permanently added 'test.rebex.net,ip' (ECDSA) to the list of known hosts.

Categories

Resources