Unable to connect cassandra with python - python

I am trying to connect to cassandra from python , I have installed cassandra as pip install pycassa.When i am trying to connect to the cassandra i am getting the following exception
from pycassa.pool import ConnectionPool
pool = ConnectionPool('Keyspace1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/site-packages/pycassa/pool.py", line 382, in __init__
self.fill()
File "/usr/lib/python2.7/site-packages/pycassa/pool.py", line 442, in fill
conn = self._create_connection()
File "/usr/lib/python2.7/site-packages/pycassa/pool.py", line 431, in _create_connection
(exc.__class__.__name__, exc))
pycassa.pool.AllServersUnavailable: An attempt was made to connect to each of the servers twice, but none of the attempts succeeded. The last failure was TTransportException: Could not connect to localhost:9160
I am using python 2.7.
What is the problem, Any help would be appreciated.

Perhaps try specifying the host:
pool = ConnectionPool('Keyspace1', ['server_node_here:9160'])

General way to connect Cassandra with python.
from cassandra.cluster import Cluster
cluster = Cluster() #for connecting on localhost
cluster = Cluster(['192.168.0.1', '192.168.0.2']) #*for connecting on clusters (comment this line, if you are connecting with localhost)*
session = cluster.connect('testing')
You can also connect using model class with python
from cassandra.cqlengine import columns
from cassandra.cqlengine.models import Model
from cassandra.cqlengine.management import sync_table
from cassandra.cqlengine import connection
import uuid
from datetime import datetime
connection.setup(['127.0.0.1'], "testing") #testing is the keyspace
For detail information for model class implementation take a look: https://github.com/vishal-kr-yadav/NoSQL_Databases

Related

Teradataml error mlengine_alias_definitions_v1.0' is not defined for the current Vantage version 'vantage1.0'

i'm trying to create an exe file which take advantage of teradataml python. I'm trying to create a table in teradata and import the data form pandas dataframe.
here is my code.
import pandas as pd
from sqlalchemy import create_engine
from teradataml.context.context import *
from sqlalchemy import *
from teradataml.dataframe.copy_to import copy_to_sql
from sqlalchemy.dialects import registry
from teradatasqlalchemy import dialect
registry.register('teradata', 'teradatasqlalchemy', 'dialect')
user = 'dbc'
pasw=user
host = '192.168.1.7'
td_engine = create_engine('teradata://'+ user +':' + pasw + '#'+ host )
create_context(tdsqlengine =td_engine)
df = pd.read_csv(r"C:/krishna/data/FL_insurance_sample1.csv", delimiter=',')
copy_to_sql(df = df, table_name = "Insurece_sample", primary_index="InsurenceID", if_exists="replace")
remove_context()
initially i was getting below error however i fixed that one.
sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:teradata
pyinstaller command which i tried:
pyinstaller --add-binary "C:\Users\krishna\AppData\Local\Programs\Python\Python38\Lib\site-packages\teradatasql\teradatasql.dll;teradatasql"-F pyinstalletest.py
the error which i'm getting now:
Traceback (most recent call last):
File "pyinstalletest.py", line 18, in <module>
File "teradataml\context\context.py", line 459, in create_context
File "teradataml\context\context.py", line 751, in _load_function_aliases
File "teradataml\common\utils.py", line 1591, in _check_alias_config_file_exists
teradataml.common.exceptions.TeradataMlException: [Teradata][teradataml](TDML_2069) Alias config file 'C:\Users\krishna\AppData\Local\Temp\_MEI63962\teradataml\config\mlengine_alias_definitions_v1.0' is not defined for the current Vantage version 'vantage1.0'. Please add the config file.
[1660] Failed to execute script pyinstalletest
please help me to resolve the error.

EFOError when trying to connect Pyftpsync to remote server on port 22

I am trying to sync two folders via FTP, yes I know there are better or different ways but for now I need to implement it this way, I was trying the example code from pyftpsync since well, a sample code should work easily right? I am just trying to connect between some test folders I made, one is empty(local) and the remote has a single text file that I want to fetch. It tries to connect but after about 2 minutes I get this error.
Well, my FTP does work outside of python. I can connect over WinSCP just fine.
Some places mentioned that a proxy could possibly cause this, but it seems I am not behind a proxy currently, but maybe I did not set that properly and it believes there should be a proxy somehow?
Here is my code, just using commands on the prompt for pyftpsync produces the same errors for me. So it is possible some input parameter is off causing all of this.
import time
import os
import re
import shutil
import string
import sys
from ftpsync.targets import FsTarget
from ftpsync.ftp_target import FtpTarget
from ftpsync.synchronizers import DownloadSynchronizer
#synchronize a local folder with ftp
local = FsTarget( "C:\\testfolder\\" )
user = "login"
passwd = "password"
remote = FtpTarget("/my/folder/location/testfold/", "126.0.0.1",port=22, username=user,password=passwd,tls=False,timeout=None,extra_opts=None)
opts = {}
s=DownloadSynchronizer(local, remote, opts)
s.run()
This is the output I am getting, I have edited out the folder names and IP addresses.
INFO:keyring.backend:Loading KWallet
INFO:keyring.backend:Loading SecretService
INFO:keyring.backend:Loading Windows
INFO:keyring.backend:Loading chainer
INFO:keyring.backend:Loading macOS
INFO:pyftpsync:Download to C:\testfolder
from ftp://126.0.0.1/.../testfold
INFO:pyftpsync:Encoding local: utf-8, remote: utf-8
Traceback (most recent call last):
File "c:\..\.py", line 30, in <module>
s.run()
File "C:\\AppData\Local\Programs\Python\Python37-32\lib\site-
packages\ftpsync\synchronizers.py", line 1268, in run
res = super(DownloadSynchronizer, self).run()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\site-packages\ftpsync\synchronizers.py", line 827, in run
res = super(BiDirSynchronizer, self).run()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\site-packages\ftpsync\synchronizers.py", line 211, in run
self.remote.open()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\site-packages\ftpsync\ftp_target.py", line 141, in open
self.ftp.connect(self.host, self.port)
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\ftplib.py", line 155, in connect
self.welcome = self.getresp()
File "C:\\Local\Programs\Python\Python37-
32\lib\ftplib.py", line 236, in getresp
resp = self.getmultiline()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\ftplib.py", line 226, in getmultiline
nextline = self.getline()
File "C:\\AppData\Local\Programs\Python\Python37-
32\lib\ftplib.py", line 210, in getline
raise EOFError
EOFError
Anyways any possible troubleshooting ideas would help. Thank you.
Pyftpsync uses FTP protocol.
You are connecting to port 22, which is used for SSH/SFTP.
So if your server is actually SFTP server, not FTP server, you cannot use Pyftpsync with it.

Unable to connect to Hive2 using Python

While connecting to Hive2 using Python with below code:
import pyhs2
with pyhs2.connect(host='localhost',
port=10000,
authMechanism="PLAIN",
user='root',
password='test',
database='default') as conn:
with conn.cursor() as cur:
#Show databases
print cur.getDatabases()
#Execute query
cur.execute("select * from table")
#Return column info from query
print cur.getSchema()
#Fetch table results
for i in cur.fetch():
print i
I am getting below error:
File
"C:\Users\vinbhask\AppData\Roaming\Python\Python36\site-packages\pyhs2-0.6.0-py3.6.egg\pyhs2\connections.py",
line 7, in <module>
from cloudera.thrift_sasl import TSaslClientTransport ModuleNotFoundError: No module named 'cloudera'
Have tried here and here but issue wasn't resolved.
Here is the packages installed till now:
bitarray0.8.1,certifi2017.7.27.1,chardet3.0.4,cm-api16.0.0,cx-Oracle6.0.1,future0.16.0,idna2.6,impyla0.14.0,JayDeBeApi1.1.1,JPype10.6.2,ply3.10,pure-sasl0.4.0,PyHive0.4.0,pyhs20.6.0,pyodbc4.0.17,requests2.18.4,sasl0.2.1,six1.10.0,teradata15.10.0.21,thrift0.10.0,thrift-sasl0.2.1,thriftpy0.3.9,urllib31.22
Error while using Impyla:
Traceback (most recent call last):
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\Scripts\HiveConnTester4.py", line 1, in <module>
from impala.dbapi import connect
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\impala\dbapi.py", line 28, in <module>
import impala.hiveserver2 as hs2
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\impala\hiveserver2.py", line 33, in <module>
from impala._thrift_api import (
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\impala\_thrift_api.py", line 74, in <module>
include_dirs=[thrift_dir])
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\thriftpy\parser\__init__.py", line 30, in load
include_dir=include_dir)
File "C:\Users\xxxxx\AppData\Local\Programs\Python\Python36-32\lib\site-packages\thriftpy\parser\parser.py", line 496, in parse
url_scheme))
thriftpy.parser.exc.ThriftParserError: ThriftPy does not support generating module with path in protocol 'c'
thrift_sasl.py is trying cStringIO which is no longer available in Python 3.0. Try with python 2 ?
You may need to install an unreleased version of thrift_sasl. Try:
pip install git+https://github.com/cloudera/thrift_sasl
If you're comfortable learning PySpark, then you just need to setup the hive.metastore.uris property to point at the Hive Metastore address, and you're ready to go.
The easiest way to do that would be to export the hive-site.xml from the your cluster, then pass --files hive-site.xml during spark-submit.
(I haven't tried running standalone Pyspark, so YMMV)

How can I list/add Nodes to my Pylon App from python command line?

I have a Pylons application that I inherited. It has a MySQL database. There is a Model called Node in the applicaiton. I want to first list all the Nodes in the database. Then, I'd like to be able to add a node. So far, I have been trying:
import myapp.model as model
nodes = model.Session.query(model.Node).all()
for node in nodes:
print node
The above code throws an error which I have seen in other questions like this one:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.4/site-packages/sqlalchemy/orm/query.py", line 1579, in all
return list(self)
File "/usr/lib/python2.4/site-packages/sqlalchemy/orm/query.py", line 1689, in __iter__
return self._execute_and_instances(context)
File "/usr/lib/python2.4/site-packages/sqlalchemy/orm/query.py", line 1694, in _execute_and_instances
mapper=self._mapper_zero_or_none())
File "/usr/lib/python2.4/site-packages/sqlalchemy/orm/session.py", line 717, in execute
engine = self.get_bind(mapper, clause=clause, **kw)
File "/usr/lib/python2.4/site-packages/sqlalchemy/orm/session.py", line 851, in get_bind
raise sa_exc.UnboundExecutionError(
sqlalchemy.exc.UnboundExecutionError: Could not locate a bind configured on mapper Mapper|Node|node, SQL expression or this Session
I feel like I am missing a step or something. I'm accustom to working with models in Django and this is my first time working with a Pylon application. I think that this has something to do with Sessions but I'm not sure. Does anyone know how I could list all the Nodes and then add a Node?
I figured out was was wrong. I never set up a session with my database:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
import myapp.model as model
engine = create_engine("mysql://username:password#localhost/myapp")
conn = engine.connect()
session = Session(bind=conn)
nodes = session.query(model.Node).all()
for node in nodes:
print node

PyHive is Hanging on Connection -- Thrift_sasl appears to be the issue

from pyhive import hive
import thrift_sasl
connection = hive.Connection(host='myhost', port=10000, database='local')
#hangs here
from sqlalchemy import create_engine
engine = create_engine('hive://myhost:10000/local')
logs = Table('mytable', MetaData(bind=engine), autoload=True)
#also hangs here
Both of these snippets will hang for me.
Hitting ctrl+c stops the execution here:
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/apps/Python/lib/python2.7/site-packages/pyhive/hive.py", line 86, in __init__
self._transport.open()
File "thrift_sasl.py", line 74, in open
status, payload = self._recv_sasl_message()
File "thrift_sasl.py", line 92, in _recv_sasl_message
header = self._trans.readAll(5)
File "/apps/Python/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 58, in readAll
chunk = self.read(sz - have)
File "/apps/Python/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 105, in read
buff = self.handle.recv(sz)
KeyboardInterrupt
I am using Hive 0.12 and HiveServer2. I can connect to it using the python Hive library provided with Hadoop (.../hive/lib/py) but cannot do so with pyhive, which uses thrift_sasl.
Some people not using the thrift_sasl module suggest turning off SASL support in hive-site.xml via:
<property><name>hive.server2.authentication</name><value>NOSASL</value></property>
However after trying this the code still hanged with the same stack trace when I issued a KeyboardInterrupt.

Categories

Resources