Using Python to connect to Impala database (thriftpy error) - python

What I'm trying to do is very basic: connect to an Impala db using Python:
from impala.dbapi import connect
conn = connect(host='impala', port=21050, auth_mechanism='PLAIN')
I'm using Impyla package to do so. I got this error:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/thriftpy/transport/socket.py", line 96, in open
self.sock.connect(addr)
socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/alaaeddine/PycharmProjects/test/data_test.py", line 3, in <module>
conn = connect(host='impala', port=21050, auth_mechanism='PLAIN')
File "/usr/local/lib/python3.6/dist-packages/impala/dbapi.py", line 147, in connect
auth_mechanism=auth_mechanism)
File "/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py", line 758, in connect
transport.open()
File "/usr/local/lib/python3.6/dist-packages/thrift_sasl/__init__.py", line 61, in open
self._trans.open()
File "/usr/local/lib/python3.6/dist-packages/thriftpy/transport/socket.py", line 104, in open
message="Could not connect to %s" % str(addr))
thriftpy.transport.TTransportException: TTransportException(type=1, message="Could not connect to ('impala', 21050)")
Tried also the Ibis package but failed with the same thriftpy related error.
In Windows using Dbeaver, I could connect to the database using the official Cloudera JDBC connector. My questions are:
Should pass my JDBC connector as parameter in my connect code? I have made some search I could not find something pointing at this direction.
Should I try something else than Ibis and Impyla packages? I had experienced a lot of version related issues and dependencies when using them. If yes, what would you recommend as alternatives?
Thanks!

Solved:
I used pyhive package instead of Ibis/Impyla. Here's an example:
#import hive from pyhive
from pyhive import hive
#establish the connection to the db
conn = hive.Connection(host='host_IP_addr', port='conn_port', auth='auth_type', database='my_db')
#prepare the cursor for the queries
cursor = conn.cursor()
#execute a query
cursor.execute("SHOW TABLES")
#navigate and display the results
for table in cursor.fetchall():
print(table)

Your impala domain name must not be resolving. Are you able to do nslookup impala in command prompt? If you're using Docker, you need to have the docker service name in docker-compose as "impala" or have "extra_hosts" option. Or you can always add it to /etc/hosts (Windows/Drivers/etc/hosts) as impala 127.0.0.1
Also try 'NOSASL' instead of PLAIN sometimes that works better with security turned off.

This is the simple method, connecting impala through impala shell using python.
import commands
import re
query1 = "select * from table_name limit 10"
impalad = str('hostname')
port = str('21000')
database = str('database_name')
result_string = 'impala-shell -i "'+ impalad+':'+port +'" -k -B --delimited -q "'+query1+'"'
status, output = commands.getstatusoutput(result_string)
print output
if status == 0:
print output
else:
print "Error encountered while executing HiveQL queries."

Related

How to invoke environmental variables while connecting to DB2 database using python

I have saved the DB2 username and DB2 password as 'DB2_USER' and 'DB2_PASS' in .bashrc(linux). And, I'm trying to invoke them in my python program to connect to DB2 database.
.bashrc content:
export DB2_USER="my_username"
export DB2_PASS="my_password"
My python code snippet:
db_user = os.environ.get('DB2_USER')
db_password = os.environ.get('DB2_PASS')
conn = ibm_db.connect('DRIVER={IBM DB2 ODBC DRIVER};DATABASE=<my_db>;HOSTNAME=<my_hostname>;PORT=50000;PROTOCOL=TCPIP;UID=db_user;pwd=db_password','','')
while executing the above code , I'm getting the below error. May I know for any other alternative ways?
Traceback (most recent call last): File "test.py",
line 38, in
conn = ibm_db.connect('DRIVER={IBM DB2 ODBC DRIVER};DATABASE=<my_db>;HOSTNAME=<my_hostname>;PORT=50000;PROTOCOL=TCPIP;UID=db_user;pwd=db_password','','')
Exception: [IBM][CLI Driver] SQL30082N Security processing failed
with reason "24" ("USERNAME AND/OR PASSWORD INVALID"). SQLSTATE=08001
SQLCODE=-30082
My earlier code literally took db_user & db_pass as username and password. Constructing the connection string using concatenation helped.
conn = ibm_db.connect('DRIVER={IBM DB2 ODBC DRIVER};DATABASE=<my_db>;HOSTNAME=<my_hostname>;PORT=50000;PROTOCOL=TCPIP;'+';UID='+db_user+';pwd='+db_pass,'','')

Sqlalchemy setup for postgresql with timescaledb extension [duplicate]

This question already has an answer here:
Postgres error after updating TimescaleDB on Ubuntu: file not found
(1 answer)
Closed 2 years ago.
I was trying to hook up the sqlalchemy with my underlying postgresql, which uses the timescaledb extension. All queries work fine when I try them from the psql terminal client. But when I try to use python & sqlalchemy to do it, it keeps throwing me an error.
Here's the very basic code snippet that I try to test it with:
engine = create_engine('postgres://usr:pwd#localhost:5432/postgres', echo=True)
engine.execute('select 1;')
And it always shows the following the error message:
File "/home/usr/.local/share/virtualenvs/redbird-lRSbFM0t/lib/python3.6/site-packages/psycopg2/extras.py", line 917, in get_oids
""" % typarray)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not access file "timescaledb-0.9.0": No such file or directory
The connection to the db is fine, otherwise it won't know the db is using timescaledb.
Any one has any insights?
UPDATE: I try to use psycopg2 directly. It basically gives the same error. DB is connected successfully, but timescaledb-0.9.0 cannot be accessed.
Here's the code snippted
conn_string = "host='localhost' dbname='db' user='usr' password='pwd'"
print("Connecting to database\n ->%s " % (conn_string))
conn = psycopg2.connect(conn_string)
cursor = conn.cursor()
print("Connected!\n")
cursor.execute("\dx")
records = cursor.fetchall()
Here's the exact same error message:
Connecting to database
Connected!
Traceback (most recent call last):
File "/home/usr/Workspace/somepath/web/model/model.py", line 21, in <module>
cursor.execute("\dx")
psycopg2.OperationalError: could not access file "timescaledb-0.9.0": No such file or directory
This seems very similar to my issue.
I guess you also updated to a new version of Timescale? The thing is: After each update of the timescale package you don't just have to make sure the library is preloaded (as the warning says on the command line) - you also have to upgrade each database that uses the extension manually via psql.
See my own answer to my issue for the steps.
--
This snipplet works for me:
#! /usr/bin/python
# -*- coding: utf-8 -*-
import psycopg2
# Connect to an existing database.
conn = psycopg2.connect(dbname='my-db-name',
user='postgres',
password='super-secret',
host='localhost',
port='5432')
# Open a cursor to perform database operations.
cur = conn.cursor()
# Query the database and obtain data as Python objects.
cur.execute('SELECT * FROM my-table-name LIMIT 100 ;')
# Print results.
results = cur.fetchall()
for result in results:
print(result)
# Close communication with the database.
cur.close()
conn.close()
Using the cursor to executue psql commands also doesn't work for me. I don't think it is supposed to. But what works reliably is doing SQL:
# Check if the database has the timescaledb extension installed.
# This is about the same as xecuting '\dx' on psql.
cur.execute('SELECT * from pg_extension;')

Unable to connect to server with python and pymssql

I'm developing a script that it's supossed to read data from a Microsoft SQL database and display it in a nice format. Also, It's supossed to write into the database as well. The issue is that I'm not able to connect to the server.
I'm using this code:
import pymssql
server = "serverIpAddress"
user = "username"
password = "pass"
db = "databaseName"
port = 1433
db = pymssql.connect(server,user,password,port= port)
# prepare a cursor object using cursor() method
cursor = db.cursor()
# execute SQL query using execute() method.
cursor.execute("SELECT VERSION()")
# Fetch a single row using fetchone() method.
data = cursor.fetchone()
print "Database version : %s " % data
# disconnect from server
db.close()
And I'm getting this traceback:
Traceback (most recent call last):
File ".\dbtest.py", line 9, in <module>
db = pymssql.connect(server,user,password,port= port)
File "pymssql.pyx", line 641, in pymssql.connect (pymssql.c:10824)
pymssql.OperationalError: (18452, 'Login failed. The login is from an untrusted domain and cannot be used with Windows authentication.DB-Lib error message 20018, severity 14:\nGeneral SQL Server e
rror: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (serverip:1433)\n')
I've changed some data to keep privacy.
This give me some clues about what it's going on:
The login is from an untrusted domain and cannot be used with Windows authentication
But I don't know how to fix it. I've seen that some people uses
integratedSecurity=true
But I don't know if there is something like this on pymssql or even if that it's a good idea.
Also, I don't need to use pymssql at all. If you know any other library that can perform what I need, I don't mind changing it.
Thanks and greetings.
--EDIT--
I've also tested this code:
import pyodbc
server = "serverIpAddress"
user = "username"
password = "pass"
db = "databaseName"
connectString = "Driver={SQL Server};server="+serverIP+";database="+db+";uid="+user+";pwd="+password
con = pyodbc.connect(connectString)
cur = con.cursor()
cur.close()
con.close()
and I'm getting this traceback:
Traceback (most recent call last):
  File ".\pyodbc_test.py", line 9, in <module>
    con = pyodbc.connect(connectString)
pyodbc.Error: ('28000', "[28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user '.\\sa'. (18456) (SQLDriverConnect); [28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Lo
gin failed for user '.\\sa'. (18456)")

Issue to connect into a mysql database with MySQLdb module (python)

I have a database in mysql and I want to connect to it. I am trying to use this module from Python called MySQLdb. I created an user (called abc) and password (abc) for this database (abc) that has one table and it is connecting ok (when I connect by mysql command line).
But when I run my python script there is an error in the connection.
My script is:
#!/usr/bin/python
import MySQLdb
# Open database connection
db = MySQLdb.connect("localhost","abc","abc","abc")
# prepare a cursor object using cursor() method
cursor = db.cursor()
# execute SQL query using execute() method.
cursor.execute("SELECT VERSION()")
# Fetch a single row using fetchone() method.
data = cursor.fetchone()
print "Database version : %s " % data
# disconnect from server
db.close()
My error is:
Traceback (most recent call last):
File "./test.py", line 8, in <module>
db = MySQLdb.connect("localhost","abc","abc","abc")
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/__init__.py", line 81, in Connect
return Connection(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 193, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (1044, "Access denied for user 'abc'#'localhost' to database 'abc'")
What is wrong? My script or something in my mysql?
I changed my localhost to 127.0.0.1 (as suggested in another post, but did not solve my issue.
I also checked my permissions for this mysql user:
SHOW GRANTS;
+------------------------------------------------------------------------------+
| Grants for abc#localhost |
+------------------------------------------------------------------------------+
| GRANT USAGE ON *.* TO 'abc'#'localhost' |
| GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, INDEX, CREATE VIEW ON `abc`.* TO 'abc'#'localhost' |
+------------------------------------------------------------------------------+
You might need to restart the mysql daemon for the privileges to take affect. Or use the flush privileges command. https://dev.mysql.com/doc/refman/5.7/en/privilege-changes.html

Python pyodbc cursor execution fails on Teradata

I have a Python script which runs successfully from my Windows workstation and I am trying to migrate it to a Unix server. The script connects to a Teradata database using pyodbc package and executes a bunch of queries. When it is execute from the server, it triggers the following error message:
Error: ('HY000', 'The driver did not supply an error!')
I am able to consistently reproduce the error with the following code snippet executed on the server:
import pyodbc
oConnexion = pyodbc.connect("Driver={Teradata};DBCNAME=myserver;UID=myuser;PWD=mypassword", autocommit=True)
print("Connected")
oCursor = oConnexion.cursor()
oCursor.execute("select 1")
print("Success")
Configuration:
Python 3.5.2
Pyodbc 3.1.2b2
UnixODBC Driver Manager
Teradata 15.10
After enabling ODBC logging and running a simple SELECT query, I have noticed the following Invalid cursor GeTypeInfo errors:
Data Type = SQL_VARCHAR
[ODBC][57920][1481847636.278776][SQLGetTypeInfo.c][190]Error: 24000
[ODBC][57920][1481847636.278815][SQLGetTypeInfo.c][168]
Entry:
Statement = 0x1bc69e0
Data Type = Unknown(-9)
[ODBC][57920][1481847636.278839][SQLGetTypeInfo.c][190]Error: 24000
[ODBC][57920][1481847636.278873][SQLGetTypeInfo.c][168]
Entry:
Statement = 0x1bc69e0
Data Type = SQL_BINARY
[ODBC][57920][1481847636.278896][SQLGetTypeInfo.c][190]Error: 24000
Also, trying to list the connection attributes using the following code:
for attr in vars(pyodbc):
print (attr)
value = oConnexion.getinfo(getattr(pyodbc, attr))
print('{:<40s} | {}'.format(attr, value))
Fails with:
SQL_DESCRIBE_PARAMETER
Traceback (most recent call last):
File "test.py", line 28, in <module>
value = oConnexion.getinfo(getattr(pyodbc, attr))
pyodbc.Error: ('IM001', '[IM001] [unixODBC][Driver Manager]Driver does not support this function (0) (SQLGetInfo)')
Upgrading to the last (unreleased) version of pyodbc (v4) solved the issue.
https://github.com/mkleehammer/pyodbc/tree/v4

Categories

Resources