I am trying to connect to hive[with default derby db] using python:
from impala.dbapi import connect
conn = connect( host='localhost', port=10000)
cursor = conn.cursor()
cursor.execute('SELECT * FROM employee')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
but I am getting error:
Traceback (most recent call last):
File "hivetest_b.py", line 2, in <module>
conn = connect( host='localhost', port=10000)
File "/home/ubuntu/.local/lib/python2.7/site-packages/impala/dbapi.py", line 147, in connect
auth_mechanism=auth_mechanism)
File "/home/ubuntu/.local/lib/python2.7/site-packages/impala/hiveserver2.py", line 758, in connect
transport.open()
File "/home/ubuntu/.local/lib/python2.7/site-packages/thrift/transport/TTransport.py", line 149, in open
return self.__trans.open()
File "/home/ubuntu/.local/lib/python2.7/site-packages/thrift/transport/TSocket.py", line 101, in open
message=message)
thrift.transport.TTransport.TTransportException: Could not connect to localhost:10000
entry in my /etc/hosts is:
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
I am using default hive-site.xml and defult derby database for running my hive. When I run hive through shell it shows me that table:
hive> show databases;
OK
default
test
test_db
Time taken: 0.937 seconds, Fetched: 3 row(s)
hive> show tables;
OK
employee
Time taken: 0.054 seconds, Fetched: 1 row(s)
hive> describe employee;
OK
empname string
age int
gender string
income float
department string
dept string
# Partition Information
# col_name data_type comment
dept string
Time taken: 0.451 seconds, Fetched: 11 row(s)
I am not sure what exactly am I missing here. Any quick references/pointers would be appreciated.
Regards,
Bhupesh
You can check and validate the port with:
hive> set hive.server2.thrift.port;
And try 0.0.0.0 and 127.0.0.1 instead of localhost as your host for the connection.
Related
What I'm trying to do is very basic: connect to an Impala db using Python:
from impala.dbapi import connect
conn = connect(host='impala', port=21050, auth_mechanism='PLAIN')
I'm using Impyla package to do so. I got this error:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/thriftpy/transport/socket.py", line 96, in open
self.sock.connect(addr)
socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/alaaeddine/PycharmProjects/test/data_test.py", line 3, in <module>
conn = connect(host='impala', port=21050, auth_mechanism='PLAIN')
File "/usr/local/lib/python3.6/dist-packages/impala/dbapi.py", line 147, in connect
auth_mechanism=auth_mechanism)
File "/usr/local/lib/python3.6/dist-packages/impala/hiveserver2.py", line 758, in connect
transport.open()
File "/usr/local/lib/python3.6/dist-packages/thrift_sasl/__init__.py", line 61, in open
self._trans.open()
File "/usr/local/lib/python3.6/dist-packages/thriftpy/transport/socket.py", line 104, in open
message="Could not connect to %s" % str(addr))
thriftpy.transport.TTransportException: TTransportException(type=1, message="Could not connect to ('impala', 21050)")
Tried also the Ibis package but failed with the same thriftpy related error.
In Windows using Dbeaver, I could connect to the database using the official Cloudera JDBC connector. My questions are:
Should pass my JDBC connector as parameter in my connect code? I have made some search I could not find something pointing at this direction.
Should I try something else than Ibis and Impyla packages? I had experienced a lot of version related issues and dependencies when using them. If yes, what would you recommend as alternatives?
Thanks!
Solved:
I used pyhive package instead of Ibis/Impyla. Here's an example:
#import hive from pyhive
from pyhive import hive
#establish the connection to the db
conn = hive.Connection(host='host_IP_addr', port='conn_port', auth='auth_type', database='my_db')
#prepare the cursor for the queries
cursor = conn.cursor()
#execute a query
cursor.execute("SHOW TABLES")
#navigate and display the results
for table in cursor.fetchall():
print(table)
Your impala domain name must not be resolving. Are you able to do nslookup impala in command prompt? If you're using Docker, you need to have the docker service name in docker-compose as "impala" or have "extra_hosts" option. Or you can always add it to /etc/hosts (Windows/Drivers/etc/hosts) as impala 127.0.0.1
Also try 'NOSASL' instead of PLAIN sometimes that works better with security turned off.
This is the simple method, connecting impala through impala shell using python.
import commands
import re
query1 = "select * from table_name limit 10"
impalad = str('hostname')
port = str('21000')
database = str('database_name')
result_string = 'impala-shell -i "'+ impalad+':'+port +'" -k -B --delimited -q "'+query1+'"'
status, output = commands.getstatusoutput(result_string)
print output
if status == 0:
print output
else:
print "Error encountered while executing HiveQL queries."
I'm developing a script that it's supossed to read data from a Microsoft SQL database and display it in a nice format. Also, It's supossed to write into the database as well. The issue is that I'm not able to connect to the server.
I'm using this code:
import pymssql
server = "serverIpAddress"
user = "username"
password = "pass"
db = "databaseName"
port = 1433
db = pymssql.connect(server,user,password,port= port)
# prepare a cursor object using cursor() method
cursor = db.cursor()
# execute SQL query using execute() method.
cursor.execute("SELECT VERSION()")
# Fetch a single row using fetchone() method.
data = cursor.fetchone()
print "Database version : %s " % data
# disconnect from server
db.close()
And I'm getting this traceback:
Traceback (most recent call last):
File ".\dbtest.py", line 9, in <module>
db = pymssql.connect(server,user,password,port= port)
File "pymssql.pyx", line 641, in pymssql.connect (pymssql.c:10824)
pymssql.OperationalError: (18452, 'Login failed. The login is from an untrusted domain and cannot be used with Windows authentication.DB-Lib error message 20018, severity 14:\nGeneral SQL Server e
rror: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (serverip:1433)\n')
I've changed some data to keep privacy.
This give me some clues about what it's going on:
The login is from an untrusted domain and cannot be used with Windows authentication
But I don't know how to fix it. I've seen that some people uses
integratedSecurity=true
But I don't know if there is something like this on pymssql or even if that it's a good idea.
Also, I don't need to use pymssql at all. If you know any other library that can perform what I need, I don't mind changing it.
Thanks and greetings.
--EDIT--
I've also tested this code:
import pyodbc
server = "serverIpAddress"
user = "username"
password = "pass"
db = "databaseName"
connectString = "Driver={SQL Server};server="+serverIP+";database="+db+";uid="+user+";pwd="+password
con = pyodbc.connect(connectString)
cur = con.cursor()
cur.close()
con.close()
and I'm getting this traceback:
Traceback (most recent call last):
File ".\pyodbc_test.py", line 9, in <module>
con = pyodbc.connect(connectString)
pyodbc.Error: ('28000', "[28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user '.\\sa'. (18456) (SQLDriverConnect); [28000] [Microsoft][ODBC SQL Server Driver][SQL Server]Lo
gin failed for user '.\\sa'. (18456)")
I have a database in mysql and I want to connect to it. I am trying to use this module from Python called MySQLdb. I created an user (called abc) and password (abc) for this database (abc) that has one table and it is connecting ok (when I connect by mysql command line).
But when I run my python script there is an error in the connection.
My script is:
#!/usr/bin/python
import MySQLdb
# Open database connection
db = MySQLdb.connect("localhost","abc","abc","abc")
# prepare a cursor object using cursor() method
cursor = db.cursor()
# execute SQL query using execute() method.
cursor.execute("SELECT VERSION()")
# Fetch a single row using fetchone() method.
data = cursor.fetchone()
print "Database version : %s " % data
# disconnect from server
db.close()
My error is:
Traceback (most recent call last):
File "./test.py", line 8, in <module>
db = MySQLdb.connect("localhost","abc","abc","abc")
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/__init__.py", line 81, in Connect
return Connection(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 193, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (1044, "Access denied for user 'abc'#'localhost' to database 'abc'")
What is wrong? My script or something in my mysql?
I changed my localhost to 127.0.0.1 (as suggested in another post, but did not solve my issue.
I also checked my permissions for this mysql user:
SHOW GRANTS;
+------------------------------------------------------------------------------+
| Grants for abc#localhost |
+------------------------------------------------------------------------------+
| GRANT USAGE ON *.* TO 'abc'#'localhost' |
| GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, INDEX, CREATE VIEW ON `abc`.* TO 'abc'#'localhost' |
+------------------------------------------------------------------------------+
You might need to restart the mysql daemon for the privileges to take affect. Or use the flush privileges command. https://dev.mysql.com/doc/refman/5.7/en/privilege-changes.html
I installed Python 2.7 to try to connect to MySQL online. Basically, MySQL and phpMyAdmin is on a server and I can access it via localhost:8888/phpmyadmin via putty on my windows desktop. I cant seem to connect to it even with the putty on. Any idea? I face the same issue with Python 3.3 using CyMySQL.
import MySQLdb
db = MySQLdb.connect(host="127.0.0.1", # your host, usually 127.0.0.1
user="megamonster", # your username
passwd="", # your password
db="extractor") # name of the data base
# you must create a Cursor object. It will let
# you execute all the query you need
cur = db.cursor()
# Use all the SQL you like
cur.execute("SELECT * FROM abc")
# print all the first cell of all the rows
for row in cur.fetchall() :
print row[0]
Error:
Traceback (most recent call last):
File "C:\Users\Jonathan\Desktop\testSQL.py", line 6, in <module>
db="extractor") # name of the data base
File "C:\Python27\lib\site-packages\MySQLdb\__init__.py", line 81, in Connect
return Connection(*args, **kwargs)
File "C:\Python27\lib\site-packages\MySQLdb\connections.py", line 187, in __init__
super(Connection, self).__init__(*args, **kwargs2)
OperationalError: (2003, "Can't connect to MySQL server on '127.0.0.1' (10061)")
Update
i added port(3306) and got this.
OperationalError: (2013, "Lost connection to MySQL server at 'waiting for initial communication packet', system error: 0")
Currently looking at
MySQL error: 2013, "Lost connection to MySQL server at 'reading initial communication packet', system error: 0"
Hmm cant work still...
I used sqlplus it worked
sqlplus User_name/password#Host_ip:Port/Service_Name#$SQLFILENAME
Just specify the SQLFILENAME if you want to utilize a file. Otherwise, you can ignore this parameter and can directly run sql statements
It could be a number of things, but as far as MySQL is concerned, permissions are set independently for localhost and for 127.0.0.1. Make sure you can connect with the exact host and credentials. Possibly related
For example, check this when connected with your PUTTY connection.
mysql> use mysql;
Database changed
mysql> SELECT host,user,select_priv FROM user;
+-------------------------+------+-------------+
| host | user | select_priv |
+-------------------------+------+-------------+
| localhost | root | Y |
| 127.0.0.1 | root | Y |
+-------------------------+------+-------------+
Also check who you are connected as (on PUTTY) and use that same info in the script:
mysql> SELECT USER(),CURRENT_USER();
+----------------+----------------+
| USER() | CURRENT_USER() |
+----------------+----------------+
| root#localhost | root#localhost |
+----------------+----------------+
I'm getting the following error when running my python script on a 2008 vm
running mysql server 5.6
using ajax
Traceback (most recent call last):
File "mypythonjob.py", line 22, in <module>
db = mdb.connect('localhost', 'website','servername', 'website')
File "C:\Python27\lib\site-packages\MySQLdb\__init__.py", line 81, in Connect
return Connection(*args, **kwargs)
File "C:\Python27\lib\site-packages\MySQLdb\connections.py", line 187, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2003, "Can't connect to MySQL server on 'localhost' (10055)")
I can watch in resource monitor as the cpu climbs to 100% after about 75 seconds at that point mysql.exe has 30 threads and python.exe has 6 threads
the error kicks out and python.exe is terminated and mysql server is un reachable for about 2mins then it comes back on line.
import os, datetime, pymssql , time, subprocess
import MySQLdb as mdb
today = datetime.datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
sixmonth = today - datetime.timedelta(days=180)
db = mdb.connect('localhost', 'website','server', 'website') #values changed for post
cursor = db.cursor()
sql2 = "select 1column from mytable where 4column like '"+str(today)+"'" #values changed for post (query produces 1100 rows)
cursor.execute(sql2)
data2 = cursor.fetchall()
cursor.close()
db.close()
for row2 in data2:
db = pymssql.connect(host="sqldb",user="username", password="pwd", database="somedatabase") #values changed for post
cursor = db.cursor()
sql3 = "SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; select col1,col2,col3 from tbl where col1 like '%somedata%' and col3 < '"+str(sixmonth)+"' and col2 ='data'" #values changed for post
cursor.execute(sql3)
data3 = cursor.fetchall()
db.commit()
cursor.close()
db.close()
for row3 in data3:
db = mdb.connect('localhost', 'website','server', 'website')
cursor = db.cursor()
sql4 = "update mytable set 2column ='"+str(row3[2])+"', 3column ='"+str(row3[1])+"' where 4column like '"+str(today)+"' and 1column like '"+str(row3[0])+"'" #values changed for post (seems to finish the update properly at 575 rows)
cursor.execute(sql4)
data4 = cursor.fetchall()
db.commit()
cursor.close()
db.close()
Resolution attempts
ran the code with a small 10 row of the starting data vs 1100
checked to make sure i'm closing connections and db's when i'm done with them
i check the mysql log file there is nothing
changed the setting max_allowed_packet in my.ini to 32mg then 500mg and finally commented out
thanks for looking
added note:
If i remove SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; from the second query it seems to continue to run. I let it go for 5 mins and the count stopped at 575 updated and held there.
Solved the issue, i was not passing the variable properly from the first query to the second so it was killing the server with results then passing that bloating line count into the update.