Access tables from Impala through Python - python

I need to access tables from Impala through CLI using python on the same cloudera server
I have tried below code to establish the connection :
def query_impala(sql):
cursor = query_impala_cursor(sql)
result = cursor.fetchall()
field_names = [f[0] for f in cursor.description]
return result, field_names
def query_impala_cursor(sql, params=None):
conn = connect(host='xx.xx.xx.xx', port=21050, database='am_playbook',user='xxxxxxxx', password='xxxxxxxx')
cursor = conn.cursor()
cursor.execute(sql.encode('utf-8'), params)
return cursor
but since I am on the same cloudera server, I will not need to provide the host name. Could you please provide the correct code to access Impala/hive tables existing on the same server through python.

you can use pyhive to make connection to hive and get access to your hive tables.
from pyhive import hive
import pandas as pd
import datetime
conn = hive.Connection(host="hostname", port=10000, username="XXXX")
hive.connect('hostname', configuration={'hive.execution.engine':'tez'})
query="select col1,col2,col3,col4 from db.yourhiveTable"
start_time= datetime.datetime.now()
data=pd.read_sql(query,conn)
print(data)
end_time=datetime.datetime.now()
print 'Finished reading from Hive table', (start_time-end_time).seconds/60.0,' minutes'

Related

SQL Server connection - Works in pyodbc, but not SQLAlchemy

This is a fairly common question but even using the answers on SO like here but I still can't connect.
When I setup my connection to pyodbc I can connect with the following:
cnxn = pyodbc.connect('DRIVER={SQL Server Native Client 11.0};SERVER=ip,port;DATABASE=db;UID=user;PWD=pass')
cursor = cnxn.cursor()
cursor.execute("some select query")
for row in cursor.fetchall():
print(row)
and it works.
However to do a .read_sql() in pandas I need to connect with sqlalchemy.
I have tried with both hosted connections and pass-through pyodbc connections like the below:
quoted = urllib.parse.quote_plus('DRIVER={SQL Server Native Client 11.0};Server=ip;Database=db;UID=user;PWD=pass;Port=port;')
engine = sqlalchemy.create_engine('mssql+pyodbc:///?odbc_connect={}'.format(quoted))
engine.connect()
I have tried with both SERVER=ip,port format and the separate Port=port parameter like above but still no luck.
The error I'm getting is Login failed for user 'user'. (18456)
Any help is much appreciated.
I assume that you want to create a DataFrame so when you have a cnxn you can pass it to Pandas read_sql_query function.
Example:
cnxn = pyodbc.connect('your connection string')
query = 'some query'
df = pandas.read_sql_query(query, conn)

Getting error on python while transferring data from SQL server to snowflake

I am getting below error
query = command % processed_params TypeError: not all arguments
converted during string formatting
I am trying to pull data from SQL server and then inserting it into Snowflake
my below code
import pyodbc
import sqlalchemy
import snowflake.connector
driver = 'SQL Server'
server = 'tanmay'
db1 = 'testing'
tcon = 'no'
uname = 'sa'
pword = '123'
cnxn = pyodbc.connect(driver='{SQL Server}',
host=server, database=db1, trusted_connection=tcon,
user=uname, password=pword)
cursor = cnxn.cursor()
cursor.execute("select * from Admin_tbldbbackupdetails")
rows = cursor.fetchall()
#for row in rows:
# #data = [(row[0], row[1],row[2], row[3],row[4], row[5],row[6], row[7])]
print (rows[0])
cnxn.commit()
cnxn.close()
connection = snowflake.connector.connect(user='****',password='****',account='*****')
cursor2 = connection.cursor()
cursor2.execute("USE WAREHOUSE FOOD_WH")
cursor2.execute("USE DATABASE Test")
sql1="INSERT INTO CN_RND.Admin_tbldbbackupdetails_ip"
"(id,dbname, dbpath, backupdate, backuptime, backupStatus, FaildMsg, Backupsource)"
"values (?,?,?,?,?,?,?,?)"
cursor2.execute(sql1,*rows[0])
It's obviously string parsing error.
You missed to provide parameter to %s printout.
If you cannot fix it step back and try another approach.
Use another script to achieve the same and get back to you bug tomorrow :-)
My script is doing pretty much the same:
1. Connect to SQL Server
-> fetchmany
-> multipart upload to s3
-> COPY INTO Snowflake table
Details are here: Snowpipe-for-SQLServer

How to fetch data from read-only mysql database using python?

I am trying to fetch data in python from MySQL database using username that has read-only permission. I am using mysql.connector package to connect to database.
It gets connected to database properly, as I checked using following:
connection = mysql.connector.connect(host = HOSTNAME, user = USERNAME, passwd = PASSWORD, db = DATABASE, port=PORT)
print(connection.cmd_statistics())
But when I try to fetch data from Database using cursor, it returns 'None'.
My code is:
cursor = connection.cursor()
try:
query1 = 'SELECT * FROM table_name'
result = cursor.execute(query1)
print(result)
finally:
connection.close()
And the output is:
None
It works for python 3.6.5 and mysql_workbench 8.0 but not tried in other python -version**
import _mysql_connector
avi = _mysql_connector.MySQL()
avi.connect(host='127.0.0.1',user='root',port=3306, password='root',database='hr_table')
avi.query("select * from hr_table.countries")
row = avi.fetch_row()
while row:
print(row)
row = avi.fetch_row()
avi.free_result()
avi.close()

Python to SQL Server Insert

I'm trying to follow the method for inserting a Panda data frame into SQL Server that is mentioned here as it appears to be the fastest way to import lots of rows.
However I am struggling with figuring out the connection parameter.
I am not using DSN , I have a server name, a database name, and using trusted connection (i.e. windows login).
import sqlalchemy
import urllib
server = 'MYServer'
db = 'MyDB'
cxn_str = "DRIVER={SQL Server Native Client 11.0};SERVER=" + server +",1433;DATABASE="+db+";Trusted_Connection='Yes'"
#cxn_str = "Trusted_Connection='Yes',Driver='{ODBC Driver 13 for SQL Server}',Server="+server+",Database="+db
params = urllib.parse.quote_plus(cxn_str)
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
conn = engine.connect().connection
cursor = conn.cursor()
I'm just not sure what the correct way to specify my connection string is. Any suggestions?
I have been working with pandas and SQL server for a while and the fastest way I found to insert a lot of data in a table was in this way:
You can create a temporary CSV using:
df.to_csv('new_file_name.csv', sep=',', encoding='utf-8')
Then use pyobdc and BULK INSERT Transact-SQL:
import pyodbc
conn = pyodbc.connect(DRIVER='{SQL Server}', Server='server_name', Database='Database_name', trusted_connection='yes')
cur = conn.cursor()
cur.execute("""BULK INSERT table_name
FROM 'C:\\Users\\folders path\\new_file_name.csv'
WITH
(
CODEPAGE = 'ACP',
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)""")
conn.commit()
cur.close()
conn.close()
Then you can delete the file:
import os
os.remove('new_file_name.csv')
It was a second to charge a lot of data at once into SQL Server. I hope this gives you an idea.
Note: don't forget to have a field for the index. It was my mistake when I started to use this lol.
Connection string parameter values should not be enclosed in quotes so you should use Trusted_Connection=Yes instead of Trusted_Connection='Yes'.

how can I use stored procedure with parameters in python code

I want to use stored procedure in python code like below.
import pyodbc
conn = pyodbc.connect('Trusted_Connection=yes', driver = '{SQL Server}',
server = 'ZAMAN\SQLEXPRESS', database = 'foy3')
def InsertUser(studentID,name,surname,birth,address,telephone):
cursor = conn.cursor()
cursor.execute("exec InserttoDB studentID,name,surname,birth,address,telephone")
rows = cursor.fetchall()
I have a problem below part of code. How can I send function parametters to DB with InserttoDB (stored procedure)
cursor.execute("exec InserttoDB studentID,name,surname,birth,address,telephone")
I am not sure what database you are using but I think this should do the job.
cursor.execute("call SP_YOUR_SP_NAME(params)")

Categories

Resources