Pyhive Presto insert select * from not running - python

I can us PYHIVE to connect to PRESTO and select data back just fine. I am trying to use PYHIVE to run "insert into x select from y" on presto and it is not running. I am sure I am missing something simple.
from pyhive import presto
import requests
from requests.auth import HTTPBasicAuth
import pandas as pd
req_kw = {'auth': HTTPBasicAuth(user, pw),'verify':False}
conn = presto.connect(host=ht,port=prt,protocol='https',catalog='hive',username=user,requests_kwargs=req_kw)
cursor = conn.cursor()
query='select count(1) from dim.date_dim '
cursor.execute(query)
print(cursor.fetchall())
query='insert into flowersc.date_dim select * from dim.date_dim'
cursor.execute(query)
query='select count(1) from flowersc.date_dim '
cursor.execute(query)
print(cursor.fetchall())
no errors occur
but the results show no data loaded
[(16624,)]
[(0,)]
Any help is greatly appreciated.

You need to check (fetch) result in
query='insert into flowersc.date_dim select * from dim.date_dim'
cursor.execute(query).next() # added .next()
This is needed due to a change in Presto in May 2018 (https://github.com/prestosql/presto/commit/568449b8d058ed8281cc5277bb53902fd044cad7). But it's also a good practise to verify query results, i.e. check that your INSERT statement succeeds.

Related

“Associated statement not prepared” caused by pypyodc?

What - Error Message (‘HY007’, [HY007][ODBC SQL Server Driver] Associated statement Is not prepared.
I downloaded ODBC to better diagnose this error based off other posts, however it is still throwing the same error
What is the actually error here and what is the way around it?
Import requests
Import pandas as pd
Import pypyodbc
Import matplotlib.pypot as ply
Conn1 = pypyodbc.connect(“Driver={SQL Server};” “Server = DESKTOP-KOOxxx;” “Database = Horsesxx;” “Trusted_Connection=yes;”, autocommit=True)
Mycursor = Conn1.cursor()
Mycursor.execute(‘Drop table #temptable SELECT * into #temptable FROM (SELECT HorseName, DayCalender FROM horses WHERE Place = 1) AS T1 Inner Join (SELECT runnerName, day, WIN_ODDS_BSP FROM betfairdata) AS T3 ON T1.HorseName = T3.runnerName AND T1.DayCalender = T3.day SELECT WIN_ODDS_BSP FROM #temptable)
Conn1.commit()
This statement works within SQL yet not within VS
Statement also works if I drop the temp table components

how to connect to sqlite from sqlalchemy

I have a sqlite db in my home dir.
stephen#stephen-AO725:~$ pwd
/home/stephen
stephen#stephen-AO725:~$ sqlite db1
SQLite version 2.8.17
Enter ".help" for instructions
sqlite> select * from test
...> ;
3|4
5|6
sqlite> .quit
when I try to connect from a jupiter notebook with sqlalchemy and pandas, sth does not work.
db=sqla.create_engine('sqlite:////home/stephen/db1')
pd.read_sql('select * from db1.test',db)
~/anaconda3/lib/python3.7/site-packages/sqlalchemy/engine/default.py in do_execute(self, cursor, statement, parameters, context)
578
579 def do_execute(self, cursor, statement, parameters, context=None):
--> 580 cursor.execute(statement, parameters)
581
582 def do_execute_no_params(self, cursor, statement, context=None):
DatabaseError: (sqlite3.DatabaseError) file is not a database
[SQL: select * from db1.test]
(Background on this error at: http://sqlalche.me/e/4xp6)
I also tried:
db=sqla.create_engine('sqlite:///~/db1')
same result
Personally, just to complete the code of #Stephen with the modules required:
# 1.-Load module
import sqlalchemy
import pandas as pd
#2.-Turn on database engine
dbEngine=sqlalchemy.create_engine('sqlite:////home/stephen/db1.db') # ensure this is the correct path for the sqlite file.
#3.- Read data with pandas
pd.read_sql('select * from test',dbEngine)
#4.- I also want to add a new table from a dataframe in sqlite (a small one)
df_todb.to_sql(name = 'newTable',con= dbEngine, index=False, if_exists='replace')
Another way to read is using sqlite3 library, which may be more straighforward:
#1. - Load libraries
import sqlite3
import pandas as pd
# 2.- Create your connection.
cnx = sqlite3.connect('sqlite:////home/stephen/db1.db')
cursor = cnx.cursor()
# 3.- Query and print all the tables in the database engine
cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
print(cursor.fetchall())
# 4.- READ TABLE OF SQLITE CALLED test
dfN_check = pd.read_sql_query("SELECT * FROM test", cnx) # we need real name of table
# 5.- Now I want to delete all rows of this table
cnx.execute("DELETE FROM test;")
# 6. -COMMIT CHANGES! (mandatory if you want to save these changes in the database)
cnx.commit()
# 7.- Close the connection with the database
cnx.close()
Please let me know if this helps!
import sqlalchemy
engine=sqlalchemy.create_engine(f'sqlite:///db1.db')
Note: that you need three slashes in sqlite:/// in order to use a relative path for the DB. If you want an absolute path, use four slashes: sqlite:////
Source: Link
The issue is no backward compatibility as noted by Everila. anaconda installs its own sqlite, which is sqlite3.x and that sqlite cannot load databases created by sqlite 2.x
after creating a db with sqlite 3 the code works fine
db=sqla.create_engine('sqlite:////home/stephen/db1')
pd.read_sql('select * from test',db)
which confirms the 4 slashes are needed.
None of the sqlalchemy solutions worked for me with python 3.10.6 and sqlalchemy 2.0.0b4, it could be a beta issue or version 2.0.0 changed things. #corina-roca's solution was close, but not right as you need to pass a connection object, not an engine object. That's what the documentation says, but it didn't actually work. After a bit of experimentation, I discovered that engine.raw_connect() works, although you get a warning on the CLI. Here are my working examples
The sqlite one works out of the box - but it's not ideal if you are thinking of changing databases later
import sqlite3
conn = sqlite3.connect("sqlite:////home/stephen/db1")
df = pd.read_sql_query('SELECT * FROM test', conn)
df.head()
# works, no problem
sqlalchemy lets you abstract your db away
from sqlalchemy import create_engine, text
engine = create_engine("sqlite:////home/stephen/db1")
conn = engine.connect() # <- this is also what you are supposed to
# pass to pandas... it doesn't work
result = conn.execute(text("select * from test"))
for row in result:
print(row) # outside pands, this works - proving that
# connection is established
conn = engine.raw_connection() # with this workaround, it works; but you
# get a warning UserWarning: pandas only
# supports SQLAlchemy connectable ...
df = pd.read_sql_query(sql='SELECT * FROM test', con=conn)
df.head()

Error while trying to execute the query in Denodo using Python SQLAlchemy

I'm trying to get a table from Denodo using Python and sqlalchemy library. That's my code
from sqlalchemy import create_engine
import os
sql = """SELECT * FROM test_table LIMIT 10 """
engine = create_engine('mssql+pyodbc://DenodoODBC', encoding='utf-8')
con = engine.connect().connection
cursor = con.cursor()
cursor.execute(sql)
df = cursor.fetchall()
cursor.close()
con.close()
When I'm trying to run it for the first time I get the following error.
DBAPIError: (pyodbc.Error) (' \x10#', "[ \x10#] ERROR: Function 'schema_name' with arity 0 not found\njava.sql.SQLException: Function 'schema_name' with arity 0 not found;\nError while executing the query (7) (SQLExecDirectW)")
[SQL: SELECT schema_name()]
I think the problem might be with create_engine because when I'm trying to run the code for the second time without creating an engine again, everything is fine.
I hope somebody can explain me what is going on. Thanks :)

With statement in PYODBC sql statement

I am not able to execute a SQL statement with pyodbc if I use the With clause in the SQL statement.
This works:
import pyodbc
cnxn = pyodbc.connect('DSN=database;PWD=password' )
cursor = cnxn.cursor()
sql = """
SELECT top 10 *
FROM table
"""
qnnum = pd.read_sql(sql, cnxn)
This does NOT work:
import pyodbc
cnxn = pyodbc.connect('DSN=database;PWD=password' )
cursor = cnxn.cursor()
sql = """
With A as(SELECT top 10 *
FROM table)
select * from A
"""
qnnum = pd.read_sql(sql, cnxn)
I have tested WITH clause using pyodbc(Python 3.7) on Teradata( 15.10.07.37) and it worked.
Comments are supported as well within the query string in the following form.
/* comment1 */
Hope that helps.
I was able to resolve the problem by simplifying the formatting and removing comments from the original code.

Python mysql connector with multiple statements

I'm trying to run a SQL query through mysql.connector that requires a SET command in order to query a specific table:
import mysql.connector
import pandas as pd
cnx = mysql.connector.connect(host=ip,
port=port,
user=user,
passwd=pwd,
database="")
sql="""SET variable='Test';
SELECT * FROM table """
df = pd.read_sql(sql, cnx)
when I run this I get the error "Use multi=True when executing multiple statements". But where do I put multi=True?
Pass the parameters as a dictionary into the params argument should do the trick, documentation here:
pd.read_sql(sql, cnx, params={'multi': True})
The parameters are passed to the underlying database driver.
after many hours of experimenting, i figured out how do to this. forgive me if this is not the most succinct way, but the best i could come up with-
import mysql.connector
import pandas as pd
cnx = mysql.connector.connect(host=ip,
port=port,
user=user,
passwd=pwd,
database="")
sql1="SET variable='Test';"
sql2="""SELECT * FROM table """
cursor=cnx.cursor()
cursor.execute(sql1)
cursor.close()
df = pd.read_sql(sql2, cnx)

Categories

Resources