In SQLAlchemy, how do I populate or update a table from a SELECT statement?
SQLalchemy doesn't build this construct for you. You can use the query from text.
session.execute('INSERT INTO t1 (SELECT * FROM t2)')
EDIT:
More than one year later, but now on sqlalchemy 0.6+ you can create it:
from sqlalchemy.ext import compiler
from sqlalchemy.sql.expression import Executable, ClauseElement
class InsertFromSelect(Executable, ClauseElement):
def __init__(self, table, select):
self.table = table
self.select = select
#compiler.compiles(InsertFromSelect)
def visit_insert_from_select(element, compiler, **kw):
return "INSERT INTO %s (%s)" % (
compiler.process(element.table, asfrom=True),
compiler.process(element.select)
)
insert = InsertFromSelect(t1, select([t1]).where(t1.c.x>5))
print insert
Produces:
"INSERT INTO mytable (SELECT mytable.x, mytable.y, mytable.z FROM mytable WHERE mytable.x > :x_1)"
Another EDIT:
Now, 4 years later, the syntax is incorporated in SQLAlchemy 0.9, and backported to 0.8.3; You can create any select() and then use the new from_select() method of Insert objects:
>>> from sqlalchemy.sql import table, column
>>> t1 = table('t1', column('a'), column('b'))
>>> t2 = table('t2', column('x'), column('y'))
>>> print(t1.insert().from_select(['a', 'b'], t2.select().where(t2.c.y == 5)))
INSERT INTO t1 (a, b) SELECT t2.x, t2.y
FROM t2
WHERE t2.y = :y_1
More information in the docs.
As of 0.8.3, you can now do this directly in sqlalchemy: Insert.from_select:
sel = select([table1.c.a, table1.c.b]).where(table1.c.c > 5)
ins = table2.insert().from_select(['a', 'b'], sel)
As Noslko pointed out in comment, you can now get rid of raw sql:
http://www.sqlalchemy.org/docs/core/compiler.html#compiling-sub-elements-of-a-custom-expression-construct
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql.expression import Executable, ClauseElement
class InsertFromSelect(Executable, ClauseElement):
def __init__(self, table, select):
self.table = table
self.select = select
#compiles(InsertFromSelect)
def visit_insert_from_select(element, compiler, **kw):
return "INSERT INTO %s (%s)" % (
compiler.process(element.table, asfrom=True),
compiler.process(element.select)
)
insert = InsertFromSelect(t1, select([t1]).where(t1.c.x>5))
print insert
Produces:
INSERT INTO mytable (SELECT mytable.x, mytable.y, mytable.z FROM mytable WHERE mytable.x > :x_1)
Related
I have a following sql query:
SELECT *
FROM %s.tableA
The tableA is in db-jablonec so I need to call db-jablonec.tableA.
I use this method in Python:
def my_method(self, expedice):
self.cursor = self.connection.cursor()
query = """
SELECT *
FROM %s.tableA
"""
self.cursor.execute(query, [expedice])
df = pd.DataFrame(self.cursor.fetchall())
I call it like this:
expedice = ["db-jablonec"]
for exp in expedice:
df = db.my_method(exp)
But I got an error MySQLdb.ProgrammingError: (1146, "Table ''db-jablonec'.tableA' doesn't exist")
Obviously, I want to call 'db-jablonec.tableA' not ''db-jablonec'.tableA'. How can I fix it please?
It is passing %s as its own string including the quotes ''
you therefore need to pass it as one variable. Concatenate .table to the variable itself then pass it in.
Your query will therefore then be
query = """
SELECT *
FROM %s
"""
I think this will helpful for you
SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME LIKE '%%'
Refer This.
I am struggling to establish a connection inside data iteration. Means I am running a select query to postgres and iterating the return data. after some transformation I am writing it to another table. But it is not working. Sample python code is below.
conn = pgconn(------)
cursor = pgconn.Cursor()
query1 = "select * from table"
query2 = "select * from table2 where Id=(%s);"
cursor.execute(query1)
result = query1.fetchall()
for row in result:
If row.a == 2:
cursor.execute(query2, [row.time])
In the above python code I can't able to extract the data by running query2 and passing query1 result as a parameter. It seems cursor is blocked by the query1 so query2 execution is not happening. Please some one help in this issue.
First of all you can write a join statement to do this and can get the data easily
select * from table join table2 where table2.id == table.time
Also why this is not working maybe because the cursor object is getting override inside the for loop and thus the query results get changed.
Use RealDictCursor, and correct the syntax on your inside call to execute():
import psycopg2
import psycopg2.extras
conn = pgconn(------)
cursor = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
query1 = "select * from table"
query2 = "select * from table2 where Id=(%s);"
cursor.execute(query1)
result = query1.fetchall()
for row in result:
If row.a == 2:
cursor.execute(query2, (row['time'],))
1. install psycopg2 and psycopg2.extras. ( pip install)
Then set up your Postgres Connection like:
def Postgres_init(self):
try:
conn = psycopg2.connect(host=os.environ['SD_POSTGRES_SERVER'],
user=os.environ['SD_POSTGRES_USER'],
password=os.environ['SD_POSTGRES_PASSWORD'],
port=os.environ['SD_POSTGRES_PORT'],
database=os.environ['SD_POSTGRES_DATABASE'])
logging.info("Connected to PostgreSQL")
except (Exception, psycopg2.Error) as error:
logging.info(error)
2. Connect your Cursor with the defined connection
cursor = conn.cursor()
3. Execute your query:
cursor.execute("""SELECT COUNT (column1) from tablename WHERE column2 =%s""", (
Value,)) # Check if already exists
result = cursor.fetchone()
Now the value is stored in the "result" variable. Now you can execute the next query like:
cursor.execute("""
INSERT INTO tablename2
(column1, column2, column3)
VALUES
(%s, %s, %s)
ON CONFLICT(column1) DO UPDATE
SET
column2=excluded.column2,
column3=excluded.column3;
""", (result, column2, column3)
)
Now the result of query 1 is stored in the second table in the first column.
Now you can close your connection:
conn.close()
I have mysql select command that counts the number of rows return based on two select commands as follows:
SELECT count(*)-1 as elmlen FROM invertedindextb WHERE
dewId IN
(SELECT dewId FROM invertedindextb WHERE eTypeId = ? and trm = ?)
and docId IN
(SELECT docId FROM invertedindextb WHERE eTypeId = ? and trm = ?)
I wrote a python function that implements the above select command as follows:
def lenOfNode (self, cursor, eTypeId , trm):
sql = """ SELECT COUNT(*)-1 AS LEN FROM invertedindextb WHERE \
dewId IN ("SELECT dewId FROM invertedindextb WHERE eTypeId = '%d' and trm = '%s' %(eTypeId,trm)") \
and docId IN ("SELECT docId FROM invertedindextb WHERE eTypeId = '%d' and trm = '%s' %(eTypeId,trm)") """
cursor.execute(sql)
results = cursor.fetchone()
print results[0]
return results[0]
Although the function run ( but computes wrong answer), I am not sure whether the syntax of the select command is correct in python.
Can somebody help me with correct syntax for the select statement
Do not use \ in the sql string, you started it with triple quote thus all new lines are ok already.
You placed your arguments to the % operator into the string, turning them into literal text. Try to print your query before executing it, like
def lenOfNode (self, cursor, eTypeId , trm):
sql = """ SELECT COUNT(*)-1 AS LEN FROM invertedindextb WHERE \
dewId IN ("SELECT dewId FROM invertedindextb WHERE eTypeId = '%d' and trm = '%s' %(eTypeId,trm)") \
and docId IN ("SELECT docId FROM invertedindextb WHERE eTypeId = '%d' and trm = '%s' %(eTypeId,trm)") """
print "My real sql query was:\n", sql
cursor.execute(sql)
results = cursor.fetchone()
print results[0]
return results[0]
Your mistace is that "aaa %d" % (10,) string (which should give "aaa 10") is turned into '''"aaa %d" % (10,)''' (and will give '"aaa %d" % (10,)'), which is not what you intended. For me, the best way to develop an eye to such things was to try all suspicious part of my code in IPython console with %ed magick command.
And using %s in sql query introduces direct vulnerability - sql injection - in your code.
I forget to add the "and" operator between the first select command and second in the where clause so on adding the and operator as follows the def runs and return correct result:
sql = """ SELECT (COUNT(*)-1) AS LEN FROM invertedindextb WHERE
(dewId IN (SELECT dewId FROM invertedindextb WHERE eTypeId = '%d' and trm = '%s') and
docId IN (SELECT docId FROM invertedindextb WHERE eTypeId = '%d' and trm = '%s')) """ %(eTypeId,trm,eTypeId,trm)
I am using pandas.io.sql to execute a SQL script that contains CTE and would like to do something like this:
import pandas.io.sql as psql
param1 = 'park'
param2 = 'zoo'
sqlstr = ("""WITH CTE_A AS (
SELECT *
FROM A
WHERE A.Location = param1),
CTE_B AS (
SELECT *
FROM B
WHERE B.Location = param2)
SELECT A.*, B.*
FROM C
INNER JOIN A
ON C.something = A.something
INNER JOIN B
ON C.something = B.something
WHERE C.combined = param1 || param2
)
I would like to do something like this
result = psql.frame_query(sqlstr, con = db, params = (param1,param2))
Could anyone help me in passing the two parameters using Pandas?
The only way that I know how to do something like this is to do the following. It doesn't however take advantage of the psql package in Pandas.
import pyodbc
import pandas
conn = pyodbc.connect('yourconnectionstring')
curs = conn.cursor()
param1 = 'park'
param2 = 'zoo'
sqlstr = """WITH CTE_A AS (
SELECT *
FROM A
WHERE A.Location = param1),
CTE_B AS (
SELECT *
FROM B
WHERE B.Location = param2)
SELECT A.*, B.*
FROM C
INNER JOIN A
ON C.something = A.something
INNER JOIN B
ON C.something = B.something
WHERE C.combined = ?|| ?;"""
q = curs.execute(sqlstr,[param1,param2]).fetchall()
df = pandas.DataFrame(q)
curs.close()
conn.close()
This passes parameters to avoid SQL injection and ends with a DataFrame object containing your results
When using pandas.io.sql in combination with mysql.connector the syntax is as follows:
import pandas.io.sql as psql
import mysql.connector as mysql
db = mysql.connector(host="localhost",user="user",passwd="password")
hour = 7
result = psql.read_sql("select * from table where
`hour` > %(hour)s and `name` = %(name)s",con=db,params={'hour':hour,'name':'John'})
So just enter %(name)s in the query, replace 'name' with whatever name you want. And add an dictionary for params.
I this option does add '' to the string so if for example you need to use it in a table name then this doesn't work. I use regex to clean the string for this. i.e.
import re
table_name = re.sub(r'[\W]', ' ',table_name)
(use r'[\W_]' if the table name also doesn't have underscores)
How can I determine if a table exists using the Psycopg2 Python library? I want a true or false boolean.
How about:
>>> import psycopg2
>>> conn = psycopg2.connect("dbname='mydb' user='username' host='localhost' password='foobar'")
>>> cur = conn.cursor()
>>> cur.execute("select * from information_schema.tables where table_name=%s", ('mytable',))
>>> bool(cur.rowcount)
True
An alternative using EXISTS is better in that it doesn't require that all rows be retrieved, but merely that at least one such row exists:
>>> cur.execute("select exists(select * from information_schema.tables where table_name=%s)", ('mytable',))
>>> cur.fetchone()[0]
True
I don't know the psycopg2 lib specifically, but the following query can be used to check for existence of a table:
SELECT EXISTS(SELECT 1 FROM information_schema.tables
WHERE table_catalog='DB_NAME' AND
table_schema='public' AND
table_name='TABLE_NAME');
The advantage of using information_schema over selecting directly from the pg_* tables is some degree of portability of the query.
select exists(select relname from pg_class
where relname = 'mytablename' and relkind='r');
The first answer did not work for me. I found success checking for the relation in pg_class:
def table_exists(con, table_str):
exists = False
try:
cur = con.cursor()
cur.execute("select exists(select relname from pg_class where relname='" + table_str + "')")
exists = cur.fetchone()[0]
print exists
cur.close()
except psycopg2.Error as e:
print e
return exists
#!/usr/bin/python
# -*- coding: utf-8 -*-
import psycopg2
import sys
con = None
try:
con = psycopg2.connect(database='testdb', user='janbodnar')
cur = con.cursor()
cur.execute('SELECT 1 from mytable')
ver = cur.fetchone()
print ver //здесь наш код при успехе
except psycopg2.DatabaseError, e:
print 'Error %s' % e
sys.exit(1)
finally:
if con:
con.close()
I know you asked for psycopg2 answers, but I thought I'd add a utility function based on pandas (which uses psycopg2 under the hood), just because pd.read_sql_query() makes things so convenient, e.g. avoiding creating/closing cursors.
import pandas as pd
def db_table_exists(conn, tablename):
# thanks to Peter Hansen's answer for this sql
sql = f"select * from information_schema.tables where table_name='{tablename}'"
# return results of sql query from conn as a pandas dataframe
results_df = pd.read_sql_query(sql, conn)
# True if we got any results back, False if we didn't
return bool(len(results_df))
I still use psycopg2 to create the db-connection object conn similarly to the other answers here.
The following solution is handling the schema too:
import psycopg2
with psycopg2.connect("dbname='dbname' user='user' host='host' port='port' password='password'") as conn:
cur = conn.cursor()
query = "select to_regclass(%s)"
cur.execute(query, ['{}.{}'.format('schema', 'table')])
exists = bool(cur.fetchone()[0])
Expanding on the above use of EXISTS, I needed something to test table existence generally. I found that testing for results using fetch on a select statement yielded the result "None" on an empty existing table -- not ideal.
Here's what I came up with:
import psycopg2
def exist_test(tabletotest):
schema=tabletotest.split('.')[0]
table=tabletotest.split('.')[1]
existtest="SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema = '"+schema+"' AND table_name = '"+table+"' );"
print('existtest',existtest)
cur.execute(existtest) # assumes youve already got your connection and cursor established
# print('exists',cur.fetchall()[0])
return ur.fetchall()[0] # returns true/false depending on whether table exists
exist_test('someschema.sometable')
You can look into pg_class catalog:
The catalog pg_class catalogs tables and most everything else that has
columns or is otherwise similar to a table. This includes indexes (but
see also pg_index), sequences (but see also pg_sequence), views,
materialized views, composite types, and TOAST tables; see relkind.
Below, when we mean all of these kinds of objects we speak of
“relations”. Not all columns are meaningful for all relation types.
Assuming an open connection with cur as cursor,
# python 3.6+
table = 'mytable'
cur.execute(f"SELECT EXISTS(SELECT relname FROM pg_class WHERE relname = {table});")
if cur.fetchone()[0]:
# if table exists, do something here
return True
cur.fetchone() will resolve to either True or False because of the EXISTS() function.