Checking if a postgresql table exists under python (and probably Psycopg2) - python

How can I determine if a table exists using the Psycopg2 Python library? I want a true or false boolean.

How about:
>>> import psycopg2
>>> conn = psycopg2.connect("dbname='mydb' user='username' host='localhost' password='foobar'")
>>> cur = conn.cursor()
>>> cur.execute("select * from information_schema.tables where table_name=%s", ('mytable',))
>>> bool(cur.rowcount)
True
An alternative using EXISTS is better in that it doesn't require that all rows be retrieved, but merely that at least one such row exists:
>>> cur.execute("select exists(select * from information_schema.tables where table_name=%s)", ('mytable',))
>>> cur.fetchone()[0]
True

I don't know the psycopg2 lib specifically, but the following query can be used to check for existence of a table:
SELECT EXISTS(SELECT 1 FROM information_schema.tables
WHERE table_catalog='DB_NAME' AND
table_schema='public' AND
table_name='TABLE_NAME');
The advantage of using information_schema over selecting directly from the pg_* tables is some degree of portability of the query.

select exists(select relname from pg_class
where relname = 'mytablename' and relkind='r');

The first answer did not work for me. I found success checking for the relation in pg_class:
def table_exists(con, table_str):
exists = False
try:
cur = con.cursor()
cur.execute("select exists(select relname from pg_class where relname='" + table_str + "')")
exists = cur.fetchone()[0]
print exists
cur.close()
except psycopg2.Error as e:
print e
return exists

#!/usr/bin/python
# -*- coding: utf-8 -*-
import psycopg2
import sys
con = None
try:
con = psycopg2.connect(database='testdb', user='janbodnar')
cur = con.cursor()
cur.execute('SELECT 1 from mytable')
ver = cur.fetchone()
print ver //здесь наш код при успехе
except psycopg2.DatabaseError, e:
print 'Error %s' % e
sys.exit(1)
finally:
if con:
con.close()

I know you asked for psycopg2 answers, but I thought I'd add a utility function based on pandas (which uses psycopg2 under the hood), just because pd.read_sql_query() makes things so convenient, e.g. avoiding creating/closing cursors.
import pandas as pd
def db_table_exists(conn, tablename):
# thanks to Peter Hansen's answer for this sql
sql = f"select * from information_schema.tables where table_name='{tablename}'"
# return results of sql query from conn as a pandas dataframe
results_df = pd.read_sql_query(sql, conn)
# True if we got any results back, False if we didn't
return bool(len(results_df))
I still use psycopg2 to create the db-connection object conn similarly to the other answers here.

The following solution is handling the schema too:
import psycopg2
with psycopg2.connect("dbname='dbname' user='user' host='host' port='port' password='password'") as conn:
cur = conn.cursor()
query = "select to_regclass(%s)"
cur.execute(query, ['{}.{}'.format('schema', 'table')])
exists = bool(cur.fetchone()[0])

Expanding on the above use of EXISTS, I needed something to test table existence generally. I found that testing for results using fetch on a select statement yielded the result "None" on an empty existing table -- not ideal.
Here's what I came up with:
import psycopg2
def exist_test(tabletotest):
schema=tabletotest.split('.')[0]
table=tabletotest.split('.')[1]
existtest="SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema = '"+schema+"' AND table_name = '"+table+"' );"
print('existtest',existtest)
cur.execute(existtest) # assumes youve already got your connection and cursor established
# print('exists',cur.fetchall()[0])
return ur.fetchall()[0] # returns true/false depending on whether table exists
exist_test('someschema.sometable')

You can look into pg_class catalog:
The catalog pg_class catalogs tables and most everything else that has
columns or is otherwise similar to a table. This includes indexes (but
see also pg_index), sequences (but see also pg_sequence), views,
materialized views, composite types, and TOAST tables; see relkind.
Below, when we mean all of these kinds of objects we speak of
“relations”. Not all columns are meaningful for all relation types.
Assuming an open connection with cur as cursor,
# python 3.6+
table = 'mytable'
cur.execute(f"SELECT EXISTS(SELECT relname FROM pg_class WHERE relname = {table});")
if cur.fetchone()[0]:
# if table exists, do something here
return True
cur.fetchone() will resolve to either True or False because of the EXISTS() function.

Related

Postgres connection in python

I am struggling to establish a connection inside data iteration. Means I am running a select query to postgres and iterating the return data. after some transformation I am writing it to another table. But it is not working. Sample python code is below.
conn = pgconn(------)
cursor = pgconn.Cursor()
query1 = "select * from table"
query2 = "select * from table2 where Id=(%s);"
cursor.execute(query1)
result = query1.fetchall()
for row in result:
If row.a == 2:
cursor.execute(query2, [row.time])
In the above python code I can't able to extract the data by running query2 and passing query1 result as a parameter. It seems cursor is blocked by the query1 so query2 execution is not happening. Please some one help in this issue.
First of all you can write a join statement to do this and can get the data easily
select * from table join table2 where table2.id == table.time
Also why this is not working maybe because the cursor object is getting override inside the for loop and thus the query results get changed.
Use RealDictCursor, and correct the syntax on your inside call to execute():
import psycopg2
import psycopg2.extras
conn = pgconn(------)
cursor = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
query1 = "select * from table"
query2 = "select * from table2 where Id=(%s);"
cursor.execute(query1)
result = query1.fetchall()
for row in result:
If row.a == 2:
cursor.execute(query2, (row['time'],))
1. install psycopg2 and psycopg2.extras. ( pip install)
Then set up your Postgres Connection like:
def Postgres_init(self):
try:
conn = psycopg2.connect(host=os.environ['SD_POSTGRES_SERVER'],
user=os.environ['SD_POSTGRES_USER'],
password=os.environ['SD_POSTGRES_PASSWORD'],
port=os.environ['SD_POSTGRES_PORT'],
database=os.environ['SD_POSTGRES_DATABASE'])
logging.info("Connected to PostgreSQL")
except (Exception, psycopg2.Error) as error:
logging.info(error)
2. Connect your Cursor with the defined connection
cursor = conn.cursor()
3. Execute your query:
cursor.execute("""SELECT COUNT (column1) from tablename WHERE column2 =%s""", (
Value,)) # Check if already exists
result = cursor.fetchone()
Now the value is stored in the "result" variable. Now you can execute the next query like:
cursor.execute("""
INSERT INTO tablename2
(column1, column2, column3)
VALUES
(%s, %s, %s)
ON CONFLICT(column1) DO UPDATE
SET
column2=excluded.column2,
column3=excluded.column3;
""", (result, column2, column3)
)
Now the result of query 1 is stored in the second table in the first column.
Now you can close your connection:
conn.close()

Python and MySQL - fetchall() doesn't show any result

I have a problem getting the query results from my Python-Code. The connection to the database seems to work, but i always get the error:
"InterfaceError: No result set to fetch from."
Can somebody help me with my problem? Thank you!!!
cnx = mysql.connector.connect(
host="127.0.0.1" ,
user="root" ,
passwd="*****",
db="testdb"
)
cursor = cnx.cursor()
query = ("Select * from employee ;")
cursor.execute(query)
row = cursor.fetchall()
If your problem is still not solved, you can consider replacing the python mysql driver package and use pymysql.
You can write code like this
#!/usr/bin/python
import pymysql
db = pymysql.connect(host="localhost", # your host, usually localhost
user="test", # your username
passwd="test", # your password
db="test") # name of the data base
# you must create a Cursor object. It will let
# you execute all the queries you need
cur = db.cursor()
query = ("SELECT * FROM employee")
# Use all the SQL you like
cur.execute(query)
# print all the first cell of all the rows
for row in cur.fetchall():
print(row[0])
db.close()
This should be able to find the result you want
add this to your code
for i in row:
print(i)
you did not print anything which is why that's not working
this will print each row in separate line
first try to print(row),if it fails try to execute using the for the loop,remove the semicolon in the select query statement
cursor = connection.cursor()
rows = cursor.execute('SELECT * FROM [DBname].[dbo].TableName where update_status is null ').fetchall()
for row in rows:
ds = row[0]
state = row[1]
here row[0] represent the first columnname in the database
& row[1] represent the second columnname in the database & so on

pyodbc not updating table

I query a table then loop through it to Update another table.
The console Prints shows correct data.
Not sure how to debug the cursor.execute for the UPDATE query.
It is not updating on the table. It's not a permission issue. If I run update command on my SQL workbench it works fine.
cursor = conn.cursor()
cursor.execute("Select Account_Name FROM dsf_CS_WebAppView")
for row in cursor.fetchall():
try:
cursor.execute("Select fullpath FROM customerdesignmap WHERE
fullpath LIKE '%{}%'".format(row.Account_Name))
rows = cursor.fetchall()
print(len(cursor.fetchall()))
if len(rows) > 0:
for rowb in rows:
print(rowb.fullpath)
print(row.Account_Name)
if len(row.Account_Name) > 2:
cursor.execute("UPDATE customerdesignmap SET householdname = {}, msid = {} WHERE fullpath LIKE '{}'".format(row.Account_Name, row.UniqueProjectNumber, rowb.fullpath))
conn.commit()
except:
pass
Consider a pure SQL solution as SQL Server supports UPDATE and JOIN across multiple tables. This avoids the nested loops, cursor calls, and string formatting of SQL commands.
UPDATE m
SET m.householdname = v.Account_Name,
m.msid = v.UniqueProjectNumber
FROM customerdesignmap m
JOIN dsf_CS_WebAppView v
ON m.fullpath LIKE CONCAT('%', v.Account_Name, '%')
In Python, run above in a single cursor.execute() with commit() call.
cursor.execute('''my SQL Query''')
conn.commit()

Speeding up performance when writing from pandas to sqlite

Hoping for a few pointers on how I can optimise this code up... Ideally I'd like to keep with using pandas but assume there's some nifty sqlite tricks I can use to get some good speed-up. For additional "points", would love to know if Cython could help at all here?
Incase it's not obvious from the code.. for context, I'm having to write out millions of very small sqlite files (files in "uncompressedDir") and outputting them into a much larger "master" sqlite DB ("6th jan.db").
Thanks in advance everyone!
%%cython -a
import os
import pandas as pd
import sqlite3
import time
import sys
def main():
rootDir = "/Users/harryrobinson/Desktop/dataForMartin/"
unCompressedDir = "/Users/harryrobinson/Desktop/dataForMartin/unCompressedSqlFiles/"
with sqlite3.connect(rootDir+'6thJan.db') as conn:
destCursor = conn.cursor()
createTable = "CREATE TABLE IF NOT EXISTS userData(TimeStamp, Category, Action, Parameter1Name, Parameter1Value, Parameter2Name, Parameter2Value, formatVersion, appVersion, userID, operatingSystem)"
destCursor.execute(createTable)
for i in os.listdir(unCompressedDir):
try:
with sqlite3.connect(unCompressedDir+i) as connection:
cursor = connection.cursor()
cursor.execute('SELECT * FROM Events')
df_events = pd.DataFrame(cursor.fetchall())
cursor.execute('SELECT * FROM Global')
df_global = pd.DataFrame(cursor.fetchall())
cols = ['TimeStamp', 'Category', 'Action', 'Parameter1Name', 'Parameter1Value', 'Parameter2Name', 'Parameter2Value']
df_events = df_events.drop(0,axis=1)
df_events.columns = cols
df_events['formatVersion'] = df_global.iloc[0,0]
df_events['appVersion'] = df_global.iloc[0,1]
df_events['userID'] = df_global.iloc[0,2]
df_events['operatingSystem'] = df_global.iloc[0,3]
except Exception as e:
print(e, sys.exc_info()[-1].tb_lineno)
try:
df_events.to_sql("userData", conn, if_exists="append", index=False)
except Exception as e:
print("Sqlite error, {0} - line {1}".format(e, sys.exc_info()[-1].tb_lineno))
UPDATE: halved the time by adding a transaction instead of to_sql
Reconsider using Pandas as a staging tool (leave the library for data analysis). Simply write pure SQL queries which can be accommodated by using SQLite's ATTACH to query external databases.
with sqlite3.connect(os.path.join(rootDir,'6thJan.db')) as conn:
destCursor = conn.cursor()
createTable = """CREATE TABLE IF NOT EXISTS userData(
TimeStamp TEXT, Category TEXT, Action TEXT, Parameter1Name TEXT,
Parameter1Value TEXT, Parameter2Name TEXT, Parameter2Value TEXT,
formatVersion TEXT, appVersion TEXT, userID TEXT, operatingSystem TEXT
);"""
destCursor.execute(createTable)
conn.commit()
for i in os.listdir(unCompressedDir):
destCursor.execute("ATTACH ? AS curr_db;", i)
sql = """INSERT INTO userData
SELECT e.*, g.formatVersion, g.appVersion, g.userID, g.operatingSystem
FROM curr_db.[events] e
CROSS JOIN (SELECT * FROM curr_db.[global] LIMIT 1) g;"""
destCursor.execute(sql)
conn.commit()
destCursor.execute("DETACH curr_db;")

TypeError: 'NoneType' object is not iterable

I need to process mysql data one row at a time and i have selected all rows put them in a tuple but i get the error above.
what does this mean and how do I go about it?
Provide some code.
You probably call some function that should update database, but the function does not return any data (like cursor.execute()). And code:
data = cursor.execute()
Makes data a None object (of NoneType). But without code it's hard to point you to the exact cause of your error.
It means that the object you are trying to iterate is actually None; maybe the query produced no results?
Could you please post a code sample?
The function you used to select all rows returned None. This "probably" (because you did not provide code, I am only assuming) means that the SQL query did not return any values.
Try using the cursor.rowcount variable after you call cursor.execute(). (this code will not work because I don't know what module you are using).
db = mysqlmodule.connect("a connection string")
curs = dbo.cursor()
curs.execute("select top 10 * from tablename where fieldA > 100")
for i in range(curs.rowcount):
row = curs.fetchone()
print row
Alternatively, you can do this (if you know you want ever result returned):
db = mysqlmodule.connect("a connection string")
curs = dbo.cursor()
curs.execute("select top 10 * from tablename where fieldA > 100")
results = curs.fetchall()
if results:
for r in results:
print r
This error means that you are attempting to loop over a None object. This is like trying to loop over a Null array in C/C++. As Abgan, orsogufo, Dan mentioned, this is probably because the query did not return anything. I suggest that you check your query/databse connection.
A simple code fragment to reproduce this error is:
x = None
for each i in x:
#Do Something
pass
This may occur when I try to let 'usrsor.fetchone' execute twice. Like this:
import sqlite3
db_filename = 'test.db'
with sqlite3.connect(db_filename) as conn:
cursor = conn.cursor()
cursor.execute("""
insert into test_table (id, username, password)
values ('user_id', 'myname', 'passwd')
""")
cursor.execute("""
select username, password from test_table where id = 'user_id'
""")
if cursor.fetchone() is not None:
username, password = cursor.fetchone()
print username, password
I don't know much about the reason. But I modified it with try and except, like this:
import sqlite3
db_filename = 'test.db'
with sqlite3.connect(db_filename) as conn:
cursor = conn.cursor()
cursor.execute("""
insert into test_table (id, username, password)
values ('user_id', 'myname', 'passwd')
""")
cursor.execute("""
select username, password from test_table where id = 'user_id'
""")
try:
username, password = cursor.fetchone()
print username, password
except:
pass
I guess the cursor.fetchone() can't execute twice, because the cursor will be None when execute it first time.
I know it's an old question but I thought I'd add one more possibility. I was getting this error when calling a stored procedure, and adding SET NOCOUNT ON at the top of the stored procedure solved it. The issue is that earlier selects that are not the final select for the procedure make it look like you've got empty row sets.
Try to append you query result to a list, and than you can access it. Something like this:
try:
cursor = con.cursor()
getDataQuery = 'SELECT * FROM everything'
cursor.execute(getDataQuery)
result = cursor.fetchall()
except Exception as e:
print "There was an error while getting the values: %s" % e
raise
resultList = []
for r in result:
resultList.append(r)
Now you have a list that is iterable.

Categories

Resources