psycopg2: dynamic table, columns and values - python

I am trying t use psycopg2 to generate a dynamic INSERT INTO statement where table, columns and values are all dynamic.
I've read the documentation around the psycopg2.SQL class, which provides some hints, however I can't get the final result.
# inputs to the DB. Keys represent COLS. 2 records.
kvalues = [{'first':11,'second':22},{'first':33,'second':44}]
table = 'my_table'
columns = list(kvalues[0].keys())
values = [list(kvalue.values()) for kvalue in kvalues]
# Generate SQL string
conn = psycopg2.connect(#...details..)
cur = conn.cursor()
query_string = sql.SQL("INSERT INTO {} ({}) VALUES %s").format(
sql.Identifier(table),
sql.SQL(', ').join(map(sql.Identifier, columns))).as_string(cur)
print(cur.mogrify(query_string, values))
#>> TypeError: not all arguments converted during string formatting
I've tried adding the values in the SQL statement, but this also produces more errors.
How can I generate a dynamic insert into statement which accepts the cols, vals from kvalues?
Thank you!

Issue is around Tuple/List formatting. Below is the code that works:
kvalues = [{'first':11,'second':22},{'first':33,'second':44}]
table = 'my_table'
columns = list(kvalues[0].keys())
values = [tuple(kvalue.values()) for kvalue in kvalues]
# Generate SQL string
conn = psycopg2.connect(#...details..)
cur = conn.cursor()
query_string = sql.SQL("INSERT INTO {} ({}) VALUES {}").format(
sql.Identifier(table),
sql.SQL(', ').join(map(sql.Identifier, columns)),
sql.SQL(', ').join(sql.Placeholder()*len(values)),
).as_string(cur)
print(cur.mogrify(query_string, values))

Related

How to insert NULL values into PostgreSQL database using Python?

I have list of tuples with data something like this:
list1 = [(1100, 'abc', '{"1209": "Y", "1210": "Y"}'), (1100, 'abc', None)]
def insert_sample_data(col_val):
cur = self.con.cursor()
sql = """insert into sampletable values {}""".format(col_val)
cur.execute(sql)
self.con.commit()
cur.close()
values1 = ', '.join(map(str, list1)) #bulk insert
insert_sample_data(values1)
Table Structure:
ssid int, name varchar, rules jsonb
When I am trying to insert the data but it throws me an error saying "insert column "none" does not exist". How can we load the data into table with 'Null' or 'None'?
I looked at this solution but it does not help in this case How to insert 'NULL' values into PostgreSQL database using Python?
As #shmee states, you need to use something like executemany and parameterize your values instead of using format, which is vulnerable to SQL injection.
I would modify your code as follows:
def insert_sample_data(self, values): # added self since you are referencing it below
with self.con.cursor() as cur:
sql = "insert into sampletable values (%s, %s, %s)" # Use %s for parameters
cur.executemany(sql, values) # Pass the list of tuples directly
self.con.commit()
list1 = [(1100, 'abc', '{"1209": "Y", "1210": "Y"}'), (1100, 'abc', None)]
self.insert_sample_data(list1) # pass the list directly

How to insert a dictionary into Postgresql Table with Pscycopg2

How do I insert a python dictionary into a Postgresql2 table? I keep getting the following error, so my query is not formatted correctly:
Error syntax error at or near "To" LINE 1: INSERT INTO bill_summary VALUES(To designate the facility of...
import psycopg2
import json
import psycopg2.extras
import sys
with open('data.json', 'r') as f:
data = json.load(f)
con = None
try:
con = psycopg2.connect(database='sanctionsdb', user='dbuser')
cur = con.cursor(cursor_factory=psycopg2.extras.DictCursor)
cur.execute("CREATE TABLE bill_summary(title VARCHAR PRIMARY KEY, summary_text VARCHAR, action_date VARCHAR, action_desc VARCHAR)")
for d in data:
action_date = d['action-date']
title = d['title']
summary_text = d['summary-text']
action_date = d['action-date']
action_desc = d['action-desc']
q = "INSERT INTO bill_summary VALUES(" +str(title)+str(summary_text)+str(action_date)+str(action_desc)+")"
cur.execute(q)
con.commit()
except psycopg2.DatabaseError, e:
if con:
con.rollback()
print 'Error %s' % e
sys.exit(1)
finally:
if con:
con.close()
You should use the dictionary as the second parameter to cursor.execute(). See the example code after this statement in the documentation:
Named arguments are supported too using %(name)s placeholders in the query and specifying the values into a mapping.
So your code may be as simple as this:
with open('data.json', 'r') as f:
data = json.load(f)
print(data)
""" above prints something like this:
{'title': 'the first action', 'summary-text': 'some summary', 'action-date': '2018-08-08', 'action-desc': 'action description'}
use the json keys as named parameters:
"""
cur = con.cursor()
q = "INSERT INTO bill_summary VALUES(%(title)s, %(summary-text)s, %(action-date)s, %(action-desc)s)"
cur.execute(q, data)
con.commit()
Note also this warning (from the same page of the documentation):
Warning: Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
q = "INSERT INTO bill_summary VALUES(" +str(title)+str(summary_text)+str(action_date)+str(action_desc)+")"
You're writing your query in a wrong way, by concatenating the values, they should rather be the comma-separated elements, like this:
q = "INSERT INTO bill_summary VALUES({0},{1},{2},{3})".format(str(title), str(summery_text), str(action_date),str(action_desc))
Since you're not specifying the columns names, I already suppose they are in the same orders as you have written the value in your insert query. There are basically two way of writing insert query in postgresql. One is by specifying the columns names and their corresponding values like this:
INSERT INTO TABLE_NAME (column1, column2, column3,...columnN)
VALUES (value1, value2, value3,...valueN);
Another way is, You may not need to specify the column(s) name in the SQL query if you are adding values for all the columns of the table. However, make sure the order of the values is in the same order as the columns in the table. Which you have used in your query, like this:
INSERT INTO TABLE_NAME VALUES (value1,value2,value3,...valueN);

Query format with f-string

I have very big dictionary that I want to insert into MySQL table. The dictionary keys are the column names in the table. I'm constructing my query like this as of now:
bigd = {'k1':'v1', 'k2':10}
cols = str(bigd.keys()).strip('[]')
vals = str(bigd.values()).strip('[]')
query = "INSERT INTO table ({}) values ({})".format(cols,vals)
print query
Output:
"INSERT INTO table ('k2', 'k1') values (10, 'v1')"
And this works in Python2.7
But in Python 3.6 if I use string literals like this:
query = f"INSERT INTO table ({cols}) values ({vals})"
print(query)
It prints this:
"INSERT INTO table (dict_keys(['k1', 'k2'])) values (dict_values(['v1', 10]))"
Any tips?
For your curiosity, you should realize that you've cast these to str, getting the representation of dict_keys/values to be inserted into the f-string.
You could just cast to tuples and then insert:
cols = tuple(bigd.keys())
vals = tuple(bigd.values())
q = f"INSERT INTO table {cols} values {vals}"
but, as the comment notes, this isn't a safe approach.

Passing parameter in psycopg2

I am trying to access PostgreSQL using psycopg2:
sql = """
SELECT
%s
FROM
table;
"""
cur = con.cursor()
input = (['id', 'name'], )
cur.execute(sql, input)
data = pd.DataFrame.from_records(cur.fetchall())
However, the returned result is:
0
0 [id, name]
1 [id, name]
2 [id, name]
3 [id, name]
4 [id, name]
If I try to access single column, it looks like:
0
0 id
1 id
2 id
3 id
4 id
It looks like something is wrong with the quoting around column name (single quote which should not be there):
In [49]: print cur.mogrify(sql, input)
SELECT
'id'
FROM
table;
but I am following doc: http://initd.org/psycopg/docs/usage.html#
Anyone can tell me what is going on here? Thanks a lot!!!
Use the AsIs extension
import psycopg2
from psycopg2.extensions import AsIs
column_list = ['id','name']
columns = ', '.join(column_list)
cursor.execute("SELECT %s FROM table", (AsIs(columns),))
And mogrify will show that it is not quoting the column names and passing them in as is.
Nowadays, you can use sql.Identifier to do this in a clean and secure way :
from psycopg2 import sql
statement = """
SELECT
{id}, {name}
FROM
table;
"""
with con.cursor() as cur:
cur.execute(sql.SQL(statement).format(
id=sql.SQL.Identifier("id"),
name=sql.SQL.Identifier("name")
))
data = pd.DataFrame.from_records(cur.fetchall())
More information on query composition here : https://www.psycopg.org/docs/sql.html
The reason was that you were passing the string representation of the array ['id', 'name'] as SQL query parameter but not as the column names. So the resulting query was similar to
SELECT 'id, name' FROM table
Looks your table had 5 rows so the returned result was just this literal for each row.
Column names cannot be the SQL query parameters but can be just the usual string parameters which you can prepare before executing the query-
sql = """
SELECT
%s
FROM
table;
"""
input = 'id, name'
sql = sql % input
print(sql)
cur = con.cursor()
cur.execute(sql)
data = pd.DataFrame.from_records(cur.fetchall())
In this case the resulting query is
SELECT
id, name
FROM
table;

Python Sqlite3 insert operation with a list of column names

Normally, if i want to insert values into a table, i will do something like this (assuming that i know which columns that the values i want to insert belong to):
conn = sqlite3.connect('mydatabase.db')
conn.execute("INSERT INTO MYTABLE (ID,COLUMN1,COLUMN2)\
VALUES(?,?,?)",[myid,value1,value2])
But now i have a list of columns (the length of list may vary) and a list of values for each columns in the list.
For example, if i have a table with 10 columns (Namely, column1, column2...,column10 etc). I have a list of columns that i want to update.Let's say [column3,column4]. And i have a list of values for those columns. [value for column3,value for column4].
How do i insert the values in the list to the individual columns that each belong?
As far as I know the parameter list in conn.execute works only for values, so we have to use string formatting like this:
import sqlite3
conn = sqlite3.connect(':memory:')
conn.execute('CREATE TABLE t (a integer, b integer, c integer)')
col_names = ['a', 'b', 'c']
values = [0, 1, 2]
conn.execute('INSERT INTO t (%s, %s, %s) values(?,?,?)'%tuple(col_names), values)
Please notice this is a very bad attempt since strings passed to the database shall always be checked for injection attack. However you could pass the list of column names to some injection function before insertion.
EDITED:
For variables with various length you could try something like
exec_text = 'INSERT INTO t (' + ','.join(col_names) +') values(' + ','.join(['?'] * len(values)) + ')'
conn.exec(exec_text, values)
# as long as len(col_names) == len(values)
Of course string formatting will work, you just need to be a bit cleverer about it.
col_names = ','.join(col_list)
col_spaces = ','.join(['?'] * len(col_list))
sql = 'INSERT INTO t (%s) values(%s)' % (col_list, col_spaces)
conn.execute(sql, values)
I was looking for a solution to create columns based on a list of unknown / variable length and found this question. However, I managed to find a nicer solution (for me anyway), that's also a bit more modern, so thought I'd include it in case it helps someone:
import sqlite3
def create_sql_db(my_list):
file = 'my_sql.db'
table_name = 'table_1'
init_col = 'id'
col_type = 'TEXT'
conn = sqlite3.connect(file)
c = conn.cursor()
# CREATE TABLE (IF IT DOESN'T ALREADY EXIST)
c.execute('CREATE TABLE IF NOT EXISTS {tn} ({nf} {ft})'.format(
tn=table_name, nf=init_col, ft=col_type))
# CREATE A COLUMN FOR EACH ITEM IN THE LIST
for new_column in my_list:
c.execute('ALTER TABLE {tn} ADD COLUMN "{cn}" {ct}'.format(
tn=table_name, cn=new_column, ct=col_type))
conn.close()
my_list = ["Col1", "Col2", "Col3"]
create_sql_db(my_list)
All my data is of the type text, so I just have a single variable "col_type" - but you could for example feed in a list of tuples (or a tuple of tuples, if that's what you're into):
my_other_list = [("ColA", "TEXT"), ("ColB", "INTEGER"), ("ColC", "BLOB")]
and change the CREATE A COLUMN step to:
for tupl in my_other_list:
new_column = tupl[0] # "ColA", "ColB", "ColC"
col_type = tupl[1] # "TEXT", "INTEGER", "BLOB"
c.execute('ALTER TABLE {tn} ADD COLUMN "{cn}" {ct}'.format(
tn=table_name, cn=new_column, ct=col_type))
As a noob, I can't comment on the very succinct, updated solution #ron_g offered. While testing, though I had to frequently delete the sample database itself, so for any other noobs using this to test, I would advise adding in:
c.execute('DROP TABLE IF EXISTS {tn}'.format(
tn=table_name))
Prior the the 'CREATE TABLE ...' portion.
It appears there are multiple instances of
.format(
tn=table_name ....)
in both 'CREATE TABLE ...' and 'ALTER TABLE ...' so trying to figure out if it's possible to create a single instance (similar to, or including in, the def section).

Categories

Resources