psycopg2 interpolate table name in executemany statement - python

I am trying to insert data into a table. The table is determined in the beging of the program and remains constant throughout. How do I interpolate the table name in an execute many statement like the one below?
tbl = 'table_name'
rows = [{'this':x, 'that': x+1} for x in range(10)]
cur.executemany("""INSERT INTO %(tbl)s
VALUES(
%(this)s,
%(that)s
)""", rows)

As stated in the official documentation: "Only query values should be bound via this method: it shouldn’t be used to merge table or field names to the query. If you need to generate dynamically an SQL query (for instance choosing dynamically a table name) you can use the facilities provided by the psycopg2.sql module."
It has the following syntax:
from psycopg2 import sql
tbl = 'table_name'
rows = [{'this':x, 'that': x+1} for x in range(10)]
cur.execute(
sql.SQL("INSERT INTO {} VALUES (%(this)s, %(that)s);"""")
.format(sql.Identifier(tbl)), rows)
More on http://initd.org/psycopg/docs/sql.html#module-psycopg2.sql

Related

SQL psycopg2 insert variable that is a list of variable length into database

I am trying to write a row of observations into my database, but I have some unique variable called list_variable which is a list of strings that can be of length 1-3. So sometimes ['string1'] but sometimes also ['string1','string2'] or ['string1','string2','string3'].
When I try to add this to my database by:
def add_to_cockroach_db():
cur.execute(f"""
INSERT INTO database (a, b, c)
VALUES ({time.time()}, {event},{list_variable}; <--- this one
""")
conn.commit()
I would get the following error (values have been changed for readability):
SyntaxError: at or near "[": syntax error
DETAIL: source SQL:
INSERT INTO database (a, b, c)
VALUES (a_value, b_value, ['c_value_1', 'c_value_2'])
^
HINT: try \h VALUES
It seems that having a variable that is a list is not allowed, how could I make this work out?
Thanks in advance!
**edit
list_variable looks e.g., like this = ['value1','value2']
You can either cast it to string using
str(['c_value_1', 'c_value_2'])
which looks like this:
"['c_value_1', 'c_value_2']"
or join the elements of your list with a delimiter you choose. This for example generates a comma separated string.
",".join(['c_value_1', 'c_value_2'])
which looks like this:
'c_value_1,c_value_2'
Like Maurice Meyer has already pointed out in the comments, it is better to pass your values as a list or as a tuple instead of formatting the query yourself.
Your command could look like this depending on the solution you choose:
cur.execute("INSERT INTO database (a, b, c) VALUES (%s, %s, %s)", (time.time(), event, ",".join(list_variable)))
There are a few ways you could accomplish this.
The simplest way is to call str on the list and insert the result into a string (VARCHAR) column. While this works, it's not easy to work with the values in database queries, and when it's retrieved from the database it's a string, not a list.
Using a VARCHAR[] column type - an array of string values - reflects the actual data type, and enables use of PostgreSQL's array functions in queries.
Finally, you could use a JSONB column type. This allows storage of lists or dicts, or nested combinations of both, so it's very flexible, and PostgreSQL provides functions for working with JSON objects too. However it might be overkill if you don't need the flexibility, or if you want to be strict about the data.
This script shows all three methods in action:
import psycopg2
from psycopg2.extras import Json
DROP = """DROP TABLE IF EXISTS t73917632"""
CREATE = """\
CREATE TABLE t73917632 (
s VARCHAR NOT NULL,
a VARCHAR[] NOT NULL,
j JSONB NOT NULL
)
"""
INSERT = """INSERT INTO t73917632 (s, a, j) VALUES (%s, %s, %s)"""
SELECT = """SELECT s, a, j FROM t73917632"""
v = ['a', 'b', 'c']
with psycopg2.connect(dbname='test') as conn:
with conn.cursor() as cur:
cur.execute(DROP)
cur.execute(CREATE)
conn.commit()
cur.execute(INSERT, (str(v), v, Json(v)))
conn.commit()
cur.execute(SELECT)
for row in cur:
print(row)
Output:
("['a', 'b', 'c']", ['a', 'b', 'c'], ['a', 'b', 'c'])
It's worth observing that if the array of strings represents some kind of child relationship to the table - for example the table records teams, and the string array contains the names of team members - it is usually a better design to insert each element in the array into a separate row in a child table, and associate them with the parent row using a foreign key.

Update SQL Records Based on Pandas DataFrame

I have established connection with SQL using below code and have extracted the data from SQL table, converted into dataframe and ran the predictive model. I have the output generated and want to add the values of output column alone in the database based on Unique ID column.
server = 'Servername'
database = 'DBName'
username = 'username'
password = 'password'
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
sql ='SELECT * FROM TableName'
DF= pd.read_sql(sql,cnxn)
I have columns 'UniqueID','Description','Date','Predicted' in dataframe 'DF' which is retrieved from database. I have predicted the output 'Predicted' and is available in my dataframe. I need to overwrite back only the value in 'Predicted' column of the database based on UniqueID.
Please let me know if there is any way out or we can just overwrite complete dataframe to database table.
The best method I've found is to take advantage of an SQL inner join and temporary tables to update the values. This works well if you need to update many records in SQL.
Apologies if there are any errors here as I'm borrowing this from a class I've written.
SQL Cursor
cursor = cnxn.cursor()
# reduce number of calls to server on inserts
cursor.fast_executemany = True
Insert Values into a Temporary Table
# insert only the key and the updated values
subset = DF[['UniqueID','Predicted']]
# form SQL insert statement
columns = ", ".join(subset.columns)
values = '('+', '.join(['?']*len(subset.columns))+')'
# insert
statement = "INSERT INTO #temp_TableName ("+columns+") VALUES "+values
insert = [tuple(x) for x in subset.values]
cursor.executemany(statement, insert)
Update Values in Main Table from Temporary Table
statement = '''
UPDATE
TableName
SET
u.Predicted
FROM
TableName AS t
INNER JOIN
#temp_TableName AS u
ON
u.UniqueID=t.UnqiueID;
'''
cursor.execute(statement)

In python script i have insert query but when i want insert multiple columns in the same query it gives error

In python script i have insert query but when i want insert multiple columns in the same query it gives error.
but for single query it works perfectly.
Below is my code.
my database AWS S3.
A = [] #
for score_row in score:
A.append(score_row[2])
print("A=",A)
B = [] #
for day_row in score:
B.append(day_row[1])
print("B=",B)
for x,y in zip(A,B):
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.executemany(sql, (x,),(y,))
when i replace above query with following sql insert statement it works perfect.
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?)"""
cursor.executemany(sql, (x,))
Fix your code like this:
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.execute(sql, (x,y,)) #<-- here
Because is just a onet insert ( not several inserts )
Explanation
I guess you are mistaked about number of inserts ( rows ) and number of parĂ meters ( fields to insert on each row ). When you want to insert several rows, use executemany, just for one row you should to use execute. Second parapeter of execute is the "list" (or sequence ) of values to be inserted in this row.
Alternative
You can try to change syntax and insert all data in one shot using ** syntax:
values = zip(A,B) #instead of "for"
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.executemany(sql, **values )
Notice this approach don't use for statement. This mean all data is send to database in one call, this is more efficient.

psychopg2 to generate insert statements with variable column counts

I am attempting to insert Excel spreadsheets into a Postgres DB using a Python script with psychopg2.
The problem is not all the spreadsheets have the same number of columns, and I need the insert statement to be flexible enough so I don't have to specify them by name.
My approach is to load the columns of the spreadsheet's header row into a tuple, and likewise with the values being inserted. So for example:
sql = ''''INSERT INTO my_table (%s) VALUES (%s);'''
cur.execute(sql, (cols, vals))
where 'cols' and 'vals' are both tuples.
'cols' can have 7, 9, 10, etc. entries, again depending on how many columns the spreadsheet had.
When I attempt to run this, I get:
psycopg2.ProgrammingError: syntax error at or near "'INSERT INTO my_table
(ARRAY['"
LINE 1: 'INSERT INTO my_table...
^
Not sure if the problem is in my calling syntax, or if you simply can't do what I'm trying to do.
There's an apostrophe ' at the beginning of your sql query.
''''INSERT INTO my_table (%s) VALUES (%s);'''
should be
'''INSERT INTO my_table (%s) VALUES (%s);'''
Edit: didn't realize you where trying to query columns dynamically. To do that, you should use text formatting. Asuming cols is a list:
sql = '''INSERT INTO my_table ({}) VALUES (%s)'''.format(','.join(cols))
Then, your execution would be:
cur.execute(sql, (vals,))

Retrieving and selecting binary values from Mysql with Python 3

I'm trying to select data from one table, and perform a query on another table using the returned values from the first table.
Both tables are case-sensitive, and of type utf8-bin.
When I perform my first select, I am returned a tuple of binary values:
query = """SELECT id FROM table1"""
results = (b'1234', b'2345', b'3456')
I'd then like to perform a query on table2 using the ids returned from table1:
query = """SELECT element FROM table2 WHERE id IN (%s) """ % results
Is this the right way to do this?
You need to create the query so that it can be properly parameterized:
query = """SELECT element FROM table2 WHERE id IN (%s) """ % ",".join(['%s'] * len(results))
This will transform the query to:
query = """SELECT element FROM table2 WHERE id IN (%s,%s,%s) """
Then you can just pass query and results to the execute() (or appropriate) method so that results are properly parameterized.

Categories

Resources