Insert into a large table in psycopg using a dictionary - python

I have a VERY large table (>200 columns) in a database, which I'm accessing through psycopg2. I have the rows I want to insert as dictionaries, column name as key and value as value. Using psycopg2, I want to insert the row into the table.
Because of the prohibitively huge number of columns of the table in question, I would rather not write out an insert statement manually. How do I insert the dictionary efficiently and neatly?

This is the test table:
create table testins (foo int, bar int, baz int)
You can compose a sql statement this way:
d = dict(foo=10,bar=20,baz=30)
cur.execute(
"insert into testins (%s) values (%s)"
% (','.join(d), ','.join('%%(%s)s' % k for k in d)),
d)

Related

SQL psycopg2 insert variable that is a list of variable length into database

I am trying to write a row of observations into my database, but I have some unique variable called list_variable which is a list of strings that can be of length 1-3. So sometimes ['string1'] but sometimes also ['string1','string2'] or ['string1','string2','string3'].
When I try to add this to my database by:
def add_to_cockroach_db():
cur.execute(f"""
INSERT INTO database (a, b, c)
VALUES ({time.time()}, {event},{list_variable}; <--- this one
""")
conn.commit()
I would get the following error (values have been changed for readability):
SyntaxError: at or near "[": syntax error
DETAIL: source SQL:
INSERT INTO database (a, b, c)
VALUES (a_value, b_value, ['c_value_1', 'c_value_2'])
^
HINT: try \h VALUES
It seems that having a variable that is a list is not allowed, how could I make this work out?
Thanks in advance!
**edit
list_variable looks e.g., like this = ['value1','value2']
You can either cast it to string using
str(['c_value_1', 'c_value_2'])
which looks like this:
"['c_value_1', 'c_value_2']"
or join the elements of your list with a delimiter you choose. This for example generates a comma separated string.
",".join(['c_value_1', 'c_value_2'])
which looks like this:
'c_value_1,c_value_2'
Like Maurice Meyer has already pointed out in the comments, it is better to pass your values as a list or as a tuple instead of formatting the query yourself.
Your command could look like this depending on the solution you choose:
cur.execute("INSERT INTO database (a, b, c) VALUES (%s, %s, %s)", (time.time(), event, ",".join(list_variable)))
There are a few ways you could accomplish this.
The simplest way is to call str on the list and insert the result into a string (VARCHAR) column. While this works, it's not easy to work with the values in database queries, and when it's retrieved from the database it's a string, not a list.
Using a VARCHAR[] column type - an array of string values - reflects the actual data type, and enables use of PostgreSQL's array functions in queries.
Finally, you could use a JSONB column type. This allows storage of lists or dicts, or nested combinations of both, so it's very flexible, and PostgreSQL provides functions for working with JSON objects too. However it might be overkill if you don't need the flexibility, or if you want to be strict about the data.
This script shows all three methods in action:
import psycopg2
from psycopg2.extras import Json
DROP = """DROP TABLE IF EXISTS t73917632"""
CREATE = """\
CREATE TABLE t73917632 (
s VARCHAR NOT NULL,
a VARCHAR[] NOT NULL,
j JSONB NOT NULL
)
"""
INSERT = """INSERT INTO t73917632 (s, a, j) VALUES (%s, %s, %s)"""
SELECT = """SELECT s, a, j FROM t73917632"""
v = ['a', 'b', 'c']
with psycopg2.connect(dbname='test') as conn:
with conn.cursor() as cur:
cur.execute(DROP)
cur.execute(CREATE)
conn.commit()
cur.execute(INSERT, (str(v), v, Json(v)))
conn.commit()
cur.execute(SELECT)
for row in cur:
print(row)
Output:
("['a', 'b', 'c']", ['a', 'b', 'c'], ['a', 'b', 'c'])
It's worth observing that if the array of strings represents some kind of child relationship to the table - for example the table records teams, and the string array contains the names of team members - it is usually a better design to insert each element in the array into a separate row in a child table, and associate them with the parent row using a foreign key.

Insert values in table with two excecute commands

trying to insert values into one MySQL table using python.
First inserting values from csvfile; with:
sql = "INSERT INTO mydb.table(time,day,number)values %r" % tuple (values),)
cursor.execute(sql)
then insert into the same table and same row an other value
sql = "INSERT INTO mydb.table(name) values(%s)"
cursor.execute(sql)
with this i get the inserts in two different rows…
But i need to insert it into the same row without using sql = "INSERT INTO mydb.table(time,day,number,name)values %r" % tuple (values),)
Is there a way to insert values into the same row in two 'insert statements'?
INSERT will always add a new row. If you want to change values in this row, you have to specify a unique identifier (key) in the WHERE clause to access this row and use UPDATE or REPLACE instead.
When using REPLACE you need to be careful if your table contains an auto_increment column, since a new value will be generated.

psychopg2 to generate insert statements with variable column counts

I am attempting to insert Excel spreadsheets into a Postgres DB using a Python script with psychopg2.
The problem is not all the spreadsheets have the same number of columns, and I need the insert statement to be flexible enough so I don't have to specify them by name.
My approach is to load the columns of the spreadsheet's header row into a tuple, and likewise with the values being inserted. So for example:
sql = ''''INSERT INTO my_table (%s) VALUES (%s);'''
cur.execute(sql, (cols, vals))
where 'cols' and 'vals' are both tuples.
'cols' can have 7, 9, 10, etc. entries, again depending on how many columns the spreadsheet had.
When I attempt to run this, I get:
psycopg2.ProgrammingError: syntax error at or near "'INSERT INTO my_table
(ARRAY['"
LINE 1: 'INSERT INTO my_table...
^
Not sure if the problem is in my calling syntax, or if you simply can't do what I'm trying to do.
There's an apostrophe ' at the beginning of your sql query.
''''INSERT INTO my_table (%s) VALUES (%s);'''
should be
'''INSERT INTO my_table (%s) VALUES (%s);'''
Edit: didn't realize you where trying to query columns dynamically. To do that, you should use text formatting. Asuming cols is a list:
sql = '''INSERT INTO my_table ({}) VALUES (%s)'''.format(','.join(cols))
Then, your execution would be:
cur.execute(sql, (vals,))

python sqlite: write a big list to sqlite database

i am new to sqlite and i think this question should have been answered before but i havent been able to find an answer.
i have a list of around 50 elements that i need to write to an sqlite database with 50 columns.
went over the documentation # https://docs.python.org/2/library/sqlite3.html but in the examples the values are specified by ? (so for writing 3 values, 3 ? are specified
sample code:
row_to_write = range(50)
conn = sqlite3.connect('C:\sample_database\sample_database')
c = conn.cursor()
tried these:
approach 1
c.execute("INSERT INTO PMU VALUES (?)", row_to_write)
ERROR: OperationalError: table PMU has 50 columns but 1 values were supplied
approach 2...tried writing a generator for iterating over list
def write_row_value_generator(row_to_write):
for val in row_to_write:
yield (val,)
c.executemany("INSERT INTO PMU VALUES (?)", write_row_value_generator(row_to_write))
ERROR: OperationalError: table PMU has 50 columns but 1 values were supplied
What is the correct way of doing this?
Assuming that your row_to_write has exactly the same number of items as PMU has columns, you can create a string of ? marks easily using str.join : ','.join(['?']*len(row_to_write))
import sqlite3
conn = sqlite3.connect(':memory:')
c = conn.cursor()
c.execute("create table PMU (%s)" % ','.join("col%d"%i for i in range(50)))
row_to_write = list(range(100,150,1))
row_value_markers = ','.join(['?']*len(row_to_write))
c.execute("INSERT INTO PMU VALUES (%s)"%row_value_markers, row_to_write)
conn.commit()
You need to specify the names of the columns. Sqlite will not guess those for you.
columns = ['A', 'B', 'C', ...]
n = len(row_to_write)
sql = "INSERT INTO PMU {} VALUES ({})".format(
', '.join(columns[:n]) , ', '.join(['?']*n))
c.execute(sql, row_to_write)
Note also that if your rows have a variable number of columns, then you might want to rethink your database schema. Usually each row should have a fixed number of columns, and the variability expresses itself in the number of rows inserted, not the number of columns used.
For example, instead of having 50 columns, perhaps you need just one extra column, whose value is one of 50 names (what used to be a column name). Each value in row_to_write would have its own row, and for each row you would have two columns: the value and the name of the column.

Add list to sqlite database

How would I add something in sqlite to an already existing table this is what I have so far
>>> rid
'26539249'
>>> for t in [(rid,("billy","jim"))]:
c.execute("insert into whois values (?,?)",t)
How would I add onto jim and create a list? or is there some way to add onto it so It can have multiple values?
I'll take a guess here, but I suspect I'm wrong.
You can't insert ("billy", "jim") as a column in the database. This is intentional. The whole point of RDBMSs like sqlite is that each field holds exactly one value, not a list of values. You can't search for 'jim' in the middle of a column shared with other people, you can't join tables based on 'jim', etc.
If you really, really want to do this, you have to pick some way to convert the multiple values into a single string, and to convert them back on reading. You can use json.dumps/json.loads, repr/ast.literal_eval, or anything else that seems appropriate. But you have to write the extra code yourself. And you won't be getting any real benefit out of the database if you do so; you'd be better off just using shelve.
So, I'm guessing you don't want to do this, and you want to know what you want to do instead.
Assuming your schema looks something like this:
CREATE TABLE whois (Rid, Names);
What you want is:
CREATE TABLE whois (Rid);
CREATE TABLE whois_names (Rid, Name, FOREIGN KEY(Rid) REFERENCES whois(Rid);
And then, to do the insert:
tt = [(rid,("billy","jim"))]
for rid, names in tt:
c.execute('INSERT INTO whois VALUES (?)', (rid,))
for name in names:
c.execute('INSERT INTO whois_names VALUES (?, ?)', (rid, name))
Or (probably faster, but not as interleaved):
c.executemany('INSERT INTO whois VALUES (?)', (rid for rid, names in tt))
c.executemany('INSERT INTO whois_names VALUES (?, ?),
(rid, name for rid, names in tt for name in names))
Not tested but should do the trick
conn = sqlite3.connect(db)
cur = conn.cursor()
cur.execute('''CREATE TABLE if not exists Data
(id integer primary key autoincrement, List)''')
cur.execute("INSERT INTO Data (id,List) values (?,?)",
(lid, str(map(lambda v : v, My_list) ) ))

Categories

Resources