Brief syntax for inserting row into sqlite database using python sqlite3 - python

I have a csv file of size 360x120 that I want to import into my sqlite database row by row. For one row, I know that below syntax works if mytuple has 2 elements:
import sqlite3
conn = sqlite3.connect(dbLoc)
cur = conn.cursor()
mytuple = (a, b, c, ...) #some long tuple of 120 elements
cur.execute('INSERT INTO tablename VALUES (?, ?)', mytuple)
Problem is, my rows contain 120 columns and I can't really go type 120 question marks into the cur.execute() line. Actually I have, it works but yeah, it is not a good solution. One thing I have tried was:
cur.execute('INSERT INTO tablename VALUES ?', mytuple)
Thought it would just do ?=mytuple and replace ? with mytuple but it doesn't do that. A user comment on the article sebastianraschka.com/Articles/2014_sqlite_in_python_tutorial.html shows such syntax, which would work for me but it does not:
t = ('RHAT',)
c.execute('SELECT * FROM stocks WHERE symbol=?', t)
As seen here he's able to replace a tuple into the execute string with a single ? used. How can I achieve the same with INSERT INTO tablename?

sqlite3 doesn't support more concise syntax:
c.execute('INSERT INTO tablename VALUES ({}?)'.format('?,'*(len(t) - 1)), t)
Note: the default SQLITE_MAX_COLUMN is 2000. And some algorithms in SQLite are O(n**2) in the number of columns i.e., if you increase the limit; it may slow down db operations.

Related

SQL psycopg2 insert variable that is a list of variable length into database

I am trying to write a row of observations into my database, but I have some unique variable called list_variable which is a list of strings that can be of length 1-3. So sometimes ['string1'] but sometimes also ['string1','string2'] or ['string1','string2','string3'].
When I try to add this to my database by:
def add_to_cockroach_db():
cur.execute(f"""
INSERT INTO database (a, b, c)
VALUES ({time.time()}, {event},{list_variable}; <--- this one
""")
conn.commit()
I would get the following error (values have been changed for readability):
SyntaxError: at or near "[": syntax error
DETAIL: source SQL:
INSERT INTO database (a, b, c)
VALUES (a_value, b_value, ['c_value_1', 'c_value_2'])
^
HINT: try \h VALUES
It seems that having a variable that is a list is not allowed, how could I make this work out?
Thanks in advance!
**edit
list_variable looks e.g., like this = ['value1','value2']
You can either cast it to string using
str(['c_value_1', 'c_value_2'])
which looks like this:
"['c_value_1', 'c_value_2']"
or join the elements of your list with a delimiter you choose. This for example generates a comma separated string.
",".join(['c_value_1', 'c_value_2'])
which looks like this:
'c_value_1,c_value_2'
Like Maurice Meyer has already pointed out in the comments, it is better to pass your values as a list or as a tuple instead of formatting the query yourself.
Your command could look like this depending on the solution you choose:
cur.execute("INSERT INTO database (a, b, c) VALUES (%s, %s, %s)", (time.time(), event, ",".join(list_variable)))
There are a few ways you could accomplish this.
The simplest way is to call str on the list and insert the result into a string (VARCHAR) column. While this works, it's not easy to work with the values in database queries, and when it's retrieved from the database it's a string, not a list.
Using a VARCHAR[] column type - an array of string values - reflects the actual data type, and enables use of PostgreSQL's array functions in queries.
Finally, you could use a JSONB column type. This allows storage of lists or dicts, or nested combinations of both, so it's very flexible, and PostgreSQL provides functions for working with JSON objects too. However it might be overkill if you don't need the flexibility, or if you want to be strict about the data.
This script shows all three methods in action:
import psycopg2
from psycopg2.extras import Json
DROP = """DROP TABLE IF EXISTS t73917632"""
CREATE = """\
CREATE TABLE t73917632 (
s VARCHAR NOT NULL,
a VARCHAR[] NOT NULL,
j JSONB NOT NULL
)
"""
INSERT = """INSERT INTO t73917632 (s, a, j) VALUES (%s, %s, %s)"""
SELECT = """SELECT s, a, j FROM t73917632"""
v = ['a', 'b', 'c']
with psycopg2.connect(dbname='test') as conn:
with conn.cursor() as cur:
cur.execute(DROP)
cur.execute(CREATE)
conn.commit()
cur.execute(INSERT, (str(v), v, Json(v)))
conn.commit()
cur.execute(SELECT)
for row in cur:
print(row)
Output:
("['a', 'b', 'c']", ['a', 'b', 'c'], ['a', 'b', 'c'])
It's worth observing that if the array of strings represents some kind of child relationship to the table - for example the table records teams, and the string array contains the names of team members - it is usually a better design to insert each element in the array into a separate row in a child table, and associate them with the parent row using a foreign key.

Faster solution than executemany to insert multiple rows at once in pyodbc

I would like to insert multiple rows with one insert statement.
I tried with
params = ((1, 2), (3,4), (5,6))
sql = 'insert into tablename (column_name1, column_name2) values (?, ?)'
cursor.fast_executemany = True
cursor.executemany(sql, params)
but it's simple loop on the params with running execute method under the hood.
I also tried with creating longer insert statement to be like INSERT INTO tablename (col1, col2) VALUES (?,?), (?,?)...(?,?).
def flat_map_list_of_tuples(list_of_tuples):
return [element for tupl in list_of_tuples for element in tupl])
args_str = ', '.join('(?,?)' for x in params)
sql = 'insert into tablename (column_name1, column_name2) values'
db.cursor.execute(sql_template + args_str, flat_map_list_of_tuples(params))
It worked and reduced time of insertion from 10.9s to 6.1.
Is this solution correct? Does it have some vulnerabilities?
Is this solution correct?
The solution you propose, which is to build a table value constructor (TVC), is not incorrect but it is really not necessary. pyodbc with fast_executemany=True and Microsoft's ODBC Driver 17 for SQL Server is about as fast as you're going to get short of using BULK INSERT or bcp as described in this answer.
Does it have some vulnerabilities?
Since you are building a TVC for a parameterized query you are protected from SQL Injection vulnerabilities, but there are still a couple of implementation considerations:
A TVC can insert a maximum of 1000 rows at a time.
pyodbc executes SQL statements by calling a system stored procedure, and stored procedures in SQL Server can accept a maximum of 2100 parameters, so the number of rows that your TVC can insert is also limited to (number_of_rows * number_of_columns < 2100).
In other words, your TVC approach will be limited to a "chunk size" of 1000 rows or less. The actual calculation is described in this answer.

SQLite query with Python

I'm trying to use Python to do an SQLite query. I can get the desired result in SQLite using
SELECT Four FROM keys2 WHERE One = B
How do I do this in Python?
I am trying
c = conn.cursor()
print "Opened database successfully";
x = c.execute("SELECT Four FROM keys2 WHERE One = B")
print x
conn.close()
But I get the message "sqlite3.OperationalError: no such column: B". I'm trying to select the entry under column 4 where the one column is B. I have tried several methods to no avail.
Supply the value using parametrized SQL:
x = c.execute("SELECT Four FROM keys2 WHERE One = ?", "B")
The problem with the SQL you posted is that the value B must be quoted. So
x = c.execute("SELECT Four FROM keys2 WHERE One = 'B'")
would also have worked, but it is better to always use parametrized SQL (to guard against SQL injection) and let sqlite3 do the quoting for you.

Add list to sqlite database

How would I add something in sqlite to an already existing table this is what I have so far
>>> rid
'26539249'
>>> for t in [(rid,("billy","jim"))]:
c.execute("insert into whois values (?,?)",t)
How would I add onto jim and create a list? or is there some way to add onto it so It can have multiple values?
I'll take a guess here, but I suspect I'm wrong.
You can't insert ("billy", "jim") as a column in the database. This is intentional. The whole point of RDBMSs like sqlite is that each field holds exactly one value, not a list of values. You can't search for 'jim' in the middle of a column shared with other people, you can't join tables based on 'jim', etc.
If you really, really want to do this, you have to pick some way to convert the multiple values into a single string, and to convert them back on reading. You can use json.dumps/json.loads, repr/ast.literal_eval, or anything else that seems appropriate. But you have to write the extra code yourself. And you won't be getting any real benefit out of the database if you do so; you'd be better off just using shelve.
So, I'm guessing you don't want to do this, and you want to know what you want to do instead.
Assuming your schema looks something like this:
CREATE TABLE whois (Rid, Names);
What you want is:
CREATE TABLE whois (Rid);
CREATE TABLE whois_names (Rid, Name, FOREIGN KEY(Rid) REFERENCES whois(Rid);
And then, to do the insert:
tt = [(rid,("billy","jim"))]
for rid, names in tt:
c.execute('INSERT INTO whois VALUES (?)', (rid,))
for name in names:
c.execute('INSERT INTO whois_names VALUES (?, ?)', (rid, name))
Or (probably faster, but not as interleaved):
c.executemany('INSERT INTO whois VALUES (?)', (rid for rid, names in tt))
c.executemany('INSERT INTO whois_names VALUES (?, ?),
(rid, name for rid, names in tt for name in names))
Not tested but should do the trick
conn = sqlite3.connect(db)
cur = conn.cursor()
cur.execute('''CREATE TABLE if not exists Data
(id integer primary key autoincrement, List)''')
cur.execute("INSERT INTO Data (id,List) values (?,?)",
(lid, str(map(lambda v : v, My_list) ) ))

python db insert

I am in facing a performance problem in my code.I am making db connection a making a select query and then inserting in a table.Around 500 rows in one select query ids populated .Before inserting i am running select query around 8-9 times first and then inserting then all using cursor.executemany.But it is taking 2 miuntes to insert which is not qood .Any idea
def insert1(id,state,cursor):
cursor.execute("select * from qwert where asd_id =%s",[id])
if sometcondition:
adding.append(rd[i])
cursor.executemany(indata, adding)
where rd[i] is a aray for records making and indata is a insert statement
#prog start here
cursor.execute("select * from assd")
for rows in cursor.fetchall()
if rows[1]=='aq':
insert1(row[1],row[2],cursor)
if rows[1]=='qw':
insert2(row[1],row[2],cursor)
I don't really understand why you're doing this.
It seems that you want to insert a subset of rows from "assd" into one table, and another subset into another table?
Why not just do it with two SQL statements, structured like this:
insert into tab1 select * from assd where asd_id = 42 and cond1 = 'set';
insert into tab2 select * from assd where asd_id = 42 and cond2 = 'set';
That'd dramatically reduce your number of roundtrips to the database and your client-server traffic. It'd also be an order of magnitude faster.
Of course, I'd also strongly recommend that you specify your column names in both the insert and select parts of the code.

Categories

Resources