I'm using pyobdc with an application that requires me to insert >1000 rows normally which I currently do individually with pyobdc. Though this tends to take >30 minutes to finish. I was wondering if there are any faster methods that could do this < 1 minute. I know you can use multiple values in an insert commands but according to this (Multiple INSERT statements vs. single INSERT with multiple VALUES) it would possibly be even slower.
The code currently looks like this.
def Insert_X(X_info):
columns = ', '.join(X_info.keys())
placeholders = ', '.join('?' * len(X_info.keys()))
columns = columns.replace("'","")
values = [x for x in X_info.values()]
query_string = f"INSERT INTO X ({columns}) VALUES ({placeholders});"
with conn.cursor() as cursor:
cursor.execute(query_string,values)
With Insert_X being called >1000 times.
Related
In python script i have insert query but when i want insert multiple columns in the same query it gives error.
but for single query it works perfectly.
Below is my code.
my database AWS S3.
A = [] #
for score_row in score:
A.append(score_row[2])
print("A=",A)
B = [] #
for day_row in score:
B.append(day_row[1])
print("B=",B)
for x,y in zip(A,B):
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.executemany(sql, (x,),(y,))
when i replace above query with following sql insert statement it works perfect.
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?)"""
cursor.executemany(sql, (x,))
Fix your code like this:
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.execute(sql, (x,y,)) #<-- here
Because is just a onet insert ( not several inserts )
Explanation
I guess you are mistaked about number of inserts ( rows ) and number of parĂ meters ( fields to insert on each row ). When you want to insert several rows, use executemany, just for one row you should to use execute. Second parapeter of execute is the "list" (or sequence ) of values to be inserted in this row.
Alternative
You can try to change syntax and insert all data in one shot using ** syntax:
values = zip(A,B) #instead of "for"
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.executemany(sql, **values )
Notice this approach don't use for statement. This mean all data is send to database in one call, this is more efficient.
Im using Python to query a SQL database. I'm fairly new with databases. I've tried looking up this question, but I can't find a similar enough question to get the right answer.
I have a table with multiple columns/rows. I want to find the MAX of a single column, I want ALL columns returned (the entire ROW), and I want only one instance of the MAX. Right now I'm getting ten ROWS returned, because the MAX is repeated ten times. I only want one ROW returned.
The query strings I've tried so far:
sql = 'select max(f) from cbar'
# this returns one ROW, but only a single COLUMN (a single value)
sql = 'select * from cbar where f = (select max(f) from cbar)'
# this returns all COLUMNS, but it also returns multiple ROWS
I've tried a bunch more, but they returned nothing. They weren't right somehow. That's the problem, I'm too new to find the middle ground between my two working query statements.
In SQLite 3.7.11 or later, you can just retrieve all columns together with the maximum value:
SELECT *, max(f) FROM cbar;
But your Python might be too old. In the general case, you can sort the table by that column, and then just read the first row:
SELECT * FROM cbar ORDER BY f DESC LIMIT 1;
i am new to sqlite and i think this question should have been answered before but i havent been able to find an answer.
i have a list of around 50 elements that i need to write to an sqlite database with 50 columns.
went over the documentation # https://docs.python.org/2/library/sqlite3.html but in the examples the values are specified by ? (so for writing 3 values, 3 ? are specified
sample code:
row_to_write = range(50)
conn = sqlite3.connect('C:\sample_database\sample_database')
c = conn.cursor()
tried these:
approach 1
c.execute("INSERT INTO PMU VALUES (?)", row_to_write)
ERROR: OperationalError: table PMU has 50 columns but 1 values were supplied
approach 2...tried writing a generator for iterating over list
def write_row_value_generator(row_to_write):
for val in row_to_write:
yield (val,)
c.executemany("INSERT INTO PMU VALUES (?)", write_row_value_generator(row_to_write))
ERROR: OperationalError: table PMU has 50 columns but 1 values were supplied
What is the correct way of doing this?
Assuming that your row_to_write has exactly the same number of items as PMU has columns, you can create a string of ? marks easily using str.join : ','.join(['?']*len(row_to_write))
import sqlite3
conn = sqlite3.connect(':memory:')
c = conn.cursor()
c.execute("create table PMU (%s)" % ','.join("col%d"%i for i in range(50)))
row_to_write = list(range(100,150,1))
row_value_markers = ','.join(['?']*len(row_to_write))
c.execute("INSERT INTO PMU VALUES (%s)"%row_value_markers, row_to_write)
conn.commit()
You need to specify the names of the columns. Sqlite will not guess those for you.
columns = ['A', 'B', 'C', ...]
n = len(row_to_write)
sql = "INSERT INTO PMU {} VALUES ({})".format(
', '.join(columns[:n]) , ', '.join(['?']*n))
c.execute(sql, row_to_write)
Note also that if your rows have a variable number of columns, then you might want to rethink your database schema. Usually each row should have a fixed number of columns, and the variability expresses itself in the number of rows inserted, not the number of columns used.
For example, instead of having 50 columns, perhaps you need just one extra column, whose value is one of 50 names (what used to be a column name). Each value in row_to_write would have its own row, and for each row you would have two columns: the value and the name of the column.
I have a table looks like this:
part min max unitPrice
A 1 9 10
A 10 99 5
B 1 9 11
B 10 99 6
...
I also have a production table that I need to insert the previous data into this production one.
When I do the select statement from one table and fetch the record, I have a hard time insert into another table.
Say
cursor_table1.execute('select part, min, max, unitPrice, now() from table1')
for row in cursor_table1.fetchall():
part, min, max, unitPrice, now = row
print part, min, max, unitPrice, now
The result turns out to be
'416570S39677N1043', 1L, 24L, 48.5, datetime.datetime(2018, 10, 8, 16, 33, 42)
I know Python smartly figured out the type of every column but I actually just want the raw content. So I can do something like this:
cursor_table1.execute('select part, min, max, unitPrice, now() from table1')
for row in cursor_table1.fetchall():
cursor_table2.execute('insert into table2 values ' + str(tuple(row)))
The question is how can simply do a select statement from one table and add it to another.
Let me know if I did not describe my question in a clear way and I can add extra info if you want.
It might be a bit late to answer this question, but I also had the same problem and landed in this page. Now, I happen to have found a different answer and figured that it might be helpful to share it with others who have the same problem.
I have two mysql servers, one on Raspberry Pi and another on a VPS and I had to sync data between these two by reading data on RPi and inserting into the VPS. I've done it the usual way by writing a loop and catching the records one by one and inserting them and it was really slow, it took about 2 minutes for 2000 datasets.
Now I solved this problem by using the executemany function. As for the data I obtained all tuples returned by the select using the fetchall function.
rows = x.fetchall()
y.executemany("insert into table2 (f1, f2, f3) values (%s,%s,%s);", rows)
And it was super fast đŸ˜€, it took about 2 seconds for 5000 records.
If you wanted all of the data to pass through Python, you could do the following:
import datetime
cursor_table1.execute('SELECT part, min, max, unitPrice, NOW() from table1')
for row in cursor_table1.fetchall():
part, min, max, unitPrice, now = row
cursor_table2.execute("INSERT INTO table2 VALUES (%s,%s,%s,%s,'%s')" % (part, min, max, unitPrice, now.strftime('%Y-%m-%d %H:%M:%S') ))
If you don't need to make any calculation with the data selected from table1 and you are only inserting the data into the other table, then you can rely on mysql and run an insert ... select statement. So the query code would be like this:
cursor_table1.execute('insert into table2 (part, min, max, unitPrice, date) select part, min, max, unitPrice, now() from table1')
EDIT:
After knowing that the tables are in different servers, I would suggest to use executemany method to insert the data, as it would run faster.
First build a list of tuples containing all the data to be inserted and then run the executemany query
I expect that several answers here will give you trouble if you have more data than you do memory.
Maybe this doesn't count as solving the problem in python, but I do this:
from sh import bash
# ... omitted argparse and table listing ...
for table_name in tables_to_sync:
dump = 'mysqldump -h{host} -u{user} -p{password} {db} {table} '.format(
host=args.remote_host,
user=args.remote_user,
password=args.remote_password,
db=args.remote_database,
table=table_name,
)
flags = '--no-create-info --lock-tables=false --set-gtid-purged=OFF '
condition = '--where=\'id > {begin} and id <= {end}\' > {table}.sql '.format(
begin=begin,
end=end,
table=table_name
)
bash(['-c', dump + flags + condition])
load = 'mysql -u{user} -p{password} {db} < {table}.sql'.format(
user=args.local_user,
password=args.local_password,
db=args.local_database,
table=table_name
)
bash(['-c', load])
If you're worried about performance, you might consider cutting the middleman out entirely and using the federated storage engine--but that too would be a non-python approach.
I am in facing a performance problem in my code.I am making db connection a making a select query and then inserting in a table.Around 500 rows in one select query ids populated .Before inserting i am running select query around 8-9 times first and then inserting then all using cursor.executemany.But it is taking 2 miuntes to insert which is not qood .Any idea
def insert1(id,state,cursor):
cursor.execute("select * from qwert where asd_id =%s",[id])
if sometcondition:
adding.append(rd[i])
cursor.executemany(indata, adding)
where rd[i] is a aray for records making and indata is a insert statement
#prog start here
cursor.execute("select * from assd")
for rows in cursor.fetchall()
if rows[1]=='aq':
insert1(row[1],row[2],cursor)
if rows[1]=='qw':
insert2(row[1],row[2],cursor)
I don't really understand why you're doing this.
It seems that you want to insert a subset of rows from "assd" into one table, and another subset into another table?
Why not just do it with two SQL statements, structured like this:
insert into tab1 select * from assd where asd_id = 42 and cond1 = 'set';
insert into tab2 select * from assd where asd_id = 42 and cond2 = 'set';
That'd dramatically reduce your number of roundtrips to the database and your client-server traffic. It'd also be an order of magnitude faster.
Of course, I'd also strongly recommend that you specify your column names in both the insert and select parts of the code.