import sqlite3
conn = sqlite3.connect('sample.db')
cursor = conn.cursor()
data = cursor.execute('''SELECT * From Table''')
for i in data:
title = i[0]
status = i[1]
cursor.execute('''UPDATED Table SET status=? WHERE title=?''', (status, title))
cursor.close()
conn.commit()
I am trying to update over multiple iterations. However, the script breaks out of the loop as soon as the database makes the first update. How to fix this? Thanks!
Use data = data.fetchall() before your loop. Otherwise you wind up recycling the cursor inside of your loop (resetting its result set) while you're trying to loop over that result set.
Using .fetchall() returns a list of results so that you have them stored locally before you re-use the cursor.
Alternatively, create a separate cursor to use for your update statements if you don't want to cache the results of the first query locally.
Related
I'm using server-side cursor in PostgreSQL with psycopg2, based on this well-explained answer.
with conn.cursor(name='name_of_cursor') as cursor:
query = "SELECT * FROM tbl FOR UPDATE"
cursor.execute(query)
for row in cursor:
# process row
In processing each row, I'd like to update a few fields in the row using PostgreSQL's UPDATE tbl SET ... WHERE CURRENT OF name_of_cursor (docs), but it seems that, when the for loop enters and row is set, the position of the server-side cursor is in a different record, so while I can run the command, the wrong record is updated.
How can I make sure the result iterator is in the same position as the cursor? (also preferably in a way that won't make the loop slower than updating using an ID)
The reason why a different record was being updated was because internally psycopg2 does a FETCH FORWARD 1000 (or whatever the default chunk size is), positioning the cursor at the end of the block. You can override this by fetching one record at a time:
updcursor = conn.cursor()
with conn.cursor(name='name_of_cursor') as cursor:
cursor.itersize = 1 # to make server-side cursor be in the same position as the iterator
cursor.execute('SELECT * FROM tbl FOR UPDATE')
for row in cursor:
# process row...
updcursor.execute('UPDATE tbl SET fld1 = %s WHERE CURRENT OF name_of_cursor', [val])
The snippet above will update the correct record. Note that you cannot use the same cursor for selecting and updating, they must be different cursors.
Performance
Reducing the FETCH size to 1 reduces the retrieval performance by a lot. I definitely wouldn't recommend using this technique if you're iterating a large dataset (which is probably the case you're searching for server-side cursors) from a different host than the PostgreSQL server.
I ended up using a combination of exporting records to CSV, then importing them later using COPY FROM (with the function copy_expert).
I'm new to SQL and psycopg2. I'm playing around a bit and try to find our how to display the results of a query. I have a small script where I make a connection to the database and create a cursor to run the query.
from psycopg2 import connect
conn = connect(host="localhost", user="postgres", dbname="portfolio",
password="empty")
cur = conn.cursor()
cur.execute("SELECT * FROM portfolio")
for record in cur:
print("ISIN: {}, Naam: {}".format(record[0], record[1]))
print(cur.fetchmany(3))
cur.close()
conn.close()
If I run this code, the first print is fine, but the second print-statement returns [].
If I run only one of the two print-statements, I get a result every time.
Can someone explain me why?
The cursor loops over the results and returns one at a time. When it has returned all of them, it can't return any more. This is precisely like when you loop over the lines in a file (there are no more lines once you reach the end of the file) or even looping over a list (there are no more entries in the list after the last one).
If you want to manipulate the results in Python, you should probably read them into a list, which you can then traverse as many times as you like, or search, sort, etc, or access completely randomly.
cur.execute("SELECT * FROM portfolio")
result = cur.fetchall()
for record in result:
print("ISIN: {}, Naam: {}".format(record[0], record[1]))
print(result[0:3]))
I'm trying to generate & execute SQL statements via pyodbc. I expect multiple SQL statements, all of which start with the same SELECT & FROM but have a different value in the WHERE. The value in my WHERE clause is derived from looping through a table - each distinct value the SQL script finds in the table, I need Python to generate another SQL statement with this value as the WHERE clause.
I'm almost there with this, I'm just struggling to get pyodbc to put my query strings in formats that SQL likes. My code so far:
import pyodbc
cn = pyodbc.connect(connection info)
cursor = cn.cursor()
result = cursor.execute('SELECT distinct searchterm_name FROM table1')
for row in result:
sql = str("SELECT * from table2 WHERE table1.searchterm_name = {c}".format(c=row)),
#print sql
This code generates an output like this, where "name here" is based on the value found in table1.
('SELECT * from ifb_person WHERE searchterm_name = (u\'name here\', )',)
I just need to remove all the crap surrounding the query & where clause so it looks like this. Then I can pass it into another cursor.execute()
SELECT * from ifb_person WHERE searchterm_name = 'name here'
EDIT
for row in result:
cursor.execute("insert into test (searchterm_name) SELECT searchterm_name FROM ifb_person WHERE searchterm_name = ?",
(row[0],))
This query fails with the error pyodbc.ProgrammingError: No results. Previous SQL was not a query.
Basically what I am trying to do is get Python to generate a fresh SQL statement for every result it finds in table1. The second query is running searches against the table ifb_person and inserting the results to a table "test". I want to run separate SQL statements for every result found in table1
pyodbc allows us to iterate over a Cursor object to return the rows, during which time the Cursor object is still "in use", so we cannot use the same Cursor object to perform other operations. For example, this code will fail:
crsr = cnxn.cursor()
result = crsr.execute("SELECT ...") # result is just a reference to the crsr object
for row in result:
# we are actually iterating over the crsr object
crsr.execute("INSERT ...") # this clobbers the previous crsr object ...
# ... so the next iteration of the for loop fails with " Previous SQL was not a query."
We can work around that by using fetchall() to retrieve all the rows into result ...
result = crsr.execute("SELECT ...").fetchall()
# result is now a list of pyodbc.Row objects and the crsr object is no longer "in use"
... or use a different Cursor object in the loop
crsr_select = cnxn.cursor()
crsr_insert = cnxn.cursor()
crsr_select.execute("SELECT ...")
for row in crsr_select:
crsr_insert.execute("INSERT ...")
I'm using python to connect to mysql and write data to a table. I have a while loop, and I update one particular row of the table with some values in each loop. Then after the loop ends, I commit the execution. Now the problem is: I should have 5000 rows of data updated since there are 5000 loops. However, I only see 1K rows or of data update. Following is the pseudo-code:
import pymysql
import pymysql.cursors
connection =pymysql.connect(
user='root',
host='localhost',
database='mysql')
mycursor=connection.cursor()
mycursor.execute('use test_db')
n=0
while n<5000:
id=IDlist[n]
url='www.example.com/'+str(id)
values=requests.get(url) ##some parse omitted
input=(values[1],values[2],id)
sql="""UPDATE mytable
SET COL1=%s, COL2=%s
WHERE ID=%s"""
mycursor.execute(sql, input)
connection.commit() ## here all loops done
The following is the structure of the table:
[ID INT(5) NOT NULL, COL1 VARCHAR, COL2 MEDIUMBLOB]
where column ID is the PRIMARY KEY
So basically what I do is: I read a unique id each time, go to the corresponding webpage and read some data values then write the values in the row corresponding to that unique id.
My concern is that, could it be possible that since data to be saved is relatively large (~500KB) in each loop, so some data just lost during while loop before connection.commit() was made?
If this were the case, then should I declare auto commit beforehand? However, it seems that executing auto-commit right after data is inserted/updated in each loop will cause the whole task to be relatively slow.
Consider breaking up the requests and MySQL update. You can iteratively append lists in a large input[] list. Then, iterate through input[]. Notice I move all opening of database connection and cursor toward the end to minimize long connection times. Also, you want to commit after each execute so commit should be inside the loop.
import pymysql
# URL DATA
input = [] ## LIST OF LISTS
for id in IDlist:
url='www.example.com/'+str(id)
values=requests.get(url) ##some parse omitted
input.append([values[1],values[2],id])
# DATABASE UPDATE
connection = pymysql.connect(
host='localhost', db='test_db',
user='root', passwd='***')
mycursor = connection.cursor()
for items in input:
sql="""UPDATE mytable
SET COL1=%s, COL2=%s
WHERE ID=%s"""
mycursor.execute(sql, tuple(items))
connection.commit()
mycursor.close() ## here all loops done
connection.close() ## close db connection
I have problem with the python PostgreSQL. I am using psycopg2.
here's my postgres DB looks like:
Database: qcdata
---information_schema
---pg_catalog
---prod
---activation
--- *** Many other table ***
---public
I want to pull out the information in schema: prod - table: activation
here's my code
import psycopg2
conn = psycopg2.connect(host='10.0.80.180', port = '5432',
dbname = 'qcdata',
user = 'username', password = 'pwd')
cur = conn.cursor()
print cur.execute("SELECT * FROM prod.activation")
But it returns None... I am sure there's data in it. How could that be?
Thanks guys.
According to the docs (http://initd.org/psycopg/docs/cursor.html), the cur.execute() cursor always returns None. You have to follow up with one of the fetch methods, like:
print cur.fetchall()
-g
While I will use cursor.fetchone() in cases where I'm getting a single value, i.e. when using COUNT:
SELECT count(*) FROM prod.activation
If I want to process a set of rows, I like to leverage the cursor's ability to iterate over the result set. That allows more control over the results, how they're processed, and how they're returned.
The cursor can be iterated over just like a Python list:
for row in cur:
dostuff()
If you use named cursors (simply set the name attribute when constructing the cursor), psycopg2 will chunk the SELECT for you, by 2000 by default. This will automatically happen in the background while iterating.
You can also use the with syntax to automatically close the cursor when you're done with it. (Useful for smaller-scoped uses.)