Why can I use the results of a query only once? - python

I'm new to SQL and psycopg2. I'm playing around a bit and try to find our how to display the results of a query. I have a small script where I make a connection to the database and create a cursor to run the query.
from psycopg2 import connect
conn = connect(host="localhost", user="postgres", dbname="portfolio",
password="empty")
cur = conn.cursor()
cur.execute("SELECT * FROM portfolio")
for record in cur:
print("ISIN: {}, Naam: {}".format(record[0], record[1]))
print(cur.fetchmany(3))
cur.close()
conn.close()
If I run this code, the first print is fine, but the second print-statement returns [].
If I run only one of the two print-statements, I get a result every time.
Can someone explain me why?

The cursor loops over the results and returns one at a time. When it has returned all of them, it can't return any more. This is precisely like when you loop over the lines in a file (there are no more lines once you reach the end of the file) or even looping over a list (there are no more entries in the list after the last one).
If you want to manipulate the results in Python, you should probably read them into a list, which you can then traverse as many times as you like, or search, sort, etc, or access completely randomly.
cur.execute("SELECT * FROM portfolio")
result = cur.fetchall()
for record in result:
print("ISIN: {}, Naam: {}".format(record[0], record[1]))
print(result[0:3]))

Related

Store output of mysql select statement as python list

I have a very large table in mysql that I'm accessing through a python script, and I'd like the output of my select query to be stored as a list. Here's what I have:
import MySQLdb
db = MySQLdb.connect(host='', user='', passwd='', db='')
cursor = db.cursor()
sqlselect = "SELECT desig FROM table WHERE num=0;"
desigs = cursor.execute(sqlselect)
print desigs
But this just gives me the number of rows in set which is nearly 250,000. Instead, I'd like it to print a list of each 'desig'. How can I do this?
cursor.execute() will return the number of rows modified or retrieved.
Once you've executed the query, run cursor.fetchall(), which will return you an array containing each row in your query results. That is, you get an array of arrays.
Edit: list! list of lists!! argh

for loop not exiting in a function

I'm having some trouble converting some working code to a function. The first code sample is running without issue, but the function below just hangs. A little debugging using print showed that the function moves all the way through the cursor to the final record, appending it to the list, but the the program hangs and won't exit.
Cursor is part of the cx_Oracle module. The intent is to query an oracle db, then create a list. I have tested the original code on several queries and had no problems (max return is about 15000 rows). At this point I can make the code work using the original format, but I'd like to know what I might be doing wrong in the function.
Working code:
cursor = db.cursor()
cursor.execute(mysqlexp)
for row in cursor:
myList.append(row)
cursor.close()
Function (not working):
def sqlToList(listname, sqlexp):
cursor = db.cursor()
cursor.execute(sqlexp)
for row in cursor:
listname.append(row)
#print statement here indicates that final record appends
#but then the program stops responding
#print statement here never appears (indicating for loop hasn't exited?)
cursor.close()
sqlToList(myList, mysqlexp)

Python code running too slow (SQLITE)

I have located a piece of code that runs quite slow (in my opinion) and would liek to know what you guys think. The code in question is as follows and is supposed to:
Query a database and get 2 fields, a field and its value
Populate the object dictionary with their values
The code is:
query = "SELECT Field, Value FROM metrics " \
"WHERE Status NOT LIKE '%ERROR%' AND Symbol LIKE '{0}'".format(self.symbol)
query = self.db.run(query, True)
if query is not None:
for each in query:
self.metrics[each[0].lower()] = each[1]
The query is run using a db class I created that is very simple:
def run(self, query, onerrorkeeprunning=False):
# Run query provided and return result
try:
con = lite.connect(self.db)
cur = con.cursor()
cur.execute(query)
con.commit()
runsql = cur.fetchall()
data = []
for rows in runsql:
line = []
for element in rows:
line.append(element)
data.append(line)
return data
except lite.Error, e:
if onerrorkeeprunning is True:
if con:
con.close()
return
else:
print 'Error %s:' % e.args[0]
sys.exit(1)
finally:
if con:
con.close()
I know there are tons of ways of writting this code and I was trying to keep things simple but for 24 fields this takes 0.03s so if I have 1,000 elements that is 30s and I find it a little too long!
EDIT: on further review, runsql = cur.fetchall() is the line that takes the most to run.
Any help will be much appreciated.
2nd EDIT: Looking online further, I have found the issue lies with the fetchall() commant and not with my query or the initialization of the DB. Has anybody been able to imporve the performance of the result fetching? (Some people mentioned changing the SQL code but this is not to blame, it runs pretty fast but then the slowness comes when you try to grab those results)
fetchall() reads all results, and returns them in a temporary list.
Your run() function then just puts all the results into another list.
Your top-level code then copies these values into yet another dictionary.
You should fetch only the row you need (which can be done directly on the cursor), and handle it directly:
cur.execute("SELECT Field, Value ...")
for row in cur:
self.metrics[row[0].lower()] = row[1]
Note: this distributes the cost of the SQL query over all for iteration; the overall time spent in the database does not change.
This code improves only on the time that would have been spent handling all the temporary variables.

Python fetch MySQLdb results in chunks with generator - error

I have the following code:
def executeQuery(conn, query):
cur = conn.cursor()
cur.execute(query)
return cur
def trackTagsGenerator(chunkSize, baseCondition):
""" Returns a dict of trackId:tag limited to chunkSize. """
sql = """
SELECT track_id, tag
FROM tags
WHERE {baseCondition}
""".format(baseCondition=baseCondition)
limit = chunkSize
offset = 0
while True:
trackTags = {}
# fetch the track ids with the coresponding tag
limitPhrase = " LIMIT %d OFFSET %d" % (limit, offset)
query = sql + limitPhrase
offset += limit
cur = executeQuery(smacConn, query)
rows = cur.fetchall()
if not rows:
break
for row in rows:
trackTags[row['track_id']] = row['tag']
yield trackTags
I want to use it like this:
for trackTags in list(trackTagsGenerator(DATA_CHUNK_SIZE, baseCondition)):
print trackTags
break
This code produces the following error without even fetching one chunk of track tags:
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method SSDictCursor.__del__ of <MySQLdb.cursors.SSDictCursor object at 0x10b067b90>> ignored
I suspect it's because I have the query execute logic in the body of loop in the generator function.
Is someone able to tell me how to fetch chunks of data using mysqldb in such way?
I'm pretty sure this is because it can run into situations where you've got two queries
running simultaniously because of the yield. Depending on how you call the function (threads, async, etc..) I'm pretty sure your cursor might get clobbered too?
As well, you're opening yourself up to (sorry, but I can't sugar coat this part) horrific SQL injection holes by inserting baseConditional using essentially a printf. Take a look at the DB-API’s parameter substitution docs for help.
Yield isn't going to save you time or energy here at all, the full sql command will always need to run before you'll get a single result. (Hence you're using LIMIT and OFFSET to make it more friendly, kudos)
i.e. someone updates the table while you're yielding out some data, in this particular case - not the end of the world. In many others, it gets ugly.
If you're just goofing around and you want this to work 'right-now-dammit', it'd probably work to modify executeQuery as such:
def executeQuery(conn, query):
cur = conn.cursor()
cur.execute(query)
cur = executeQuery(smacConn, query)
rows = cur.fetchall()
cur.close()
return rows
One thing that also kinda jumps out at me - you define trackTags = {}, but then you update tagTrackIds, and yield trackTags.. Which will always be empty dict.
My suggestion would be to not bother yourself with the headache of hand writing SQL if you're just trying to get a hobby project working. Take a look at Elixir which is built on top of SQLAlchemy.
Using an ORM (object-relational-mapper) can be a much more friendly introduction to databases. Defining what your objects look like in Python, and having it automatically generate your schema for you - and being able to add/modify/delete things in a Pythonic manner is really nifty.
If you really need to be async, check out ultramysql python module.
You use a SSDictCursor, something that maps to mysql_use_result() on MySQL-API-side. This requires that you read out the complete result before you can issue a new command.
As this happens before you receive the first chunk of data after all: are yu sure that this doesn't happen in the context of the query before this part of code is executed? The results of that last query might be still in the line, and executing the next one (i. e., the fist one in this context) might break things...

python sqlite3 for loop update

import sqlite3
conn = sqlite3.connect('sample.db')
cursor = conn.cursor()
data = cursor.execute('''SELECT * From Table''')
for i in data:
title = i[0]
status = i[1]
cursor.execute('''UPDATED Table SET status=? WHERE title=?''', (status, title))
cursor.close()
conn.commit()
I am trying to update over multiple iterations. However, the script breaks out of the loop as soon as the database makes the first update. How to fix this? Thanks!
Use data = data.fetchall() before your loop. Otherwise you wind up recycling the cursor inside of your loop (resetting its result set) while you're trying to loop over that result set.
Using .fetchall() returns a list of results so that you have them stored locally before you re-use the cursor.
Alternatively, create a separate cursor to use for your update statements if you don't want to cache the results of the first query locally.

Categories

Resources