Executemany SELECT queries with psycopg2 - python

I have a large postgresql DB of users that I connect with using psycopg2. I need to retrieve (SELECT) the information of a specific large subset of users (>200). I am provided with a list of ids and I need to return the age of each of those users. I put down a working solution:
conn = psycopg2.connect("dbname= bla bla bla")
cur = conn.cursor()
for user_id in interesting_users:
qr = "SELECT age FROM users WHERE country_code = {0} AND user_id = {1}".format(1, user_id)
cur.execute(qr)
fetched_row = cur.fetchall()
#parse results
This solution works fine, however it is not ideal when the length of interesting_users is large. I am looking for a more efficient approach than executing multiple queries. One solution would be to create a single query by appending all the user ids:
for user_id in interesting_users:
query += "OR user_id {0}".format(user_id)
But I was hoping for a more elegant solution.
I found that psycopg2 provides the executemany() method. So, I tried to apply to my problem. However, I can't manage to make it work. This:
cur.executemany("SELECT age FROM users WHERE country_code = %s AND user_id = %s",[(1, user_id) for user_id in interesting_users])
r = cur.fetchall()
returns:
r = cur.fetchall()
psycopg2.ProgrammingError: no results to fetch
So, can executemany() be used for a SELECT statement? If yes, what's wrong with my code? If no, how can I perform multiple SELECT queries at once?
Note: ids in interesting_users have no order so I can't use something like WHERE id < ...
SOLUTION:
query = "SELECT age FROM users WHERE country_code = {0} AND user_id IN ({1});".format(1, ",".join(map(str, interesting_users)))
cur.execute(query)
fetched_rows = cur.fetchall()

executemany works only with INSERT, not SELECT. Use IN:
cur.executemany("SELECT age FROM users WHERE country_code = %s AND user_id IN ({})".format(','.join(['%s'] * len(interesting_users)),
[1] + interesting_users)
r = cur.fetchall()

Related

Creating a Search Record Function Python SQLite3

I am currently working on a coursework project for school and it is a database system with a user interface using Tkinter, Python and SQLite3. I have made a form to add, delete, update and search for customers. I am able to display the result from a single field, however, I am struggling to get the message box to display all the fields, which is what I would like it to do. I have attached photos of the form along with the code. Thank you in advance.
def SearchCustomer(self):
customerid = self.CustomerEntry.get();
with sqlite3.connect("LeeOpt.db") as db:
cursor = db.cursor()
search_customer = ('''SELECT * FROM Customer WHERE CustomerID = ?''')
cursor.execute(search_customer, [(customerid)])
results = cursor.fetchall()
if results:
for i in results:
tkinter.messagebox.showinfo("Notification",i[0])
It is because you showed only the first column (i[0]) from result.
Since there should be only one record for a specific customer ID, you should use fetchone() instead of fetchall(), then you can show the whole record as below:
def SearchCustomer(self):
customerid = self.CustomerEntry.get()
with sqlite3.connect("LeeOpt.db") as db:
cursor = db.cursor()
search_customer = "SELECT * FROM Customer WHERE CustomerID = ?"
cursor.execute(search_customer, [customerid])
result = cursor.fetchone() # there should be only one record for specific customer ID
if result:
tkinter.messagebox.showinfo("Notification", "\n".join(str(x) for x in result))

How to protect SELECT * FROM var1 WHERE var2 statements from SQLInjection

I am making a website in django where I want the user to put in a table id and group id and then return the table and group that the put in. However, I have only found statements that are prone to SQL injection. Does anybody know how to fix this?
mycursor = mydb.cursor()
qry = "SELECT * from %s WHERE group_id = %i;" % (assembly_name, group_id)
mycursor.execute(qry)
return mycursor.fetchall()
Or do something that achieves the same thing?
I have tried doing something like this:
assembly_id = 'peptides_proteins_000005'
group_id = 5
mycursor = mydb.cursor()
mycursor.execute("SELECT * FROM %s WHERE group_id = %s", [assembly_id, group_id])
myresult = mycursor.fetchall()
but I get this error:
1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''peptides_proteins_000005' WHERE group_id = 5' at line 1
It's typically not possible to bind table names. For SELECT statements, the easiest way is to sanitize table name candidates by whitelisting.
Check whether the overhead of using abstraction or some way of constraining user input to the finite set of valid names as part of the user interface may be justified.

Using COUNT(*) OVER() in current query with SQLAlchemy over PostgreSQL

In a prototype application that uses Python and SQLAlchemy with a PostgreSQL database I have the following schema (excerpt):
class Guest(Base):
__tablename__ = 'guest'
id = Column(Integer, primary_key=True)
name = Column(String(50))
surname = Column(String(50))
email = Column(String(255))
[..]
deleted = Column(Date, default=None)
I want to build a query, using SQLAlchemy, that retrieves the list of guests, to be displayed in the back-office.
To implement pagination I will be using LIMIT and OFFSET, and also COUNT(*) OVER() to get the total amount of records while executing the query (not with a different query).
An example of the SQL query could be:
SELECT id, name, surname, email,
COUNT(*) OVER() AS total
FROM guest
WHERE (deleted IS NULL)
ORDER BY id ASC
LIMIT 50
OFFSET 0
If I were to build the query using SQLAlchemy, I could do something like:
query = session.query(Guest)
query = query.filter(Login.deleted == None)
query = query.order_by(Guest.id.asc())
query = query.offset(0)
query = query.limit(50)
result = query.all()
And if I wanted to count all the rows in the guests table, I could do something like this:
from sqlalchemy import func
query = session.query(func.count(Guest.id))
query = query.filter(Login.deleted == None)
result = query.scalar()
Now the question I am asking is how to execute one single query, using SQLAlchemy, similar to the one above, that kills two birds with one stone (returns the first 50 rows and the count of the total rows to build the pagination links, all in one query).
The interesting bit is the use of window functions in PostgreSQL which allows the abovementioned behaviour, thus saving you from having to query twice but just once.
Is it possible?
Thanks in advance.
So I could not find any examples in the SQLAlchemy documentation, but I found these functions:
count()
over()
label()
And I managed to combine them to produce exactly the result I was looking for:
from sqlalchemy import func
query = session.query(Guest, func.count(Guest.id).over().label('total'))
query = query.filter(Guest.deleted == None)
query = query.order_by(Guest.id.asc())
query = query.offset(0)
query = query.limit(50)
result = query.all()
Cheers!
P.S. I also found this question on Stack Overflow, which was unanswered.

How do I confine the output of a fetchall() on my table to just the value?

I have the following function:
def credential_check(username, password):
conn = sqlite3.connect('pythontkinter.db')
c = conn.cursor()
idvalue = c.execute('''SELECT ID FROM userdetails WHERE username = "{0}"'''.format(username)).fetchall()
print(idvalue)
I wish to assign the value of ID in my userdetails table to the variable idvalue in the row where the inputted username = userdetails username, however when I use this fetchall() I get [('0',)] printed out rather than just 0.
How do I go about doing this?
Thanks
You can use fetchone() if you only want one value. However, the result will still be returned as a tuple, just without the list.
import sqlite3
conn = sqlite3.connect('test.db')
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS testing(id TEXT)''')
conn.commit()
c.execute("""INSERT INTO testing (id) VALUES ('0')""")
conn.commit()
c.execute("""SELECT id FROM testing""")
data = c.fetchone()
print data
# --> (u'0',)
You can also use LIMIT if you want to restrict the number of returned values with fetchall().
More importantly, don't format your queries like that. Get used to using the ? placeholder as a habit so that you are not vulnerable to SQL injection.
idvalue = c.execute("""SELECT ID FROM userdetails WHERE username = ?""", (username,)).fetchone()

Performing an SQL query for each item in a tuple

I am new to Python and am hoping someone can help me figure out how to perform an SQL query on each item in a tuple using Python.
I have a SQL Express server that contains a number of databases for a badge reader system. What I am trying to do is pull the user id's that have scanned into a particular reader, then use those id's to get the actual user names.
Currently, I am able run the query that pulls the user id's and run a query on the other table using just one id. What want to be able to do, and seem to be having an issue figuring out, is running that second query on every user id in the tuple that is created from the first query. Below is the code for the two functions I am currently using.
def get_id():
global cardholder
global cur
cur.execute("SELECT user_id FROM db.table WHERE badgereaderid = 'badgereader1'")
cardholder = []
rows = cur.fetchall()
for row in rows:
if row == None:
break
cardholder.append(row[0])
print(cardholder)
def get_name():
global cardholder
global user
global cur
cur.execute("SELECT FirstName, LastName FROM db.table WHERE user_id= '%s'" % cardholder)
while 1:
row = cur.fetchone()
if row == None:
break
user = row[0] + row[1]
Two possible options
Repeated queries in Python
for user_id in cardholder:
cur.execute("SELECT FirstName, LastName FROM db.table WHERE user_id= '%s'" % user_id)
But why not just pull all the data in the first query?
cur.execute("SELECT a.user_id, b.FirstName, b.LastName FROM db.table1 a left join bd.table2 b on a.user_id = b.user_id WHERE a.badgereaderid = 'badgereader1'")
or, use triple quotes to allow multi-line strings and make the SQL command easier to understand
cur.execute("""SELECT
a.user_id,
b.FirstName,
b.LastName
FROM db.table1 a
left join db.table2 b
on a.user_id = b.user_id
WHERE a.badgereaderid = 'badgereader1'""")
A good practice in Python is to define the data collections outside the function if you intend to use them later on in your code
Try this code:
cardholder_names = []
#pass the cardholder as a param to the function
def get_name(cardholder):
#cur is already defined as a global param, no need to do it twice
cur.execute("SELECT FirstName, LastName FROM db.table WHERE user_id='{0}'".format(cardholder))
return cur.fetchone()
#now use the for loop to iterate over all the cardholders
for holder in cardholders:
cardholder_name = get_name(holder)
cardholder_names.append( {"name" : cardholder_name[0], "surname" : cardholder_name[1]})

Categories

Resources