SQL Queries on Python using Variables - python

I have a list of IDs
L1=['A1','A14','B43']
I am trying to use a SQL script to extract information from a table where the ID is in the above list.
sqlquery= "select * from table where ID in " + L1
cur.execute(sqlquery)
I've connected to vertica using vertica_python and sqlalchemy_vertica. But I'm not sure how to incorporate my variable (the list L1) into the sql query.
Updated Code:
data = ['A1', 'A14', 'B43', ...]
placeholders = ','.join('?' * len(data)) # this gives you e.g. '?,?,?'
sqlquery = 'SELECT * FROM table WHERE id IN (%s)' % placeholders
cur.execute(sqlquery, tuple(data))

The docs on https://github.com/vertica/vertica-python shows that the Vertica DBAPI implementation uses ? for positional placeholders, so you can use a parametrized query.
Unfortunately lists cannot be passed nicely and need once parameter per element, so you need to generate this part dynamically:
data = ['A1', 'A14', 'B43', ...]
placeholders = ','.join('?' * len(data)) # this gives you e.g. '?,?,?'
sqlquery = 'SELECT * FROM table WHERE id IN (%s)' % placeholders
cur.execute(sqlquery, data)
But you still keep data and SQL separate that way, so there's no risk of SQL injection!

Related

How to the store a column from a SQL request into a variable?

I want to put the result of each column of the result of my request and store them into separate variables, so I can exploit its results.
I precise this is with a SELECt * and not separate requests.
So, If I do for example:
with connection.cursor() as cursor:
# Read a single record
sql = 'SELECT * FROM table'
cursor.execute(sql)
result = cursor.fetchall()
print(result)
I want to do :
a = [results from column1]
b = [results from column2]
The results should be turned into a row and not be left as a column, to make it a dictionary.
It's probably very simple but I'm new with Python / SQL, thank you.

Alter query according to user selection in sqlite python

I have a sqlite database named StudentDB which has 3 columns Roll number, Name, Marks. Now I want to fetch only the columns that user selects in the IDE. User can select one column or two or all the three. How can I alter the query accordingly using Python?
I tried:
import sqlite3
sel={"Roll Number":12}
query = 'select * from StudentDB Where({seq})'.format(seq=','.join(['?']*len(sel))),[i for k,i in sel.items()]
con = sqlite3.connect(database)
cur = con.cursor()
cur.execute(query)
all_data = cur.fetchall()
all_data
I am getting:
operation parameter must be str
You should control the text of the query. The where clause shall allways be in the form WHERE colname=value [AND colname2=...] or (better) WHERE colname=? [AND ...] if you want to build a parameterized query.
So you want:
query = 'select * from StudentDB Where ' + ' AND '.join('"{}"=?'.format(col)
for col in sel.keys())
...
cur.execute(query, tuple(sel.values()))
In your code, the query is now a tuple instead of str and that is why the error.
I assume you want to execute a query like below -
select * from StudentDB Where "Roll number"=?
Then you can change the sql query like this (assuming you want and and not or) -
query = "select * from StudentDB Where {seq}".format(seq=" and ".join('"{}"=?'.format(k) for k in sel.keys()))
and execute the query like -
cur.execute(query, tuple(sel.values()))
Please make sure in your code the provided database is defined and contains the database name and studentDB is indeed the table name and not database name.

How to use dynamic variables(tuple) in pandasql query?

number_tuple = (1,4,6,3)
sensex_quaterly_df = psql.sqldf("SELECT * FROM sensex_df
WHERE 'Num' IN ('number_tuple')")
"HERE number_tuple has the values that I want to retrieve from sensex_df database"
Because pandasql allows you to run SQL on data frames, you can build SQL with concatenated values of tuple into comma-separated string using string.join().
number_tuple = (1,4,6,3)
in_values = ", ".join(str(i) for i in number_tuple)
sql = f"SELECT * FROM sensex_df WHERE Num IN ({in_values})"
sensex_quaterly_df = psql.sqldf(sql)
However, concatenated SQL strings is not recommended if you use an actual relational database as backend. If so, use parameterization where you develop a prepared SQL statement with placeholders like %s of ? and in subsequent step binding values. Below demonstrates with pandas read_sql:
number_tuple = (1,4,6,3)
in_values = ", ".join('?' for i in number_tuple)
sql = f"SELECT * FROM sensex_df WHERE Num IN ({in_values})"
sensex_quaterly_df = pd.read_sql(sql, conn, params=number_tuple)

Using format in two queries in sql

I have a query I use except.I want to send the table path in format when running the select query.
query_2="""select *
from {}.{}
where date(etl_date) = current_date
except select *
from {}_test.{}
where date(etl_date)=current_date"""
.format(liste[0],liste[1])
But naturally I get an error like this.
IndexError: tuple index out of range
How else can I use the format function here? Thanks...
Do not use simple format for SQL queries; use sql.Identifier for tables, fields and use the second argument of the execute method to pass variables (if needed).
from psycopg2.sql import Identifier, SQL
connection = psycopg2.connect("...")
cursor = connection.cursor()
suffix = "_test"
identifiers = [Identifier("some_schema"), Identifier("some_table"), Identifier("other_schema%s" % suffix), Identifier("other_table")]
query_2 = SQL("""select * from {}.{} where date(etl_date) = current_date
except select * from {}.{} where date(etl_date)=current_date""").format(*identifiers)
print(query_2.as_string(cursor)) # if you want to see the final query
cursor.execute(query_2)
Output
select * from "some_schema"."some_table" where date(etl_date) = current_date
except select * from "other_schema_test"."other_table" where date(etl_date)=current_date
This assumes you have multiple schemas in the same database as you can't easily do cross database queries in PostgreSQL.

Oracle Parameters used multiple places in query

I'm trying to pass the same parameters to an oracle query in two separate places in the SQL code.
My code works if I hard code the criteria for table2 like this:
# define parameters
years = ['2018','2019']
placeholder= ':d'
placeholders= ', '.join(placeholder for unused in years)
placeholders
# create cursor
cursor = connection.cursor()
# query
qry = """
select * from table1
INNER
JOIN table2
ON table1_id = table2_id
where table1_year in (%s)
and table2_year in ['2018','2019'] --here's where I say I'm hard coding criteria
""" % placeholders
data = cursor.execute(qry, years)
df = pd.DataFrame(data.fetchall(), columns = [column[0] for column in cursor.description])
# close database connection
connection.close()
If I try to use the parameter for table2 like this:
qry = """
select * from table1
INNER
JOIN table2
ON table1_id = table2_id
where table1_year in (%s)
and table2_year in (%s) --part of code I'm having issues with
""" % placeholders
I get the following error:
TypeError: not enough arguments for format string
I can't simply rewrite the SQL because I frequently have to use someone else's code and it wouldn't be feasible to rewrite all of it.
If you want to fill multiple placeholders, you have to supply the same number of parameters.
"one meal: %s" % "sandwich" # works
"two meals: %s, %s" % "sandwich" # not working
"two meals: %s, %s" % ("sandwich", "sandwich") # works
NOTE: It is a bad/dangerous thing to use string formatting for the assembly of SQL queries (lookup "SQL Injection"). In your case it is fine, but in general you should use parameterized queries, especially when dealing with input from untrusted sources like user input. You don't want a user to input "2018; DROP TABLE table1;".

Categories

Resources