It's not straight forward to find information on this so wondering if there are some docs I can look at but basically I want to achieve passing multiple conditions to either .where() or .order_by() that is safe from SQL injection.
Here's how I am currently doing this: Two tables: Archive and Backup, and I am trying to filter by archive.city, archive.zip, and backup.serial and then I am ordering by all of those fields. The values are coming from the user via URL parameters so I need to make sure these are safe from SQL injection and sanitized.
filters = []
sorts = []
if 'city' in query:
city = query['city']
filters.append(text(f'archive.city = {city}'))
sorts.append(text(f'archive.city = {city}'))
if 'zip' in query:
zip = query['zip']
filters.append(text(f'archive.zip > {zip}'))
sorts.append(text(f'archive.zip DESC'))
if 'serial' in query:
serial = query['serial']
filters.append(text(f'backup.serial IN {serial}'))
sorts.append(text(f'backup.serial ASC'))
with Session(engine) as session:
results = session.exec(select(Archive, Backup)
.join(Backup)
.where(and_(*filters))
.order_by(*sorts).all()
as I understand the text() is not safe from sql injection, so how do I transform this so that it does what I want and is safe from sql injection?
You can invoke .where() and .order_by() on a select() multiple times and SQLAlchemy will logically "and" them for you:
qry = select(Task)
qry = qry.where(Task.description == "foo")
qry = qry.where(Task.priority < 2)
qry = qry.order_by(Task.priority)
qry = qry.order_by(Task.description)
print(qry)
"""
SELECT task.id, task.description, task.priority
FROM task
WHERE task.description = :description_1 AND task.priority < :priority_1
ORDER BY task.priority, task.description
"""
Related
Ultimately I'm trying to use pandas read_sql which takes in a raw SQL query.
I'm trying to convert something like;
sql = str(session.query(Post).filter(Post.user_id=1))
That generates something like
select * from Post where user_id = %(user_id_1)
Is there any way to generate that query with the parameter already interpolated?
As you have found, if we str() an ORM query we get the SQL command text with parameter placeholders using the paramstyle for our dialect:
qry = session.query(Parent).filter(Parent.id == 1)
sql = str(qry)
print(sql)
"""console output:
SELECT parent.id AS parent_id, parent.lastname AS parent_lastname, parent.firstname AS parent_firstname
FROM parent
WHERE parent.id = %(id_1)s
"""
If we want to have the parameter values embedded in the SQL statement then we need to .compile() it:
sql_literal = qry.statement.compile(
compile_kwargs={"literal_binds": True},
)
print(sql_literal)
"""console output:
SELECT parent.id, parent.lastname, parent.firstname
FROM parent
WHERE parent.id = 1
"""
(Standard disclaimers regarding SQL Injection apply.)
In a prototype application that uses Python and SQLAlchemy with a PostgreSQL database I have the following schema (excerpt):
class Guest(Base):
__tablename__ = 'guest'
id = Column(Integer, primary_key=True)
name = Column(String(50))
surname = Column(String(50))
email = Column(String(255))
[..]
deleted = Column(Date, default=None)
I want to build a query, using SQLAlchemy, that retrieves the list of guests, to be displayed in the back-office.
To implement pagination I will be using LIMIT and OFFSET, and also COUNT(*) OVER() to get the total amount of records while executing the query (not with a different query).
An example of the SQL query could be:
SELECT id, name, surname, email,
COUNT(*) OVER() AS total
FROM guest
WHERE (deleted IS NULL)
ORDER BY id ASC
LIMIT 50
OFFSET 0
If I were to build the query using SQLAlchemy, I could do something like:
query = session.query(Guest)
query = query.filter(Login.deleted == None)
query = query.order_by(Guest.id.asc())
query = query.offset(0)
query = query.limit(50)
result = query.all()
And if I wanted to count all the rows in the guests table, I could do something like this:
from sqlalchemy import func
query = session.query(func.count(Guest.id))
query = query.filter(Login.deleted == None)
result = query.scalar()
Now the question I am asking is how to execute one single query, using SQLAlchemy, similar to the one above, that kills two birds with one stone (returns the first 50 rows and the count of the total rows to build the pagination links, all in one query).
The interesting bit is the use of window functions in PostgreSQL which allows the abovementioned behaviour, thus saving you from having to query twice but just once.
Is it possible?
Thanks in advance.
So I could not find any examples in the SQLAlchemy documentation, but I found these functions:
count()
over()
label()
And I managed to combine them to produce exactly the result I was looking for:
from sqlalchemy import func
query = session.query(Guest, func.count(Guest.id).over().label('total'))
query = query.filter(Guest.deleted == None)
query = query.order_by(Guest.id.asc())
query = query.offset(0)
query = query.limit(50)
result = query.all()
Cheers!
P.S. I also found this question on Stack Overflow, which was unanswered.
In trying to replicate a MySQL query in SQL Alchemy, I've hit a snag in specifying which tables to select from.
The query that works is
SELECT c.*
FROM attacks AS a INNER JOIN hosts h ON a.host_id = h.id
INNER JOIN cities c ON h.city_id = c.id
GROUP BY c.id;
I try to accomplish this in SQLAlchemy using the following function
def all_cities():
session = connection.globe.get_session()
destination_city = aliased(City, name='destination_city')
query = session.query(City). \
select_from(Attack).\
join((Host, Attack.host_id == Host.id)).\
join((destination_city, Host.city_id == destination_city.id)).\
group_by(destination_city.id)
print query
results = [result.serialize() for result in query]
session.close()
file(os.path.join(os.path.dirname(__file__), "servers.geojson"), 'a').write(geojson.feature_collection(results))
When printing the query, I end up with ALMOST the right query
SELECT
cities.id AS cities_id,
cities.country_id AS cities_country_id,
cities.province AS cities_province,
cities.latitude AS cities_latitude,
cities.longitude AS cities_longitude,
cities.name AS cities_name
FROM cities, attacks
INNER JOIN hosts ON attacks.host_id = hosts.id
INNER JOIN cities AS destination_city ON hosts.city_id = destination_city.id
GROUP BY destination_city.id
However, you will note that it is selecting from cities, attacks...
How can I get it to select only from the attacks table?
The line here :
query = session.query(City)
is querying the City table also that's why you are getting the query as
FROM cities, attacks
I want to return result in to custom model(class) but I got this error message:
SQL expression, column, or mapped entity expected
I don't know where I went wrong.
connect = Connector()
Session = sessionmaker(bind=connect.ConnectorMySql())
ses = Session()
query = u"""
SELECT
`reports`.ID As 'ID',
reports.Title AS 'ReportTitle',
`reports`.Text as 'ReporText',
`reports`.Status as 'Status',
`user`.ID AS 'ReporterID',
`user`.Name as 'ReporterName' ,
`user`.Username as 'ReporterUserName',
`user`.ImageProfile as 'ReporterAvatar',
`Clinet`.ID AS 'ClinetID',
`Clinet`.SiteUserName AS 'ClinetUserName',
`Clinet`.ImageProfile as 'ClinetAvatar'
FROM reports
JOIN Clinet on reports.ClinetID = `Clinet`.ID
JOIN users user on reports.UserID = `user`.ID
where
:pClinetID IS NULL OR reports.ClinetID=:pClinetID
AND
:pStatus IS NULL OR reports.Status=:pStatus;
"""
QueryResult=CustomModel()
QueryResult=ses.query(CustomModel).from_statement(query).all()
return QueryResult
To use query strings in SQLAlchemy, you have to craft them with text function.
So you should use
QueryResult=ses.query(CustomModel).from_statement(text(query)).all()
I need to write a query that returns all object less that or equal to a certain day of a certain month. The year is not important. It's easy enough to get an object by a particular day/month (assume now = datetime.datetime.now()):
posts = TodaysObject.objects.filter(publish_date__day=now.day, publish_date__month=now.month)
But I can't do this:
posts = TodaysObject.objects.filter(publish_date__day__lte=now.day, publish_date__month=now.month)
Seems that Django thinks I'm trying to do a join when combining multiple field lookups (publish_date__day__lte). What's the best way to do this in Django?
Try this:
Option 1:
from django.db.models import Q
datafilter = Q()
for i in xrange(1, now.day+1):
datafilter = datafilter | Q(publish_date__day=i)
datafilter = datafilter & Q(publish_date__month=now.month)
posts = TodaysObject.objects.filter(datafilter)
Option 2:
Perform raw sql query:
def query_dicts(query_string, *query_args):
from django.db import connection
cursor = connection.cursor()
cursor.execute(query_string, query_args)
col_names = [desc[0] for desc in cursor.description]
while True:
row = cursor.fetchone()
if row is None:
break
row_dict = dict(izip(col_names, row))
yield row_dict
return
posts = query_dicts('SELECT * FROM tablename WHERE DAY(publish_date)<=%s AND MONTH(publish_date)=%s', now.day, now.month)
Using extra() function:
posts = TodaysObject.objects.extra([where='DAY(publish_date)<=%d AND MONTH(publish_date)=%d' % (now.day, now.month)])
It's assumed that you are using MySQL. For PostgreSQL, you need to change DAY(publish_date) and MONTH(publish_date) to DATE_PART('DAY', publish_date) and DATE_PART('MONTH', publish_date) respectively.
it's not always portable from one database engine to another, but you may want to look into the extra() queryset method.
from django docs
this allows you to inject raw sql to construct more complex queries than the django queryset api.
if your application needs to be portable to different database engines, you can try restructuring so you have day, month, and year integer fields.
now = datetime.date.today()
post = TodaysObject.objects.raw("SELECT * FROM (app_name)_todaysobject WHERE DAY(publish_date) =%(day)d AND MONTH(publish_date)=%(month)d" %{'day' : today.day, 'month':today.month} )