SQLAlchemy - how to get raw SQL of `.count()` queries? - python

The simplest possible way to get "raw" SQL for any query is just print it (actually, convert it to str).
But, this is not working for count() queries because count() is "firing" method - a method which is stated as "This results in an execution of the underlying query". Other "firing" methods include all(), first() and so on.
How to get SQL for such methods?
I'm especially interested in count() because it transforms underlying query in some way (actually, this way is described explicitly in docs, but things may vary). Other methods can alter resulting SQL as well, for example, first().
So, sometimes it is useful to get raw SQL of such queries in order to investigate how thing goes under the hood.
I read answers about "getting raw SQL" but this case is special because such methods don't return Query objects.
Note that I mean that I need a SQL of existing Query objects which are already constructed in some way.

The following example will return a count of any query object, which you should then be able to convert to a string representation:
from sqlalchemy import func
...
existing_query = session.query(Something)\
.join(OtherThing)\
.filter(OtherThing.foo = 'FOO')\
.subquery()
query = session.query(func.count(existing_query.c.bar).label('bar_count'))
print(query)
actual_count = query.as_scalar() # Executes query
Notice that you have to specify a field from the query output to count. In the example defined by existing_query.c.bar.

Related

SQL query builder with seed-query Parser

Is there an SQL query builder (in Python) which allows me to "parse" and initial SQL query, add certain operators and then get the resulting SQL text?
My use case is the following:
Start with a query like: "SELECT * from my_table"
I want to be able to do something like query_object = Query.parse("SELECT * from my_table to get a query object I can manipulate and then write something like query_object.where('column < 10').limit(10) or similar (columns and operators could also be part of the library, may also have to consider existing WHERE clauses)
And finally getting the resulting query string str(query_object) with the final modified query.
Is this something that can be achieved with any of the ORMs? I don't need all the database connection to specific DB-engines or object mappings (although having it is not a limitation).
I've seen pypika, which allows to create an SQL query from code, but it doesn't allow one to parse an existing query and continue from there.
I've also seen sqlparse which allows me to parse and SQL query into tokens. But because it does not create a tree, it is non-trivial to add additional elements to am existing statement. (it is close to what I am looking for, if only it created an actual tree)

Add own literal string as additional field in result of query - SQLAlchemy

How in SQLAlchemy ORM to make analogue of the following raw sql?
SELECT "http://example.com/page/"||table.pagename as pageUrl
Need get value from table, modify using ORM/Python (here just a string concatenation), and output in result of the SQLAlchemy query as additional field.
The SQLAlchemy string types have operator overloads that allow you to treat them like you'd treat Python strings in this case (string concatenation), but produce SQL expressions:
session.query(
Table,
("http://example.com/page/" + Table.pagename).label("pageUrl"))
You can read more about SQLAlchemy's operator paradigm here: http://docs.sqlalchemy.org/en/latest/core/tutorial.html#operators
This can make via select, but there is almost no ORM:
from sqlalchemy.sql import select, text
q = select([text('"http://example.com/page/"||pagename as pageUrl')]).select_from(Table)
session.execute(q).fetchall()
Results will a list of objects in the RowProxy class.
For me seems that the solve via session.query (the answer above) is more convenient. It is short, and there results in result class that can be easy converting to dict.

Django Raw Query with params on Table Column (SQL Injection)

I have a kinda unusual scenario but in addition to my sql parameters, I need to let the user / API define the table column name too. My problem with the params is that the query results in:
SELECT device_id, time, 's0' ...
instead of
SELECT device_id, time, s0 ...
Is there another way to do that through raw or would I need to escape the column by myself?
queryset = Measurement.objects.raw(
'''
SELECT device_id, time, %(sensor)s FROM measurements
WHERE device_id=%(device_id)s AND time >= to_timestamp(%(start)s) AND time <= to_timestamp(%(end)s)
ORDER BY time ASC;
''', {'device_id': device_id, 'sensor': sensor, 'start': start, 'end': end})
As with any potential for SQL injection, be careful.
But essentially this is a fairly common problem with a fairly safe solution. The problem, in general, is that query parameters are "the right way" to handle query values, but they're not designed for schema elements.
To dynamically include schema elements in your query, you generally have to resort to string concatenation. Which is exactly the thing we're all told not to do with SQL queries.
But the good news here is that you don't have to use the actual user input. This is because, while possible query values are infinite, the superset of possible valid schema elements is quite finite. So you can validate the user's input against that superset.
For example, consider the following process:
User inputs a value as a column name.
Code compares that value (raw string comparison) against a list of known possible values. (This list can be hard-coded, or can be dynamically fetched from the database schema.)
If no match is found, return an error.
If a match is found, use the matched known value directly in the SQL query.
So all you're ever using are the very strings you, as the programmer, put in the code. Which is the same as writing the SQL yourself anyway.
It doesn't look like you need raw() for the example query you posted. I think the following queryset is very similar.
measurements = Measurement.objects.filter(
device_id=device_id,
to_timestamp__gte=start,
to_timestamp__lte,
).order_by('time')
for measurement in measurements:
print(getattr(measurement, sensor)
If you need to optimise and avoid loading other fields, you can use values() or only().

Peewee execute_sql with escaped characters

I have wrote a query which has some string replacements. I am trying to update a url in a table but the url has % signs in which causes a tuple index out of range exception.
If I print the query and run in manually it works fine but through peewee causes an issue. How can I get round this? I'm guessing this is because the percentage signs?
query = """
update table
set url = '%s'
where id = 1
""" % 'www.example.com?colour=Black%26white'
db.execute_sql(query)
The code you are currently sharing is incredibly unsafe, probably for the same reason as is causing your bug. Please do not use it in production, or you will be hacked.
Generally: you practically never want to use normal string operations like %, +, or .format() to construct a SQL query. Rather, you should to use your SQL API/ORM's specific built-in methods for providing dynamic values for a query. In your case of SQLite in peewee, that looks like this:
query = """
update table
set url = ?
where id = 1
"""
values = ('www.example.com?colour=Black%26white',)
db.execute_sql(query, values)
The database engine will automatically take care of any special characters in your data, so you don't need to worry about them. If you ever find yourself encountering issues with special characters in your data, it is a very strong warning sign that some kind of security issue exists.
This is mentioned in the Security and SQL Injection section of peewee's docs.
Wtf are you doing? Peewee supports updates.
Table.update(url=new_url).where(Table.id == some_id).execute()

sqlalchemy using custom methods in filter

I'm having a problem with this sqlalchemy query:
def bvalue(value):
if isinstance(value, unicode):
value = re.sub('[^\w]', "", value).lower()
return value
basicValue = bvalue(someVariable)
q = self.session.query(sheet.id).\
filter(bvalue(sheet.column) == basicValue)
The bvalue function works. I'm trying to match values after stripping them from any special characters and capitalisation. The stripped variable does match with the stripped db value, but still the query is not retrieving any results.
What am I doing wrong? Can't you use custom methods in filters?
You are aware that SQLAlchemy translates your queries into plain SQL statements that are then emitted to your configured database?
So naturally you can't simply add arbitrary python functions, since they would have to be translated into SQL which can't be done in a generic way.
Aside from this general issue, bvalue(sheet.column) will simply return sheet.column (since it's not a unicode instance) and it is evaluated before creating the query. So your query is in fact equivalent to:
q = self.session.query(sheet.id).\
filter(sheet.column == basicValue)
How to get the regex into SQL depends on the database you're using. Check e.g.
REGEXP_LIKE in SQLAlchemy
for a some suggestions.

Categories

Resources