SQLAlchemy - override orm.Query.count for a database without subselect - python

I am using sqlalchemy with a database that doesn't support subselects. What that means is that something like this wouldn't work (where Calendar is a model inheriting a declarative base):
Calendar.query.filter(uuid=uuid).count()
I am trying to override the count method with something like this:
def count(self):
col = func.count(literal_column("'uuid'"))
return self.from_self(col).scalar()
However, the from_self bit still does the subselect. I can't do something like this:
session.query(sql.func.count(Calendar.uuid)).scalar()
Because I want all the filter information from the Query. Is there a way I can get the filter arguments for the current Query without doing the subselect?
Thanks~

From the SQLAlchemy documentation:
For fine grained control over specific columns to count, to skip the usage of a subquery or otherwise control of the FROM clause, or to use other aggregate functions, use func expressions in conjunction with query(), i.e.:
from sqlalchemy import func
# count User records, without
# using a subquery.
session.query(func.count(User.id))
# return count of user "id" grouped
# by "name"
session.query(func.count(User.id)).\
group_by(User.name)
from sqlalchemy import distinct
# count distinct "name" values
session.query(func.count(distinct(User.name)))
Source: SQLAlchemy (sqlalchemy.orm.query.Query.count)

Related

Registering star SELECT (*) #compiles for mapped class entities in a sqlalchemy ORM query

so I have a requirement to replace the individual column select that sqlalchemy automatically does in its queries with a general star select (SELECT table.* FROM...) whenever the whole table or mapped class is provided to Session.query(). It shouldn't change how InstrumentedAttribute objects are compiled though.
So for example, for the ORM classes Table1 and Table2:
str(s.query(Table1, Table2.col1, Table2.col2).select_from(Table1).join(Table2))
Should display as:
SELECT table1.*, table2.col1, table2.col2
FROM table1
JOIN table2 ON table1.some_col = table2.some_other_col
I've read the sqlalchemy documentation on the #compiles decorator factory and I've tried to use it on the following expression constructs with a simple breakpoint:
sqlalchemy.sql.expression.ColumnCollection
sqlalchemy.sql.expression.ClauseList
sqlalchemy.sql.expression.ColumnElement
sqlalchemy.sql.expression.TableClause
so for example:
#compiles(sqlalchemy.sql.expression.ColumnCollection)
def star_select(element, compiler, **kwargs):
breakpoint()
but no matter which of these I pass to #compiles, the flow of execution never actually goes into the function I've decorated with #compiles. And I can't find any others that would seem to do what I want.
Does anyone have any idea which sqlalchemy class I'd have to pass to the #compiles function to override just the column select list portion of an ORM select query?
I'm feel like I'm going crazy with this.

How to add a SQL function to sqlalchemy

I'm working with Oracle SQL and I want to use some of Oracle's function that don't exist in other types of relational databases.
Basically I want to add function that return a weekday for a given date.
From what I understand sqlachemy gives me two way to do that, one is provide sql query as text, another exend sqlalchemy implementing a new python function that represents the SQL function. I'm leaning torwards implementing the function because I expect to use this in few queries.
Here is what I implemented so far to get this done, I'm not really sure what is my next step, or if this is even correct.
from sqlalchemy.sql.expression import FunctionElement
from sqlalchemy.ext.compiler import compiles
class weekday(FunctionElement):
name= 'weekday'
#compiles(weekday)
def complie(element, compiler, **kw):
if len(element.clauses) == 1:
return "TO_CHAR(%s,'DY')" % compiler.process(element.clauses)
elif len(element.clauses) == 0:
raise TypeError("Weekday needs a date as parameter")
else:
raise TypeError("Weekday needs just one parameter")
When I tried to add this funtion for one of my objects instead of caculating results I got the function istelf back, here is an example of what I'm taking about:
from sqlalchemy import Column, Date
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.hybrid import hybrid_property
class SomeObject(Base):
__tablename__ = 'table1'
asof = Column(Date,primary_key=True)
#hybrid_property
def weekday(self):
return weekday(self.asof)
In shell I tried:
from datetime import datetime
my_object = SomeObject()
my_object.asof = datetime(2018,1,1)
session.add(my_object)
session.commit()
result = session.query(SomeObject).filter(SomeObject.asof == datetime(2018,1,1)).first()
result.weekday # returns `<orm.weekday as 0x1724b7deeb8; weekday>`
NOTE
I insist on extracting that weekday in SQL query rather than in python because I need this to filter out some records, and in my case that funtion will determine if sqlalchemy pulls out couple million or just couple records.
After trying out few things I realized that hybrid_property is not supposed to return a sql expression, it needs to return the actual value that sql expression would have returned.
Thta being said my sql function would go into the 'expression' part of hybrid_property which would look like this:
#weekday.expression
def weekday(cls):
return weekday(cls.asof)
Old question, but you can also use sqlachemy's SQL and Generic Functions:
https://docs.sqlalchemy.org/en/13/core/functions.html

How to make/use a custom database function in Django

Prologue:
This is a question arising often in SO:
Equivalent of PostGIS ST_MakeValid in Django GEOS
Geodjango: How to Buffer From Point
Get random point from django PolygonField
Django custom for complex Func (sql function)
and can be applied to the above as well as in the following:
Django F expression on datetime objects
I wanted to compose an example on SO Documentation but since it got shut down on August 8, 2017, I will follow the suggestion of this widely upvoted and discussed meta answer and write my example as a self-answered post.
Of course, I would be more than happy to see any different approach as well!!
Question:
Django/GeoDjango has some database functions like Lower() or MakeValid() which can be used like this:
Author.objects.create(name='Margaret Smith')
author = Author.objects.annotate(name_lower=Lower('name')).get()
print(author.name_lower)
Is there any way to use and/or create my own custom database function based on existing database functions like:
Position() (MySQL)
TRIM() (SQLite)
ST_MakePoint() (PostgreSQL with PostGIS)
How can I apply/use those functions in Django/GeoDjango ORM?
Django provides the Func() expression to facilitate the calling of database functions in a queryset:
Func() expressions are the base type of all expressions that involve database functions like COALESCE and LOWER, or aggregates like SUM.
There are 2 options on how to use a database function in Django/GeoDjango ORM:
For convenience, let us assume that the model is named MyModel and that the substring is stored in a variable named subst:
from django.contrib.gis.db import models as gis_models
class MyModel(models.Model):
name = models.CharField()
the_geom = gis_models.PolygonField()
Use Func()to call the function directly:
We will also need the following to make our queries work:
Aggregation to add a field to each entry in our database.
F() which allows the execution of arithmetic operations on and between model fields.
Value() which will sanitize any given value (why is this important?)
The query:
MyModel.objects.aggregate(
pos=Func(F('name'), Value(subst), function='POSITION')
)
Create your own database function extending Func:
We can extend Func class to create our own database functions:
class Position(Func):
function = 'POSITION'
and use it in a query:
MyModel.objects.aggregate(pos=Position('name', Value(subst)))
GeoDjango Appendix:
In GeoDjango, in order to import a GIS related function (like PostGIS's Transform function) the Func() method must be replaced by GeoFunc(), but it is essentially used under the same principles:
class Transform(GeoFunc):
function='ST_Transform'
There are more complex cases of GeoFunc usage and an interesting use case has emerged here: How to calculate Frechet Distance in Django?
Generalize custom database function Appendix:
In case that you want to create a custom database function (Option 2) and you want to be able to use it with any database without knowing it beforehand, you can use Func's as_<database-name> method, provided that the function you want to use exists in every database:
class Position(Func):
function = 'POSITION' # MySQL method
def as_sqlite(self, compiler, connection):
#SQLite method
return self.as_sql(compiler, connection, function='INSTR')
def as_postgresql(self, compiler, connection):
# PostgreSQL method
return self.as_sql(compiler, connection, function='STRPOS')

How to output logarithm of some calculated value using Django, MySQL

I want to create the following query in Django:
select field1, count(field1), log(count(field1)) from object_table
where parent_id = 12345
group by field1;
I've implemented field1, count(field1) and group by field1 by following:
from django.db.models import Count
Object.objects.filter(
parent = 12345
).values_list(
'field1'
).annotate(
count=Count('field1')
)
However if I add something like this
.extra(
select={'_log':'log(count)'}
)
it doesn't affect my results. Could you give me a clue what am I doing wrong? How to implement log(count(field)) within Django?
PS, I'm using Django 1.9.
Thanks in advance!
Note that some databases don't natively support logarithm function (e.g. sqlite). This is probably an operation that should be done in your Python code instead of the database query.
import math
for obj in object_list:
# use math.log() for natural logarithm
obj._log = math.log10(obj.count)
If you are certain you can rely on a database function and you want to use the database to perform the computation, you can use raw queries. For example, postgres has the log function implemented:
query = """\
select count(field1), log(count(field1)) as logvalue
from myapp_mymodel
group by field1"""
queryset = MyModel.objects.raw(query)
for obj in queryset:
print(obj.logvalue)

Filter on/Order by Postgres range type in SQLAlchemy

SQLAlchemy supports Postgres range types, as described here. It uses the postgresql+psycopg2 dialect for Postgres communication. These testcases give usage examples for the range types in SQLALchemy.
How can I filter by, or order by, one component (lower or upper) of such a range field in SQLAlchemy?
Using the example from the first link
from psycopg2.extras import DateTimeRange
from sqlalchemy.dialects.postgresql import TSRANGE
class RoomBooking(Base):
__tablename__ = 'room_booking'
room = Column(Integer(), primary_key=True)
during = Column(TSRANGE())
booking = RoomBooking(
room=101,
during=DateTimeRange(datetime(2013, 3, 23), None)
)
I would, e.g., like to filter on bookings with a during that begins on a given datetime or order the bookings by the start of the datetime.
As such I'm looking to generate roughly this SQL:
SELECT room, during
FROM room_booking
WHERE lower(during) = foo
ORDER BY upper(during)
I have tried constructs like
RoomBooking.query.filter(RoomBooking.during.lower == foo).order_by(RoomBooking.during.upper)
but recognize that this is likely not working because lower is an attribute on the python object and not associated with the underlying table column.
One possible solution to this might be finding a way to use the upper()/lower() range functions from SQLAlchemy.
One way to do this is to use the already existing func.lower()/func.upper() methods in sqlalchemy:
from sqlalchemy import func
RoomBooking.query.filter(func.lower(RoomBooking.during) == foo).order_by(func.upper(RoomBooking.during))
These methods were probably introduced to support (un)capitalizing text – it would be interesting to see if other unavailable postgres functions can be implemented in a similar manner as well.

Categories

Resources