I have a CTE which resides in a table function which requires a parameter being passed into it. The data I need is then called with something like
SELECT * FROM myThingFunction('e543149c-6589-49c6-b962-bf2503c0e278')
What I would like to do if possible is map a SQLAlchamy model so that I can apply filters, limits etc to the record set returned, for example
qry = session.query(Thing).limit(100)
What I am struggling with is how do I handle the parameter. I know that I am treating the function like a table which feels a bit wrong as the function is more of a composite set of relations rather than a table mapping to just one type of domain object but I need to get this data into Python somehow.
Have you seen the recipe for mapping arbitrary selects? You can write a factory method which returns a class representing the query for a given parameter:
def myThingFunction(param):
tmpSelect = select(..)
class tmpCls(Base):
__table__ = tmpSelect
return tmpCls
But there is a note below this recipe which states that creating a mapping is unnecessary. I haven't tried it, but in principle,
session.query(func.myThingFunction("bar")).all()
might work, too. (func.foo creates a GenericFunction on the fly, see the documentation for FunctionElement and below.)
Related
Prologue:
This is a question arising often in SO:
Equivalent of PostGIS ST_MakeValid in Django GEOS
Geodjango: How to Buffer From Point
Get random point from django PolygonField
Django custom for complex Func (sql function)
and can be applied to the above as well as in the following:
Django F expression on datetime objects
I wanted to compose an example on SO Documentation but since it got shut down on August 8, 2017, I will follow the suggestion of this widely upvoted and discussed meta answer and write my example as a self-answered post.
Of course, I would be more than happy to see any different approach as well!!
Question:
Django/GeoDjango has some database functions like Lower() or MakeValid() which can be used like this:
Author.objects.create(name='Margaret Smith')
author = Author.objects.annotate(name_lower=Lower('name')).get()
print(author.name_lower)
Is there any way to use and/or create my own custom database function based on existing database functions like:
Position() (MySQL)
TRIM() (SQLite)
ST_MakePoint() (PostgreSQL with PostGIS)
How can I apply/use those functions in Django/GeoDjango ORM?
Django provides the Func() expression to facilitate the calling of database functions in a queryset:
Func() expressions are the base type of all expressions that involve database functions like COALESCE and LOWER, or aggregates like SUM.
There are 2 options on how to use a database function in Django/GeoDjango ORM:
For convenience, let us assume that the model is named MyModel and that the substring is stored in a variable named subst:
from django.contrib.gis.db import models as gis_models
class MyModel(models.Model):
name = models.CharField()
the_geom = gis_models.PolygonField()
Use Func()to call the function directly:
We will also need the following to make our queries work:
Aggregation to add a field to each entry in our database.
F() which allows the execution of arithmetic operations on and between model fields.
Value() which will sanitize any given value (why is this important?)
The query:
MyModel.objects.aggregate(
pos=Func(F('name'), Value(subst), function='POSITION')
)
Create your own database function extending Func:
We can extend Func class to create our own database functions:
class Position(Func):
function = 'POSITION'
and use it in a query:
MyModel.objects.aggregate(pos=Position('name', Value(subst)))
GeoDjango Appendix:
In GeoDjango, in order to import a GIS related function (like PostGIS's Transform function) the Func() method must be replaced by GeoFunc(), but it is essentially used under the same principles:
class Transform(GeoFunc):
function='ST_Transform'
There are more complex cases of GeoFunc usage and an interesting use case has emerged here: How to calculate Frechet Distance in Django?
Generalize custom database function Appendix:
In case that you want to create a custom database function (Option 2) and you want to be able to use it with any database without knowing it beforehand, you can use Func's as_<database-name> method, provided that the function you want to use exists in every database:
class Position(Func):
function = 'POSITION' # MySQL method
def as_sqlite(self, compiler, connection):
#SQLite method
return self.as_sql(compiler, connection, function='INSTR')
def as_postgresql(self, compiler, connection):
# PostgreSQL method
return self.as_sql(compiler, connection, function='STRPOS')
I have to use SQLalchemy Core expression to fetch objects because ORM can't do "update and returning". (the update in ORM doesn't has returning)
from sqlalchemy import update
class User(ORMBase):
...
# pure sql expression, the object returned is not ORM object.
# the object is a RowProxy.
object = update(User) \
.values({'name': 'Wayne'}) \
.where(User.id == subquery.as_scalar()) \
.returning() \
.fetchone()
When
db_session.add(object)
it report UnmappedInstanceError: Class 'sqlalchemy.engine.result.RowProxy' is not mapped.
How do I put that RowProxy object from sql expression into identity map of ORM
?
I'm not sure there is a straight-forward way to do what you're describing, which is essentially to build an ORM object that maps directly to an database entry but without performing the query through the ORM.
My intuition is that the naive approach (just build init the ORM object with the values in the database) would just create another row with the same values (or fail to because of uniqueness constraints).
The more standard way to do what you are asking would be to query the row through the ORM first and then update the database from that ORM object.
user = User.query.filter(User.user_attribute == 'foo').one()
user.some_value = 'bar'
session.add(user)
session.commit()
I'm not sure if you have some constraint on your end that prevents you from using that pattern though. The documentation works through similar examples
Simple case:
Possible quick solution: construct the object from kwargs of your RowProxy, since those are object-like.
Given:
rowproxy = update(User) \
.values({'name': 'Wayne'}) \
.where(User.id == subquery.as_scalar()) \
.returning() \
.fetchone()
We might be able to do:
user = User(**dict(rowproxy.items()))
rowproxy.items() returns tuples of key-value pairs; dict(...) converts the tuples into actual key-value pairs; and User(...) takes kwargs for the model attribute names.
More difficult case:
But what if you have a model where one of the attribute names isn't quite the same as the SQL table column name? E.g. something like:
class User(ORMBase):
# etc...
user_id = Column(name='id', etc)
When we try to unpack our rowproxy into the User class, we'll likely get an error along the lines of: TypeError: 'id' is an invalid keyword argument for User (because it's expecting user_id instead).
Now it gets dirty: we should have lying around a mapper for how to get from the table attributes to the model attributes and vice versa:
kw_map = {a.key: a.class_attribute.name for a in User.__mapper__.attrs}
Here, a.key is the model attribute (and kwarg), and a.class_attribute.name is the table attribute. This gives us something like:
{
"user_id": "id"
}
Well, we want to actually provide the values we got back from our rowproxy, which besides allowing object-like access also allows dict-like access:
kwargs = {a.key: rowproxy[a.class_attribute.name] for a in User.__mapper__.attrs}
And now we can do:
user = User(**kwargs)
Errata:
you may want to session.commit() right after calling update().returning() to prevent long delays from your changes vs. when they get permanently stored in the database. No need to session.add(user) later - you already updated() and just need to commit() that transaction
object is a keyword in Python, so try not to stomp on it; you could get some very bizarre behavior doing that; that's why I renamed to rowproxy.
I have a model in Flask-Admin with filter (e.g. based on Foreign Key to other model).
I want to generate links from front-end to this model view in admin with filter value applied. I noticed that it adds ?flt0_0= to the url, so that the whole address looks kinda:
http:/.../admin/model_view_<my model>/?flt0_0=<filter value>
Which is the best way to generate routes like this?
I prefer setting named_filter_urls=True on my base view to get rid of these magic numbers (though you can just set it on any specific view as well):
class MyBaseView(BaseModelView):
...
named_filter_urls = True
class MyView(MyBaseView):
...
column_filters = ['name', 'country']
This creates URLs like: http://.../admin/model/?flt_name_equals=foo&flt_country_contains=bar (*)
With this, your URLs can easily be contructed using the name of the attribute you want to filter on. As a bonus, you don't need to have a view instance available - important if you want to link to a view for a different model.
*(When selecting filters from UI, Flask-Admin will insert integers into the parameter keys. I'm not sure why it does that, but they don't appear necessary for simple filtering.)
Flask-Admin defaults to the flt0_0=<value> syntax to be "robust across translations" if your app needs to support multiple languages. If you don't need to worry about translations, setting named_filter_urls=True is the way to go.
With named_filter_urls=True Flask-Admin generates filter query parameters like:
flt0_country_contains=<value>
The remaining integer after flt (0 in this case) is a sort key used to control the order of the filters as they appear in the UI when you have multiple filters defined. This number does not matter at all if you have a single filter.
For example, in my app I have named filters turned on. If I have multiple filters without the sort key the filters are displayed in the order they appear in the query string:
?flt_balance_smaller_than=100&flt_balance_greater_than=5
Yields: Default filter ordering
With a sort key added to the flt parameters, then I can force those filters to be displayed in a different order (flt1 will come before flt2):
?flt2_balance_smaller_than=100&flt1_balance_greater_than=5
Yields: Forced filter ordering
In practice it looks like this sort key can be any single character, e.g. this works too:
?fltB_balance_smaller_than=100&fltA_balance_greater_than=5
This behavior is ultimately defined in the Flask-Admin BaseModelView._get_list_filter_args() method here:
https://github.com/flask-admin/flask-admin/blob/master/flask_admin/model/base.py#L1714-L1739
Unfortunately, there's no public API for this yet. Here's a short snippet you can use for now to generate fltX_Y query string:
class MyView(BaseModelView):
...
def get_filter_arg(self, filter_name, filter_op='equals'):
filters = self._filter_groups[filter_name].filters
position = self._filter_groups.keys().index(filter_name)
for f in filters:
if f['operation'] == filter_op:
return 'flt%d_%d' % (position, f['index'])
Then you can call this method on a your view instance:
print my_view.get_filter_arg('Name', 'contains')
I'm getting an error I don't understand with AbstractConcreteBase
in my_enum.py
class MyEnum(AbstractConcreteBase, Base):
pass
in enum1.py
class Enum1(MyEnum):
years = Column(SmallInteger, default=0)
# class MyEnums1:
# NONE = Enum1()
# Y1 = Enum1(years=1)
in enum2.py
class Enum2(MyEnum):
class_name_python = Column(String(50))
in test.py
from galileo.copernicus.basic_enum.enum1 import Enum1
from galileo.copernicus.basic_enum.enum2 import Enum2
#...
If I uncomment the three lines in enum1.py I get the following error on the second import.
AttributeError: type object 'MyEnum' has no attribute 'table'
but without MyEnums1 it works fine or with MyEnums1 in a separate file it works fine. Why would this instantiation affect the import? Is there anyway I can keep MyEnums1 in the same file?
the purpose of the abstractconcretebase is to apply a non-standard order of operations to the standard mapping procedure. normally, mapping works like this:
define a class to be mapped
define a Table
map the class to the Table using mapper().
Declarative essentially combines these three steps, but that's what it does.
When using an abstract concrete base, we have this totally special step that needs to happen - the base class needs to be mapped to a union of all the tables that the subclasses are mapped to. So if you have enum1 and enum2, the "Base" needs to map to essentially "select * from enum1 UNION ALL select * from enum2".
This mapping to a UNION can't happen piecemeal; the MyEnum base class has to present itself to mapper() with the full UNION of every sub-table at once. So AbstractConcreteBase performs the complex task of rearranging how declarative works such that the base MyEnum is not mapped at all until the mapper configuration occurs, which among other places occurs when you first instantiate a mapped class. It then inserts itself as the mapped base for all the existing mapped subclasses.
So basically by instantiating an Enum1() object at the class level like that, you're invoking configure_mappers() way too early, such that by the time Enum2() comes along the abstractconcretebase is baked and the process fails.
All of that aside, it's not at all correct to be instantiating a mapped class like Enum1() at the class level like that. ORM-mapped objects are the complete opposite of global objects and must always be created local to a specific Session.
edit: also those classes are supposed to have {"concrete": True} on them which is part of why you're getting this message. Im trying to see if the message can be improved.
edit 2: yeah the mechanics here are weird. I've committed something else that skips this particular error message, though it will fail differently now and not much better. getting this to fail more gracefully would require a little more work.
I'm working on a OpenERP environment, but maybe my issue can be answered from a pure python perspective. What I'm trying to do is define a class whose "_columns" variable can be set from a function that returns the respective dictionary. So basically:
class repos_report(osv.osv):
_name = "repos.report"
_description = "Reposition"
_auto = False
def _get_dyna_cols(self):
ret = {}
cr = self.cr
cr.execute('Select ... From ...')
pass #<- Fill dictionary
return ret
_columns = _get_dyna_cols()
def init(self, cr):
pass #Other stuff here too, but I need to set my _columns before as per openerp
repos_report()
I have tried many ways, but these code reflects my basic need. When I execute my module for installation I get the following error.
TypeError: _get_dyna_cols() takes exactly 1 argument (0 given)
When defining the the _get_dyna_cols function I'm required to have self as first parameter (even before executing). Also, I need a reference to openerp's 'cr' cursor in order to query data to fill my _columns dictionary. So, how can I call this function so that it can be assigned to _columns? What parameter could I pass to this function?
From an OpenERP perspective, I guess I made my need quite clear. So any other approach suggested is also welcome.
From an OpenERP perspective, the right solution depends on what you're actually trying to do, and that's not quite clear from your description.
Usually the _columns definition of a model must be static, since it will be introspected by the ORM and (among other things) will result in the creation of corresponding database columns. You could set the _columns in the __init__ method (not init1) of your model, but that would not make much sense because the result must not change over time, (and it will only get called once when the model registry is initialized anyway).
Now there are a few exceptions to the "static columns" rules:
Function Fields
When you simply want to dynamically handle read/write operations on a virtual column, you can simply use a column of the fields.function type. It needs to emulate one of the other field types, but can do anything it wants with the data dynamically. Typical examples will store the data in other (real) columns after some pre-processing. There are hundreds of example in the official OpenERP modules.
Dynamic columns set
When you are developing a wizard model (a subclass of TransientModel, formerly osv_memory), you don't usually care about the database storage, and simply want to obtain some input from the user and take corresponding actions.
It is not uncommon in that case to need a completely dynamic set of columns, where the number and types of the columns may change every time the model is used. This can be achieved by overriding a few key API methods to simulate dynamic columns`:
fields_view_get is the API method that is called by the clients to obtain the definition of a view (form/tree/...) for the model.
fields_get is included in the result of fields_view_get but may be called separately, and returns a dict with the columns definition of the model.
search, read, write and create are called by the client in order to access and update record data, and should gracefully accept or return values for the columns that were defined in the result of fields_get
By overriding properly these methods, you can completely implement dynamic columns, but you will need to preserve the API behavior, and handle the persistence of the data (if any) yourself, in real static columns or in other models.
There are a few examples of such dynamic columns sets in the official addons, for example in the survey module that needs to simulate survey forms based on the definition of the survey campaign.
1 The init() method is only called when the model's module is installed or updated, in order to setup/update the database backend for this model. It relies on the _columns to do this.
When you write _columns = _get_dyna_cols() in the class body, that function call is made right there, in the class body, as Python is still parsing the class itself. At that point, your _get_dyn_cols method is just a function object in the local (class body) namespace - and it is called.
The error message you get is due to the missing self parameter, which is inserted only when you access your function as a method - but this error message is not what is wrong here: what is wrong is that you are making an imediate function call and expecting an special behavior, like late execution.
The way in Python to achieve what you want - i.e. to have the method called authomatically when the attribute colluns is accessed is to use the "property" built-in.
In this case, do just this: _columns = property(_get_dyna_cols) -
This will create a class attribute named "columns" which through a mechanism called "descriptor protocol" will call the desired method whenever the attribute is accessed from an instance.
To leran more about the property builtin, check the docs: http://docs.python.org/library/functions.html#property