I'm getting an error I don't understand with AbstractConcreteBase
in my_enum.py
class MyEnum(AbstractConcreteBase, Base):
pass
in enum1.py
class Enum1(MyEnum):
years = Column(SmallInteger, default=0)
# class MyEnums1:
# NONE = Enum1()
# Y1 = Enum1(years=1)
in enum2.py
class Enum2(MyEnum):
class_name_python = Column(String(50))
in test.py
from galileo.copernicus.basic_enum.enum1 import Enum1
from galileo.copernicus.basic_enum.enum2 import Enum2
#...
If I uncomment the three lines in enum1.py I get the following error on the second import.
AttributeError: type object 'MyEnum' has no attribute 'table'
but without MyEnums1 it works fine or with MyEnums1 in a separate file it works fine. Why would this instantiation affect the import? Is there anyway I can keep MyEnums1 in the same file?
the purpose of the abstractconcretebase is to apply a non-standard order of operations to the standard mapping procedure. normally, mapping works like this:
define a class to be mapped
define a Table
map the class to the Table using mapper().
Declarative essentially combines these three steps, but that's what it does.
When using an abstract concrete base, we have this totally special step that needs to happen - the base class needs to be mapped to a union of all the tables that the subclasses are mapped to. So if you have enum1 and enum2, the "Base" needs to map to essentially "select * from enum1 UNION ALL select * from enum2".
This mapping to a UNION can't happen piecemeal; the MyEnum base class has to present itself to mapper() with the full UNION of every sub-table at once. So AbstractConcreteBase performs the complex task of rearranging how declarative works such that the base MyEnum is not mapped at all until the mapper configuration occurs, which among other places occurs when you first instantiate a mapped class. It then inserts itself as the mapped base for all the existing mapped subclasses.
So basically by instantiating an Enum1() object at the class level like that, you're invoking configure_mappers() way too early, such that by the time Enum2() comes along the abstractconcretebase is baked and the process fails.
All of that aside, it's not at all correct to be instantiating a mapped class like Enum1() at the class level like that. ORM-mapped objects are the complete opposite of global objects and must always be created local to a specific Session.
edit: also those classes are supposed to have {"concrete": True} on them which is part of why you're getting this message. Im trying to see if the message can be improved.
edit 2: yeah the mechanics here are weird. I've committed something else that skips this particular error message, though it will fail differently now and not much better. getting this to fail more gracefully would require a little more work.
Related
I'm using SQLAlchemy with Python (linking to an MySQL database) and am having a little design issue, which has presented itself as a problem with a backref.
So the situation is this. I have a SearchGroup which contains TargetObjects and SearchObjects. These are both many to many relationships, and so the SearchGroup table comes with two association tables, one for each. The SearchObject is the same time for any SearchGroup, but the TargetObject varies. So far so good. The whole idea here is that a SearchObject is simply a string with a few other variables, and a SearchGroup compares them all to a given string and then, if there's a match, supplies the target objects.
Now for some code: the declaration of these three classes, although with the parent logic hidden for brevity:
class AssocTable_GroupCMClassesGrades_Grades(AssociationTable_Group_TargetObjectsParent, med.DeclarativeBase):
__tablename__ = 'AssocTable_GroupCMClassesGrades_Grades'
_groupsTableName = 'SearchGroup_CMClasses_Grades'
_targetObjectsTableName = 'Grades'
class AssocTable_GroupCMClassesGrades_SearchObjects(AssociationTable_Group_SearchObjectsParent, med.DeclarativeBase):
__tablename__ = 'AssocTable_GroupCMClassesGrades_SearchObjects'
_groupsTableName = 'SearchGroup_CMClasses_Grades'
_searchObjectsTableName = 'SearchObjects'
class SearchGroup_CMClasses_Grades(SearchObjectGroupParent, med.DeclarativeBase):
__tablename__ = 'SearchGroup_CMClasses_Grades'
targetAssociatedTargetObjectTableName = 'AssocTable_GroupCMClassesGrades_Grades'
targetAssociatedSearchObjectTableName = 'AssocTable_GroupCMClassesGrades_SearchObjects'
targetClassName = 'Grade'
myClassName = 'SearchGroup_CMClasses_Grades'
searchObjectClassName = 'SearchObject'
searchObjectChildrenBackRefName = 'Groups'
The top two are the association tables and the bottom is the main class. The strings are used to set up various foreign keys and relationships and such.
Let's look at a specific example, which is crucial to the question:
#declared_attr
def searchObject_childen(cls):
return relationship(f'{cls.searchObjectClassName}', secondary=f'{cls.targetAssociatedSearchObjectTableName}', backref=f'{cls.searchObjectChildrenBackRefName}')
This is inside the SearchObjectGroupParent class and, as you can see, is for the 'children' of the SearchGroup, which are SearchObjects.
So now to the problem.
That all works rather well, except for one thing. If I could direct your attention back to the large bit of code above, and to this line:
searchObjectChildrenBackRefName = 'Groups'
This, as seen in the second posted piece of code (the declared_attr one), sets up a backref; a property in the target - it creates that property and then populates it. I'm not an expert at this by any means so I won't pretend to be. The point is this: if I create another SearchObjectGroupParent derived class, like the one above, with its association tables, I can't put another 'Groups' property into SearchObject - in fact it will throw an error telling me as much:
sqlalchemy.exc.ArgumentError: Error creating backref 'Groups' on relationship 'SearchGroup_CMClasses_Grades.searchObject_childen': property of that name exists on mapper 'mapped class SearchObject->SearchObjects'
There is a rather unsatisfying way to solve this, which is to simple change that name each time, but then the SearchObject won't have a common list of SearchGroups. In fact it will contain the 'Groups' property for every SearchGroup. This will work, but will be messy and I'd rather not do it. What I would like is to say 'okay, if this backref already exists, just use that one'. I don't know if that's possible, but I think such a thing would solve my problem.
Edit: I thought an image might help explain better:
Figure 1: what I have now:
The more of these objects derived from SearchObjectsGroupParent I have, the messier it will be (SearchObject will contain Groups03, Groups04, Groups05, etc.).
Figure 2: what I want:
I am using factory.LazyAttribute within a SubFactory call to pass in an object, created in the factory_parent. This works fine.
But if I pass the object created to a RelatedFactory, LazyAttribute can no longer see the factory_parent and fails.
This works fine:
class OKFactory(factory.DjangoModelFactory):
class = Meta:
model = Foo
exclude = ['sub_object']
sub_object = factory.SubFactory(SubObjectFactory)
object = factory.SubFactory(ObjectFactory,
sub_object=factory.LazyAttribute(lambda obj: obj.factory_parent.sub_object))
The identical call to LazyAttribute fails here:
class ProblemFactory(OKFactory):
class = Meta:
model = Foo
exclude = ['sub_object', 'object']
sub_object = factory.SubFactory(SubObjectFactory)
object = factory.SubFactory(ObjectFactory,
sub_object=factory.LazyAttribute(lambda obj: obj.factory_parent.sub_object))
another_object = factory.RelatedFactory(AnotherObjectFactory, 'foo', object=object)
The identical LazyAttribute call can no longer see factory_parent, and can only access AnotherObject values. LazyAttribute throws the error:
AttributeError: The parameter sub_object is unknown. Evaluated attributes are...[then lists all attributes of AnotherObjectFactory]
Is there a way round this?
I can't just put sub_object=sub_object into the ObjectFactory call, ie:
sub_object = factory.SubFactory(SubObjectFactory)
object = factory.SubFactory(ObjectFactory, sub_object=sub_object)
because if I then do:
object2 = factory.SubFactory(ObjectFactory, sub_object=sub_object)
a second sub_object is created, whereas I need both objects to refer to the same sub_object. I have tried SelfAttribute to no avail.
I think you can leverage the ability to override parameters passed in to the RelatedFactory to achieve what you want.
For example, given:
class MyFactory(OKFactory):
object = factory.SubFactory(MyOtherFactory)
related = factory.RelatedFactory(YetAnotherFactory) # We want to pass object in here
If we knew what the value of object was going to be in advance, we could make it work with something like:
object = MyOtherFactory()
thing = MyFactory(object=object, related__param=object)
We can use this same naming convention to pass the object to the RelatedFactory within the main Factory:
class MyFactory(OKFactory):
class Meta:
exclude = ['object']
object = factory.SubFactory(MyOtherFactory)
related__param = factory.SelfAttribute('object')
related__otherrelated__param = factory.LazyAttribute(lambda myobject: 'admin%d_%d' % (myobject.level, myobject.level - 1))
related = factory.RelatedFactory(YetAnotherFactory) # Will be called with {'param': object, 'otherrelated__param: 'admin1_2'}
I solved this by simply calling factories within #factory.post_generation. Strictly speaking this isn't a solution to the specific problem posed, but I explain below in great detail why this ended up being a better architecture. #rhunwick's solution does genuinely pass a SubFactory(LazyAttribute('')) to RelatedFactory, however restrictions remained that meant this was not right for my situation.
We move the creation of sub_object and object from ProblemFactory to ObjectWithSubObjectsFactory (and remove the exclude clause), and add the following code to the end of ProblemFactory.
#factory.post_generation
def post(self, create, extracted, **kwargs):
if not create:
return # No IDs, so wouldn't work anyway
object = ObjectWithSubObjectsFactory()
sub_object_ids_by_code = dict((sbj.name, sbj.id) for sbj in object.subobject_set.all())
# self is the `Foo` Django object just created by the `ProblemFactory` that contains this code.
for another_obj in self.anotherobject_set.all():
if another_obj.name == 'age_in':
another_obj.attribute_id = sub_object_ids_by_code['Age']
another_obj.save()
elif another_obj.name == 'income_in':
another_obj.attribute_id = sub_object_ids_by_code['Income']
another_obj.save()
So it seems RelatedFactory calls are executed before PostGeneration calls.
The naming in this question is easier to understand, so here is the same solution code for that sample problem:
The creation of dataset, column_1 and column_2 are moved into a new factory DatasetAnd2ColumnsFactory, and the code below is then added to the end of FunctionToParameterSettingsFactory.
#factory.post_generation
def post(self, create, extracted, **kwargs):
if not create:
return
dataset = DatasetAnd2ColumnsFactory()
column_ids_by_name =
dict((column.name, column.id) for column in dataset.column_set.all())
# self is the `FunctionInstantiation` Django object just created by the `FunctionToParameterSettingsFactory` that contains this code.
for parameter_setting in self.parametersetting_set.all():
if parameter_setting.name == 'age_in':
parameter_setting.column_id = column_ids_by_name['Age']
parameter_setting.save()
elif parameter_setting.name == 'income_in':
parameter_setting.column_id = column_ids_by_name['Income']
parameter_setting.save()
I then extended this approach passing in options to configure the factory, like this:
whatever = WhateverFactory(options__an_option=True, options__another_option=True)
Then this factory code detected the options and generated the test data required (note the method is renamed to options to match the prefix on the parameter names):
#factory.post_generation
def options(self, create, not_used, **kwargs):
# The standard code as above
if kwargs.get('an_option', None):
# code for custom option 'an_option'
if kwargs.get('another_option', None):
# code for custom option 'another_option'
I then further extended this. Because my desired models contained self joins, my factory is recursive. So for a call such as:
whatever = WhateverFactory(options__an_option='xyz',
options__an_option_for_a_nested_whatever='abc')
Within #factory.post_generation I have:
class Meta:
model = Whatever
# self is the top level object being generated
#factory.post_generation
def options(self, create, not_used, **kwargs):
# This generates the nested object
nested_object = WhateverFactory(
options__an_option=kwargs.get('an_option_for_a_nested_whatever', None))
# then join nested_object to self via the self join
self.nested_whatever_id = nested_object.id
Some notes you do not need to read as to why I went with this option rather than #rhunwicks's proper solution to my question above. There were two reasons.
The thing that stopped me experimenting with it was that the order of RelatedFactory and post-generation is not reliable - apparently unrelated factors affect it, presumably a consequence of lazy evaluation. I had errors where a set of factories would suddenly stop working for no apparent reason. Once was because I renamed the variables RelatedFactory were assigned to. This sounds ridiculous but I tested it to death (and posted here) but there is no doubt - renaming the variables reliably switched the sequence of RelatedFactory and post-gen execution. I still assumed this was some oversight on my behalf until it happened again for some other reason (which I never managed to diagnose).
Secondly I found the declarative code confusing, inflexible and hard to re-factor. It isn't straightforward to pass different configurations during instantiation so that the same factory can be used for different variations of test data, meaning I had to repeat code, object needs adding to a Factory Meta.exclude list - sounds trivial but when you've pages of code generating data it was a reliable error. As a developer you'd have to pass over several factories several times to understand the control flow. Generation code would be spread between the declarative body, until you'd exhausted these tricks, then the rest would go in post-generation or get very convoluted. A common example for me is a triad of interdependent models (eg, a parent-children category structure or dataset/attributes/entities) as a foreign key of another triad of inter-dependent objects (eg, models, parameter values, etc, referring to other models' parameter values). A few of these types of structures, especially if nested, quickly become unmanagable.
I realize it isn't really in the spirit of factory_boy, but putting everything into post-generation solved all these problems. I can pass in parameters, so the same single factory serves all my composite model test data requirements and no code is repeated. The sequence of creation is easy to see immediately, obvious and completely reliable, rather than depending on confusing chains of inheritance and overriding and subject to some bug. The interactions are obvious so you don't need to digest the whole thing to add some functionality, and different areas of funtionality are grouped in the post-generation if clauses. There's no need to exclude working variables and you can refer to them for the duration of the factory code. The unit test code is simplified, because describing the functionality goes in parameter names rather than Factory class names - so you create data with a call like WhateverFactory(options__create_xyz=True, options__create_abc=True.., rather than WhateverCreateXYZCreateABC..(). This makes a nice division of responsibilities quite clean to code.
I want to have a base class called MBUser that has some predefined properties, ones that I don't want to be changed. If the client wants to add properties to MBUser, it is advised that MBUser be subclassed, and any additional properties be put in there.
The API code won't know if the client actually subclasses MBUser or not, but it shouldn't matter. The thinking went that we could just get MBUser by id. So I expected this to work:
def test_CreateNSUser_FetchMBUser(self):
from nsuser import NSUser
id = create_unique_id()
user = NSUser(id = id)
user.put()
# changing MBUser.get.. to NSUser.get makes this test succeed
get_user = MBUser.get_by_id(id)
self.assertIsNotNone(get_user)
Here NSUser is a subclass of MBUser. The test fails.
Why can't I do this?
What's a work around?
Models are defined by their "kind", and a subclass is a different kind, even if it seems the same.
The point of subclassing is not to share values, but to share the "schema" you've created for a given "kind".
A kind map is created on base class ndb.Model (it seems like you're using ndb since you mentioned get_by_id) and each kind is looked up when you do queries like this.
For subclasses, the kind is just defined as the class name:
#classmethod
def _get_kind(cls):
return cls.__name__
I just discovered GAE has a solution for this. It's called the PolyModel:
https://developers.google.com/appengine/docs/python/ndb/polymodelclass
I'm working on a OpenERP environment, but maybe my issue can be answered from a pure python perspective. What I'm trying to do is define a class whose "_columns" variable can be set from a function that returns the respective dictionary. So basically:
class repos_report(osv.osv):
_name = "repos.report"
_description = "Reposition"
_auto = False
def _get_dyna_cols(self):
ret = {}
cr = self.cr
cr.execute('Select ... From ...')
pass #<- Fill dictionary
return ret
_columns = _get_dyna_cols()
def init(self, cr):
pass #Other stuff here too, but I need to set my _columns before as per openerp
repos_report()
I have tried many ways, but these code reflects my basic need. When I execute my module for installation I get the following error.
TypeError: _get_dyna_cols() takes exactly 1 argument (0 given)
When defining the the _get_dyna_cols function I'm required to have self as first parameter (even before executing). Also, I need a reference to openerp's 'cr' cursor in order to query data to fill my _columns dictionary. So, how can I call this function so that it can be assigned to _columns? What parameter could I pass to this function?
From an OpenERP perspective, I guess I made my need quite clear. So any other approach suggested is also welcome.
From an OpenERP perspective, the right solution depends on what you're actually trying to do, and that's not quite clear from your description.
Usually the _columns definition of a model must be static, since it will be introspected by the ORM and (among other things) will result in the creation of corresponding database columns. You could set the _columns in the __init__ method (not init1) of your model, but that would not make much sense because the result must not change over time, (and it will only get called once when the model registry is initialized anyway).
Now there are a few exceptions to the "static columns" rules:
Function Fields
When you simply want to dynamically handle read/write operations on a virtual column, you can simply use a column of the fields.function type. It needs to emulate one of the other field types, but can do anything it wants with the data dynamically. Typical examples will store the data in other (real) columns after some pre-processing. There are hundreds of example in the official OpenERP modules.
Dynamic columns set
When you are developing a wizard model (a subclass of TransientModel, formerly osv_memory), you don't usually care about the database storage, and simply want to obtain some input from the user and take corresponding actions.
It is not uncommon in that case to need a completely dynamic set of columns, where the number and types of the columns may change every time the model is used. This can be achieved by overriding a few key API methods to simulate dynamic columns`:
fields_view_get is the API method that is called by the clients to obtain the definition of a view (form/tree/...) for the model.
fields_get is included in the result of fields_view_get but may be called separately, and returns a dict with the columns definition of the model.
search, read, write and create are called by the client in order to access and update record data, and should gracefully accept or return values for the columns that were defined in the result of fields_get
By overriding properly these methods, you can completely implement dynamic columns, but you will need to preserve the API behavior, and handle the persistence of the data (if any) yourself, in real static columns or in other models.
There are a few examples of such dynamic columns sets in the official addons, for example in the survey module that needs to simulate survey forms based on the definition of the survey campaign.
1 The init() method is only called when the model's module is installed or updated, in order to setup/update the database backend for this model. It relies on the _columns to do this.
When you write _columns = _get_dyna_cols() in the class body, that function call is made right there, in the class body, as Python is still parsing the class itself. At that point, your _get_dyn_cols method is just a function object in the local (class body) namespace - and it is called.
The error message you get is due to the missing self parameter, which is inserted only when you access your function as a method - but this error message is not what is wrong here: what is wrong is that you are making an imediate function call and expecting an special behavior, like late execution.
The way in Python to achieve what you want - i.e. to have the method called authomatically when the attribute colluns is accessed is to use the "property" built-in.
In this case, do just this: _columns = property(_get_dyna_cols) -
This will create a class attribute named "columns" which through a mechanism called "descriptor protocol" will call the desired method whenever the attribute is accessed from an instance.
To leran more about the property builtin, check the docs: http://docs.python.org/library/functions.html#property
I want to make attributes of GAE Model properties. The reason is for cases like to turn the value into uppercase before storing it. For a plain Python class, I would do something like:
Foo(db.Model):
def get_attr(self):
return self.something
def set_attr(self, value):
self.something = value.upper() if value != None else None
attr = property(get_attr, set_attr)
However, GAE Datastore have their own concept of Property class, I looked into the documentation and it seems that I could override get_value_for_datastore(model_instance) to achieve my goal. Nevertheless, I don't know what model_instance is and how to extract the corresponding field from it.
Is overriding GAE Property classes the right way to provides getter/setter-like functionality? If so, how to do it?
Added:
One potential issue of overriding get_value_for_datastore that I think of is it might not get called before the object was put into datastore. Hence getting the attribute before storing the object would yield an incorrect value.
Subclassing GAE's Property class is especially helpful if you want more than one "field" with similar behavior, in one or more models. Don't worry, get_value_for_datastore and make_value_from_datastore are going to get called, on any store and fetch respectively -- so if you need to do anything fancy (including but not limited to uppercasing a string, which isn't actually all that fancy;-), overriding these methods in your subclass is just fine.
Edit: let's see some example code (net of imports and main):
class MyStringProperty(db.StringProperty):
def get_value_for_datastore(self, model_instance):
vv = db.StringProperty.get_value_for_datastore(self, model_instance)
return vv.upper()
class MyModel(db.Model):
foo = MyStringProperty()
class MainHandler(webapp.RequestHandler):
def get(self):
my = MyModel(foo='Hello World')
k = my.put()
mm = MyModel.get(k)
s = mm.foo
self.response.out.write('The secret word is: %r' % s)
This shows you the string's been uppercased in the datastore -- but if you change the get call to a simple mm = my you'll see the in-memory instance wasn't affected.
But, a db.Property instance itself is a descriptor -- wrapping it into a built-in property (a completely different descriptor) will not work well with the datastore (for example, you can't write GQL queries based on field names that aren't really instances of db.Property but instances of property -- those fields are not in the datastore!).
So if you want to work with both the datastore and for instances of Model that have never actually been to the datastore and back, you'll have to choose two names for what's logically "the same" field -- one is the name of the attribute you'll use on in-memory model instances, and that one can be a built-in property; the other one is the name of the attribute that ends up in the datastore, and that one needs to be an instance of a db.Property subclass and it's this second name that you'll need to use in queries. Of course the methods underlying the first name need to read and write the second name, but you can't just "hide" the latter because that's the name that's going to be in the datastore, and so that's the name that will make sense to queries!
What you want is a DerivedProperty. The procedure for writing one is outlined in that post - it's similar to what Alex describes, but by overriding get instead of get_value_for_datastore, you avoid issues with needing to write to the datastore to update it. My aetycoon library has it and other useful properties included.