How to avoid the use of a complicated metaclass with sqlalchemy declarative - python

I've just noticed that in a stackoverflow reply sqlalchemy's creator Mike Bayer (zzzeek) said that "Custom metaclasses aren't needed with Declarative for simple use cases, and pretty much not at all for hard use cases either. Mixins + custom bases should be able to do pretty much everything." This raised my hopes that something I've been doing in sqlalchemy using a custom metaclass with declarative_base can be done using mixins + custom bases instead. However, I have not been able to make it work. I would greatly appreciate any hints!
I have a mixin ConciseRelationshipMixin that enables the creation of relationships using a concise syntax. It is used as follows:
class Eggs(ConciseRelationshipMixin, OtherMixin1, OtherMixin2):
pass
Spam = declarative_base(cls=Eggs)
class Ham(Spam):
some_column = Column(Integer)
crispy_bacon = ConciseRelationship(Bacon)
This creates an extra column in Ham.__table__ to hold the FK of the relationship, as well as the relationship itself under the name Ham.bar, using project-specific conventions for the name of the extra column and the primaryjoin. I really like this mixin because it makes the definition of Ham look very clear and uncluttered.
The reason I have not been able to implement this without using metaclasses is that I am trying to convert the assigment crispy_bacon = ... into two assigments column_for_crispy_bacon = Column(...) and crispy_bacon = relationship(...), I could not see any other way to do that.
I am also open to replies of the form "you should not be doing this kind of thing, because ...".

Related

Django creating multiple tables/model classes from same base class with factory function

I have been trying to figure out the best way to automate creating multiple SQL tables based on separate but identical models, all based on the same base class. I'm basically creating pseudo message boards or walls with different Groups, and I wanted each Group to have its own db_table of Posts, each Post containing the user id, timestamp, etc.
My first thought was to have one base class of Posts and just include a field for Group name, but I thought this would be bad practice. My rationale was that one table containing every Post for all Groups would get really big (in theory anyway) and slow down filtering, and also that the extra field for group name would in the long run be a waste of memory when I could have separate tables per group and skip this field.
I've also considered using a ForeignKey with a Many-to-One relationship, but as far as I can tell this has the same drawbacks. Am I wrong to think that? Or are these size concerns not really an issue?
So my next idea was to make Posts an abstract class, and then create subclasses based on each Group. This is ultimately what I did. However, I found myself having to copy and paste the code over and over and change the class name each time. This felt very unPythonic to me. It was something like:
class Posts(models.Model):
timestamp = models.DateTimeField(auto_now_add=True, unique=False)
user_id = ...
#etc.
#
class Meta:
abstract = True
class GroupA(Posts):
class Meta(Posts.Meta):
db_table = 'groupa_board'
class GroupB(Posts):
class Meta(Posts.Meta):
db_table = 'groupb_board'
class GroupC...etc.
What I really was looking for was a factory function to do this for me. I tried this sort of thing:
def makeBoard(group):
class Board(Posts):
class Meta(Posts.Meta):
db_table = group
return board #note I tried with and without this line
And then I ran a simple for loop using a list of groups.
for group in groups:
makeBoard(group)
I found myself hitting a RuntimeError: conflicting models in application, and I probably deserved it. So then I figured what I need is something like:
def makeBoard(group):
class group(Posts): #***group here being a variable, not the class name
class Meta(Posts.Meta):
db_table = '%s' % group #maybe issues here too, but the table
return group #name is not that important if the class
#name works
But I couldn't figure out how to make this work! Is there a way to pass a variable from a list to a class name?
Anyway if you're still with me I appreciate it. I've been on stackoverflow all day and while I've found guides for creating abstract base classes and subclasses to solve similar issues, I didn't see a way to create a function to do this for me. I ultimately punted here and just make a subclass for each group by hand. If there is a way to automate this process, I'd love to hear it.
Also, if I'm being stupid for not just going with one db table containing every post, I'd like to know that too, and why! Or if there's a better way to implement this kind of system altogether. I apologize if this has been answered before, I really couldn't find it.
Thank you!
Using a single table would not be bad practice. The extra memory is minimal, on modern systems that shouldn't be a problem. You shouldn't worry about performance either, premature optimization (not including the actual system design) is considered bad practice, but if you run into performance problems you can always specify an index on the group column:
group = models.CharField(max_length=100, db_index=True)
That's not to say that it is the best option, or that your method isn't good. Also, it is entirely possible to dynamically create models, using the type() built-in function. The only difference with dynamically creating models and creating other classes is that you must specifically pass the __module__ attribute. You can create subclasses for Posts in the following way:
def fabric(names, baseclass=Posts):
for name in names:
class Meta:
db_table = '%s_table' % name.lower()
attrs = {'__module__': baseclass.__module__, 'Meta': Meta}
# specify any other class attributes here. E.g. you can specify extra fields:
attrs.update({'my_field': models.CharField(max_length=100)})
newclass = type(str(name), (baseclass,), attrs)
globals()[name] = newclass
fabric(['GroupA', 'GroupB', 'GroupC', etc...])
Put that code in your models.py after your Posts class, and all classes will be created for you. They can be used in any way normal classes can be used: Django doesn't even know you dynamically created this class. Though your Meta class doesn't inherit from Posts.Meta, your meta settings should still be preserved.
Tested with Django 1.4.
Try smth like this
import app.models as group_models
from django.db.models.base import ModelBase
def fabric(group):
for item in dir(group_models):
c = getattr(group_models, item)
if type(c) is ModelBase:
if c._meta.db_table == '%s_table' % group:
return c
return None

Google App Engine base and subclass gets

I want to have a base class called MBUser that has some predefined properties, ones that I don't want to be changed. If the client wants to add properties to MBUser, it is advised that MBUser be subclassed, and any additional properties be put in there.
The API code won't know if the client actually subclasses MBUser or not, but it shouldn't matter. The thinking went that we could just get MBUser by id. So I expected this to work:
def test_CreateNSUser_FetchMBUser(self):
from nsuser import NSUser
id = create_unique_id()
user = NSUser(id = id)
user.put()
# changing MBUser.get.. to NSUser.get makes this test succeed
get_user = MBUser.get_by_id(id)
self.assertIsNotNone(get_user)
Here NSUser is a subclass of MBUser. The test fails.
Why can't I do this?
What's a work around?
Models are defined by their "kind", and a subclass is a different kind, even if it seems the same.
The point of subclassing is not to share values, but to share the "schema" you've created for a given "kind".
A kind map is created on base class ndb.Model (it seems like you're using ndb since you mentioned get_by_id) and each kind is looked up when you do queries like this.
For subclasses, the kind is just defined as the class name:
#classmethod
def _get_kind(cls):
return cls.__name__
I just discovered GAE has a solution for this. It's called the PolyModel:
https://developers.google.com/appengine/docs/python/ndb/polymodelclass

Python, SQLAlchemy and Postgresql: understanding inheritance

Pretty recent (but not newborn) to both Python, SQLAlchemy and Postgresql, and trying to understand inheritance very hard.
As I am taking over another programmer's code, I need to understand what is necessary, and where, for the inheritance concept to work.
My questions are:
Is it possible to rely only on SQLAlchemy for inheritance? In other words, can SQLAlchemy apply inheritance on Postgresql database tables that were created without specifying INHERITS=?
Is the declarative_base technology (SQLAlchemy) necessary to use inheritance the proper way. If so, we'll have to rewrite everything, so please don't discourage me.
Assuming we can use Table instance, empty Entity classes and mapper(), could you give me a (very simple) example of how to go through the process properly (or a link to an easily understandable tutorial - I did not find any easy enough yet).
The real world we are working on is real estate objects. So we basically have
- one table immobject(id, createtime)
- one table objectattribute(id, immoobject_id, oatype)
- several attribute tables: oa_attributename(oa_id, attributevalue)
Thanks for your help in advance.
Vincent
Welcome to Stack Overflow: in the future, if you have more than one question; you should provide a separate post for each. Feel free to link them together if it might help provide context.
Table inheritance in postgres is a very different thing and solves a different set of problems from class inheritance in python, and sqlalchemy makes no attempt to combine them.
When you use table inheritance in postgres, you're doing some trickery at the schema level so that more elaborate constraints can be enforced than might be easy to express in other ways; Once you have designed your schema; applications aren't normally aware of the inheritance; If they insert a row; it just magically appears in the parent table (much like a view). This is useful, for instance, for making some kinds of bulk operations more efficient (you can just drop the table for the month of january).
This is a fundamentally different idea from inheritance as seen in OOP (in python or otherwise, with relational persistence or otherwise). In that case, the application is aware that two types are related, and that the subtype is a permissible substitute for the supertype. "A holding is an address, a contact has an address therefore a contact can have a holding."
Which of these, (mostly orthogonal) tools you need depends on the application. You might need neither, you might need both.
Sqlalchemy's mechanisms for working with object inheritance is flexible and robust, you should use it in favor of a home built solution if it is compatible with your particular needs (this should be true for almost all applications).
The declarative extension is a convenience; It allows you to describe the mapped table, the python class and the mapping between the two in one 'thing' instead of three. It makes your code more "DRY"; It is however only a convenience layered on top of "classic sqlalchemy" and it isn't necessary by any measure.
If you find that you need table inheritance that's visible from sqlalchemy; your mapped classes won't be any different from not using those features; tables with inheritance are still normal relations (like tables or views) and can be mapped without knowledge of the inheritance in the python code.
For your #3, you don't necessarily have to declare empty entity classes to use mapper. If your application doesn't need fancy properties, you can just use introspection and metaclasses to model the existing tables without defining them. Here's what I did:
mymetadata = sqlalchemy.MetaData()
myengine = sqlalchemy.create_engine(...)
def named_table(tablename):
u"return a sqlalchemy.Table object given a SQL table name"
return sqlalchemy.Table(tablename, mymetadata, autoload=True, autoload_with=myengine)
def new_bound_class(engine, table):
u"returns a new ORM class (processed by sqlalchemy.orm.mapper) given a sqlalchemy.Table object"
fieldnames = table.c.__dict__['_data']
def format_attributes(obj, transform):
attributes = [u'%s=%s' % (x, transform(x)) for x in fieldnames]
return u', '.join(attributes)
class DynamicORMClass(object):
def __init__(self, **kw):
u"Keyword arguments may be used to initialize fields/columns"
for key in kw:
if key in fieldnames: setattr(self, key, kw[key])
else: raise KeyError, '%s is not a valid field/column' % (key,)
def __repr__(self):
return u'%s(%s)' % (self.__class__.__name__, format_attributes(self, repr))
def __str__(self):
return u'%s(%s)' % (str(self.__class__), format_attributes(self, str))
DynamicORMClass.__doc__ = u"This is a dynamic class created using SQLAlchemy based on table %s" % (table,)
return sqlalchemy.orm.mapper(DynamicORMClass, table)
def named_orm_class(table):
u"returns a new ORM class (processed by sqlalchemy.orm.mapper) given a table name or object"
if not isinstance(table, Table):
table = named_table(table)
return new_bound_class(table)
Example of use:
>>> myclass = named_orm_class('mytable')
>>> session = Session()
>>> obj = myclass(name='Fred', age=25, ...)
>>> session.add(obj)
>>> session.commit()
>>> print str(obj) # will print all column=value pairs
I beefed up my versions of new_bound_class and named_orm_class a little more with decorators, etc. to provide extra capabilities, and you can too. Of course, under the covers, it is declaring an empty entity class. But you don't have to do it, except this one time.
This will tide you over until you decide that you're tired of doing all those joins yourself, and why can't I just have an object attribute that does a lazy select query against related classes whenever I use it. That's when you make the leap to declarative (or Elixir).

Figure out child type with Django MTI or specify type as field?

I'm setting up a data model in django using multiple-table inheritance (MTI) like this:
class Metric(models.Model):
account = models.ForeignKey(Account)
date = models.DateField()
value = models.FloatField()
calculation_in_progress = models.BooleanField()
type = models.CharField( max_length=20, choices= METRIC_TYPES ) # Appropriate?
def calculate(self):
# default calculation...
class WebMetric(Metric):
url = models.URLField()
def calculate(self):
# web-specific calculation...
class TextMetric(Metric):
text = models.TextField()
def calculate(self):
# text-specific calculation...
My instinct is to put a 'type' field in the base class as shown here, so I can tell which sub-class any Metric object belongs to. It would be a bit of a hassle to keep this up to date all the time, but possible. But do I need to do this? Is there some way that django handles this automatically?
When I call Metric.objects.all() every objects returned is an instance of Metric never the subclasses. So if I call .calculate() I never get the sub-class's behavior.
I could write a function on the base class that tests to see if I can cast it to any of the sub-types like:
def determine_subtype(self):
try:
self.webmetric
return WebMetric
except WebMetric.DoesNotExist:
pass
# Repeat for every sub-class
but this seems like a bunch of repetitious code. And it's also not something that can be included in a SELECT filter -- only works in python-space.
What's the best way to handle this?
While it might offend some people's sensibilities, the only practical way to solve this problem is to put either a field or a method in the base class which says what kind of object each record really is. The problem with the method you describe is that it requires a separate database query for every type of subclass, for each object you're dealing with. This could get extremely slow when working with large querysets. A better way is to use a ForeignKey to the django Content Type class.
#Carl Meyer wrote a good solution here: How do I access the child classes of an object in django without knowing the name of the child class?
Single Table Inheritance could help alleviate this issue, depending on how it gets implemented. But for now Django does not support it: Single Table Inheritance in Django so it's not a helpful suggestion.
But do I need to do this?
Never. Never. Never.
Is there some way that django handles this automatically?
Yes. It's called "polymorphism".
You never need to know the subclass. Never.
"What about my WebMetric.url and my TextMetric.text attributes?"
What will you do with these attributes? Define a method function that does something. Implement different versions in WebMetric (that uses url) and TextMetric (that uses text).
That's proper polymorphism.
Please read this: http://docs.djangoproject.com/en/1.2/topics/db/models/#abstract-base-classes
Please make your superclass abstract.
Do NOT do this: http://docs.djangoproject.com/en/1.2/topics/db/models/#multi-table-inheritance
You want "single-table inheritance".

How do I access the child classes of an object in django without knowing the name of the child class?

In Django, when you have a parent class and multiple child classes that inherit from it you would normally access a child through parentclass.childclass1_set or parentclass.childclass2_set, but what if I don't know the name of the specific child class I want?
Is there a way to get the related objects in the parent->child direction without knowing the child class name?
(Update: For Django 1.2 and newer, which can follow select_related queries across reverse OneToOneField relations (and thus down inheritance hierarchies), there's a better technique available which doesn't require the added real_type field on the parent model. It's available as InheritanceManager in the django-model-utils project.)
The usual way to do this is to add a ForeignKey to ContentType on the Parent model which stores the content type of the proper "leaf" class. Without this, you may have to do quite a number of queries on child tables to find the instance, depending how large your inheritance tree is. Here's how I did it in one project:
from django.contrib.contenttypes.models import ContentType
from django.db import models
class InheritanceCastModel(models.Model):
"""
An abstract base class that provides a ``real_type`` FK to ContentType.
For use in trees of inherited models, to be able to downcast
parent instances to their child types.
"""
real_type = models.ForeignKey(ContentType, editable=False)
def save(self, *args, **kwargs):
if self._state.adding:
self.real_type = self._get_real_type()
super(InheritanceCastModel, self).save(*args, **kwargs)
def _get_real_type(self):
return ContentType.objects.get_for_model(type(self))
def cast(self):
return self.real_type.get_object_for_this_type(pk=self.pk)
class Meta:
abstract = True
This is implemented as an abstract base class to make it reusable; you could also put these methods and the FK directly onto the parent class in your particular inheritance hierarchy.
This solution won't work if you aren't able to modify the parent model. In that case you're pretty much stuck checking all the subclasses manually.
In Python, given a ("new-style") class X, you can get its (direct) subclasses with X.__subclasses__(), which returns a list of class objects. (If you want "further descendants", you'll also have to call __subclasses__ on each of the direct subclasses, etc etc -- if you need help on how to do that effectively in Python, just ask!).
Once you have somehow identified a child class of interest (maybe all of them, if you want instances of all child subclasses, etc), getattr(parentclass,'%s_set' % childclass.__name__) should help (if the child class's name is 'foo', this is just like accessing parentclass.foo_set -- no more, no less). Again, if you need clarification or examples, please ask!
Carl's solution is a good one, here's one way to do it manually if there are multiple related child classes:
def get_children(self):
rel_objs = self._meta.get_all_related_objects()
return [getattr(self, x.get_accessor_name()) for x in rel_objs if x.model != type(self)]
It uses a function out of _meta, which is not guaranteed to be stable as django evolves, but it does the trick and can be used on-the-fly if need be.
It turns out that what I really needed was this:
Model inheritance with content type and inheritance-aware manager
That has worked perfectly for me. Thanks to everyone else, though. I learned a lot just reading your answers!
You can use django-polymorphic for that.
It allows to automatically cast derived classes back to their actual type. It also provides Django admin support, more efficient SQL query handling, and proxy model, inlines and formset support.
The basic principle seems to be reinvented many times (including Wagtail's .specific, or the examples outlined in this post). It takes more effort however, to make sure it doesn't result in an N-query issue, or integrate nicely with the admin, formsets/inlines or third party apps.
Here's my solution, again it uses _meta so isn't guaranteed to be stable.
class Animal(models.model):
name = models.CharField()
number_legs = models.IntegerField()
...
def get_child_animal(self):
child_animal = None
for r in self._meta.get_all_related_objects():
if r.field.name == 'animal_ptr':
child_animal = getattr(self, r.get_accessor_name())
if not child_animal:
raise Exception("No subclass, you shouldn't create Animals directly")
return child_animal
class Dog(Animal):
...
for a in Animal.objects.all():
a.get_child_animal() # returns the dog (or whatever) instance
You can achieve this looking for all the fields in the parent that are an instance of django.db.models.fields.related.RelatedManager. From your example it seems that the child classes you are talking about are not subclasses. Right?
An alternative approach using proxies can be found in this blog post. Like the other solutions, it has its benefits and liabilities, which are very well put in the end of the post.

Categories

Resources