I'm using sqlalchemy but find documentation difficult to search.
I've these two columns:
verified = Column(Boolean, default=False)
verified_at = Column(DateTime, nullable=True)
I'd like to create a function that does something like this:
if self.verified and not oldobj.verified:
self.verified_at = datetime.datetime.utcnow
if not self.verified and oldobj.verified:
self.verified_at = None
I'm not sure where to put code like this. I could put it in the application, but would prefer the model object took care of this logic.
I think what you're looking for is a Hybrid Property.
from sqlalchemy.ext.hybrid import hybrid_property
class VerifiedAsset(Base):
id = Column(Integer, primary_key=True)
verified_at = Column('verified_at', String(24))
#hybrid_property
def verification(self):
return self.verified_at;
#verification.setter
def verification(self, value):
if value and not self.verification:
self.verified_at = datetime.datetime.utcnow
if not value and self.verification:
self.verified_at = None
# Presumably you want to handle your other cases here
You want to update your verified_at value in a particular way based on some incoming new value. Use properties to wrap the underlying value, and only update when it is appropriate, and only to what you're actually persisting in the db.
You can use sqlalchemy's events registration to put code like that: http://docs.sqlalchemy.org/en/latest/core/event.html.
Basically, you can subscribe to certain events that happen in the Core and ORM. I think it's a clean way to manage what you want to achieve.
You would use the listen_for() decorator, in order to hook when those columns change.
Reading "Changing Attribute Behavior" and "ORM Events" is a good start on trying to solve this type of problem.
One way to go about it would be to set an event listener that updates the timestamp:
#event.listens_for(MyModel.verified, 'set')
def mymodel_verified_set(target, value, oldvalue, initiator):
"""Set verified_at"""
if value != oldvalue:
target.verified_at = datetime.datetime.utcnow() if value else None
Related
I have a complex model. Let's say it contains 100 entities, all of which are related to each other in some way. Some are many to many, some are one to one, some are many to one, and so on.
These entities all have start and end timestamps indicating valid time ranges. When loading these entities via query, I wish to populate the relationship fields only with entities that have start and end stamps wrapping a given timestamp: for example datetime.now(), or yesterday, or whenever.
I'll define two models here for example, but assume there are a vast number of others:
class User(base):
__tablename__ = 'User'
class Role(base):
__tablename__ = 'Role'
user_id = Column(Integer, ForeignKey('User.uid'))
user = relationship(User, backref=backref('Role')
start = Column(DateTime, default=func.current_timestamp())
end = Column(DateTime))
Now, I want to return entities via restful endpoints in flask. So, a get might look something like this in flask:
def get(self, uid=None) -> Tuple[Dict, int]:
query = User.query
if uid:
query.filter_by(uid=uid)
return create_response(
query.all()
200
)
Now, I want to restrict the Role entities returned as children to the User returned by the above query. Obviously, this could easily be done by just extending the query to filter the Roles. The problem comes when this scales up. Consider 100 nested levels of child relationships. Now consider restful endpoints providing a get for any one of them. It would be practically impossible to write out a query to properly filter every different level of child.
My desired solution was to define loading behavior on each entity, making everything composable. For example:
class User(base):
__tablename__ = 'User'
role = relationship("Role",
primaryjoin="and_(Role.start<={desired_timestamp} "
"Role.end>={desired_timestamp})")
The problem, of course, is that we don't know our desired_timestamp at class definition time as it is passed at runtime. I have thought of some hacks for this such as redefining everything during every runtime, but I'm not happy with them. Does anyone have some insight as to the "right" way to do something like this?
I'm building a web application in Python 3 using Flask & SQLAlchemy (via Flask-SQLAlchemy; with either MySQL or SQLite), and I've run into a situation where I'd like to reference a single property on my model class that encapsulates multiple columns in my database. I'm pretty well versed in MySQL, but this is my first real foray into SQLAlchemy beyond the basics. Reading the docs, scouring SO, and searching Google have led me to two possible solutions: Hybrid attributes (docs) or Composite columns (docs).
My question is what are the implications of using each of these, and which of these is the appropriate solution to my situation? I've included example code below that's a snippet of what I'm doing.
Background: I'm developing an application to track & sort photographs, and have a DB table in which I store the metadata for these photos, including when the picture was taken. Since photos are taken in a specific place, the taken date & time have an associated timezone. As SQL has a notoriously love/hate relationship with timezones, I've opted to record when the photo was taken in two columns: a datetime storing the date & time and a string storing the timezone name. (I'd like to sidestep the inevitable debate about how to store timezone aware dates & times in SQL, please.) What I would like is a single parameter on the model class that can I can use to get a proper python datetime object, and that I can also set like any other column.
Here's my table:
class Photo(db.Model):
__tablename__ = 'photos'
id = db.Column(db.Integer, primary_key=True)
...
taken_dt = db.Column(db.datetime, nullable=False)
taken_tz = db.Column(db.String(64), nullable=False)
...
Here's what I have using a hybrid parameter (added to the above class, datetime/pytz code is psuedocode):
#hybrid_parameter
def taken(self):
return datetime.datetime(self.taken_dt, self.taken_tz)
#taken.setter(self, dt):
self.taken_dt = dt
self.taken_tz = dt.tzinfo
From there I'm not exactly sure what else I need in the way of a #taken.expression or #taken.comparator, or why I'd choose one over the other.
Here's what I have using a composite column (again, added to the above class, datetime/pytz code is psuedocode):
taken = composite(DateTimeTimeZone._make, taken_dt, taken,tz)
class DateTimeTimeZone(object):
def __init__(self, dt, tz):
self.dt = dt
self.tz = tz
#classmethod
def from_db(cls, dt, tz):
return DateTimeTimeZone(dt, tz)
#classmethod
def from_dt(cls, dt):
return DateTimeTimeZone(dt, dt.tzinfo)
def __composite_values__(self):
return (self.dt, self.tz)
def value(self):
#This is here so I can get the actual datetime.datetime object
return datetime.datetime(self.dt, self.tz)
It would seem that this method has a decent amount of extra overhead, and I can't figure out a way to set it like I would any other column directly from a datetime.datetime object without instantiating the value object first using .from_dt.
Any guidance on if I'm going down the wrong path here would be welcome. Thanks!
TL;DR: Look into hooking up an AttributeEvent to your column and have it check for datetime instances which have a tz attribute set and then return a DateTimeTimeZone object. If you look at the SQLAlchemy docs for Attribute Events you can see that you can tell SQLAlchemy to listen to an attribute-set event and call your code on that. In there you can do any modification to the value being set as you like. You can't however access other attributes of the class at that time. I haven't tried this in combination with composites yet, so I don't know if this will be called before or after the type-conversion of the composite. You'd have to try.
edit: Its all about what you want to achieve though. The AttributeEvent can help you with your data consistency, while the hybrid_property and friends will make querying easier for you. You should use each one for it's intended use-case.
More detailed discussion on the differences between the various solutions:
hybrid_attribute and composite are two completely different beasts. To understand hybrid_attribute one first has to understand what a column_property is and can do.
1) column_property
This one is placed on a mapper and can contain any selectable. So if you put an concrete sub-select into a column_property you can access it read-only as if it were a concrete column. The calculation is done on the fly. You can even use it to search for entries. SQLAlchemy will construct the right select containing your sub-select for you.
Example:
class User(Base):
id = Column(Integer, primary_key=True)
first_name = Column(Unicode)
last_name = Column(Unicode)
name = column_property(first_name + ' ' + last_name)
category = column_property(select([CategoryName.name])
.select_from(Category.__table__
.join(CategoryName.__table__))
.where(Category.user_id == id))
db.query(User).filter(User.name == 'John Doe').all()
db.query(User).filter(User.category == 'Paid').all()
As you can see, this can simplify a lot of code, but one has to be careful to think of the performance implications.
2) hybrid_method and hybrid_attribute
A hybrid_attribute is just like a column_property but can call a different code-path when you are in an instance context. So you can have the selectable on the class level but a different implementation on the instance level. With a hybrid_method you can even parametrize both sides.
3) composite_attribute
This is what enables you to combine multiple concrete columns to a logical single one. You have to write a class for this logical column so that SQLAlchemy can extract the correct values from there and use it in the selects. This integrates neatly in the query framework and should not impose any additional problems. In my experience the use-cases for composite columns are rather rare. Your use-case seems fine. For modification of values you can always use AttributeEvents. If you want to have the whole instance available you'd have to have a MapperEvent called before flush. This certainly works, as I used this to implement a completely transparent Audit Trail tracking system which stored every value changed in every table in a separate set of tables.
I have following code:
class ArchaeologicalRecord(Base, ObservableMixin, ConcurrentMixin):
author_id = Column(Integer, ForeignKey('authors.id'))
author = relationship('Author', backref=backref('record'))
horizont_id = Column(Integer, ForeignKey('horizonts.id'))
horizont = relationship('Horizont', backref=backref('record'))
.....
somefield_id = Column(Integer, ForeignKey('somefields.id'))
somefield = relationship('SomeModel', backref=backref('record'))
At the moment I have one of entry (Author or Horizont or any other entry which related to arch.record). And I want to ensure that no one record has reference to this field. But I hate to write a lot of code for each case and want to do it most common way.
So, actually I have:
instance of ArchaeologicalRecord
instance of child entity, for example, Horizont
(from previous) it's class definition.
How to check whether any ArchaeologicalRecord contains (or does not) reference to Horizont (or any other child entity) without writing great chunk of copy-pasted code?
Are you asking how to find orphaned authors, horzonts, somefields etc?
Assuming all your relations are many-to-one (ArchaelogicalRecord-to-Author), you could try something like:
from sqlalchemy.orm.properties import RelationshipProperty
from sqlalchemy.orm import class_mapper
session = ... # However you setup the session
# ArchaelogicalRecord will have various properties defined,
# some of these are RelationshipProperties, which hold the info you want
for rp in class_mapper(ArchaeologicalRecord).iterate_properties:
if not isinstance(rp, RelationshipProperty):
continue
query = session.query(rp.mapper.class_)\
.filter(~getattr(rp.mapper.class_, rp.backref[0]).any())
orphans = query.all()
if orphans:
# Do something...
print rp.mapper.class_
print orphans
This will fail when rp.backref is None (i.e. where you've defined a relationship without a backref) - in this case you'd probably have to construct the query a bit more manually, but the RelationshipProperty, and it's .mapper and .mapper.class_ attributes should get you all the info you need to do this in a generic way.
class Quote(db.Model):
id = db.Column(db.Integer, primary_key=True)
content = db.Column(db.Text)
votes = db.Column(db.Integer)
author_id = db.Column(db.Integer, db.ForeignKey('author.id'))
date_added = db.Column(db.DateTime,default=datetime.datetime.now())
last_letter = db.Column(db.String(1))
I have a Model that looks like the above. I want last_letter to be the last letter of whatever the content is. Where should I place this logic so that it will occur every time a model is saved? I'm reading about Hybrid Properties and stuff and I'm not sure which way is the correct one to go.
1.the Naive way: you can use sqlalchemy column default value to set something like:
last_letter = db.Column(db.char, default=content[len(content)-1:])
didn't check if that would actually work, guess not.
2.you can also do something like adding this init to the class:
def __init__(self,id,content,votes,auther_id,date_added):
self.id = id
self.content = content
#yadda yadda etc
self.last_letter = content[len(content)-1:] #or something similiar
or you could use "listen" to the "before insert" event and add this dynamically as explained here.
you can use sql computed column with an sql trigger (in the db) without sqlalchemy.
you can probably use a sqlalchemy mapper sql expression as a hybrid property, also I didn't try that myself, look simple enough and probably is the most elegant way to do this.
last_letter could be decorated with #property and defined
#property
def last_letter(self):
return self.content[-1]
Disclaimer: I just learned how to use decorators and am using them everywhere
I have a simple "Invoices" class with a "Number" attribute that has to
be assigned by the application when the user saves an invoice. There
are some constraints:
1) the application is a (thin) client-server one, so whatever
assigns the number must look out for collisions
2) Invoices has a "version" attribute too, so I can't use a simple
DBMS-level autoincrementing field
I'm trying to build this using a custom Type that would kick in every
time an invoice gets saved. Whenever process_bind_param is called with
a None value, it will call a singleton of some sort to determine the
number and avoid collisions. Is this a decent solution?
Anyway, I'm having a problem.. Here's my custom Type:
class AutoIncrement(types.TypeDecorator):
impl = types.Unicode
def copy(self):
return AutoIncrement()
def process_bind_param(self, value, dialect):
if not value:
# Must find next autoincrement value
value = "1" # Test value :)
return value
My problem right now is that when I save an Invoice and AutoIncrement
sets "1" as value for its number, the Invoice instance doesn't get
updated with the new number.. Is this expected? Am I missing
something?
Many thanks for your time!
(SQLA 0.5.3 on Python 2.6, using postgreSQL 8.3)
Edit: Michael Bayer told me that this behaviour is expected, since TypeDecorators don't deal with default values.
Is there any particular reason you don't just use a default= parameter in your column definition? (This can be an arbitrary Python callable).
def generate_invoice_number():
# special logic to generate a unique invoice number
class Invoice(DeclarativeBase):
__tablename__ = 'invoice'
number = Column(Integer, unique=True, default=generate_invoice_number)
...