I'm using sqlalchemy and am trying to integrate alembic for database migrations.
My database currently exists and has a number of ForeignKeys defined without names. I would like to add a naming convention to allow for migrations that affect ForeignKey columns.
I've added the naming convention given here to the top of my models.py file:
SQLAlchemy Naming Constraints
convention = {
"ix": 'ix_%(column_0_label)s',
"uq": "uq_%(table_name)s_%(column_0_name)s",
"ck": "ck_%(table_name)s_%(constraint_name)s",
"fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
"pk": "pk_%(table_name)s"
}
DeclarativeBase = declarative_base()
DeclarativeBase.metadata = MetaData(naming_convention=convention)
def db_connect():
return create_engine(URL(**settings.DATABASE))
def create_reviews_table(engine):
DeclarativeBase.metadata.create_all(engine)
class Review(DeclarativeBase):
__tablename__ = 'reviews'
id = Column(Integer, primary_key=True)
review_id = Column('review_id', String, primary_key=True)
resto_id = Column('resto_id', Integer, ForeignKey('restaurants.id'),
nullable=True)
url = Column('url', String),
resto_name = Column('resto_name', String)
I've set up alembic/env.py as per the tutorial instructions, feeding my model's metadata into target_metadata.
When I run
$: alembic current
I get the following error:
sqlalchemy.exc.InvalidRequestError: Naming convention including %(constraint_name)s token requires that constraint is explicitly named.
In the docs they say that "This same feature [generating names for columns using a naming convention] takes effect even if we just use the Column.unique flag:" 1, so I'm thinking that there shouldn't be a problem (they go on to give an example using a ForeignKey that isn't named too).
Do I need to go back and give all my constraints explicit names, or is there a way to do it automatically?
just modify th "ck" in convention to "ck": "ck_%(table_name)s_%(column_0_name)s"。it works for me .
refer to see sqlalchemy docs
What this error message is telling you is that you should name constraints explicitly. The constraints it's referring to are Boolean, Enum etc but not foreignkeys nor primary keys.
So go through your table, wherever you have a Boolean or Enum add a name to it. For example:
is_active = Column(Boolean(name='is_active'))
That's what you need to do.
This does not aim to be definitive answer and also fails to answer your immediate technical question, but could it be a "philosophical problem"? Either your SQLAlchemy code is the source of truth as far as the database is concerned, or the RDMS is the source. In front of this a mixed situation, where each of the two have part of it, I would see two avenues:
The one that you are exploring: you modify the database's schema to match the SQLAlchemy model and you make your Python code the master. This is the most intuitive, but this may not always be possible, both for technical and administrative reasons.
Accepting that the RDMS has info that SQLAlchemy doesn't have, but is fortunately not relevant for day-to-day work. Your best chance is to use another migration tool (ETL) that will reverse engineer the database before migrating it. After the migration is complete you could give back control of the new instance to SQLAlchemy (which may require some adjustments to the new DB or to the model).
There is no way to tell which approach will work, since both have their own challenges. But I would give some thought to the second method?
I've had some luck altering naming_convention back to {} in each older migration so that they run with the correct historical context.
Still entirely unsure what kind of interesting side-effects this might have.
Related
I'm using SQLAlchemy with Python (linking to an MySQL database) and am having a little design issue, which has presented itself as a problem with a backref.
So the situation is this. I have a SearchGroup which contains TargetObjects and SearchObjects. These are both many to many relationships, and so the SearchGroup table comes with two association tables, one for each. The SearchObject is the same time for any SearchGroup, but the TargetObject varies. So far so good. The whole idea here is that a SearchObject is simply a string with a few other variables, and a SearchGroup compares them all to a given string and then, if there's a match, supplies the target objects.
Now for some code: the declaration of these three classes, although with the parent logic hidden for brevity:
class AssocTable_GroupCMClassesGrades_Grades(AssociationTable_Group_TargetObjectsParent, med.DeclarativeBase):
__tablename__ = 'AssocTable_GroupCMClassesGrades_Grades'
_groupsTableName = 'SearchGroup_CMClasses_Grades'
_targetObjectsTableName = 'Grades'
class AssocTable_GroupCMClassesGrades_SearchObjects(AssociationTable_Group_SearchObjectsParent, med.DeclarativeBase):
__tablename__ = 'AssocTable_GroupCMClassesGrades_SearchObjects'
_groupsTableName = 'SearchGroup_CMClasses_Grades'
_searchObjectsTableName = 'SearchObjects'
class SearchGroup_CMClasses_Grades(SearchObjectGroupParent, med.DeclarativeBase):
__tablename__ = 'SearchGroup_CMClasses_Grades'
targetAssociatedTargetObjectTableName = 'AssocTable_GroupCMClassesGrades_Grades'
targetAssociatedSearchObjectTableName = 'AssocTable_GroupCMClassesGrades_SearchObjects'
targetClassName = 'Grade'
myClassName = 'SearchGroup_CMClasses_Grades'
searchObjectClassName = 'SearchObject'
searchObjectChildrenBackRefName = 'Groups'
The top two are the association tables and the bottom is the main class. The strings are used to set up various foreign keys and relationships and such.
Let's look at a specific example, which is crucial to the question:
#declared_attr
def searchObject_childen(cls):
return relationship(f'{cls.searchObjectClassName}', secondary=f'{cls.targetAssociatedSearchObjectTableName}', backref=f'{cls.searchObjectChildrenBackRefName}')
This is inside the SearchObjectGroupParent class and, as you can see, is for the 'children' of the SearchGroup, which are SearchObjects.
So now to the problem.
That all works rather well, except for one thing. If I could direct your attention back to the large bit of code above, and to this line:
searchObjectChildrenBackRefName = 'Groups'
This, as seen in the second posted piece of code (the declared_attr one), sets up a backref; a property in the target - it creates that property and then populates it. I'm not an expert at this by any means so I won't pretend to be. The point is this: if I create another SearchObjectGroupParent derived class, like the one above, with its association tables, I can't put another 'Groups' property into SearchObject - in fact it will throw an error telling me as much:
sqlalchemy.exc.ArgumentError: Error creating backref 'Groups' on relationship 'SearchGroup_CMClasses_Grades.searchObject_childen': property of that name exists on mapper 'mapped class SearchObject->SearchObjects'
There is a rather unsatisfying way to solve this, which is to simple change that name each time, but then the SearchObject won't have a common list of SearchGroups. In fact it will contain the 'Groups' property for every SearchGroup. This will work, but will be messy and I'd rather not do it. What I would like is to say 'okay, if this backref already exists, just use that one'. I don't know if that's possible, but I think such a thing would solve my problem.
Edit: I thought an image might help explain better:
Figure 1: what I have now:
The more of these objects derived from SearchObjectsGroupParent I have, the messier it will be (SearchObject will contain Groups03, Groups04, Groups05, etc.).
Figure 2: what I want:
I'm building a web application in Python 3 using Flask & SQLAlchemy (via Flask-SQLAlchemy; with either MySQL or SQLite), and I've run into a situation where I'd like to reference a single property on my model class that encapsulates multiple columns in my database. I'm pretty well versed in MySQL, but this is my first real foray into SQLAlchemy beyond the basics. Reading the docs, scouring SO, and searching Google have led me to two possible solutions: Hybrid attributes (docs) or Composite columns (docs).
My question is what are the implications of using each of these, and which of these is the appropriate solution to my situation? I've included example code below that's a snippet of what I'm doing.
Background: I'm developing an application to track & sort photographs, and have a DB table in which I store the metadata for these photos, including when the picture was taken. Since photos are taken in a specific place, the taken date & time have an associated timezone. As SQL has a notoriously love/hate relationship with timezones, I've opted to record when the photo was taken in two columns: a datetime storing the date & time and a string storing the timezone name. (I'd like to sidestep the inevitable debate about how to store timezone aware dates & times in SQL, please.) What I would like is a single parameter on the model class that can I can use to get a proper python datetime object, and that I can also set like any other column.
Here's my table:
class Photo(db.Model):
__tablename__ = 'photos'
id = db.Column(db.Integer, primary_key=True)
...
taken_dt = db.Column(db.datetime, nullable=False)
taken_tz = db.Column(db.String(64), nullable=False)
...
Here's what I have using a hybrid parameter (added to the above class, datetime/pytz code is psuedocode):
#hybrid_parameter
def taken(self):
return datetime.datetime(self.taken_dt, self.taken_tz)
#taken.setter(self, dt):
self.taken_dt = dt
self.taken_tz = dt.tzinfo
From there I'm not exactly sure what else I need in the way of a #taken.expression or #taken.comparator, or why I'd choose one over the other.
Here's what I have using a composite column (again, added to the above class, datetime/pytz code is psuedocode):
taken = composite(DateTimeTimeZone._make, taken_dt, taken,tz)
class DateTimeTimeZone(object):
def __init__(self, dt, tz):
self.dt = dt
self.tz = tz
#classmethod
def from_db(cls, dt, tz):
return DateTimeTimeZone(dt, tz)
#classmethod
def from_dt(cls, dt):
return DateTimeTimeZone(dt, dt.tzinfo)
def __composite_values__(self):
return (self.dt, self.tz)
def value(self):
#This is here so I can get the actual datetime.datetime object
return datetime.datetime(self.dt, self.tz)
It would seem that this method has a decent amount of extra overhead, and I can't figure out a way to set it like I would any other column directly from a datetime.datetime object without instantiating the value object first using .from_dt.
Any guidance on if I'm going down the wrong path here would be welcome. Thanks!
TL;DR: Look into hooking up an AttributeEvent to your column and have it check for datetime instances which have a tz attribute set and then return a DateTimeTimeZone object. If you look at the SQLAlchemy docs for Attribute Events you can see that you can tell SQLAlchemy to listen to an attribute-set event and call your code on that. In there you can do any modification to the value being set as you like. You can't however access other attributes of the class at that time. I haven't tried this in combination with composites yet, so I don't know if this will be called before or after the type-conversion of the composite. You'd have to try.
edit: Its all about what you want to achieve though. The AttributeEvent can help you with your data consistency, while the hybrid_property and friends will make querying easier for you. You should use each one for it's intended use-case.
More detailed discussion on the differences between the various solutions:
hybrid_attribute and composite are two completely different beasts. To understand hybrid_attribute one first has to understand what a column_property is and can do.
1) column_property
This one is placed on a mapper and can contain any selectable. So if you put an concrete sub-select into a column_property you can access it read-only as if it were a concrete column. The calculation is done on the fly. You can even use it to search for entries. SQLAlchemy will construct the right select containing your sub-select for you.
Example:
class User(Base):
id = Column(Integer, primary_key=True)
first_name = Column(Unicode)
last_name = Column(Unicode)
name = column_property(first_name + ' ' + last_name)
category = column_property(select([CategoryName.name])
.select_from(Category.__table__
.join(CategoryName.__table__))
.where(Category.user_id == id))
db.query(User).filter(User.name == 'John Doe').all()
db.query(User).filter(User.category == 'Paid').all()
As you can see, this can simplify a lot of code, but one has to be careful to think of the performance implications.
2) hybrid_method and hybrid_attribute
A hybrid_attribute is just like a column_property but can call a different code-path when you are in an instance context. So you can have the selectable on the class level but a different implementation on the instance level. With a hybrid_method you can even parametrize both sides.
3) composite_attribute
This is what enables you to combine multiple concrete columns to a logical single one. You have to write a class for this logical column so that SQLAlchemy can extract the correct values from there and use it in the selects. This integrates neatly in the query framework and should not impose any additional problems. In my experience the use-cases for composite columns are rather rare. Your use-case seems fine. For modification of values you can always use AttributeEvents. If you want to have the whole instance available you'd have to have a MapperEvent called before flush. This certainly works, as I used this to implement a completely transparent Audit Trail tracking system which stored every value changed in every table in a separate set of tables.
How can I make a relationship without having a foreign key?
#declared_attr
def custom_stuff(cls):
joinstr = 'foreign(Custom.name) == "{name}"'.format(name=cls.__name__)
return db.relationship('Custom', primaryjoin=joinstr)
This raises an error:
ArgumentError: Could not locate any simple equality expressions involving locally mapped foreign key columns for primary join condition
This works, but I think it's a pretty ugly hack.
#declared_attr
def custom_stuff(cls):
joinstr = 'or_(
and_(foreign(Custom.name) == MyTable.title,
foreign(Custom.name) != MyTable.title),
foreign(Custom.name) == "{name}")'.format(name=cls.__name__)
return db.relationship('Custom', primaryjoin=joinstr)
Is there a better way to do this?
EDIT: the extra attribute needs to be added as #declared_attr and has to use a relationship, since our serializer is written so it works with declarred attrs.
Doing this with #hybrid_property or something else would work, but then our json serializer would break. Getting that to work seems harder than defining a relationship.
You don't necessarily have to define relationships when creating tables in a database (this applies for almost every SQL). You can still join tables that don't have a foreign-key relation predefined (the main reason for foreign keys is to enforce data consistency, not to define what you can join or not).
See this for reference - http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.join
If you would still like to "show" the database and table structure/model, then better use some entity relationship modeler like ERwin (or some diagramming software).
Maybe you should point to the meaning of "relationship" word. It states that some thing "depends" on some other thing or both depend each other.
If you're trying to define a relationship on the ERM (Entity Relationship Model) that means you should state which one of the entities "depends" on the other one. Also, databases have some hacks to deal faster with tables relating each other than just simple tables that are related on an upper abstract way.
Is there any reason why you need to do that?
I have a legacy database with non-django naming conventions. If I have the following (cut down) models:
class Registration(models.Model):
projectId=models.IntegerField(primary_key=True)
class Application(models.Model):
applicationId=models.IntegerField(primary_key=True)
registration=models.ForeignKey(Registration,db_column='projectId')
The ForeignKey instance causes a property to be created on Application called registration_id, but this is neither the correct name for the field (I have a hack to fix this), nor is it able to be used in a QuerySet.
Is there some way of using the id field provided by the ForeignKey on the Application model, rather than having to reference it via Registration?
Ie. I write lots of code like:
Application.objects.get(projectId=1234)
And don't want to have to write it out as:
Application.objects.get(registration__projectId=1234)
or even
Application.objects.get(registration__pk=1234)
I'm slightly surprised that:
Application.objects.get(registration_id=1234)
doesn't work...
Also note, I tried defining the id column as a field as well as the foreignkey which worked for queryset, but inserts complain of trying to insert into the same column twice:
class Application(models.Model):
...
projectId=models.IntegerField()
...
Have you tried this?
Application.objects.get(registration=1234)
I think just doing Application.objects.registration.get(projectId=1234) should do what you want.
I'm using SA 0.6.6, Declarative style, against Postgres 8.3, to map Python objects to a database. I have a table that is self referencing and I'm trying to make a relationship property for it's children. No matter what I try, I end up with a NoReferencedTableError.
My code looks exactly like the sample code from the SA website for how to do this very thing.
Here's the class.
class FilterFolder(Base):
__tablename__ = 'FilterFolder'
id = Column(Integer,primary_key=True)
name = Column(String)
isShared = Column(Boolean,default=False)
isGlobal = Column(Boolean,default=False)
parentFolderId = Column(Integer,ForeignKey('FilterFolder.id'))
childFolders = relationship("FilterFolder",
backref=backref('parentFolder', remote_side=id)
)
Here's the error I get:
NoReferencedTableError: Foreign key assocated with column 'FilterFolder.parentFolderId' could not find table 'FilterFolder' with which to generate a foreign key to target column 'id'
Any ideas what I'm doing wrong here?
This was a foolish mistake on my part. I typically specify my FK's by specifying the Entity type, not the string. I am using different schemas, so when defining the FK entity as a string I also need the schema.
Broken:
parentFolderId = Column(Integer,ForeignKey('FilterFolder.id'))
Fixed:
parentFolderId = Column(Integer,ForeignKey('SchemaName.FilterFolder.id'))
I checked your code with SQLAlchemy 0.6.6 and sqlite. I was able to create the tables, add a parent and child combination, and retrieve them again using a session.query.
As far as I can tell, the exception you mentioned (NoReferencedTableError) is thrown in schema.py (in the SQLAlchemy source) exclusively, and is not database specific.
Some questions: Do you see the same bug if you use an sqlite URL instead of the Postgres one? How are you creating your schema? Mine looks something like this:
engine = create_engine(db_url)
FilterFolder.metadata.create_all(self.dbengine)