This question is about how to design a SQL relationship. I am pretty newbie in this matter and I'd like to know the answers of (way) more experts guys...
I am currently migrating a ZopeDB (Object oriented) database to MySQL (relational) using MeGrok and SqlAlchemy (although I don't think that's really too relevant, since my question is more about designing a relationship in a relational database).
I have two classes related like this:
class Child(object):
def __init__(self):
self.field1 = "hello world"
class Parent(object):
def __init__(self):
self.child1 = Child()
self.child2 = Child()
The "Parent" class has two different instances of a Child() class. I am not even sure about how to treat this (two different 1:1 relationships or a 1:2 relationship).
Currently, I have this:
class Child(rdb.Model):
rdb.metadata(metadata)
rdb.tablename("children_table")
id = Column("id", Integer, primary_key=True)
field1 = Column("field1", String(64)) #Irrelevant
def __init__(self):
self.field1 = "hello world"
class Parent(rdb.Model):
rdb.metadata(metadata)
rdb.tablename("parent_table")
id = Column("id", Integer, primary_key=True)
child1_id = Column("child_1_id", Integer, ForeignKey("children_table.id"))
child2_id = Column("child_2_id", Integer, ForeignKey("children_table.id"))
child1 = relationship(Child,
primaryjoin = ("parent_table.child1_id == children_table.id")
)
child2 = relationship(Child,
primaryjoin = ("parent_table.child2_id == children_table.id")
)
Meaning... Ok, I store the two "children" ids as foreign keys in the Parent and retrieve the children itself using that information.
This is working fine, but I don't know if it's the most proper solution.
Or I could do something like:
class Child(rdb.Model):
rdb.metadata(metadata)
rdb.tablename("children_table")
id = Column("id", Integer, primary_key=True)
parent_id = Column("id", Integer, ForeignKey("parent_table.id")) # New!
type = Column("type", ShortInteger) # New!
field1 = Column("field1", String(64)) #Irrelevant
def __init__(self):
self.field1 = "hello world"
class Parent(rdb.Model):
rdb.metadata(metadata)
rdb.tablename("parent_table")
id = Column("id", Integer, primary_key=True)
child1 = relationship(
# Well... this I still don't know how to write it down,
# but it would be something like:
# Give me all the children whose "parent_id" is my own "id"
# AND their type == 1
# I'll deal with the joins and the actual implementation depending
# on your answer, guys
)
child2 = relationship(
# Would be same as above
# but selecting children whose type == 2
)
This may be good for adding new children to the parent class... If I add a "Parent.child3", I just need to create a new relationship very similar to the already existing ones.
The way I have it now would imply creating a new relationship AND adding a new foreign key to the parent.
Also, having a "parent" table with a bunch of foreign keys may not make it the best "parent" table in the world, right?
I'd like to know what people that know much more about databases think :)
Thank you.
PS: Related post? Question 3998545
Expanded in Response to Comments
The issue is, you are thinking in the terms that you know (understandable), and you have the limitations of an OO database ... which would not be good to carry over into the Relational db. So for many reasons, it is best to simply identify the Entities and Relations, and to Normalise them. The method you use to call is easy to change and you will not be limited to only what you have now.
There are some good answers here, but even those are limited and incomplete. If you Normalise Parent and Child (being people, they will have many common columns), you get Person, with no duplicated columns.
People have "upward" relations to other people, their Parents, but that is context, not the fact that the Parent exists as a Person first (and you can have more than two if you like). People also have "downward" relations to their Children, also contextual. The limitation of two children per Parent is absurd (you may have to inspect your methods/classes: I suspect one is an "upward" navigation and the other is "downward"). And you do not want to have to store the relations as duplicates (once that Fred is a father of Sally; twice that Sally is a child of Fred), that single fact exists in a single row, which can be interpreted Parent⇢Child or Parent⇠Child.
This requirement has come up in many questions, therefore I am using a single generic, but detailed, illustration. The model defines any tree structure that needs to be walked up or down, handled by simple recursion. It is called a Bill of Materials structure, originally created for inventory control systems, and can be applied to any tree structure requirement. It is Fifth Normal Form; no duplicate columns; no Update Anomalies.
Bill of Materials
For Assemblies and Components, which would have many common columns, they are Normalised into Part; whether they are Assemblies or Components is contextual, and these contextual columns are located in the Associative (many-to-many) table.
Two Relations 1:1 or one 1:2 ?
Actually, it is two times 1::n.
Ordinals, or Ranking, is explicit in the Primary Key (chronological order). If some other ordinal is required, simply add a column to the Associative table. better yet, it is truly a derived column, so compute it at runtime from current values.
I'll admit that I'm not too familiar with object databases, but in relational terms this is a straightforward one-to-many (optional) relationship.
create table parent (
id int PK,
otherField whatever
)
create table child (
id int PK,
parent_id int Fk,
otherField whatever
)
Obviously, that's not usable code as it stands....
I think this is similar to your second example. If you need to track the ordinal postion of the children in their relationships to the parent, you'd add a column to the child table such as:
create table child (
id int PK,
parent_id int Fk,
birth_order int,
otherField whatever
)
You'd have to be responsible for managing that field at teh application level, it's not something you can expect the DBMS to do for you.
I called it an optional relationship on the assumption that childless parents can exist--if that's not true, it becomes a required relationship logically, though you'd still have to let the DBMS create a new parent record childlessly, then grab its id to create the child--and once again manage the requirement at the application level.
This is probably a little out of context, since I use none of the things you've mentioned - but as far as the general design goes, here are a couple ideas:
Keep relationships based on common types: has_one, has_many, belongs_to, has_and_belongs_to_many.
With children, it's better to not specify N number of children explicitly; either there are none, one, or there could potentially be many. Thus your model declarations of child1 and child2 would be replaced by a single property - an array containing children.
To be totally honest, I don't know how well that fits in with what you're using. However, that's generally how relationships work in an ORM sense. So, based on this:
If a model belongs to another (it has a foreign key for another table), it would have a parent [sic] property with a reference to the parent object
If a model has one model that belongs to it (the other model has a foreign key to the first model's table), it would have a child [sic] property with a reference to the child object
If a model has many models that belong to it (many other models have foreign keys to the first model's table), it would have a children [sic] property that is an array of references to child objects
If a model has and belongs to many other models... you might want to consider using both parents and children properties, or something similar; nomenclature is less important than you having access to a group of models that it belongs to, and another group of models that belong to it.
Sorry if that's totally unhelpful, but it might shed some light. HTH.
Related
I have a complex model. Let's say it contains 100 entities, all of which are related to each other in some way. Some are many to many, some are one to one, some are many to one, and so on.
These entities all have start and end timestamps indicating valid time ranges. When loading these entities via query, I wish to populate the relationship fields only with entities that have start and end stamps wrapping a given timestamp: for example datetime.now(), or yesterday, or whenever.
I'll define two models here for example, but assume there are a vast number of others:
class User(base):
__tablename__ = 'User'
class Role(base):
__tablename__ = 'Role'
user_id = Column(Integer, ForeignKey('User.uid'))
user = relationship(User, backref=backref('Role')
start = Column(DateTime, default=func.current_timestamp())
end = Column(DateTime))
Now, I want to return entities via restful endpoints in flask. So, a get might look something like this in flask:
def get(self, uid=None) -> Tuple[Dict, int]:
query = User.query
if uid:
query.filter_by(uid=uid)
return create_response(
query.all()
200
)
Now, I want to restrict the Role entities returned as children to the User returned by the above query. Obviously, this could easily be done by just extending the query to filter the Roles. The problem comes when this scales up. Consider 100 nested levels of child relationships. Now consider restful endpoints providing a get for any one of them. It would be practically impossible to write out a query to properly filter every different level of child.
My desired solution was to define loading behavior on each entity, making everything composable. For example:
class User(base):
__tablename__ = 'User'
role = relationship("Role",
primaryjoin="and_(Role.start<={desired_timestamp} "
"Role.end>={desired_timestamp})")
The problem, of course, is that we don't know our desired_timestamp at class definition time as it is passed at runtime. I have thought of some hacks for this such as redefining everything during every runtime, but I'm not happy with them. Does anyone have some insight as to the "right" way to do something like this?
I'm attempting to make my first app with flask. This is my node class as of now:
class TreePage(db.Model):
__tablename__ = 'tree'
id = db.Column(db.Integer, primary_key= True)
name = db.Column(db.String(100), nullable= False)
content = db.Column(db.String)
parent_id = db.create_all(db.Integer, db.ForeignKey('id'))
children = db.relationship('TreeNode', cascade = "all",
backref = db.backref("parent", remote_side='TreeNode.id'),
collection_class = db.attribute_mapped_collection('name')
)
Being new to SQLAlchemy and databases in general, lots of the information I'm reading here and on the SQLAlchemy documentation is kind of hard to grasp.
My goal right now is to be able to access a tree and its content, and then if that tree has children to be able to pick which child to study from; this is basically a visual representation of what I want to make.
My guess is this should be a one to many relationship, so I won't have to deal with local/remote sides right? Both of which confuse me at the moment. Next, do I need to worry at all about the cascade parameter, or can I leave it be for now?
Lastly, I want to solidify how self referential relationships work. With two different classes it's easy for me to grasp: the 'one' relationship creates a relationship object, and then the 'many' object creates a foreign key that references the primary key of its owner. But in the case of a self referential relationship, is the 'owner' the object that holds id and parent_id being the id of whatever parent object comes before? thanks and I apologize if this is at all confusing.
Assume I have in models.pysomething like:
Class ModelA(models.Model):
# many fields, including
relatives = models.ManyToManyField(Person)
)
# also, A is foreign key to other models:
Class SomeOtherModel(models.Model):
mya = models.ForeignKey(A)
# now we produce two classes with multi-table inheritance
# WITHOUT any additional fileds
Class InhertA1(ModelA):
pass
Class InhertA2(ModelA):
pass
So as as I understand, this will create Tables for ModelA, InheritA1 and InheritA1; each instance of ModelA will get a row in the ModelA-table only, each instance of InheritA1 will have a row both in the ModelA-table (basically containing all the data) and another in the InheritA1-table (only containing a PK and a OneToOne key pointing to the ModelA-table row), etc. Django queries for ModelA-objects will give all objects, queries for InheritA1-objects only the InheritA1 ones etc.
So now I have an InheritA1-object a1, and want to make it into a InheritA2-object, without changing the according ModelA-object. So previously the parent-IbeToOne Key of a1 points to the ModelA-row with key 3, say (and the ForeignKey of some SomeOtherModel-object is set to 3). In the end, I want a InheritA1-object a2 pointing to the same, unchanged ModelA-row (and the object a1removed).
It seems that django doesn't offer any such move-Class-functionality?
Can I safely implement the according SQL operations myself?
Or will things go horribly wrong? I.e., can I just execute the SQL commands that
Create a new row in the InheritA2-table, setting the parent-OneToOne key to the one of a1,
Remove the a1 row in the InheritA2-table?
It seems I cannot do this from non-SQL-django without automatically creating a ModelA-row. Well, for 1., maybe I can create a temporary object x that way, then let p be the parent of x, then change the parent-OneToOne key of x to point to the one of a1, then delete the obect p? But for 2, I do not think that it is possible in non-SQL-django to remove an instance of a child while keeping the parent object?
Alternatively, is there a good django way to copy instances in django and change references to them?
I.e., I could create a new InheritA2 object y, and copy all the properties of a1 into the new object, and then go through the database and find all ManyToMany and ForeignKey entries that point to the parent of a1
and change it to point to the parent of y instead, and then delete a1.
That could be done with non-SQL-django, the downside is that it seems wasteful performance-wise (which would be of no great concern for my project), and that it might also not be so "stable" I.e., if I change ModelA or other models in relation to it, the code that copies everything might break? (Or is there a good way to do something like
Forall models M in my project:
Forall keys k used in M:
If k is a descendant of a ManyToMany or Foreign or ... key:
If k points to ModelA:
Forall instances x of M:
If x.k=a1:
x.k=y
The first four lines seem rather dubious.
Remarks:
Copying without changing the instance can be done in a stable, simple, standard way, see e.g. here, but we are still stuck in the same child class (and still have to modify ForeignKeys etc)?
Changing the class by just declaring it in the standard python way, see here, is not an option for me, as nobody seems to know whether it will horribly break django.
If you plan on changing the children but not the parent, then maybe you could use OneToOneField instead of direct inheritance.
class ModelA(models.Model):
pass
class InhertA1(models.Model):
a = models.OneToOneField(ModelA, primary_key=True)
class InhertA2(models.Model):
a = models.OneToOneField(ModelA, primary_key=True)
It gives you the same 3 tables in the database. (One difference is that the pk fields of InheritA1 and InheritA2 will be the same id from the parent.)
For changing from InheritA1 to InheritA2 you would delete one child instance (this would not affect the parent instance) and then create the other new instance, pointing it to the parent instance.
Well, you can even have a parent instance which has children from both other models, but that would be checked in your view to prevent that.
Let me know if this helps you, even if the answer is a bit late.
This question essentially two parts.
1. I have a situation where I require things to be unique together i.e the elements in db need to be unique together with each other.
Lets say we have a model Things ( Rough PseudoCode)
Class ShoppingList( object ):
thing1_id = Column(Integer)
thing2_id = Column(Integer)
Now I need thing1_id and thing2_id to be a unique together ie the set of thing1_id and thing2_id needs to be unique together. Coming from django world I know that you can do a meta declaration in django models of unique_together. But how can do this in turbogears .
Also how do I actually apply a unique_together on a legacy system.
You simply want to add a UniqueConstraint to your table definition (using a primary key would achive similar effects, but with different semantics nevertheless).
This is as simple as:
Class ShoppingList( object ):
thing1_id = Column(Integer)
thing2_id = Column(Integer)
__table_args__ = (
UniqueConstraint('thing1_id', 'thing2_id'),
)
See also https://docs.sqlalchemy.org/en/latest/orm/extensions/declarative/table_config.html#table-configuration
For the first part of your question, if I understand your question correctly, I believe you are talking about the need for defining composite primary keys. As stated in http://docs.sqlalchemy.org/en/latest/core/schema.html#describing-databases-with-metadata:
Multiple columns may be assigned the primary_key=True flag which denotes a multi-column primary key, known as a composite primary key.
Defining such a relationship on a class using the declarative ORM way in SQLAlchemy, should be as simple as:
class ShoppingList(Base):
thing1_id = Column(Integer, primary_key=True)
thing2_id = Column(Integer, primary_key=True)
As for the second part of your question, I believe you mean how one would define the same SQLAlchemy mapping for an existing, legacy database. If so, you should be able to use the above approach, just don't create the database from the ORM definition. You may also use the classic mapping way, described in: http://docs.sqlalchemy.org/en/rel_0_8/orm/mapper_config.html?highlight=composite%20primary%20key#classical-mappings
I'm not sure what this is called since it is new to me, but here is what I want to do:
I have two tables in my database: TableA and TableB. TableA has pk a_id and another field called a_code. TableB has pk b_id and another field called b_code.
I have these tables mapped in my sqlalchemy code and they work fine. I want to create a third object called TableC that doesn't actually exist in my database, but that contains combinations of a_code and b_code, something like this:
class TableC:
a_code = String
b_code = String
Then I'd like to query TableC like:
TableC.query.filter(and_(
TableC.a_code == x,
TableC.b_code == y)).all()
Question 1) Does this type of thing have a name? 2) How do I do the mapping (using declarative would be nice)?
I don't really have a complete understanding of the query you are trying to express, weather it's a union or a join or some third thing, but that aside, it certainly is possible to map an arbitrary selectable (anything you can pass to a database that returns rows).
I'll start with the assumption that you want some kind of union of TableA and TableB, which would be all of the rows in A, and also all of the rows in B. This is easy enough to change to a different concept if you reveal more information about the shape of the data you are expressing.
We'll start by setting up the real tables, and classes to map them, in the declarative style.
from sqlalchemy import *
import sqlalchemy.ext.declarative
Base = sqlalchemy.ext.declarative.declarative_base()
class TableA(Base):
__tablename__ = 'a'
id = Column(Integer, primary_key=True)
a_code = Column(String)
class TableB(Base):
__tablename__ = 'b'
id = Column(Integer, primary_key=True)
b_code = Column(String)
Since we've used declarative, we don't actually have table instances to work from, which is neccesary for the next part. There are many ways to access the tables, but the way I prefer is to use sqlalchemy mapping introspection methods, since that will work no matter how the class was mapped.
from sqlalchemy.orm.attributes import manager_of_class
a_table = manager_of_class(TableA).mapper.mapped_table
b_table = manager_of_class(TableB).mapper.mapped_table
Next, we need an actual sql expression that represents the data we are interested in.
This is a union, which results in columns that look the same as the columns defined in the first class, id and a_code. We could rename it, but that's not a very important part of the example.
ab_view_sel = sqlalchemy.alias(a_table.select().union(b_table.select()))
Finally, we map a class to this. It is possible to use declarative for this, but it's actually more code to do it that way instead of classic mapping style, not less. Notice that the class inherits from object, not base
class ViewAB(object):
pass
sqlalchemy.orm.mapper(ViewAB, ab_view_sel)
And that's pretty much it. Of course there are some limitations with this; the most obvious being there's no (trivial) way to save instances of ViewAB back to the database.
There isn't really a concept of 'virtual tables', but it is possible to send a single query that 'joins' the data from multiple tables. This is probably as close as you can get to what you want.
For example, one way to do this in sqlalchemy/elixir would be (and this isn't far off from what you've shown, we're just not querying a 'virtual' table):
result = session.query(TableA, TableB).filter(TableA.a_code==x).filter(TableB.b_code==y).all()
This is similar to an SQL inner join, with some qualifying conditions in the filter statements. This isn't going to give you an sqlalchemy table object, but will give you a list of objects from each real table.
It looks like SQLAlchemy allows you to map an arbitrary query to a class. e.g. From SQLAlchemy: one classes – two tables:
usersaddresses = sql.join(t_users, t_addresses,
t_users.c.id == t_addresses.c.user_id)
class UserAddress(object):
def __repr__(self):
return "<FullUser(%s,%s,%s)" % (self.id, self.name, self.address)
mapper(UserAddress, usersaddresses, properties={
'id': [t_users.c.id, t_addresses.c.user_id],
})
f = session.query(UserAddress).filter_by(name='Hagar').one()