How would I do these multiple joins as a Django queryset? - python

I have this query that joins multiple tables together:
select
p.player_id
, d.player_data_1
, l.year
, l.league
, s.stat_1
, l.stat_1_league_average
from
stats s
inner join players p on p.player_id = s.player_id
left join player_data d on d.other_player_id = p.other_player_id
left join league_averages as l on l.year = s.year and l.league = s.year
where
p.player_id = 123
My models look like this:
class Stats(models.Model):
player_id = models.ForeignKey(Player)
stat_1 = models.IntegerField()
year = models.IntegerField()
league = models.IntegerField()
class Player(models.Model):
player_id = models.IntegerField(primary_key=True)
other_player_id = models.ForeignKey(PlayerData)
class PlayerData(models.Model):
other_player_id = models.IntegerField(primary_key=True)
player_data_1 = models.TextField()
class LeagueAverages(models.Model):
year = models.IntegerField()
league = models.IntegerField()
stat_1_league_average = models.DecimalField()
I can do something like this:
Stats.objects.filter(player_id=123).select_related('player')
to do the first join. For the second join, I tried:
Stats.objects.filter(player_id=123).select_related('player').select_related('player_data')
but I got this error:
django.core.exceptions.FieldError: Invalid field name(s) given in select_related: 'player_data'. Choices are: player
How would I do the third join considering that year and league aren't foreign keys in any of the tables? Thanks!

select_related(*fields) Returns a QuerySet that will “follow” foreign-key relationships, [...]
According to the django documentation select_related follows foreign-key relationships. player_data is neighter a foreign key, nor even an field of Stats. If you'd want to INNER join PlayerData and Player you could follow its foreign-keys. In your case use the
double-underscore to get to PlayerData:
Stats.objects.all()
.select_related('player_id')
.select_related('player_id__other_player_id')
As for joining LeagueAverages: There is not a way to join models without an appropriate foreign key, but to use raw sql. Have a look at a related question: Django JOIN query without foreign key. By using .raw(), your LEFT join (which by the way is also not that easy without using raw: Django Custom Left Outer Join) could also be taken care of.
Quick notes about your models:
Each model by default has an automatically incrementing primary key that can be accessed via .id or .pk. So there is no need to add for example player_id
A models.ForeignKey field references an object not it's id. Therefore it's more intuitive to rename for example player_id to player. If you name your field player django allows you automatically to access it's id via player_id

Related

How to sort by hybrid property on a one-on-one relationship in SQLAlchemy with a reference on the same table twice?

I have a table called User and Document.
The table User holds:
first_name
…
Document holds:
id
created_by_id
deleted_by_id
…
I want to sort the Document table by the first_name. Therefore, I created the relationships creation_user and deletion_user. Both foreign keys created_by_id and deleted_by_id reference the same table. The hybrid property created_by shall be sorted on:
creation_user = relationship("User", foreign_keys=[created_by_id], lazy="joined")
deletion_user = relationship("User", foreign_keys=[deleted_by_id], lazy="joined")
#hybrid_property
def created_by(self):
if self.creation_user:
return self.creation_user.first_name
else:
return None
#created_by.expression
def created_by(cls):
return User.first_name
Unfortunately, SQLAlchemy does not match the correct user and the resulting SQL query looks like this:
SELECT …
FROM document
LEFT OUTER JOIN [user] AS user_1
ON user_1.id = document.created_by_id
LEFT OUTER JOIN [user] AS user_2
ON user_2.id = document.deleted_by_id
WHERE document.id = ?
ORDER BY [user].first_name ASC
Is there a way to use the hybrid property or a hybrid expression (see sqlalchemy docs) so that the ORDER BY statement resolves to ORDER BY user_2.first_name ASC?
A solution that is "semi SQLAlchemy" is to use text as following:
def sort_docs(ids, order)
if order.lower() not in ['asc', 'desc']:
return None
statement = """
SELECT documents.id AS documents_id, documents.created_by_id AS
documents_created_by_id, documents.deleted_by_id AS documents_deleted_by_id,
users_1.id AS users_1_id, users_1.first_name AS users_1_first_name, users_2.id AS
users_2_id, users_2.first_name AS users_2_first_name
FROM documents
LEFT OUTER JOIN users AS users_1 ON users_1.id = documents.created_by_id
LEFT OUTER JOIN users AS users_2 ON users_2.id = documents.deleted_by_id
WHERE documents_id in :ids
ORDER BY users_1.first_name {}
""".format(order)
sql_statement = text(statement)
result = db.session.query(Document).from_statement(sql_statement.bindparams(ids=ids))
return result
This way you will retrieve your required documents as an object of Document model sorted by the user's first name in the requested order.
Having the hybrid expression specify the required user name generates the desired ordering
#created_by.expression
def created_by(cls):
return sql.select([User.first_name]).where(User.id == cls.created_by_id)
Although the generated query adds an additional subquery rather than referencing the aliased table:
SELECT documents.id AS documents_id, documents.created_by_id AS documents_created_by_id, documents.deleted_by_id AS documents_deleted_by_id, users_1.id AS users_1_id, users_1.first_name AS users_1_first_name, users_2.id AS users_2_id, users_2.first_name AS users_2_first_name
FROM documents
LEFT OUTER JOIN users AS users_1 ON users_1.id = documents.created_by_id
LEFT OUTER JOIN users AS users_2 ON users_2.id = documents.deleted_by_id
ORDER BY (SELECT users.first_name
FROM users
WHERE users.id = documents.created_by_id)
Another approach would be to define the ordering on the relationship:
creation_user = relationship("User", foreign_keys=[created_by_id], lazy="joined", order_by="User.first_name")
This produced the SQL specified in the question, and removes the need for the hybrid attributes:
SELECT documents.id AS documents_id, documents.created_by_id AS documents_created_by_id, documents.deleted_by_id AS documents_deleted_by_id, users_1.id AS users_1_id, users_1.first_name AS users_1_first_name, users_2.id AS users_2_id, users_2.first_name AS users_2_first_name
FROM documents
LEFT OUTER JOIN users AS users_1 ON users_1.id = documents.created_by_id
LEFT OUTER JOIN users AS users_2 ON users_2.id = documents.deleted_by_id
ORDER BY users_1.first_name
However if you ever want a different order, you'll need to override the loading strategy - see this answer.

Many-to-many join table with additional field in Flask

I have two tables, Products and Orders, inside my Flask-SqlAlchemy setup, and they are linked so an order can have several products:
class Products(db.Model):
id = db.Column(db.Integer, primary_key=True)
....
class Orders(db.Model):
guid = db.Column(db.String(36), default=generate_uuid, primary_key=True)
products = db.relationship(
"Products", secondary=order_products_table, backref="orders")
....
linked via:
order_products_table = db.Table("order_products_table",
db.Column('orders_guid', db.String(36), db.ForeignKey('orders.guid')),
db.Column('products_id', db.Integer, db.ForeignKey('products.id'))
# db.Column('license', dbString(36))
)
For my purposes, each product in an order will receive a unique license string, which logically should be added to the order_products_table rows of each product in an order.
How do I declare this third license column on the join table order_products_table so it gets populated it as I insert an Order?
I've since found the documentation for the Association Object from the SQLAlchemy docs, which allows for exactly this expansion to the join table.
Updated setup:
# Instead of a table, provide a model for the JOIN table with additional fields
# and explicit keys and back_populates:
class OrderProducts(db.Model):
__tablename__ = 'order_products_table'
orders_guid = db.Column(db.String(36), db.ForeignKey(
'orders.guid'), primary_key=True)
products_id = db.Column(db.Integer, db.ForeignKey(
'products.id'), primary_key=True)
order = db.relationship("Orders", back_populates="products")
products = db.relationship("Products", back_populates="order")
licenses = db.Column(db.String(36), nullable=False)
class Products(db.Model):
id = db.Column(db.Integer, primary_key=True)
order = db.relationship(OrderProducts, back_populates="order")
....
class Orders(db.Model):
guid = db.Column(db.String(36), default=generate_uuid, primary_key=True)
products = db.relationship(OrderProducts, back_populates="products")
....
What is really tricky (but also shown on the documentation page), is how you insert the data. In my case it goes something like this:
o = Orders(...) # insert other data
for id in products:
# Create OrderProducts join rows with the extra data, e.g. licenses
join = OrderProducts(licenses="Foo")
# To the JOIN add the products
join.products = Products.query.get(id)
# Add the populated JOIN as the Order products
o.products.append(join)
# Finally commit to database
db.session.add(o)
db.session.commit()
I was at first trying to populate the Order.products (or o.products in the example code) directly, which will give you an error about using a Products class when it expects a OrderProducts class.
I also struggled with the whole field naming and referencing of the back_populates. Again, the example above and on the docs show this. Note the pluralization is entirely to do with how you want your fields named.

How to make an Inner Join in django without foreign key?

how to make statement of join on in django 1.11,
i want to create this statement :
select t1.name ,t2.str, t2.num
from table_1 as t1
join table_2 as t2 on t2.product_id = t1.id and t2.section_num = 2;
the models:
class t1(UTModelTS):
alt_keys = product_alt_keys
name = utCharField()
...
class t2(UTModelTS):
alt_keys= [('pr_id', 'section')]
str = utCharField()
num = models.IntegerField()
...
i tried
t1 = t1.objects.filter(**params).exclude(**exclude)
t1 = t1.select_related('t2')
`
but this make no sense since acoridng to django doc :
select_related
Returns a QuerySet that will “follow” foreign-key relationships...
from https://docs.djangoproject.com/en/2.2/ref/models/querysets/ .
No, there isn't an effective / elegant way unfortunately.
although you can use .raw()/RawSQL() method for this exact thing. Even if it could it probably would be a lot slower than raw SQL.
https://docs.djangoproject.com/en/2.2/topics/db/sql/
You should not add on statement. Django's ORM will performs inner join for you automatically.
Imagine this is our models. User Table and Post table.
class User(models.Model):
name = models.CharField(max_length=30)
surname = models.CharField(max_length=50)
class Post(models.Model):
title = models.CharField(max_length=50)
text = models.TextField()
user = models.ForeignKey(to='User', on_delete=models.CASCADE) # this is FK field to users
Usage
qs = Post.objects.select_related('user') # This will performs SQL INNER JOIN.
print(qs.query) # use query attribute to show what query is performed.
This will be generated SQL query
SELECT "myapp_post"."id", "myapp_post"."title", "myapp_post"."text", "myapp_post"."user_id", "myapp_user"."id", "myapp_user"."name", "myapp_user"."surname" FROM "myapp_post" INNER JOIN "myapp_user" ON ("myapp_post"."user_id" = "myapp_user"."id")
Look your models,,,
class t1(UTModelTS):
alt_keys = product_alt_keys
name = utCharField()
t2 = models.ForeignKey(to='t2', on_delete=models.CASCADE)
class t2(UTModelTS):
alt_keys= [('pr_id', 'section')]
str = utCharField()
num = models.IntegerField()
add t2 FK to your t1 Model.
Querying
qs = t1.objects.select_related('t2')
OR
qs = t1.objects.select_related('t2').filter(**lookup_kwargs).
select_related() returns QuerySet object, you can use QuerySet methods after select_related().

How to query many-to-many based on some constraints in flask sqlalchemy?

If I have a User and Item model, and they have a many-to-many association with each other, how do I build a query that returns:
(1) All items that belong to any user named 'Bob'
I tried:
Item.query.filter(User.name == 'Bob')
Which returns all items regardless of the user's name (incorrect)
(2) All items that have the name 'shark' and belong to any user named 'Bob'
I tried:
Item.query.filter(User.name == 'Bob' & Item.name == 'shark')
Same as above, but only returns items named 'shark' regardless of the user's name. (incorrect)
My model definitions:
association_table = Table('items_users',
Column('itemid', Integer, ForeignKey('item.id'), primary_key=True),
Column('userid', Integer, ForeignKey('user.id'), primary_key=True)
)
class Item(Model):
# other fields...
# many to many association
users = relationship('User', secondary=association_table, lazy='dynamic', backref=backref('items', lazy='dynamic'))
class User(Model):
# other fields...
What would be appropriate syntax for two queries?
You need to join the tables you will query, so that filtering one will filter the combined row associated with the other. Since you have defined a relationship between the two models, you can join on it rather than specifying a join condition manually.
Item.query.join(Item.users).filter(User.name == 'bob')
Item.query.join(Item.users).filter(User.name == 'bob', Item.name == 'shark')
Working with relationships and joins is covered in the comprehensive tutorial in the SQLAlchemy docs.

SQLAlchemy mapping joined tables' columns to one object

I have three tables: UserTypeMapper, User, and SystemAdmin. In my get_user method, depending on the UserTypeMapper.is_admin row, I then query either the User or SystemAdmin table. The user_id row correlates to the primary key id in the User and SystemAdmin tables.
class UserTypeMapper(Base):
__tablename__ = 'user_type_mapper'
id = Column(BigInteger, primary_key=True)
is_admin = Column(Boolean, default=False)
user_id = Column(BigInteger, nullable=False)
class SystemAdmin(Base):
__tablename__ = 'system_admin'
id = Column(BigInteger, primary_key=True)
name = Column(Unicode)
email = Column(Unicode)
class User(Base):
__tablename__ = 'user'
id = Column(BigInteger, primary_key=True)
name = Column(Unicode)
email = Column(Unicode)
I want to be able to get any user – system admin or regular user – from one query, so I do a join, on either User or SystemAdmin depending on the is_admin row. For example:
DBSession.query(UserTypeMapper, SystemAdmin).join(SystemAdmin, UserTypeMapper.user_id==SystemAdmin.id).first()
and
DBSession.query(UserTypeMapper, User).join(User, UserTypeMapper.user_id==User.id).first()
This works fine; however, I then would like to be access these, like so:
>>> my_admin_obj.is_admin
True
>>> my_admin_obj.name
Bob Smith
versus
>>> my_user_obj.is_admin
False
>>> my_user_obj.name
Bob Stevens
Currently, I have to specify: my_user_obj.UserTypeMapper.is_admin and my_user_obj.User.name. From what I've been reading, I need to map the tables so that I don't need to specify which table the attribute belongs to. My problem is that I do not understand how I can specify this given that I have two potential tables that the name attribute, for example, may come from.
This is the example I am referring to: Mapping a Class against Multiple Tables
How can I achieve this? Thank you.
You have discovered why "dual purpose foreign key", is an antipattern.
There is a related problem to this that you haven't quite pointed out; there's no way to use a foreign key constraint to enforce the data be in a valid state. You want to be sure that there's exactly one of something for each row in UserTypeMapper, but that 'something' is not any one table. formally you want a functional dependance on
user_type_mapper → (system_admin× 1) ∪ (user× 0)
But most sql databses won't allow you to write a foreign key constraint expressing that.
It looks complicated because it is complicated.
instead, lets consider what we really want to say; "every system_admin should be a user; or
system_admin → user
In sql, that would be written:
CREATE TABLE user (
id INTEGER PRIMARY KEY,
name VARCHAR,
email VARCHAR
);
CREATE TABLE system_admin (
user_id INTEGER PRIMARY KEY REFERENCES user(id)
);
Or, in sqlalchemy declarative style
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
email = Column(String)
class SystemAdmin(Base):
__tablename__ = 'system_admin'
user_id = Column(ForeignKey(User.id), primary_key=True)
What sort of questions does this schema allow us to ask?
"Is there a SystemAdmin by the name of 'john doe'"?
>>> print session.query(User).join(SystemAdmin).filter(User.name == 'john doe').exists()
EXISTS (SELECT 1
FROM "user" JOIN system_admin ON "user".id = system_admin.user_id
WHERE "user".name = :name_1)
"How many users are there? How many sysadmins?"
>>> print session.query(func.count(User.id), func.count(SystemAdmin.user_id)).outerjoin(SystemAdmin)
SELECT count("user".id) AS count_1, count(system_admin.user_id) AS count_2
FROM "user" LEFT OUTER JOIN system_admin ON "user".id = system_admin.user_id
I hope you can see why the above is prefereable to the design you describe in your question; but in the off chance you don't have a choice (and only in that case, if you still feel what you've got is better, please refine your question), you can still cram that data into a single python object, which will be very difficult to work with, by providing an alternate mapping to the tables; specifically one which follows the rough structure in the first equation.
We need to mention UserTypeMapper twice, once for each side of the union, for that, we need to give aliases.
>>> from sqlalchemy.orm import aliased
>>> utm1 = aliased(UserTypeMapper)
>>> utm2 = aliased(UserTypeMapper)
For the union bodies join each alias to the appropriate table: Since SystemAdmin and User have the same columns in the same order, we don't need to describe them in detail, but if they are at all different, we need to make them "union compatible", by mentioning each column explicitly; this is left as an exercise.
>>> utm_sa = Query([utm1, SystemAdmin]).join(SystemAdmin, (utm1.user_id == SystemAdmin.id) & (utm1.is_admin == True))
>>> utm_u = Query([utm2, User]).join(User, (utm2.user_id == User.id) & (utm2.is_admin == False))
And then we join them together...
>>> print utm_sa.union(utm_u)
SELECT anon_1.user_type_mapper_1_id AS anon_1_user_type_mapper_1_id, anon_1.user_type_mapper_1_is_admin AS anon_1_user_type_mapper_1_is_admin, anon_1.user_type_mapper_1_user_id AS anon_1_user_type_mapper_1_user_id, anon_1.system_admin_id AS anon_1_system_admin_id, anon_1.system_admin_name AS anon_1_system_admin_name, anon_1.system_admin_email AS anon_1_system_admin_email
FROM (SELECT user_type_mapper_1.id AS user_type_mapper_1_id, user_type_mapper_1.is_admin AS user_type_mapper_1_is_admin, user_type_mapper_1.user_id AS user_type_mapper_1_user_id, system_admin.id AS system_admin_id, system_admin.name AS system_admin_name, system_admin.email AS system_admin_email
FROM user_type_mapper AS user_type_mapper_1 JOIN system_admin ON user_type_mapper_1.user_id = system_admin.id AND user_type_mapper_1.is_admin = 1 UNION SELECT user_type_mapper_2.id AS user_type_mapper_2_id, user_type_mapper_2.is_admin AS user_type_mapper_2_is_admin, user_type_mapper_2.user_id AS user_type_mapper_2_user_id, "user".id AS user_id, "user".name AS user_name, "user".email AS user_email
FROM user_type_mapper AS user_type_mapper_2 JOIN "user" ON user_type_mapper_2.user_id = "user".id AND user_type_mapper_2.is_admin = 0) AS anon_1
While it's theoretically possible to wrap this all up into a python class that looks a bit like standard sqlalchemy orm stuff, I would certainly not do that. working with non-table mappings, especially when they are more than simple joins (this is a union), is lots of work for zero payoff.

Categories

Resources