SQLAlchemy - Get all Rows which have matching set of Columns - python

I have a table
class Term(CommonFunctions, Base):
__tablename__ = 'terms'
id = Column(Integer, primary_key=True,autoincrement=True)
term_begin = Column(Date, nullable=False)
term_end = Column(Date)
term_served = Column(Integer) # term_number # calculatable?
office_type_id = Column(Integer, ForeignKey(OfficeType.id))
office_type = relationship('OfficeType', backref='terms')
state_id = Column(Integer, ForeignKey(State.id))
state = relationship('State', backref='terms')
district_id = Column(Integer, ForeignKey(District.id))
district = relationship('District', backref='terms')
office_class = Column(Integer)
# ... other fieldds
I am trying to run a report, to find the ID pairs, of rows that have the same set of data for (state_id,district_id,office_type_id, office_class)
for a specific office_type_id within a specific date range.
The query I have right now - (institution = office_type_id)
date = request.args.get('date')
institution = request.args.get('institution')
term_alias = aliased(schema.Term)
composition = Session.query(schema.Term.id, term_alias.id).\
filter(schema.Term.id != term_alias.id).\
filter(schema.Term.office_class == term_alias.office_class).\
filter(schema.Term.state_id == term_alias.state_id).\
filter(schema.Term.office_type_id == term_alias.office_type_id).\
filter(schema.Term.office_type_id == institution).\
filter(schema.Term.office_class != 0).\
filter(and_(schema.Term.term_begin <= date, or_(schema.Term.term_end >= date,
schema.Term.term_end == None))).\
all()
This works - in a sense. I get back valid results, but it reproduces the result twice, once for each version of the pair.
For Example :
[(127,196), (196,127)]
My question is, How can I update the query, to include only pairs, that are not already represented by a logically equivalent pair.
I would like the above set to be either [(127, 196)] or [(196,127)] not both.
Thanks for reading

A common way is to impose a particular (arbitrary) ordering:
Session.query(...).filter(..., schema.Term.id < term_alias.id)
If you can get back a "reflexive" pair (pair of identical IDs), you need apply a distinct as well.
Session.query(...).filter(..., schema.Term.id < term_alias.id).distinct()

Related

sqlalchemy orm | fastAPI querying + joining three tables to get parents and all children

I'm trying to make have a route in my fastAPI that gives back a list of all parents.portfolios and all the children or stocks that are associated with each of them PLUS the extra data that is in the association table (for that relationship).
The response is suppose to look somewhat like this
[ { "parrent1_attr1": bla,
"parrent1_attr2": bla,
"children": [ {
"child1_attr1": bla,
"child1_attr2": bla},
{"child2_attr1": bla,
"child2_attr2": bla}]
},
etc...]
Right now the route that produces this looks like this:
#router.get("/")
def get_all_portfolios(db: Session = Depends(get_db), current_user: int = Depends(oauth2.get_current_user)):
results = db.query(models.Portfolio).options(joinedload(models.Portfolio.stocks)).all()
return results
But this gives me the wrong result.
This results in this.
[ { "parrent1_attr1": bla,
"parrent1_attr2": bla,
"children": [ {
"association_table_attr1": bla
"association_table_attr2": bla},]
So I get data from the association table back instead of from the children.
The models I have are here.
class Portfolio(Base):
__tablename__ = "portfolios"
id = Column(Integer, primary_key=True, nullable=False)
...
stocks = relationship("PortfolioStock", back_populates="portfolio")
class Stock(Base):
__tablename__ = "stocks"
id = Column(Integer, primary_key=True, nullable=False)
...
portfolios = relationship("PortfolioStock", back_populates="stock")
class PortfolioStock(Base):
__tablename__ = "portfolio_stocks"
id = Column(Integer, primary_key=True)
stock_id = Column(Integer, ForeignKey("stocks.id", ondelete="CASCADE"))
portfolio_id = Column(Integer, ForeignKey("portfolios.id", ondelete="CASCADE"))
count = Column(Integer, nullable=True)
buy_in = Column(Float, nullable=True)
stock = relationship("Stock", back_populates="portfolios")
portfolio = relationship("Portfolio", back_populates="stocks")
Let me know if you need more information. I appreciate your help.
I find it to be easier to give the association some name of its own because it is confusing but in this case Portfolio.stocks is actually a list of the association objects and NOT actual stocks. You have to get those off the association object. In my example below I go and get stock with assoc.stock.id. That should not trigger another query because we used joinedload to pre-load it. If the stock had a name we'd reference it with assoc.stock.name.
with Session(engine) as session:
q = session.query(Portfolio).options(joinedload(Portfolio.stocks).joinedload(PortfolioStock.stock))
for portfolio in q.all():
print (f"Listing associated stocks for portfolio {portfolio.id}")
for assoc in portfolio.stocks:
print (f" Buy in {assoc.buy_in}, count {assoc.count} and stock id {assoc.stock.id}")
The query looks something like this:
SELECT portfolios.id AS portfolios_id, stocks_1.id AS stocks_1_id, portfolio_stocks_1.id AS portfolio_stocks_1_id, portfolio_stocks_1.stock_id AS portfolio_stocks_1_stock_id, portfolio_stocks_1.portfolio_id AS portfolio_stocks_1_portfolio_id, portfolio_stocks_1.count AS portfolio_stocks_1_count, portfolio_stocks_1.buy_in AS portfolio_stocks_1_buy_in
FROM portfolios LEFT OUTER JOIN portfolio_stocks AS portfolio_stocks_1 ON portfolios.id = portfolio_stocks_1.portfolio_id LEFT OUTER JOIN stocks AS stocks_1 ON stocks_1.id = portfolio_stocks_1.stock_id
For anybody that is looking for an answer, here is how I fixed it.
I used the query from Ian that he mentioned above, (thank you a ton for that).
And then I just manually declared the structure I wanted to have.
The whole code looks like this
results = (
db.query(models.Portfolio)
.options(joinedload(models.Portfolio.stocks).joinedload(models.PortfolioStock.stock))
.all()
)
result_list = []
for portfolio in results:
result_dict = portfolio.__dict__
stock_list = []
for sto in result_dict["stocks"]:
sto_dict = sto.__dict__
temp_sto = {}
temp_sto = sto_dict["stock"]
setattr(temp_sto, "buy_in", sto_dict["buy_in"])
setattr(temp_sto, "count", sto_dict["count"])
stock_list.append(temp_sto)
result_dict["stocks"] = stock_list
result_list.append(result_dict)
return result_list
What I'm doing here is firstly declare an empty list where our final results will be stored and which will be returned.
Then we iterate over the query (because the query gives us a list back).
So we have each "SQL alchemy Model" now as portfolio in there.
Then we can convert this into a dictionary and assign it a new variable with result_dict = portfolio.__dict__ The __dict__ method converts the model into a Python dictionary that you can work with easily.
Since that result_dict contains a list of PortfolioStock models which is the association table model in. These are stored in the stocks key we have to iterate over them to get those values as well. And here we just repeat the process.
We convert the model into a dictionary with __dict__ then make a new empty dictionary temp_sto={} and set it equal to the stock key which is the key that is linking the child to our association table. So now we have the child or stock we want to access. We can simply set our new empty dicitonary equal to that so we inherit all the information contained within.
And then we just have to add all other information from the association table that we might want which can be accessed via the dictionary we defined at the beginning of the for loop sto_dict.
Once we have this we append it to an empty list we have defined outside of this for loop but inside the portfolio loop.
Set the result_dict["stocks"] key (so basically the key where you want all children to be contained in) equal to that list we just appended all the dictionaries to and then append that dictionary to our result_list.
And last thing to do is just to return that list, and we're done.
I have provided an agnostic approach hopefully down below
query = db.query(Parent).options(joinedload(Parent.relationship).joinedload(AssociationTable.child).all()
result_list = []
for parent in query:
parent_dict = parent.__dict__
child_list = []
for association in parent_dict["relationship_name"]:
association_dict = association.__dict__
temp_child_dict = {}
temp_child_dict = association_dict["child"]
setattr(temp_child_dict, "name_you_want", association_dict["key_of_value_you_want"])
# repeat as many times as you need
child_list.append(temp_child_dict)
parent_dict["children"] = child_list
result_list.append(parent_dict)
return result_list
I hope this helps you if you are in a similar situation.

Flask-AppBuilder: How to sort on relationship?

According to the documentation, using order_columns you can specify which columns allow sorting, which adds blue arrows in the header to select sorting in ascending or descending order.
However I also want to order by a relationship called "softwareproduct" to another table but when I add that to order_columns, it crashes (as it is not a real column but a relationship). The documentation also lists order_rel_fields, which I tried as well but that doesn't add a sorting function to the "softwareproduct" "column"/relationship:
Add_columns, edit_columns, show_columns and list_columns work perfectly fine, only order doesn't, even though "softwareproduct" isn't technically a real column but a relationship.
How can I let the users sort on such relationships?
models.py
[...]
class Softwareproduct(Model):
suffix = Column(String(200), primary_key=True)
label = Column(String(200), nullable=False)
[...]
def __repr__(self):
return self.label
class Citation(Model):
suffix = Column(String(200), primary_key=True)
swp_suffix = Column(String(200), ForeignKey("softwareproduct.suffix"),nullable=False)
softwareproduct = relationship("Softwareproduct")
label = Column(String(200), nullable=False)
def __repr__(self):
return self.label
views.py
class CitationView(ModelView):
datamodel = SQLAInterface(Citation)
label_columns = {'label':'Citation', 'suffix': 'ID'}
add_columns = ['softwareproduct', "label", "suffix", "classified"]
edit_columns = ['softwareproduct', "label", "suffix","classified"]
show_columns = ['softwareproduct', "label", "suffix","classified"]
list_columns = ['softwareproduct', "label", "suffix","classified"]
order_columns= ["label","suffix"]
order_rel_fields = {'softwareproduct': ('label', 'asc')}
related_views = [ClassifiedView]
Change
order_columns= ["label","suffix"]
To
base_order = ("label", "asc")

Order query by count of one-to-many relationship per hour - SQLAlchemy

I'm working on a video game auction website for buying/selling in-game items.
I want to be able to query the Auctions table and sort them by the "hottest" auctions. This is based on the number of bids/hour placed on an auction.
Here's the auction model:
class Auctions(db.Model):
id = db.Column(db.Integer, primary_key=True, index=True)
posted = db.Column(db.DateTime())
end = db.Column(db.DateTime())
...
bids = db.relationship('Bids', backref='auctions', lazy='dynamic', order_by='desc(Bids.amount)', cascade="all, delete-orphan")
Here's the Bids model:
class Bids(db.Model):
id = db.Column(db.Integer, primary_key=True, index=True)
bidder_id = db.Column(db.Integer, db.ForeignKey('user.id'), index=True)
auction_id = db.Column(db.Integer, db.ForeignKey('auctions.id'), index=True)
amount = db.Column(db.Integer)
posted = db.Column(db.DateTime())
I'm able to sort them by the amount of bids like this:
hot_stmt = db.session.query(models.Bids.auction_id, func.count('*').label('bid_count')).group_by(models.Bids.auction_id).subquery()
hot = db.session.query(models.Auctions, hot_stmt.c.bid_count).outerjoin(hot_stmt, (models.Auctions.id == hot_stmt.c.auction_id)).order_by(hot_stmt.c.bid_count.desc()).limit(5)
I can calculate and list bids/hour with this:
for auc, count in hot:
time_delta = datetime.utcnow() - auc.posted
auc_hours = time_delta.seconds / 60 / 60
print(auc.id, count / auc_hours)
How could I sort the query by bids/hour so that the query returns the top 5 hottest auctions?
One useful approach is to create a dictionary with auctions as keys and bids/hr as values:
d = {}
for auc, count in hot:
time_delta = datetime.utcnow() - auc.posted
auc_hours = time_delta.seconds / 60 / 60
d[auc] = count / auc_hours
Make a list of the auctions:
aucs = [auc for auc, count in hot]
Sort the list aucs based on the values (use the reverse keyword to put the highest values at the beginning of the list, since the sort function goes lowest-to-highest by default):
aucs.sort(key=d.get, reverse=True)

Abstraction in SQLAlchemy conditional filtering

I've created models for my database:
class Album(db.Model):
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(128))
year = db.Column(db.String(4))
tracklist = db.relationship('Track', secondary=tracklist,
backref=db.backref('albums',
lazy='dynamic'), lazy='dynamic')
class Track(db.Model):
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(128))
class Artist(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(128))
releases = db.relationship('Track', secondary=releases,
backref=db.backref('artists',
lazy='dynamic'), lazy='dynamic')
They are many-to-many related Album <--> Track <--> Artist
Next, I have this form:
class SearchForm(FlaskForm):
search_by_album = StringField('Album', validators=[Optional()])
search_by_artist = StringField('Artist', validators=[Optional()])
search_track = StringField('Track', validators=[Optional()])
year = StringField('Year', validators=[Optional(), Length(max=4)])
My idea is to give the user freedom in filling desired combination of forms (but at least one is required), so I've got this function, which recieves SearchForm().data (an immutable dict 'field_name': 'data'):
def construct_query(form):
query = db.session.query(*[field.label.text for field in form if field.data and field.name != 'csrf_token'])
if form.search_by_album.data:
query = query.filter(Album.title == form.search_by_album.data)
if form.search_by_artist.data:
query = query.filter(Artist.name == form.search_by_artist.data)
if form.search_track.data:
query = query.filter(Track.title == form.search_track.data)
if form.year.data:
query = query.filter(Album.year == form.year.data)
result = query.all()
return result
My question is if there is a more abstract way of adding filters in the function above? If one day I decide to add more columns to my tables (or even create new tables), I will have to add more monstrous ifs to constrcut_query(), which will eventually grow enormous. Or such an abstractions is not a pythonic way because "Explicit is better than implicit"?
PS
I know about forms from models, but I don't think that they are my case
One way would be associating the filter-attribute with the fields at some place, e.g. as a class attribute on the form itself:
class SearchForm(FlaskForm):
search_by_album = StringField('Album', validators=[Optional()])
search_by_artist = StringField('Artist', validators=[Optional()])
search_track = StringField('Track', validators=[Optional()])
year = StringField('Year', validators=[Optional(), Length(max=4)])
# map form fields to database fields/attributes
field_to_attr = {search_by_album: Album.title,
search_by_artist: Artist.name,
search_track: Track.title,
year: Album.year}
When building the query, you could then build the where clause in a pretty comfortable way:
def construct_query(form):
query = db.session.query(*[field.label.text for field in form if field.data and field.name != 'csrf_token'])
for field in form:
if field.data:
query = query.filter(form.field_to_attr[field] == field.data)
# or:
# for field, attr in form.field_to_attr.items():
# if field.data:
# query = query.filter(attr == field.data)
result = query.all()
return result
Adding new fields and attributes to filter on would then only translate to the creating the field and its mapping to an attribute.

Updating a field in an Association Object in SQLAlchemy

I have an association object in SQLAlchemy that has some extra information (actually a single field) for 2 other objects.
The first object is a Photo model, the second object is a PhotoSet and the association object is called PhotoInSet which holds the position attribute which tells us in what position is the Photo in the current PhotoSet.
class Photo(Base):
__tablename__ = 'photos'
id = Column(Integer, primary_key=True)
filename = Column(String(128), index=True)
title = Column(String(256))
description = Column(Text)
pub_date = Column(SADateTime)
class PhotoInSet(Base):
__tablename__ = 'set_order'
photo_id = Column(Integer, ForeignKey('photos.id'), primary_key=True)
photoset_id = Column(Integer, ForeignKey('photo_set.id'), primary_key=True)
position = Column(Integer)
photo = relationship('Photo', backref='sets')
def __repr__(self):
return '<PhotoInSet %r>' % self.position
class PhotoSet(Base):
__tablename__ = 'photo_set'
id = Column(Integer, primary_key=True)
name = Column(String(256))
description = Column(Text)
timestamp = Column(SADateTime)
user_id = Column(Integer, ForeignKey('users.id'))
user = relationship('User', backref=backref('sets', lazy='dynamic'))
photo_id = Column(Integer, ForeignKey('photos.id'))
photos = relationship('PhotoInSet', backref=backref('set', lazy='select'))
I have no problems creating a new PhotoSet saving the position and creating the relationship, which is (roughly) done like this:
# Create the Set
new_set = PhotoSet(name, user)
# Add the photos with positions applied in the order they came
new_set.photos.extend(
[
PhotoInSet(position=pos, photo=photo)
for pos, photo in
enumerate(photo_selection)
]
)
But I am having a lot of trouble attempting to figure out how to update the position when the order changes.
If I had, say, 3 Photo objects with ids: 1, 2, and 3, and positions 1, 2, and 3 respectively, would look like this after creation:
>>> _set = PhotoSet.get(1)
>>> _set.photos
[<PhotoInSet 1>, <PhotoInSet 2>, <PhotoInSet 3>]
If the order changes, (lets invert the order for this example), is there anyway SQLAlchemy can help me update the position value? So far I am not happy with any of the approaches I can come up with.
What would be the most concise way to do this?
Take a look at the Ordering List extension:
orderinglist is a helper for mutable ordered relationships. It will
intercept list operations performed on a relationship()-managed
collection and automatically synchronize changes in list position onto
a target scalar attribute.
I believe you could change your schema to look like:
from sqlalchemy.ext.orderinglist import ordering_list
# Photo and PhotoInSet stay the same...
class PhotoSet(Base):
__tablename__ = 'photo_set'
id = Column(Integer, primary_key=True)
name = Column(String(256))
description = Column(Text)
photo_id = Column(Integer, ForeignKey('photos.id'))
photos = relationship('PhotoInSet',
order_by="PhotoInSet.position",
collection_class=ordering_list('position'),
backref=backref('set', lazy='select'))
# Sample usage...
session = Session()
# Create two photos, add them to the set...
p_set = PhotoSet(name=u'TestSet')
p = Photo(title=u'Test')
p2 = Photo(title='uTest2')
p_set.photos.append(PhotoInSet(photo=p))
p_set.photos.append(PhotoInSet(photo=p2))
session.add(p_set)
session.commit()
print 'Original list of titles...'
print [x.photo.title for x in p_set.photos]
print ''
# Change the order...
p_set.photos.reverse()
# Any time you change the order of the list in a way that the existing
# items are in a different place, you need to call "reorder". It will not
# automatically try change the position value for you unless you are appending
# an object with a null position value.
p_set.photos.reorder()
session.commit()
p_set = session.query(PhotoSet).first()
print 'List after reordering...'
print [x.photo.title for x in p_set.photos]
The results of this script...
Original list of titles...
[u'Test', u'uTest2']
List after reordering...
[u'uTest2', u'Test']
In your comment, you said...
So this would mean that if I assign a new list to _set.photos I get the positioning for free?
I doubt this is the case.

Categories

Resources