Why is the order of SQLAlchemy InstrumentedList not persistent?

Why is the order of SQLAlchemy InstrumentedList not persistent? - python

Quick Summary: I want to have an ordered list of Addresses in SQLAlchemy.
But the order of my list changes when I commit.
Why does this happen and how can I change it?
Long explanation:
I start with a list of Address attached to a User object.
Then I replace the first element of the "addresses" list with a
new Address.
Then I print the list of addresses ... so far the order is what I would expect.
Finally I commit. After my commit I do a query but the order of my
addresses list has changed.
So is this just something about databasing in general that I don't understand? Or does a SQLAlchemy InstrumentedList not act like an actual list? I thought I could change the order of elements in a relationship but I don't see how.
from sqlalchemy import Column, Integer, String
from sqlalchemy import create_engine
from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
Base = declarative_base()
Session = sessionmaker()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String(50))
fullname = Column(String(50))
password = Column(String(12))
addresses = relationship("Address", back_populates="user")
def __repr__(self):
return "<User(name='%s', fullname='%s', password='%s')>" % (
self.name, self.fullname, self.password)
class Address(Base):
__tablename__ = 'addresses'
id = Column(Integer, primary_key=True)
email_address = Column(String, nullable=False)
user_id = Column(Integer, ForeignKey('users.id'))
user = relationship("User", back_populates="addresses")
def __repr__(self):
return "<Address(email_address='%s')>" % self.email_address
if __name__ == "__main__":
engine = create_engine('sqlite:///:memory:', echo=False)
Session.configure(bind=engine)
Base.metadata.create_all(engine)
session = Session()
user = User(name='ed', fullname='Ed Jones', password='edspassword')
user.addresses = [Address(email_address='jack#google.com'), Address(email_address='j25#yahoo.com')]
session.add(user)
session.commit()
user = session.query(User).filter_by(name='ed').first()
print("Current order of addresses list at start.")
print(user.addresses)
print()
new_primary_address = Address(email_address='primary#google.com')
user.addresses[0] = new_primary_address
print("Current order of addresses list before commit.")
print("But after chaning addresses[0].")
print(user.addresses)
print()
session.commit()
user = session.query(User).filter_by(name='ed').first()
print("Current order of addresses list after commit.")
print(user.addresses)
print()
print("Why is the order of the InstrumentedList not persistent?")
print("Isn't persistent order what makes a list a list?")

It is "databasing" in general. An InstrumentedList does act like an actual list with the added ORM instrumentation Python side, but when you commit the Session's default behaviour is to expire all database loaded state of ORM-managed attributes, and so the list has to be refreshed upon next access. This means that a SELECT such as
2017-05-21 13:32:31,124 INFO sqlalchemy.engine.base.Engine SELECT addresses.id AS addresses_id, addresses.email_address AS addresses_email_address, addresses.user_id AS addresses_user_id
FROM addresses
WHERE ? = addresses.user_id
is issued to fetch the list contents. In SQL the order of a SELECT is unspecified, if not explicitly chosen, so you may or may not get the items in the same order as before. Also note that the ORM operation
user.addresses[0] = new_primary_address
translates to an UPDATE that sets the user_id of the old address tuple to NULL and INSERTs a new one in the table, so you'd not get the order you thought, even if the rows were returned in insertion order.
If the order of addresses matters to you, you must choose ordering. Use the order_by parameter of relationship:
class User(Base):
...
addresses = relationship("Address", back_populates="user",
order_by="Address.email_address")
would order the addresses by email address, when fetched. SQLAlchemy also provides (thank you for digging that up) a helper collection class for mutable ordered relationships: orderinglist, which helps managing index/position on changes, if used as the ordering.
It seems you'd like the order of addresses to signify which is the primary address of a user. A separate flag column would work for this better.

Related

How to construct a SQLAlchemy relationship that takes the record most recently inserted?

Imagine I've got the following:
class User:
id = Column(Integer, primary_key=True)
username = Column(String(20), nullable=False)
password_hash = Column(String(HASH_LENGTH), nullable=False)
class LoginAttempts:
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey(User.id))
attempted_at = Column(DateTime, default=datetime.datetime.utcnow)
Now, I want to add a relationship to User called last_attempt that retrieves the most recent login attempt. How might one do this?

This seems like a use case for a relationship to an aliased class, which was added in SQLAlchemy 1.3 – before that you'd use a non primary mapper, or other methods such as a custom primary join. The idea is to create a subquery representing a derived table of latest login attempts per user that is then aliased to LoginAttempts and used as the target of a relationship. The exact query used to derive the latest attempts depends on your DBMS1, but a generic left join "antijoin" will work in most. Start by generating the (sub)query for latest login attempts:
newer_attempts = aliased(LoginAttempts)
# This reads as "find login attempts for which no newer attempt with larger
# attempted_at exists". The same could be achieved using NOT EXISTS as well.
latest_login_attempts_query = select([LoginAttempts]).\
select_from(
outerjoin(LoginAttempts, newer_attempts,
and_(newer_attempts.user_id == LoginAttempts.user_id,
newer_attempts.attempted_at > LoginAttempts.attempted_at))).\
where(newer_attempts.id == None).\
alias()
latest_login_attempts = aliased(LoginAttempts, latest_login_attempts_query)
Then just add the relationship attribute to your User model:
User.last_attempt = relationship(latest_login_attempts, uselist=False,
viewonly=True)
1: For example in Postgresql you could replace the LEFT JOIN subquery with a LATERAL subquery, NOT EXISTS, a query using window functions, or SELECT DISTINCT ON (user_id) ... ORDER BY (user_id, attempted_at DESC).

Although the selected answer is more robust, another way you could accomplish this is to use a lazy=dynamic and order_by:
User.last_attempted = relationship(LoginAttempts, order_by=desc(LoginAttempts.attempted_at), lazy='dynamic')
Be careful though, because this returns a query object (and will require .first() or equivalent), and you will need to use a limit clause:
last_attempted_login = session.query(User).get(my_user_id).last_attempted.limit(1).first()

Pre-populate a Flask SQLAlchemy database

I have two models, one is Identification which contains two IDs (first_id and second_id) and the second is User. The idea is that only authorised users will be given their first_id and second_id pair of values. They go to the site and login by entering the two id's plus a username and password (which they generate there and then).
I am trying to achieve two things here:
Pre-populate the Identification table with many (let's say 100) first_id/second_id values that will serve as the correct value pairs for logging in.
Set up the User class in such a way that only if the user enters a correct first_id/second_id pair in the login form can they log in (presumable this involves checking the form data with the Identification table somehow).
Here are the model classes:
class Identification(db.Model):
id = db.Column(db.Integer, primary_key=True)
first_id= db.Column(db.Text, unique=True)
second_id= db.Column(db.Text, unique=True)
def __init__(self, first_id, second_id):
self.first_id= first_id
self.second_id= second_id
def __repr__(self):
return f"ID: {self.id}, first_id: {self.first_id}, second_id: {self.second_id}"
class User(db.Model, UserMixin):
__tablename__ = 'user'
first_id= db.relationship('Identification', backref = 'identificationFID', uselist=False)
second_id = db.relationship('Identification', backref = 'identificationSID', uselist=False)
id = db.Column(db.Integer, primary_key=True)
username = db.Column(db.Text, unique=True, index=True)
password_hash = db.Column(db.Text(128))
identification_id = db.Column(db.Integer, db.ForeignKey('identification.id'), unique=True)
first_id = db.Column(db.Text, unique=True)
second_id = db.Column(db.Text, unique=True)
I would appreciate any help on this as I'm struggling and this is really above my understanding of python/Flask. Thanks all!

The answer above didn't work for me, because the create_tables() function since being part of the User class, requested that I pass an Instance of that class.
The solution I came up with, was to call the function after db.create_all(). This seemed like a good place to put the call, because of the #app.before_first_request decorator.
init.py
#app.before_first_request
def create_tables():
"""Create Tables and populate certain ones"""
db.create_all()
from app.models.init_defaults import init_defaults
init_defaults()
init_defaults.py
def init_defaults():
"""Pre-Populate Role Table"""
if Role.query.first() is None:
roles = ['Admin', 'Master', 'Apprentice']
for role in roles:
user_role = Role(access_level=role)
if role != 'Apprentice':
user_role.set_password('Passw0rd!')
db.session.add(user_role)
db.session.commit()
pass
Due to the decorator the function is now only called once per instance. Another solution I could imagine working, would be to use events:
https://dzone.com/articles/how-to-initialize-database-with-default-values-in
Note: This is a development solution not fit for production.

You can use mock data to populate these tables.
create a function in this py file where you can add objects to DB using ORM
and then call the function in __init__.py, which will populate data once your flask server starts.
Update:-
here is a code for your reference.
Model.py
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
from sqlalchemy import Column, Integer, String
from sqlalchemy.orm import sessionmaker
Base = declarative_base()
engine = create_engine('sqlite:///:memory:', echo=True)
Session = sessionmaker(bind=engine)
session = Session()
class User(Base):
\__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
fullname = Column(String)
#Create getter setters
def create_tables():
Base.metadata.create_all(engine)
user = User()
user.id=1
user.name="ABC"
user.fullname="ABCDEF"
session.add(user)
# similarly create more user objects with mock data and add it using session
__init__.py
from model import User
User.create_tables()
Reference

How can I prevent an IntegrityError when adding the same entity to a session twice?

The (executable) code below causes an IntegrityError because an entity with the same PK is added twice (implicitly) to the session. The session does not know that the entity represents the same object (same PK) and triggers two INSERT statements. I was under the impression that the session was supposed to detect that automatically. I realise that both entities are transient/detached and I would need to perform a merge to get a managed instance. But is this not avoidable?
A few notes concerning the example code:
It's immensely simplified to demonstrate the issue at hand. The real code I have is quite a bit more complex.
An important thing to note is that the main entity is "built" using a function "build_data" which does not have any reference to the session. The session is created on a higher level of the application.
The concept of User and Article are just for illustration. In reality it's different business entities. I replaced them here with "article" and "user" as it's a well-known concept.
Between the creation of article and article2 a lot of other stuff is happening, creating a fairly complex data-structure. I also don't know which line is happening first as a non-deterministic loop is involved (over dictionary keys).
I could likely solve this by either:
Passing the session to the build_data function and merging as required, or
Keep a manual reference to the instances and only create them once
What I would like to know: Can I avoid both approaches above to keep the code simple?
Here's the code in question:
from sqlalchemy import (
create_engine,
Column,
ForeignKeyConstraint,
Unicode,
)
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import (
relationship,
sessionmaker)
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
name = Column(Unicode, nullable=False, primary_key=True)
class Article(Base):
__tablename__ = 'article'
__table_args__ = (
ForeignKeyConstraint(
['user_'],
['user.name'],
ondelete='CASCADE',
onupdate='CASCADE'),
)
user_ = Column(Unicode, nullable=False, primary_key=True,
)
title = Column(Unicode, nullable=False, primary_key=True)
content = Column(Unicode)
user = relationship(User, backref='articles')
# Prepare the session
Session = sessionmaker()
engine = create_engine('sqlite:///:memory:', echo=True)
Session.configure(bind=engine)
Base.metadata.create_all(engine)
# --- The main code -----------------------------------------------------------
def build_data():
user = User(name='JDoe')
article = Article(user=user, title='Hello World', content='Foobar')
print(article)
# More stuff is happening here.
article2 = Article(user=user, title='Hello World', content='Foobar')
print(article2)
return user
session = Session()
entity = build_data()
session.add(entity)
session.flush()
# -----------------------------------------------------------------------------

How to migrate the format of a unique non nullable flask model property? [duplicate]

I need to alter data during an Alembic upgrade.
I currently have a 'players' table in a first revision:
def upgrade():
op.create_table('player',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('name', sa.Unicode(length=200), nullable=False),
sa.Column('position', sa.Unicode(length=200), nullable=True),
sa.Column('team', sa.Unicode(length=100), nullable=True)
sa.PrimaryKeyConstraint('id')
)
I want to introduce a 'teams' table. I've created a second revision:
def upgrade():
op.create_table('teams',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('name', sa.String(length=80), nullable=False)
)
op.add_column('players', sa.Column('team_id', sa.Integer(), nullable=False))
I would like the second migration to also add the following data:
Populate teams table:
INSERT INTO teams (name) SELECT DISTINCT team FROM players;
Update players.team_id based on players.team name:
UPDATE players AS p JOIN teams AS t SET p.team_id = t.id WHERE p.team = t.name;
How do I execute inserts and updates inside the upgrade script?

What you are asking for is a data migration, as opposed to the schema migration that is most prevalent in the Alembic docs.
This answer assumes you are using declarative (as opposed to class-Mapper-Table or core) to define your models. It should be relatively straightforward to adapt this to the other forms.
Note that Alembic provides some basic data functions: op.bulk_insert() and op.execute(). If the operations are fairly minimal, use those. If the migration requires relationships or other complex interactions, I prefer to use the full power of models and sessions as described below.
The following is an example migration script that sets up some declarative models that will be used to manipulate data in a session. The key points are:
Define the basic models you need, with the columns you'll need. You don't need every column, just the primary key and the ones you'll be using.
Within the upgrade function, use op.get_bind() to get the current connection, and make a session with it.
Or use bind.execute() to use SQLAlchemy's lower level to write SQL queries directly. This is useful for simple migrations.
Use the models and session as you normally would in your application.
"""create teams table
Revision ID: 169ad57156f0
Revises: 29b4c2bfce6d
Create Date: 2014-06-25 09:00:06.784170
"""
revision = '169ad57156f0'
down_revision = '29b4c2bfce6d'
from alembic import op
import sqlalchemy as sa
from sqlalchemy import orm
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Player(Base):
__tablename__ = 'players'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String, nullable=False)
team_name = sa.Column('team', sa.String, nullable=False)
team_id = sa.Column(sa.Integer, sa.ForeignKey('teams.id'), nullable=False)
team = orm.relationship('Team', backref='players')
class Team(Base):
__tablename__ = 'teams'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String, nullable=False, unique=True)
def upgrade():
bind = op.get_bind()
session = orm.Session(bind=bind)
# create the teams table and the players.team_id column
Team.__table__.create(bind)
op.add_column('players', sa.Column('team_id', sa.ForeignKey('teams.id'), nullable=False)
# create teams for each team name
teams = {name: Team(name=name) for name in session.query(Player.team).distinct()}
session.add_all(teams.values())
# set player team based on team name
for player in session.query(Player):
player.team = teams[player.team_name]
session.commit()
# don't need team name now that team relationship is set
op.drop_column('players', 'team')
def downgrade():
bind = op.get_bind()
session = orm.Session(bind=bind)
# re-add the players.team column
op.add_column('players', sa.Column('team', sa.String, nullable=False)
# set players.team based on team relationship
for player in session.query(Player):
player.team_name = player.team.name
session.commit()
op.drop_column('players', 'team_id')
op.drop_table('teams')
The migration defines separate models because the models in your code represent the current state of the database, while the migrations represent steps along the way. Your database might be in any state along that path, so the models might not sync up with the database yet. Unless you're very careful, using the real models directly will cause problems with missing columns, invalid data, etc. It's clearer to explicitly state exactly what columns and models you will use in the migration.

You can also use direct SQL see (Alembic Operation Reference) as in the following example:
from alembic import op
# revision identifiers, used by Alembic.
revision = '1ce7873ac4ced2'
down_revision = '1cea0ac4ced2'
branch_labels = None
depends_on = None
def upgrade():
# ### commands made by andrew ###
op.execute('UPDATE STOCK SET IN_STOCK = -1 WHERE IN_STOCK IS NULL')
# ### end Alembic commands ###
def downgrade():
# ### commands auto generated by Alembic - please adjust! ###
pass
# ### end Alembic commands ###

I recommend using SQLAlchemy core statements using an ad-hoc table, as detailed in the official documentation, because it allows the use of agnostic SQL and pythonic writing and is also self-contained. SQLAlchemy Core is the best of both worlds for migration scripts.
Here is an example of the concept:
from sqlalchemy.sql import table, column
from sqlalchemy import String
from alembic import op
account = table('account',
column('name', String)
)
op.execute(
account.update().\\
where(account.c.name==op.inline_literal('account 1')).\\
values({'name':op.inline_literal('account 2')})
)
# If insert is required
from sqlalchemy.sql import insert
from sqlalchemy import orm
bind = op.get_bind()
session = orm.Session(bind=bind)
data = {
"name": "John",
}
ret = session.execute(insert(account).values(data))
# for use in other insert calls
account_id = ret.lastrowid

SqlAlchemy: Self referencing default value as query

Let's say I have the following structure (using Flask-SqlAlchemy):
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, nullable=False, index=True)
# The following line throws an error at runtime.
variant = db.Column(db.Integer, nullable=False, index=True,
default=select(func.count(User.id)).where(User.name == self.name))
def __init__(self, name):
super(User, self).__init__()
self.name = name
#property
def clause(self):
return '/'.join([str(self.variant), self.name])
Problem is, "User is not defined." I would like to model a system with Users who may choose the same name but add a field to differentiate between users in a systemic way without using (thereby exposing) the "id" field.
Anyone know how to make a self-referential query to use to populate a default value?

The issue of the default not referring to User here is solved by just assigning "default" to the Column once User is available. However, that's not going to solve the problem here because "self" means nothing either, there is no User method being called so you can't just refer to "self". The challenge with this statement is that you want it to be rendered as an inline sub-SELECT but it still needs to know the in-memory value of ".name". So you have to assign that sub-SELECT per-object in some way.
The usual way people approach ORM-level INSERT defaults like this is usually by using a before_insert handler.
Another way that's worth pointing out is by creating a SQL level INSERT trigger. This is overall the most "traditional" approach, as here you need to have access to the row being inserted; triggers define a means of getting at the row values that are being inserted.
As far as using a default at the column level, you'd need to use a callable function as the default which can look at the current value of the row being inserted, but at the moment that means that your SELECT statement will not be rendered inline with the INSERT statement, you'd need to pre-execute the SELECT which is not really what we want here.
Anyway, the basic task of rendering a SQL expression into the INSERT while also having that SQL expression refer to some local per-object state is achieved by assigning that expression to the attribute, the ORM picks up on this at flush time. Below we do this in the constructor, but this can also occur inside of before_insert() as well:
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False, index=True)
variant = Column(Integer, nullable=False, index=True)
def __init__(self, name):
self.name = name
self.variant = select([func.count(User.id)]).where(User.name == self.name)
e = create_engine("sqlite://", echo=True)
Base.metadata.create_all(e)
s = Session(e)
s.add(User(name='n1'))
s.commit()
s.add(User(name='n1'))
s.commit()
print s.query(User.variant).all()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is the order of SQLAlchemy InstrumentedList not persistent? - python

Related

How to construct a SQLAlchemy relationship that takes the record most recently inserted?

Pre-populate a Flask SQLAlchemy database

How can I prevent an IntegrityError when adding the same entity to a session twice?

How to migrate the format of a unique non nullable flask model property? [duplicate]

SqlAlchemy: Self referencing default value as query

Categories

Resources