SQLALCHEMY/FASTAPI/POSTGRESQL | Only retrieve double entries - python

I have a database with a table named friends. That table has two columns, "user_id" and "friend_id".
Those are foreign keys from the Users table.
My friends table right now:
user_id | friend_id
-------------------------------------+-------------------------------------
google-oauth2|11539665289********** | google-oauth2|11746442253**********
google-oauth2|11746442253********** | google-oauth2|11539665289**********
google-oauth2|11746442253********** | google-oauth2|11111111111**********
The first two rows are the same IDs but flipped. Those Users I want to retrieve, because they added eachother. The third row only added another guy, that one shouldn't be retrieved.
My SQLModels (models.py):
class Friends(SQLModel, table=True):
__tablename__ = "friends"
user_id: str = Field(sa_column=Column('user_id', VARCHAR(length=50), primary_key=True), foreign_key="users.id")
friend_id: str = Field(sa_column=Column('friend_id', VARCHAR(length=50), primary_key=True), foreign_key="users.id")
class UserBase(SQLModel):
id: str
username: Optional[str]
country_code: Optional[str]
phone: Optional[str]
picture: Optional[str]
class Config:
allow_population_by_field_name = True
class User(UserBase, table=True):
__tablename__ = 'users'
id: str = Field(primary_key=True)
username: Optional[str] = Field(sa_column=Column('username', VARCHAR(length=50), unique=True, default=None))
phone: Optional[str] = Field(sa_column=Column('phone', VARCHAR(length=20), unique=True, default=None))
picture: Optional[str] = Field(sa_column=Column('picture', VARCHAR(length=255), default=None))
My fastapi endpoint:
#router.get("", status_code=status.HTTP_200_OK, response_model=models.FriendsList, name="Get Friends for ID",
tags=["friends"])
async def get_friends(
user_id: str = Query(default=None, description="The user_id that you want to retrieve friends for"),
session: Session = Depends(get_session)
):
stm = select(models.User, models.Friends).where(models.User.id == models.Friends.friend_id, models.Friends.user_id == user_id)
res = session.exec(stm).all()
if not res:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND,
detail="There are no friendships associated with this id.")
users = []
for item in res:
users.append(item[0])
return models.FriendsList(users=users)
My code works perfectly fine, only the query needs to be replaced.
stm = select(models.User, models.Friends).where(models.User.id == models.Friends.friend_id, models.Friends.user_id == user_id)
res = session.exec(stm).all()
This query returns every User that has the given ID as user_id, but doesn't check if there is an entry the other way around.
Example for what I want to get:
I make a GET request to my endpoint with the id google-oauth2|11746442253**********. I would get the User google-oauth2|11539665289**********. (The User google-oauth2|11111111111********** would not be retrieved because there is no entry the other way arround)
I hope you guys understand my problem. If there are any questions feel free to ask.
Best regards,
Colin

As I said in the comment, without a simple example I can't actually try myself, but I did just have an idea. You might need to modify the subquery syntax a little bit, but I reckon theoretically this could work:
stmt = select(Friends.user_id, Friends.friend_id).where(
tuple_(Friends.user_id, Friends.friend_id).in_(select(Friends.friend_id, Friends.user_id))
)
Basically it's just checking every (user_id, friend_id) if there is a matching (friend_id, user_id).

are you able to add another column called "accepted" with values like 0 or 1?
user_id | friend_id | accepted
-------------------------------------+-----------------------------------------------
google-oauth2|11539665289********** | google-oauth2|11746442253********** | 1
google-oauth2|11746442253********** | google-oauth2|11539665289********** | 1
google-oauth2|11746442253********** | google-oauth2|11111111111********** | 0
then you have two options:
you could make a relationship on the user table called "friends" and set the lazy parameter to "dynamic" (lazy='dynamic') and then query: user.friends.filter_by(accepted=1).all()
or you could write a query like:
query = Friends.query.filter(Friends.user_id==user_id).filter(Friends.accepted == 1).all()
generally relational databases aren't the best solution for these types of scenarios - if you're very flexible and not too far in you could checkout a NoSQL solution like MongoDB

Related

How should we manage datetime fields in SQLModel in python?

Let's say I want to create an API with a Hero SQLModel, below are minimum viable codes illustrating this:
from typing import Optional
from sqlmodel import Field, Relationship, SQLModel
from datetime import datetime
from sqlalchemy import Column, TIMESTAMP, text
class HeroBase(SQLModel): # essential fields
name: str = Field(index=True)
secret_name: str
age: Optional[int] = Field(default=None, index=True)
created_datetime: datetime = Field(sa_column=Column(TIMESTAMP(timezone=True),
nullable=False, server_default=text("now()")))
updated_datetime: datetime = Field(sa_column=Column(TIMESTAMP(timezone=True),
nullable=False, server_onupdate=text("now()")))
team_id: Optional[int] = Field(default=None, foreign_key="team.id")
class Hero(HeroBase, table=True): # essential fields + uniq identifier + relationships
id: Optional[int] = Field(default=None, primary_key=True)
team: Optional["Team"] = Relationship(back_populates="heroes")
class HeroRead(HeroBase): # uniq identifier
id: int
class HeroCreate(HeroBase): # same and Base
pass
class HeroUpdate(SQLModel): # all essential fields without datetimes
name: Optional[str] = None
secret_name: Optional[str] = None
age: Optional[int] = None
team_id: Optional[int] = None
class HeroReadWithTeam(HeroRead):
team: Optional["TeamRead"] = None
My question is, how should the SQLModel for HeroUpdate be like?
Does it include the create_datetime and update_datetime fields?
How do I delegate the responsibility of creating these fields to the database instead of using the app to do so?
Does [the HeroUpdate model] include the create_datetime and update_datetime fields?
Well, you tell me! Should the API endpoint for updating an entry in the hero table be able to change the value in the create_datetime and update_datetime columns? I would say, obviously not.
Fields like that serve as metadata about entries in the DB and are typically only ever written to by the DB. It is strange enough that you include them in the model for creating new entries in the table. Why would you let the API set the value of when an entry in the DB was created/updated?
One could even argue that those fields should not be visible to "the outside" at all. But I suppose you could include them in HeroRead for example, if you wanted to present that metadata to the consumers of the API.
How do I delegate the responsibility of creating [the create_datetime and update_datetime] fields to the database instead of using the app to do so?
You already have delegated it. You (correctly) defined a server_default and server_onupdate values for the Column instances that represent those fields. That means the DBMS will set their values accordingly, unless they are passed explicitly in a SQL statement.
What I would suggest is the following re-arrangement of your models:
from datetime import datetime
from typing import Optional
from sqlmodel import Column, Field, SQLModel, TIMESTAMP, text
class HeroBase(SQLModel):
name: str = Field(index=True)
secret_name: str
age: Optional[int] = Field(default=None, index=True)
class Hero(HeroBase, table=True):
id: Optional[int] = Field(default=None, primary_key=True)
created_datetime: Optional[datetime] = Field(sa_column=Column(
TIMESTAMP(timezone=True),
nullable=False,
server_default=text("CURRENT_TIMESTAMP"),
))
updated_datetime: Optional[datetime] = Field(sa_column=Column(
TIMESTAMP(timezone=True),
nullable=False,
server_default=text("CURRENT_TIMESTAMP"),
server_onupdate=text("CURRENT_TIMESTAMP"),
))
class HeroRead(HeroBase):
id: int
class HeroCreate(HeroBase):
pass
class HeroUpdate(SQLModel):
name: Optional[str] = None
secret_name: Optional[str] = None
age: Optional[int] = None
(I use CURRENT_TIMESTAMP to test with SQLite.)
Demo:
from sqlmodel import Session, create_engine, select
# Initialize database & session:
engine = create_engine("sqlite:///", echo=True)
SQLModel.metadata.create_all(engine)
session = Session(engine)
# Create:
hero_create = HeroCreate(name="foo", secret_name="bar")
session.add(Hero.from_orm(hero_create))
session.commit()
# Query (SELECT):
statement = select(Hero).filter(Hero.name == "foo")
hero = session.execute(statement).scalar()
# Read (Response):
hero_read = HeroRead.from_orm(hero)
print(hero_read.json(indent=4))
# Update (comprehensive as in the docs, although we change only one field):
hero_update = HeroUpdate(secret_name="baz")
hero_update_data = hero_update.dict(exclude_unset=True)
for key, value in hero_update_data.items():
setattr(hero, key, value)
session.add(hero)
session.commit()
# Read again:
hero_read = HeroRead.from_orm(hero)
print(hero_read.json(indent=4))
Here is what the CREATE statement looks like:
CREATE TABLE hero (
created_datetime TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_datetime TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
name VARCHAR NOT NULL,
secret_name VARCHAR NOT NULL,
age INTEGER,
id INTEGER NOT NULL,
PRIMARY KEY (id)
)
Here is the output of the the two HeroRead instances:
{
"name": "foo",
"secret_name": "bar",
"age": null,
"id": 1
}
{
"name": "foo",
"secret_name": "baz",
"age": null,
"id": 1
}
I did not include the timestamp columns in the read model, but SQLite does not honor ON UPDATE anyway.

SQLAlchemy I/O where parameters

I'm currently working on a project with a database I use through SQLAlchemy I/O and I stumble on a problem I can't solve. In the following DbSession is an asynchronous session and select is the select function of the library.
I have a class Player with 3 attributes id: BigInteger (Primary Key), name: String, other_id(nullable, if not null can serve as a primary key): BigInteger.
class Player(Base):
__tablename__ = "players"
id = Column(Integer, primary_key=True)
name = Column(String)
other_id = Column(BigInteger, nullable=True)
I implemented 2 methods get and get_by_id:
get is working well and select a Player in the table through its id:
#classmethod
async def get(cls, id):
query = select(cls).where(cls.id == id)
results = await DbSession.execute(query)
result = results.scalars().all()[0]
return result
My problem comes with get_by_id which is supposed to find a player through its other_id.
I tried:
#classmethod
async def get_dc_id(cls, id):
query = select(cls).filter(cls.other_id == id)
results = await DbSession.execute(query)
result = results.scalars().all()[0]
return result
As well as:
#classmethod
async def get_dc_id(cls, id):
query = select(cls).where(cls.other_id == id)
results = await DbSession.execute(query)
result = results.scalars().all()[0]
return result
But both send back an error:
ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.UndefinedFunctionError'>: operator does not exist: character varying = bigint
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
[SQL: SELECT players.id, players.name, players.other_id
FROM players
WHERE players.other_id = %s]
[parameters: (331534096054616068,)]
If I understand this right, the parameter of the call is actually the id I gave to my function but wrapped in a sort of tuple (that comes from I don't know where). It throws an error as this tuple doesn't match the type BigInteger other_id is supposed to have. I checked multiple times that I'm effectively giving an integer as an argument to get_by_id (here equal to 331534096054616068). I must admit that I don't know why the id ends up wrapped in a tuple, if it's a normal behavior or not as I just started working with sqlalchemy.
Any hint or help will be greatly appreciated.
It seems that your actual schema differs and other_id is a STRING column.
Can you inspect the postgresql db directly with psql and check if \d players is correct?
I get the same error if I change your column definition to other_id = Column(String, nullable=True) and create the db schema but it works if I recreate the schema with other_id = Column(BigInteger, nullable=True).

SQLAlchemy: is it possible to store session wide properties

I am working on a software that manipulates SQL tables using SQLALchemy.
Each operation a user will perform (insertion, modification, deletion) must be logged on a specific LOG table.
The log table looks like this:
+-----------+----------------------------------------------+
| user_id | log |
+-----------+----------------------------------------------+
| 21 | Value x added in table y |
| 12 | Value z deleted from table w |
To write such logs, I have a function define in the table Log that insert a new log with the following prototype.
class Foo(Base):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
value = Column(String)
#staticmethod
def insert(value):
item = Foo()
item.value = value
session.add(item)
Log.add(item)
class Log(Base):
user_id = Column(Integer, not_null=True)
value = Column(String, not_null=True)
#saticmethod
def add(item):
logitem = Log()
logitem.user_id = x
logitem.value = "Insertion of %s" % item.value
session.add(logitem)
The code above does not work because 'x' for user is not defined.
I don't want to pass the user_id as an argument when I call the Foo.insert method. I would like to know if it is possible to bind the user_id to the session so that the user_id would be define once and persist for all sql queries.
Sessions have an info attribute, a user-modifiable dictionary. The dictionary can be pre-populated when a session is created, and modified and accessed thereafter:
s = Session(info={'foo': 'bar'})
foo = s.info['foo']
s.info['baz'] = 'quux'
As far as I know, you can't use the session as a global namespace, and it doesn't look like a good idea anyway. If you really need to do that, it should be done by your application, using any other global or session state you have, not by coupling SQLAlchemy to your authentication/user-session process.

SQLAlchemy: One-Way Relationship, Correlated Subquery

thanks in advance for your help.
I have two entities, Human and Chimp. Each has a collection of metrics, which can contain subclasses of a MetricBlock, for instance CompleteBloodCount (with fields WHITE_CELLS, RED_CELLS, PLATELETS).
So my object model looks like (forgive the ASCII art):
--------- metrics --------------- ----------------------
| Human | ----------> | MetricBlock | <|-- | CompleteBloodCount |
--------- --------------- ----------------------
^
--------- metrics |
| Chimp | --------------
---------
This is implemented with the following tables:
Chimp (id, …)
Human (id, …)
MetricBlock (id, dtype)
CompleteBloodCount (id, white_cells, red_cells, platelets)
CholesterolCount (id, hdl, ldl)
ChimpToMetricBlock(chimp_id, metric_block_id)
HumanToMetricBlock(human_id, metric_block_id)
So a human knows its metric blocks, but a metric block does not know its human or chimp.
I would like to write a query in SQLAlchemy to find all CompleteBloodCounts for a particular human. In SQL I could write something like:
SELECT cbc.id
FROM complete_blood_count cbc
WHERE EXISTS (
SELECT 1
FROM human h
INNER JOIN human_to_metric_block h_to_m on h.id = h_to_m.human_id
WHERE
h_to_m.metric_block_id = cbc.id
)
I'm struggling though to write this in SQLAlchemy. I believe correlate(), any(), or an aliased join may be helpful, but the fact that a MetricBlock doesn't know its Human or Chimp is a stumbling block for me.
Does anyone have any advice on how to write this query? Alternately, are there other strategies to define the model in a way that works better with SQLAlchemy?
Thank you for your assistance.
Python 2.6
SQLAlchemy 0.7.4
Oracle 11g
Edit:
HumanToMetricBlock is defined as:
humanToMetricBlock = Table(
"human_to_metric_block",
metadata,
Column("human_id", Integer, ForeignKey("human.id"),
Column("metric_block_id", Integer, ForeginKey("metric_block.id")
)
per the manual.
Each primate should have a unique ID, regardless of what type of primate they are. I'm not sure why each set of attributes (MB, CBC, CC) are separate tables, but I assume that they have more than one dimension (primate) such as time, otherwise I would only have one giant table.
Thus, I would structure this problem in the following manner:
Create a parent object Primate and derive humans and chimps from it. This example is using single table inheritance, though you may want to use joined table inheritance based on their attributes.
class Primate(Base):
__tablename__ = 'primate'
id = Column(Integer, primary_key=True)
genus = Column(String)
...attributes all primates have...
__mapper_args__ = {'polymorphic_on': genus, 'polymorphic_identity': 'primate'}
class Chimp(Primate):
__mapper_args__ = {'polymorphic_identity': 'chimp'}
...attributes...
class Human(Primate):
__mapper_args__ = {'polymorphic_identity': 'human'}
...attributes...
class MetricBlock(Base):
id = ...
Then you create a single many-to-many table (you can use an association proxy instead):
class PrimateToMetricBlock(Base):
id = Column(Integer, primary_key=True) # primary key is needed!
primate_id = Column(Integer, ForeignKey('primate.id'))
primate = relationship('Primate') # If you care for relationships.
metricblock_id = Column(Integer, ForeignKey('metric_block.id')
metricblock = relationship('MetricBlock')
Then I would structure the query like so (note that the on clause is not necessary since SQLAlchemy can infer the relationships automatically since there's no ambiguity):
query = DBSession.query(CompleteBloodCount).\
join(PrimateToMetricBlock, PrimateToMetricBlock.metricblock_id == MetricBlock.id)
If you want to filter by primate type, join the Primate table and filter:
query = query.join(Primate, Primate.id == PrimateToMetricBlock.primate_id).\
filter(Primate.genus == 'human')
Otherwise, if you know the ID of the primate (primate_id), no additional join is necessary:
query = query.filter(PrimateToMetricBlock.primate_id == primate_id)
If you're only retrieving one object, end the query with:
return query.first()
Otherwise:
return query.all()
Forming your model like this should eliminate any confusion and actually make everything simpler. If I'm missing something, let me know.

sqlalchemy relational mapping

Hi I have a simple question - i have 2 tables (addresses and users - user has one address, lot of users can live at the same address)... I created a sqlalchemy mapping like this:
when I get my session and try to query something like
class Person(object):
'''
classdocs
'''
idPerson = Column("idPerson", Integer, primary_key = True)
name = Column("name", String)
surname = Column("surname", String)
idAddress = Column("idAddress", Integer, ForeignKey("pAddress.idAddress"))
idState = Column("idState", Integer, ForeignKey("pState.idState"))
Address = relationship(Address, primaryjoin=idAddress==Address.idAddress)
class Address(object):
'''
Class to represent table address object
'''
idAddress = Column("idAddress", Integer, primary_key=True)
street = Column("street", String)
number = Column("number", Integer)
postcode = Column("postcode", Integer)
country = Column("country", String)
residents = relationship("Person",order_by="desc(Person.surname, Person.name)", primaryjoin="idAddress=Person.idPerson")
self.tablePerson = sqlalchemy.Table("pPerson", self.metadata, autoload=True)
sqlalchemy.orm.mapper(Person, self.tablePerson)
self.tableAddress = sqlalchemy.Table("pAddress", self.metadata, autoload=True)
sqlalchemy.orm.mapper(Address, self.tableAddress)
myaddress = session.query(Address).get(1);
print myaddress.residents[1].name
=> I get TypeError: 'RelationshipProperty' object does not support indexing
I understand residents is there to define the relationship but how the heck can I get the list of residents that the given address is assigned to?!
Thanks
You define a relationship in a wrong place. I think you are mixing Declarative Extension with non-declarative use:
when using declarative, you define your relations in your model.
otherwise, you define them when mapping model to a table
If option-2 is what you are doing, then you need to remove both relationship definitions from the models, and add it to a mapper (only one is enought):
mapper(Address, tableAddress,
properties={'residents': relationship(Person, order_by=(desc(Person.name), desc(Person.surname)), backref="Address"),}
)
Few more things about the code above:
Relation is defined only on one side. The backref takes care about the other side.
You do not need to specify the primaryjoin (as long as you have a ForeignKey specified, and SA is able to infer the columns)
Your order_by configuration is not correct, see code above for the version which works.
You might try defining Person after Address, with a backref to Address - this will create the array element:
class Address(object):
__tablename__ = 'address_table'
idAddress = Column("idAddress", Integer, primary_key=True)
class Person(object):
idPerson = Column("idPerson", Integer, primary_key = True)
...
address_id = Column(Integer, ForeignKey('address_table.idAddress'))
address = relationship(Address, backref='residents')
Then you can query:
myaddress = session.query(Address).get(1);
for residents in myaddress.residents:
print name
Further, if you have a lot of residents at an address you can further filter using join:
resultset = session.query(Address).join(Address.residents).filter(Person.name=='Joe')
# or
resultset = session.query(Person).filter(Person.name=='Joe').join(Person.address).filter(Address.state='NY')
and resultset.first() or resultset[0] or resultset.get(...) etc...

Categories

Resources