How to speed up python and sqlalchemy? - python

The model in my source code is in the format below.
Array in dict Array in dict Array in dict...
# data structure
user_list = [{user_name: 'A',
email: 'aaa#aaa.com',
items:[{name:'a_itme1', properties:[{1....},{2....}...]}
]} * 100]
I'm trying to put the above data into a postgresql db with SQLAlchemy.
There is a user table, an entity table, and an attribute table.
And there are tables that link users and items, and items and properties respectively.
for u in user_list:
new_user = User(user_name=u.get('user_name'),....)
session.add(new_user)
session.flush()
for item in u.get('items'):
new_item = Item(name=item.get('name'),.....)
session.add(new_item)
session.flush()
new_item_link = UserItemLink(user_id=new_user.id, item_id=new_item.id,...)
session.add(new_item_link)
session.flush()
for prop in item.properties:
new_properties = Properties(name=prop.get('name'),...)
session.add(new_properties)
session.flush()
new_prop_link = ItemPropLink(item_id=new_item.id, prop_id=new_properties.id,...)
session.add(new_prop_link)
session.flush()
session.commit()
My models look like this:
class User(Base):
__tablename__ = 'user'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(20))
email = Column(String(50))
user_item_link = relationship('UserItemLink', back_populates='user')
class Item(Base):
__tablename__ = 'item'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(50))
note = Column(String(50))
user_item_link = relationship('UserItemLink', back_populates='item')
class Properties(Base):
__tablename__ = 'properties'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(50))
value = Column(String(50))
item_prop_link = relationship('ItemPropLink', back_populates='properties')
class UserItemLink(Base):
__tablename__ = 'user_item_link'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
user_id = Column(ForeignKey('db.user.id'), nullable=False)
item_id = Column(ForeignKey('db.item.id'), nullable=False)
The above sources have been simplified for better understanding.
When session.add() is performed sequentially with the above information, it takes a lot of time.
When 100 user information is input, there is a delay of 8 seconds or more.
Please advise to improve python speed and sqlalchemy speed.

As you have relationships configured on the models you can compose complex objects using these relationships instead of relying on ids:
with Session() as s, s.begin():
for u in user_list:
user_item_links = []
for item in u.get('items'):
item_prop_links = []
for prop in item['properties']:
item_prop_link = ItemPropLink()
item_prop_link.properties = Properties(name=prop.get('name'), value=prop.get('value'))
item_prop_links.append(item_prop_link)
item = Item(name=item.get('name'), item_prop_link=item_prop_links)
user_item_link = UserItemLink()
user_item_link.item = item
user_item_links.append(user_item_link)
new_user = User(name=u.get('user_name'), email=u.get('email'), user_item_link=user_item_links)
s.add(new_user)
SQLAlchemy will automatically set the foreign keys when the session is flushed at commit time, removing the need to manually flush.

Related

SQLAlchemy - How to correctly connect two sets of data?

I am hoping for some guidance about what I believe is going to be a common pattern in SQLAlchemy for Python. However, I have so far failed to find a simple explanation for someone new to SQLAlchemy.
I have the follow objects:
Customers
Orders
Products
I am building a Python FastAPI application and I want to be able to create customers, and products individually. And subsequently, I want to then be able to create an order for a customer that can contain 1 or more products. A customer will be able to have multiple orders also.
Here are my SQLAlchemy models:
order_products = Table('order_products', Base.metadata,
Column('order_id', ForeignKey('orders.id'), primary_key=True),
Column('product_id', ForeignKey('products.id'), primary_key=True)
)
class Customer(Base):
__tablename__ = "customers"
id = Column(Integer, primary_key=True, index=True)
name = Column(String, index=True)
address = Column(String)
phonenumber = Column(String)
email = Column(String, unique=True, index=True)
is_active = Column(Boolean, default=True)
orders = relationship("Order", back_populates="customers")
class Order(Base):
__tablename__ = "orders"
id = Column(Integer, primary_key=True, index=True)
ordernumber = Column(String, index=True)
customer_id = Column(Integer, ForeignKey("customers.id"))
customers = relationship("Customer", back_populates="orders")
products = relationship("Product", secondary="order_products", back_populates="orders")
class Product(Base):
__tablename__ = "products"
id = Column(Integer, primary_key=True, index=True)
name = Column(String, index=True)
size = Column(Integer)
order_id = Column(Integer, ForeignKey("orders.id"))
orders = relationship("Order", secondary="order_products", back_populates="products")
And here are my CRUD operations:
def create_customer(db: Session, customer: customer.CustomerCreate):
db_customer = models.Customer(name = customer.name, address = customer.address, email=customer.email, phonenumber=customer.phonenumber)
db.add(db_customer)
db.commit()
db.refresh(db_customer)
return db_customer
def create_product(db: Session, product: product.Productreate):
db_product = models.Product(name = product.name, size = product.size)
db.add(db_product)
db.commit()
db.refresh(db_product)
return db_product
def create_order(db: Session, order: order.OrderCreate, cust_id: int):
db_order = models.Order(**order.dict(), customer_id=cust_id)
db.add(db_order)
db.commit()
db.refresh(db_order)
return db_order
def update_order_with_product(db: Session, order: order.Order):
db_order = db.query(models.Order).filter(models.Order.id==1).first()
if db_order is None:
return None
db_product = db.query(models.Order).filter(models.Product.id==1).first()
if db_order is None:
return None
db_order.products.append(db_product)
db.add(db_order)
db.commit()
db.refresh(db_order)
return db_order
All of the CRUD operations work apart from update_order_with_product which gives me this error:
child_impl = child_state.manager[key].impl
KeyError: 'orders'
I'm not sure if I am taking the correct approach to the pattern needed to define the relationships between my models. If not, can someone point me in the right direction of some good examples for a beginner?
If my pattern is valid then there must be an issue with my CRUD operation trying to create the relationships? Can anyone help with that?
This query could be a problem:
db_product = db.query(models.Order).filter(models.Product.id==1).first()
Should probably be:
db_product = db.query(models.Product).filter(models.Product.id==1).first()
because you want to get a Product instance, not Order.
When you update a record you should not add it to the session (because it has been registered to the session when you queried the record).
def update_order_with_product(db: Session, order: order.Order):
db_order = db.query(models.Order).filter(models.Order.id==1).first()
if db_order is None:
return None
db_product = db.query(models.Product).filter(models.Product.id==1).first()
if db_product is None:
return None
db_order.products.append(db_product)
db.commit()
db.refresh(db_order)
return db_order

How to assign a custom SQL query which returns a collection of Rows as an attribute of an ORM model

For example, suppose I have three models: Book, Author, and BookAuthor where a book can have many authors and an author can have many books.
class BookAuthor(Base):
__tablename__ = 'book_authors'
author_id = Column(ForeignKey('authors.id'), primary_key=True)
book_id = Column(ForeignKey('books.id'), primary_key=True)
blurb = Column(String(50))
class Author(Base):
__tablename__ = 'authors'
id = Column(Integer, primary_key=True)
class Book(Base):
__tablename__ = 'books'
id = Column(Integer, primary_key=True)
I would like to create an authors attribute of Book which returns every author for the book and the corresponding blurb about each author. Something like this
class Book(Base):
__tablename__ = 'books'
id = Column(Integer, primary_key=True)
#authors.expression
def authors(cls):
strSQL = "my custom SQL query"
return execute(strSQL)
Demo
from sqlalchemy import create_engine, Column, Integer, String, ForeignKey
from sqlalchemy.orm import declarative_base, Session
# Make the engine
engine = create_engine("sqlite+pysqlite:///:memory:", future=True, echo=False)
# Make the DeclarativeMeta
Base = declarative_base()
class BookAuthor(Base):
__tablename__ = 'book_authors'
author_id = Column(ForeignKey('authors.id'), primary_key=True)
book_id = Column(ForeignKey('books.id'), primary_key=True)
blurb = Column(String(50))
class Author(Base):
__tablename__ = 'authors'
id = Column(Integer, primary_key=True)
class Book(Base):
__tablename__ = 'books'
id = Column(Integer, primary_key=True)
# Create the tables in the database
Base.metadata.create_all(engine)
# Make data
with Session(bind=engine) as session:
# add parents
a1 = Author()
session.add(a1)
a2 = Author()
session.add(a2)
session.commit()
# add children
b1 = Book()
session.add(b1)
b2 = Book()
session.add(b2)
session.commit()
# map books to authors
ba1 = BookAuthor(author_id=a1.id, book_id=b1.id, blurb='foo')
ba2 = BookAuthor(author_id=a1.id, book_id=b2.id, blurb='bar')
ba3 = BookAuthor(author_id=a2.id, book_id=b2.id, blurb='baz')
session.add(ba1)
session.add(ba2)
session.add(ba3)
session.commit()
# Get the authors for book with id 2
with Session(bind=engine) as session:
s = """
SELECT foo.* FROM (
SELECT
authors.*,
book_authors.blurb,
book_authors.book_id
FROM authors INNER JOIN book_authors ON authors.id = book_authors.author_id
) AS foo
INNER JOIN books ON foo.book_id = books.id
WHERE books.id = :bookid
"""
result = session.execute(s, params={'bookid':2}).fetchall()
print(result)
See that semi-nasty query at the end? It successfully returns the authors for book 2, including the corresponding blurb about each author. I would like to create a .authors attribute of my Book model that executes this query.
Figured it out. The trick was to key was to use a plain descriptor with object_session()
class Book(Base):
__tablename__ = 'books'
id = Column(Integer, primary_key=True)
#property
def authors(self):
s = """
SELECT foo.* FROM (
SELECT
authors.*,
book_authors.blurb,
book_authors.book_id
FROM authors INNER JOIN book_authors ON authors.id = book_authors.author_id
) AS foo
INNER JOIN books ON foo.book_id = books.id
WHERE books.id = :bookid
"""
result = object_session(self).execute(s, params={'bookid': self.id}).fetchall()
return result

SQLAlchemy table defining relationship using two foreign keys

I have two tables, Users and ChatSessions. ChatSessions has two fields, user_id and friend_id, both foreign keys to the Users table.
user_id always contains the user that initiated the chat session, friend_id is the other user. As a certain user can have chat sessions initiated by him, or his friends, he can have his id either as user_id or as friend_id, in various sessions.
Is it possible to define a relationship in the Users table, where i have access to all the chat_sessions of that user, no matter whether his id is in user_id or friend_id?
Something like this:
chat_sessions = db.relationship('chat_sessions',
primaryjoin="or_(User.id==ChatSession.user_id, User.id==ChatSession.friend_id)",
backref="user")
I receive the following error when I try to commit an entry to the Users table:
ERROR main.py:76 [10.0.2.2] Unhandled Exception [93e3f515-7dd6-4e8d-b096-8239313433f2]: relationship 'chat_sessions' expects a class or a mapper argument (received: <class 'sqlalchemy.sql.schema.Table'>)
The models:
class User(db.Model):
__tablename__ = 'users'
id = db.Column(db.Integer, primary_key=True)
email = db.Column(db.String(60), index=True, unique=True)
password = db.Column(db.String(255))
name = db.Column(db.String(100))
active = db.Column(db.Boolean(), nullable=False)
chat_sessions = db.relationship('chat_sessions',
primaryjoin="or_(User.id==ChatSession.user_id, User.id==ChatSession.friend_id)")
class ChatSession(db.Model):
__tablename__ = 'chat_sessions'
id = db.Column(db.Integer, primary_key=True)
user_id = db.Column(db.Integer, db.ForeignKey('users.id'))
friend_id = db.Column(db.Integer, db.ForeignKey('users.id'))
status = db.Column(db.String(50))
user = db.relationship('User', foreign_keys=[user_id])
friend = db.relationship('User', foreign_keys=[friend_id])
It's difficult to be certain without seeing the tables' code, but it might be sufficient to remove the backref argument.
Here's a pure SQLAlchemy implementation that seems to do what you want:
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import orm
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String)
all_chats = orm.relationship('Chat',
primaryjoin="or_(User.id==Chat.user_id, User.id==Chat.friend_id)")
def __repr__(self):
return f'User(name={self.name})'
class Chat(Base):
__tablename__ = 'chats'
id = sa.Column(sa.Integer, primary_key=True)
user_id = sa.Column(sa.Integer, sa.ForeignKey('users.id'))
friend_id = sa.Column(sa.Integer, sa.ForeignKey('users.id'))
user = orm.relationship('User', foreign_keys=[user_id])
friend = orm.relationship('User', foreign_keys=[friend_id])
def __repr__(self):
return f'Chat(user={self.user.name}, friend={self.friend.name})'
engine = sa.create_engine('sqlite:///')
Base.metadata.create_all(bind=engine)
Session = orm.sessionmaker(bind=engine)
usernames = ['Alice', 'Bob', 'Carol']
session = Session()
users = [User(name=name) for name in usernames]
session.add_all(users)
session.flush()
a, b, c = users
session.add(Chat(user_id=a.id, friend_id=b.id))
session.add(Chat(user_id=a.id, friend_id=c.id))
session.add(Chat(user_id=c.id, friend_id=a.id))
session.commit()
session.close()
session = Session()
users = session.query(User)
for user in users:
for chat in user.all_chats:
print(user, chat)
print()
session.close()
This is the output:
User(name=Alice) Chat(user=Alice, friend=Bob)
User(name=Alice) Chat(user=Alice, friend=Carol)
User(name=Alice) Chat(user=Carol, friend=Alice)
User(name=Bob) Chat(user=Alice, friend=Bob)
User(name=Carol) Chat(user=Alice, friend=Carol)
User(name=Carol) Chat(user=Carol, friend=Alice)

SqlAlchemy, how to make automatic join

I got 3 tables
class Article_Comment(Base):
__tablename__ = 'article_comment'
article_id = Column(Integer, ForeignKey('article.id'), primary_key=True)
comment_id = Column(Integer, ForeignKey('comment.id'), primary_key=True)
child = relationship("Comment", lazy="joined", innerjoin=True)
class Article(Base):
__tablename__ = 'article'
id = Column(Integer, primary_key=True)
title = Column(String)
children = relationship("Article_Comment", lazy="joined", innerjoin=True)
class Comment(Base):
__tablename__ = 'comment'
id = Column(Integer, primary_key=True)
text = Column(String)
I need to get specific Article with Comments. I do this like this:
Session = sessionmaker(bind=engine)
session = Session()
result = session.query(Article, Comment).join(Article_Comment).join(Comment).filter(Article_Comment.article_id == Article.id).filter(Article_Comment.comment_id == Comment.id).filter(Article.title=='music').all()
for i, j in result:
print i.title, j.text
But I want to make this query without using .join.
Can someone help me?
May be, I need to remake relationships?
My thx to univerio
Here is the full answer
set up a relationship on Article, comments = relationship(Comment, secondary=Article_Comment.__table__, lazy="joined")
session.query(Article).filter(Article.title=='music').one().‌​comments

sqlalchemy foreign keys / query joins

Hi im having some trouble with foreign key in sqlalchemy not auto incrementing on a primary key ID
Im using: python 2.7, pyramid 1.3 and sqlalchemy 0.7
Here is my models
class Page(Base):
__tablename__ = 'page'
id = Column(Integer, ForeignKey('mapper.object_id'), autoincrement=True, primary_key=True)
title = Column(String(30), unique=True)
title_slug = Column(String(75), unique=True)
text = Column(Text)
date_added = Column(DateTime)
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String(100), unique=True)
email = Column(String(100), unique=True)
password = Column(String(100))
class Group(Base):
__tablename__ = 'groups'
id = Column(Integer, primary_key=True)
name = Column(String(100), unique=True)
class Member(Base):
__tablename__ = 'members'
user_id = Column(Integer, ForeignKey('user.id'), primary_key=True)
group_id = Column(Integer, ForeignKey('groups.id'), primary_key=True)
class Resource(Base):
__tablename__ = 'resource'
id = Column(Integer, primary_key=True)
tablename = Column(Text)
action = Column(Text)
class Mapper(Base):
__tablename__ = 'mapper'
resource_id = Column(Integer, ForeignKey('resource.id'), primary_key=True)
group_id = Column(Integer, ForeignKey('groups.id'), primary_key=True)
object_id = Column(Integer, primary_key=True)
and here is my RAW SQL query which i've written in SQLAlchemys ORM
'''
SELECT g.name, r.action
FROM groups AS g
INNER JOIN resource AS r
ON m.resource_id = r.id
INNER JOIN page AS p
ON p.id = m.object_id
INNER JOIN mapper AS m
ON m.group_id = g.id
WHERE p.id = ? AND
r.tablename = ?;
'''
obj = Page
query = DBSession().query(Group.name, Resource.action)\
.join(Mapper)\
.join(obj)\
.join(Resource)\
.filter(obj.id == obj_id, Resource.tablename == obj.__tablename__).all()
the raw SQL Query works fine without any relations between Page and Mapper, but SQLAlchemys ORM seem to require a ForeignKey link to be able to join them. So i decided to put the ForeignKey at Page.id since Mapper.object_id will link to several different tables.
This makes the SQL ORM query with the joins work as expected but adding new data to the Page table results in a exception.
FlushError: Instance <Page at 0x3377c90> has a NULL identity key.
If this is an auto- generated value, check that the database
table allows generation of new primary key values, and that the mapped
Column object is configured to expect these generated values.
Ensure also that this flush() is not occurring at an inappropriate time,
such as within a load() event.
here is my view code:
try:
session = DBSession()
with transaction.manager:
page = Page(title, text)
session.add(page)
return HTTPFound(location=request.route_url('home'))
except Exception as e:
print e
pass
finally:
session.close()
I really don't know why, but i'd rather have the solution in SQLalchemy than doing the RAW SQL since im making this project for learning purposes :)
I do not think autoincrement=True and ForeignKey(...) play together well.
In any case, for join to work without any ForeignKey, you can just specify the join condition in the second parameter of the join(...):
obj = Page
query = DBSession().query(Group.name, Resource.action)\
.join(Mapper)\
.join(Resource)\
.join(obj, Resource.tablename == obj.__tablename__)\
.filter(obj.id == obj_id)\
.all()

Categories

Resources