I am trying to query my database using Sqlalchemy ORM methods. I have created the tables and the engine as well as testing raw sql against it. I want to be able to use a Location code as a parameter from the locations table and pull the origin / destination from the trip table. Here is the code below:
Base = declarative_base () # generated base class orm
class Trip(Base):
__tablename__="trips"
id = Column("id", Integer, primary_key=True, autoincrement=True)
route =Column("Route", String(25))
origin_id = Column("origin_id", String(10), ForeignKey("locations.locationCode"))
destination_id = Column("destination_id", String(10), ForeignKey("locations.locationCode"))
origin = relationship("Location", foreign_keys=[origin_id])
destination = relationship("Location", foreign_keys=[destination_id])
class Location(Base):
__tablename__ = "locations"
locationCode = Column("locationCode",String(10), primary_key = True)
latitude = Column("latitude", String(25))
longitude = Column("longitude", String(25))
facilityOwnedByCarvana = Column("facilityOwnedByCarvana",Integer)
engine = create_engine("sqlite:///carvana.db")
Session = sessionmaker(bind=engine)
session = Session()
Base.metadata.create_all(engine)
locationsDF = pd.read_csv("data/locations.csv")
tripsDf = pd.read_csv("data/trips.csv")
locationsDF.to_sql(con=engine, name=Location.__tablename__, if_exists="replace", index=False)
tripsDf.to_sql(con=engine, name=Trip.__tablename__,if_exists="replace", index=False)
Here is my attempt at the query
q = (
session.query(Location)
.outerjoin(Trip, Location.locationCode == Trip.destination_id)
.filter(Location.locationCode == "BALT")
.order_by(Location.locationCode)
.limit(10)
)
General:
db_session.query(class_table1)
.join(class_table2,
class_table2.key_table2 == class_table1.key_table1,
isouter=True)
Specific:
db_session.query(Location)
.join(Trip,
Trip.destination_id== Location.locationCode ,
isouter=True)
Related
For the back-end, attached you will be able to find an XML file, where you need:
To create a parser in nodejs/php/python to read xml
Creating a MYSQL database (schema) saves the xml data
To use ORM to communicate with the database
The script should handle insert/update/delete
No need for frontend (CLI is enough)
I tried to solve this in python but im stuck at the function i need to create to store data from XML to Database table.
# import xml element tree
import xml.etree.ElementTree as ET
# import mysql connector
import mysql.connector
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
# give the connection parameters
# user name is root
# password is 2634687
# server is localhost
database = mysql.connector.connect(user='root', password='2634687', host='localhost')
# reading xml file , file name is dataset.xml
tree = ET.parse('dataset.xml')
# creating the cursor object
c = database.cursor()
# c.execute("CREATE DATABASE testingdb")
# print("testdb Data base is created")
# Connect to a MySQL database
engine = create_engine('mysql+pymysql://root:2634687#localhost/testingdb', echo=True)
# Define the Product model
Base = declarative_base()
class Product(Base):
__tablename__ = 'Product'
productId = Column(Integer, primary_key=True)
cedi = Column(String(100))
childWeightFrom = Column(String(100))
childWeightTo = Column(Integer)
color_code = Column(Integer)
color_description = Column(String(100))
countryImages = Column(String(100))
defaultSku = Column(String(100))
preferredEan = Column(Integer)
sapAssortmentLevel = Column(String(100))
sapPrice = Column(Integer)
season = Column(String(100))
showOnLineSku = Column(String(100))
size_code = Column(String(100))
size_description = Column(String(100))
skuID = Column(Integer)
skuName = Column(String(100))
stateOfArticle = Column(String(100))
umSAPprice = Column(String(10))
volume = Column(Integer)
weight = Column(Integer)
# Create the users table
Base.metadata.create_all(engine)
# Create a session to interact with the database
Session = sessionmaker(bind=engine)
session = Session()
# Insert a new product
new_product = Product(cedi='CD01')
session.add(new_product)
session.commit()
# Update the user's age
Product = session.query(Product).filter_by(cedi='CD01').first()
Product.childWeightFrom = 31
session.commit()
# Delete the user
session.delete(Product)
session.commit()
I have a schema as follows:
Thing # Base class for below tables
- id
Ball (Thing)
- color
Bin (Thing)
- ball -> Ball.id
Court (Thing)
- homeBin -> Bin.id
- awayBin -> Bin.id
I'd like to ensure that whenever I load a set of Courts, it includes the latest Ball column values. From what I understand, contains_eager() might be able to help with that:
Indicate that the given attribute should be eagerly loaded from columns stated manually in the query.
I have a test that queries every few seconds for any Courts. I'm finding that, even with contains_eager, I only ever see the same value for Ball.color, even though I've explicitly updated the column's value in the database.
Why does sqlalchemy appear to reuse this old data?
Below is a working example of what's happening:
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.associationproxy import association_proxy
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Thing(Base):
__tablename__ = "Things"
id = Column(Integer, primary_key=True)
name = Column(String(256))
thingType = Column(String(256))
__mapper_args__ = {
'with_polymorphic':'*',
'polymorphic_on':"thingType",
'polymorphic_identity':"thing"
}
class Ball(Thing):
__tablename__ = "Balls"
id = Column('id', Integer, ForeignKey('Things.id'), primary_key=True)
color = Column('color', String(256))
__mapper_args__ = {
'polymorphic_identity':'ball'
}
class Bin(Thing):
__tablename__ = "Bins"
id = Column('id', Integer, ForeignKey('Things.id'), primary_key=True)
shape = Column('shape', String(256))
ballId = Column('ballId', Integer, ForeignKey('Balls.id'))
ball = relationship(Ball, foreign_keys=[ballId], backref="outputBins")
__mapper_args__ = {
'polymorphic_identity':'bin'
}
pass
class Court(Thing):
__tablename__ = "Courts"
id = Column('id', Integer, ForeignKey('Things.id'), primary_key=True)
homeBinId = Column('homeBinId', Integer, ForeignKey('Bins.id'))
awayBinId = Column('awayBinId', Integer, ForeignKey('Bins.id'))
homeBin = relationship(Bin, foreign_keys=[homeBinId], backref="homeCourts")
awayBin = relationship(Bin, foreign_keys=[awayBinId], backref="awayCourts")
__mapper_args__ = {
'polymorphic_identity':'court'
}
metadata = MetaData()
engine = create_engine("postgresql://localhost:5432/")
Session = sessionmaker(bind=engine)
session = Session()
def courtQuery():
awayBalls = aliased(Ball, name="awayBalls")
homeBalls = aliased(Ball, name="homeBalls")
awayBins = aliased(Bin, name="awayBins")
homeBins = aliased(Bin, name="homeBins")
query = session.query(Court)\
.outerjoin(awayBins, Court.awayBinId == awayBins.id)\
.outerjoin(awayBalls, awayBins.ballId == awayBalls.id)\
.outerjoin(homeBins, Court.homeBinId == homeBins.id)\
.outerjoin(homeBalls, homeBins.ballId == homeBalls.id)\
.options(contains_eager(Court.awayBin, alias=awayBins).contains_eager(awayBins.ball, alias=awayBalls))\
.options(contains_eager(Court.homeBin, alias=homeBins).contains_eager(homeBins.ball, alias=homeBalls))
return [r for r in query]
import time
while(True):
results = courtQuery()
court = results[0]
ball = court.homeBin.ball
print(ball.color) # does not change
time.sleep(2)
Environment:
Python 2.7.14
SqlAlchemy 1.3.0b1
PostGres 11.3 (though I've seen this
on Oracle as well)
I'm using SQLAlchemy to query a number of similar tables, and union the results. The tables are rows of customer information, but our current database structures it so that different groups of customers are in their own tables e.g. client_group1, client_group2, client_group3:
client_group1:
| id | name | email |
| 1 | john | johnsmith#gmail.com |
| 2 | greg | gregjones#gmail.com |
Each of the other tables have identical columns. If I'm using SQLAlchemy declarative_base, I can have a class for client_group1 like the following:
def ClientGroup1(Base):
__tablename__ = 'client_group1'
__table_args__ = {u'schema': 'clients'}
id = Column(Integer, primary_key=True)
name = Column(String(32))
email = Column(String(32))
Then I can do queries such as:
session.query(ClientGroup1.name)
However, if I use union_all to combine a bunch of client tables into a viewport, such as:
query1 = session.query(ClientGroup1.name)
query2 = session.query(ClientGroup2.name)
viewport = union_all(query1, query2)
then I'm not sure how to map a viewport to an object, and instead I have to access viewport columns using:
viewport.c.name
Is there any way to map the viewport to a specific table structure? Especially considering the fact that each class points to a different __table_name__
Read Concrete Table Inheritance documentation for the idea how this can be done. The code below is a running example of how this can be done:
from sqlalchemy import create_engine, Column, String, Integer
from sqlalchemy.orm import sessionmaker, configure_mappers
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.declarative import AbstractConcreteBase
engine = create_engine('sqlite:///:memory:', echo=True)
Session = sessionmaker(bind=engine)
session = Session()
Base = declarative_base(engine)
class ClientGroupBase(AbstractConcreteBase, Base):
pass
class ClientGroup1(ClientGroupBase):
__tablename__ = 'client_group1'
# __table_args__ = {'schema': 'clients'}
__mapper_args__ = {
'polymorphic_identity': 'client_group1',
'concrete': True,
}
id = Column(Integer, primary_key=True)
name = Column(String(32))
email = Column(String(32))
class ClientGroup2(ClientGroupBase):
__tablename__ = 'client_group2'
# __table_args__ = {'schema': 'clients'}
__mapper_args__ = {
'polymorphic_identity': 'client_group2',
'concrete': True,
}
id = Column(Integer, primary_key=True)
name = Column(String(32))
email = Column(String(32))
def _test_model():
# generate classes for all tables
Base.metadata.create_all()
print('-'*80)
# configure mappers (see documentation)
configure_mappers()
print('-'*80)
# add some test data
session.add(ClientGroup1(name="name1"))
session.add(ClientGroup1(name="name1"))
session.add(ClientGroup2(name="name1"))
session.add(ClientGroup2(name="name1"))
session.commit()
print('-'*80)
# perform a query
q = session.query(ClientGroupBase).all()
for r in q:
print(r)
if __name__ == '__main__':
_test_model()
The above example has an added benefit that you can also create new objects, as well as query only some tables.
You could do it mapping an SQL VIEW to a class, but you need to specify a primary key explicitly (see Is possible to mapping view with class using mapper in SqlAlchemy?). In you case, I am afraid, this might not work because of the same PK value in multiple tables, and using a multi-column PK might not be the best idea.
I have defined few tables in Pyramid like this:
# coding: utf-8
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Integer, Float, DateTime, ForeignKey, ForeignKeyConstraint, String, Column
from sqlalchemy.orm import scoped_session, sessionmaker, relationship, backref,
from zope.sqlalchemy import ZopeTransactionExtension
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
Base = declarative_base()
class Codes(Base):
__tablename__ = 'Code'
__table_args__ = {u'schema': 'Locations'}
id = Column(Integer, nullable=False)
code_str = Column(String(9), primary_key=True)
name = Column(String(100))
incoming = relationship(u'Voyages', primaryjoin='Voyage.call == Codes.code_str', backref=backref('Code'))
class Locations(Base):
__tablename__ = 'Location'
__table_args__ = {u'schema': 'Locations'}
unit_id = Column(ForeignKey(u'Structure.Definition.unit_id', ondelete=u'RESTRICT', onupdate=u'CASCADE'), primary_key=True, nullable=False)
timestamp = Column(DateTime, primary_key=True, nullable=False)
longitude = Column(Float)
latitude = Column(Float)
class Voyages(Base):
__tablename__ = 'Voyage'
__table_args__ = (ForeignKeyConstraint(['unit_id', 'Voyage_id'], [u'Locations.Voyages.unit_id', u'Locations.Voyages.voyage_id'], ondelete=u'RESTRICT', onupdate=u'CASCADE'), {u'schema': 'Locations'}
)
uid = Column(Integer, primary_key=True)
unit_id = Column(Integer)
voyage_id = Column(Integer)
departure = Column(ForeignKey(u'Locations.Code.code_str', ondelete=u'RESTRICT', onupdate=u'CASCADE'))
call = Column(ForeignKey(u'Locations.Code.code_str', ondelete=u'RESTRICT', onupdate=u'CASCADE'))
departure_date = Column(DateTime)
voyage_departure = relationship(u'Codes', primaryjoin='Voyage.departure == Codes.code_str')
voyage_call = relationship(u'Codes', primaryjoin='Voyage.call == Codes.code_str')
class Definitions(Base):
__tablename__ = 'Definition'
__table_args__ = {u'schema': 'Structure'}
unit_id = Column(Integer, primary_key=True)
name = Column(String(90))
type = Column(ForeignKey(u'Structure.Type.id', ondelete=u'RESTRICT', onupdate=u'CASCADE'))
locations = relationship(u'Locations', backref=backref('Definition'))
dimensions = relationship(u'Dimensions', backref=backref('Definition'))
types = relationship(u'Types', backref=backref('Definition'))
voyages = relationship(u'Voyages', backref=backref('Definition'))
class Dimensions(Base):
__tablename__ = 'Dimension'
__table_args__ = {u'schema': 'Structure'}
unit_id = Column(ForeignKey(u'Structure.Definition.unit_id', ondelete=u'RESTRICT', onupdate=u'CASCADE'), primary_key=True, nullable=False)
length = Column(Float)
class Types(Base):
__tablename__ = 'Type'
__table_args__ = {u'schema': 'Structure'}
id = Column(SmallInteger, primary_key=True)
type_name = Column(String(255))
type_description = Column(String(255))
What I am trying to do here is to find a specific row from Codes table (filter it by code_str) and get all related tables in return, but under the condition that Location table returns only the last row by timestamp, Voyage table must return only the last row by departure, and it must have all information from Definitions table.
I started to create a query from the scratch and came across something like this:
string_to_search = request.matchdict.get('code')
sub_dest = DBSession.query(func.max(Voyage.departure).label('latest_voyage_timestamp'), Voyage.unit_id, Voyage.call.label('destination_call')).\
filter(Voyage.call== string_to_search).\
group_by(Voyage.unit_id, Voyage.call).\
subquery()
query = DBSession.query(Codes, Voyage).\
join(sub_dest, sub_dest.c.destination_call == Codes.code_str).\
outerjoin(Voyage, sub_dest.c.latest_voyage_timestamp == Voyage.departure_date)
but I have notice that when I iterate through my results (like for code, voyage in query) I am actually iterating every Voyage I get in return. In theory it is not a big problem for me but I am trying to construct some json response with basic information from Codes table which would include all possible Voyages (if there is any at all).
For example:
code_data = {}
all_units = []
for code, voyage in query:
if code_data is not {}:
code_data = {
'code_id': code.id,
'code_str': code.code_str,
'code_name': code.name,
}
single_unit = {
'unit_id': voyage.unit_id,
'unit_departure': str(voyage.departure_date) if voyage.departure_date else None,
}
all_units.append(single_unit)
return {
'code_data': exception.message if exception else code_data,
'voyages': exception.message if exception else all_units,
}
Now, this seems a bit wrong because I don't like rewriting this code_data in each loop, so I put if code_data is not {} line here, but I suppose it would be much better (logical) to iterate in a way similar to this:
for code in query:
code_data = {
'code_id': code.id,
'code_str': code.code_str,
'code_name': code.name,
}
for voyage in code.voyages:
single_unit = {
'unit_id': voyage.unit_id,
'unit_departure': str(voyage.departure) if voyage.departure else None,
}
all_units.append(single_unit)
return {
'code_data': exception.message if exception else code_data,
}
So, to get only single Code in return (since I queried the db for that specific Code) which would then have all Voyages related to it as a nested value, and of course, in each Voyage all other information related to Definition of the particular Unit...
Is my approach good at all in the first place, and how could I construct my query in order to iterate it in this second way?
I'm using Python 2.7.6, SQLAlchemy 0.9.7 and Pyramid 1.5.1 with Postgres database.
Thanks!
Try changing the outer query like so:
query = DBSession.query(Codes).options(contains_eager('incoming')).\
join(sub_dest, sub_dest.c.destination_call == Codes.code_str).\
outerjoin(Voyage, sub_dest.c.latest_voyage_timestamp == Voyage.departure_date)
In case of problems, try calling the options(...) part like so:
(...) .options(contains_eager(Codes.incoming)). (...)
This should result in a single Codes instance being returned with Voyages objects accessible via the relationship you've defined (incoming), so you could proceed with:
results = query.all()
for code in results:
print code
# do something with code.incoming
# actually, you should get only one code so if it proves to work, you should
# use query.one() so that in case something else than a single Code is returned,
# an exception is thrown
of course you need an import, e.g.: from sqlalchemy.orm import contains_eager
I've defined the model's id field of the table like this:
id = Column(Integer(15, unsigned=True),
nullable=False,
server_default='0',
primary_key=True,
unique=True,
autoincrement=True)
and altered the database(MySQL) table accordingly but still when I create the model
and try to commit it (Im using SQLalchemy 0.7.8)
m = MyModel(values without defining the id)
session.add(m)
session.commit()
I get this error
FlushError: Instance <MyModel at 0x4566990> has a NULL identity key.
If this is an auto-generated value, check that the database table
allows generation of new primary key values, and that the mapped
Column object is configured to expect these generated values. Ensure
also that this flush() is not occurring at an inappropriate time, such
as within a load() event.
I use Postgres 13 and the type of ID is UUIT data type. I met the same issue.
I have solved it by applying server_default.
class TrafficLightController(Base):
__tablename__ = 'Tlc'
id = Column(UUID, primary_key=True, server_default='uuid_generate_v4()')
type_id = Column('type_id', UUID)
title = Column('title', String(100))
gps_x = Column('gps_x', Float)
gps_y = Column('gps_y', Float)
hardware_config = Column('hardware_config', JSONB)
lcu_id = Column('lcu_id', UUID)
signal_id = Column('signal_id', UUID)
def __init__(self, type_id, title, gps_x, gps_y, hardware_config, lcu_id, signal_id):
self.type_id = type_id
self.title = title
self.gps_x = gps_x
self.gps_y = gps_y
self.hardware_config = hardware_config
self.lcu_id = lcu_id
self.signal_id = signal_id
if __name__ == "__main__":
dbschema = 'asudd'
engine = create_engine(DB_CONNECTION_STR, connect_args={'options': '- csearch_path={}'.format(dbschema)})
Session = sessionmaker(bind=engine)
Base = declarative_base()
Base.metadata.create_all(engine)
session = Session()
tlc_obj = TrafficLightController("b0322313-0995-40ac-889c-c65702e1841e", "test DK", 35, 45, "{}", None, None)
session.add(tlc_obj)
I solved it by removing the server_default value