I've just run across a fairly vexing problem, and after testing I have found that NONE of the available answers are sufficient.
I have seen various suggestions but none seem to be able to return the last inserted value for an auto_increment field in MySQL.
I have seen examples that mention the use of session.flush() to add the record and then retrieve the id. However that always seems to return 0.
I have also seen examples that mention the use of session.refresh() but that raises the following error: InvalidRequestError: Could not refresh instance ''
What I'm trying to do seems insanely simple but I can't seem to figure out the secret.
I'm using the declarative approach.
So, my code looks something like this:
class Foo(Base):
__tablename__ = 'tblfoo'
__table_args__ = {'mysql_engine':'InnoDB'}
ModelID = Column(INTEGER(unsigned=True), default=0, primary_key=True, autoincrement=True)
ModelName = Column(Unicode(255), nullable=True, index=True)
ModelMemo = Column(Unicode(255), nullable=True)
f = Foo(ModelName='Bar', ModelMemo='Foo')
session.add(f)
session.flush()
At this point, the object f has been pushed to the DB, and has been automatically assigned a unique primary key id. However, I can't seem to find a way to obtain the value to use in some additional operations. I would like to do the following:
my_new_id = f.ModelID
I know I could simply execute another query to lookup the ModelID based on other parameters but I would prefer not to if at all possible.
I would much appreciate any insight into a solution to this problem.
Thanks for the help in advance.
The problem is you are setting defaul for the auto increment. So when it run the insert into query the log of server is
2011-12-21 13:44:26,561 INFO sqlalchemy.engine.base.Engine.0x...1150 INSERT INTO tblfoo (`ModelID`, `ModelName`, `ModelMemo`) VALUES (%s, %s, %s)
2011-12-21 13:44:26,561 INFO sqlalchemy.engine.base.Engine.0x...1150 (0, 'Bar', 'Foo')
ID : 0
So the output is 0 which is the default value and which is passed because you are setting default value for autoincrement column.
If I run same code without default then it give the correct output.
Please try this code
from sqlalchemy import create_engine
engine = create_engine('mysql://test:test#localhost/test1', echo=True)
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
session = Session()
from sqlalchemy import Column, Integer, Unicode
class Foo(Base):
__tablename__ = 'tblfoo'
__table_args__ = {'mysql_engine':'InnoDB'}
ModelID = Column(Integer, primary_key=True, autoincrement=True)
ModelName = Column(Unicode(255), nullable=True, index=True)
ModelMemo = Column(Unicode(255), nullable=True)
Base.metadata.create_all(engine)
f = Foo(ModelName='Bar', ModelMemo='Foo')
session.add(f)
session.flush()
print "ID :", f.ModelID
Try using session.commit() instead of session.flush(). You can then use f.ModelID.
Not sure why the flagged answer worked for you. But in my case, that does not actually insert the row into the table. I need to call commit() in the end.
So the last few lines of code are:
f = Foo(ModelName='Bar', ModelMemo='Foo')
session.add(f)
session.flush()
print "ID:", f.ModelID
session.commit()
Related
I need to change a project that currently uses the library mysqlclient to use pymysql because of license issues.
The project uses sqlalchemy and doesn't use mysqlclient directly so I thought I will only need to change the connection string but I seem to encounter an edge case.
I have places in the code where some columns are defined in the sqlalchemy model as String, but for some reason (old code) the code tries to put a dict there. This works by casting the dict to str (this is the expected behaviour for all types - if I put int it will be cast to str).
When I change from the library mysqlclient to pymysql this behaviour seem to break only for dicts.
Here is a sample code that replicate this issue:
import sqlalchemy
from sqlalchemy import Column, Integer, String, DateTime, func, text, MetaData
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
SCHEMA = 'testing'
con = "mysql+pymysql://{USERNAME}:{PASSWORD}#{HOST}/{SCHEMA}?charset=utf8mb4".format(USERNAME='reducted',
PASSWORD="reducted",
HOST='127.0.0.1:3306',
SCHEMA=SCHEMA)
engine = sqlalchemy.create_engine(con, pool_recycle=3600, pool_size=20, pool_pre_ping=True, max_overflow=100)
metadata = MetaData(bind=engine)
base = declarative_base(metadata=metadata)
class TestModel(base):
__tablename__ = 'test_table'
__table_args__ = {'autoload': False,
'schema': SCHEMA
}
id = Column(Integer(), primary_key=True, nullable=False, index=True)
test_value = Column(String(50), nullable=False)
date_created = Column(DateTime, server_default=func.now(), index=True)
date_modified = Column(DateTime, server_default=text('CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP'), index=True)
metadata.create_all()
session_maker = sessionmaker(bind=engine)
session = session_maker()
row = TestModel()
row.test_value = {}
session.add(row)
session.commit()
This causes this error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '})' at line 1
If you change pymysql in the connection string to mysqldb the code will work.
My question is this:
Is there a workaround or is there a sqlalchemy hook i can use cast the dicts myself?
Also if anymore knows about more issues in moving from mysqlclient to pymysql i would appreciate any tip, I cant seem to find any documentation of the differences (except the license part)
is there a sqlalchemy hook i can use cast the dicts myself?
You could add a validator to your TestModel class:
#validates("test_value")
def validate_test_value(self, key, thing):
if isinstance(thing, dict):
return str(thing)
else:
return thing
I am just trying to get started using sqlalchemy. For whatever reason I can't get anything to work.
I installed sqlalchemy the import alone works. I tried to start following the code on this site:
https://www.pythoncentral.io/introductory-tutorial-python-sqlalchemy/
The code is as follows:
import os
import sys
from sqlalchemy import Column, ForeignKey, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy import create_engine
Base = declarative_base()
class Person(Base):
__tablename__ = 'person'
# Here we define columns for the table person
# Notice that each column is also a normal Python instance attribute.
id = Column(Integer, primary_key=True)
name = Column(String(250), nullable=False)
class Address(Base):
__tablename__ = 'address'
# Here we define columns for the table address.
# Notice that each column is also a normal Python instance attribute.
id = Column(Integer, primary_key=True)
street_name = Column(String(250))
street_number = Column(String(250))
post_code = Column(String(250), nullable=False)
person_id = Column(Integer, ForeignKey('person.id'))
person = relationship(Person)
# Create an engine that stores data in the local directory's
# sqlalchemy_example.db file.
engine = create_engine('sqlite:///sqlalchemy_example.db')
# Create all tables in the engine. This is equivalent to "Create Table"
# statements in raw SQL.
Base.metadata.create_all(engine)
I copied and pasted the code to create a table and I'm getting the following error
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to
open database file (Background on this error at:
http://sqlalche.me/e/e3q8)
I went to http://sqlalche.me/e/e3q8 and it seems to believe that adding pool_pre_ping=True to the engine would help resolve issue. It mentions a connection issues, but don't really understand how that can be since it's just creating the sqlite database.
I would really appreciate any advice on how I can fix this issue.
Edit: I put the specific code into my question.
Also I tried performing the code in pythonanywhere and it works as expected. Any guidance on what could be wrong with my machine would be appreciated.
So for whatever reason I needed to designate the absolute path of where the database needed to be. I updated my engine to be:
sqlite:///C:\user\file_path\test.db
this allowed it to create the database. However I'd really prefer it just create the database in the current directory. If someone knows what I need to do to get that to work that would be great.
I've been playing with SQL Alchemy for a couple of months now and so far been really impressed with it.
There is one issue I've run into now that seems to be a bug, but I'm not sure that I'm doing the right thing. We use MS SQL here, with table reflection to define the table classes, however I can replicate the problem using an in-memory SQLite database, code for which I have included here.
What I am doing is defining a many to many relationship between two tables using a linking table between them. There is one extra piece of information that the linking table contains which I want to use for filtering the links, requiring the use of a primaryjoin statement on the relationship. This works perfectly for lazy loading, however for performance reasons we need eager loading and thats where it all falls over.
If I define the relationship with lazy loading:
activefunds = relationship('Fund', secondary='fundbenchmarklink',
primaryjoin='and_(FundBenchmarkLink.isactive==True,'
'Benchmark.id==FundBenchmarkLink.benchmarkid,'
'Fund.id==FundBenchmarkLink.fundid)')
and query the DB normally:
query = session.query(Benchmark)
The behaviour I need is exactly what I want, though performance is really bad, due to the extra SQL queries when iterating through all of the benchmarks and their respective funds.
If I define the relationship with eager loading:
activefunds = relationship('Fund', secondary='fundbenchmarklink',
primaryjoin='and_(FundBenchmarkLink.isactive==True,'
'Benchmark.id==FundBenchmarkLink.benchmarkid,'
'Fund.id==FundBenchmarkLink.fundid)',
lazy='joined')
and query the DB normally:
query = session.query(Benchmark)
it blows up in my face:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such column: fund.id
[SQL: 'SELECT benchmark.id AS benchmark_id,
benchmark.name AS benchmark_name,
fund_1.id AS fund_1_id,
fund_1.name AS fund_1_name,
fund_2.id AS fund_2_id,
fund_2.name AS fund_2_name
FROM benchmark
LEFT OUTER JOIN (fundbenchmarklink AS fundbenchmarklink_1
JOIN fund AS fund_1 ON fund_1.id = fundbenchmarklink_1.fundid) ON benchmark.id = fundbenchmarklink_1.benchmarkid
LEFT OUTER JOIN (fundbenchmarklink AS fundbenchmarklink_2
JOIN fund AS fund_2 ON fund_2.id = fundbenchmarklink_2.fundid) ON fundbenchmarklink_2.isactive = 1
AND benchmark.id = fundbenchmarklink_2.benchmarkid
AND fund.id = fundbenchmarklink_2.fundid']
The SQL above clearly shows the linked table is not being joined before attempting to access columns from it.
If I query the DB, specifically joining the linked table:
query = session.query(Benchmark).join(FundBenchmarkLink, Fund, isouter=True)
It works, however it means I now have to make sure that whenever I query the Benchmark table, I always have to define the join to add both of the extra tables.
Is there something I'm missing, is this a potential bug, or is it simply the way the library works?
Full working sample code to replicate issue:
import logging
logging.basicConfig(level=logging.INFO)
logging.getLogger('sqlalchemy.engine.base').setLevel(logging.INFO)
from sqlalchemy import Column, DateTime, String, Integer, Boolean, ForeignKey, create_engine
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class FundBenchmarkLink(Base):
__tablename__ = 'fundbenchmarklink'
fundid = Column(Integer, ForeignKey('fund.id'), primary_key=True, autoincrement=False)
benchmarkid = Column(Integer, ForeignKey('benchmark.id'), primary_key=True, autoincrement=False)
isactive = Column(Boolean, nullable=False, default=True)
fund = relationship('Fund')
benchmark = relationship('Benchmark')
def __repr__(self):
return "<FundBenchmarkLink(fundid='{}', benchmarkid='{}', isactive='{}')>".format(self.fundid, self.benchmarkid, self.isactive)
class Benchmark(Base):
__tablename__ = 'benchmark'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
funds = relationship('Fund', secondary='fundbenchmarklink', lazy='joined')
# activefunds has additional filtering on the secondary table, requiring a primaryjoin statement.
activefunds = relationship('Fund', secondary='fundbenchmarklink',
primaryjoin='and_(FundBenchmarkLink.isactive==True,'
'Benchmark.id==FundBenchmarkLink.benchmarkid,'
'Fund.id==FundBenchmarkLink.fundid)',
lazy='joined')
def __repr__(self):
return "<Benchmark(id='{}', name='{}')>".format(self.id, self.name)
class Fund(Base):
__tablename__ = 'fund'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
def __repr__(self):
return "<Fund(id='{}', name='{}')>".format(self.id, self.name)
if '__main__' == __name__:
engine = create_engine('sqlite://')
Base.metadata.create_all(engine)
maker = sessionmaker(bind=engine)
session = maker()
# Create some data
for bmkname in ['foo', 'bar', 'baz']:
bmk = Benchmark(name=bmkname)
session.add(bmk)
for fname in ['fund1', 'fund2', 'fund3']:
fnd = Fund(name=fname)
session.add(fnd)
session.add(FundBenchmarkLink(fundid=1, benchmarkid=1))
session.add(FundBenchmarkLink(fundid=2, benchmarkid=1))
session.add(FundBenchmarkLink(fundid=1, benchmarkid=2))
session.add(FundBenchmarkLink(fundid=2, benchmarkid=2, isactive=False))
session.commit()
# This code snippet works when activefunds doesn't exist, or doesn't use eager loading
# query = session.query(Benchmark)
# print(query)
# for bmk in query:
# print(bmk)
# for fund in bmk.funds:
# print('\t{}'.format(fund))
# This code snippet works for activefunds with eager loading
query = session.query(Benchmark).join(FundBenchmarkLink, Fund, isouter=True)
print(query)
for bmk in query:
print(bmk)
for fund in bmk.activefunds:
print('\t{}'.format(fund))
I think you've mixed the primary join and the secondary join a bit. Your primary would seem to contain both at the moment. Remove the predicate for Fund and it should work:
activefunds = relationship(
'Fund',
secondary='fundbenchmarklink',
primaryjoin='and_(FundBenchmarkLink.isactive==True,'
'Benchmark.id==FundBenchmarkLink.benchmarkid)',
lazy='joined')
The reason why your explicit join seems to fix the query is that it introduces the table fund before the implicit eager loading joins and so they can refer to it. It's not really a fix, rather than it hides the error. If you really want to use explicit Query.join() with eagerloading, inform the query about it with contains_eager(). Just be careful which relationship you choose as being contained, depending on the query in question; without additional filtering you could fill activefunds with inactive also.
Finally, consider using Query.outerjoin() instead of Query.join(..., isouter=True).
I have a sqlalchemy schema containing three tables, (A, B, and C) related via one-to-many Foreign Key relationships (between A->B) and (B->C) with SQLite as a backend. I create separate database files to store data, each of which use the exact same sqlalchemy Models and run identical code to put data into them.
I want to be able to copy data from all these individual databases and put them into a single new database file, while preserving the Foreign Key relationships. I tried the following code to copy data from one file to a new file:
import sqlalchemy
from sqlalchemy.ext import declarative
from sqlalchemy import Column, String, Integer
from sqlalchemy import orm, engine
Base = declarative.declarative_base()
Session = orm.session_maker()
class A(Base):
__tablename__ = 'A'
a_id = Column(Ingeter, primary_key=True)
adata = Column(String)
b = orm.relationship('B', back_populates='a', cascade='all, delete-orphan', passive_deletes=True)
class B(Base):
__tablename__ = 'B'
b_id = Column(Ingeter, primary_key=True)
a_id = Column(Integer, sqlalchemy.ForeignKey('A.a_id', ondelete='SET NULL')
bdata = Column(String)
a = orm.relationship('A', back_populates='b')
c = orm.relationship('C', back_populates='b', cascade='all, delete-orphan', passive_deletes=True)
class C(Base):
__tablename__ = 'C'
c_id = Column(Ingeter, primary_key=True)
b_id = Column(Integer, sqlalchemy.ForeignKey('B.b_id', ondelete='SET NULL')
cdata = Column(String)
b = orm.relationship('B', back_populates='c')
file_new = 'file_new.db'
resource_new = 'sqlite:////%s' % file_new.lstrip('/')
engine_new = sqlalchemy.create_engine(resource_new, echo=False)
session_new = Session(bind=engine_new)
file_old = 'file_old.db'
resource_old = 'sqlite:////%s' % file_old.lstrip('/')
engine_old = sqlalchemy.create_engine(resource_old, echo=False)
session_old = Session(bind=engine_old)
for arow in session_old.query(A):
session_new.add(arow) # I am assuming that this will somehow know to copy all the child rows from the tables B and C due to the Foreign Key.
When run, I get the error, "Object '' is already attached to session '2' (this is '1')". Any pointers on how to do this using sqlalchemy and sessions? I also want to preserve the Foreign Key relationships within each database.
The use case is where data is first generated locally in non-networked machines and aggregated into a central db on the cloud. While the data will get generated in SQLite, the merge might happen in MySQL or Postgres, although here everything is happening in SQLite for simplicity.
First, the reason you get that error is because the instance arow is still tracked by session_old, so session_new will refuse to deal with it. You can detach it from session_old:
session_old.expunge(arow)
Which will allow you do add arow to session_new without issue, but you'll notice that nothing gets inserted into file_new. This is because SQLAlchemy knows that arow is persistent (meaning there's a row in the db corresponding to this object), and when you detach it and add it to session_new, SQLAlchemy still thinks it's persistent, so it does not get inserted again.
This is where Session.merge comes in. One caveat is that it won't merge unloaded relationships, so you'll need to eager load all the relationships you want to merge:
query = session_old.query(A).options(orm.subqueryload(A.b),
orm.subqueryload(A.b, B.c))
for arow in query:
session_new.merge(arow)
I have two tables: Eca_users and Eca_user_emails, one user can have many emails. I recive json with users and their emails. And I wont to load them into MS SQL database. Users can update their emails, so in this json I can get the same users with new (or changed) emails.
My code
# some import here
Base = declarative_base()
class Eca_users(Base):
__tablename__ = 'eca_users'
sql_id = sqlalchemy.Column(sqlalchemy.Integer(), primary_key = True)
first_id = sqlalchemy.Column(sqlalchemy.String(15))
name = sqlalchemy.Column(sqlalchemy.String(200))
main_email = sqlalchemy.Column(sqlalchemy.String(200))
user_emails = relationship("Eca_user_emails", backref=backref('eca_users'))
class Eca_user_emails(Base):
__tablename__ = 'user_emails'
sql_id = sqlalchemy.Column(sqlalchemy.Integer(), primary_key = True)
email_address = Column(String(200), nullable=False)
status = Column(String(10), nullable=False)
active = Column(DateTime, nullable=True)
sql_user_id = Column(Integer, ForeignKey('eca_users.sql_id'))
def main()
engine = sqlalchemy.create_engine('mssql+pymssql://user:pass/ECAusers?charset=utf8')
Session = sessionmaker()
Session.configure(bind = engine)
session = Session()
#then I get my json, parse it and...
query = session.query(Eca_users).filter(Eca_users.first_id == str(user_id))
if query.count() == 0:
# not interesting now
else:
for exstUser in query:
exstUser.name = name #update user info
exstUser.user_emails = [:] # empty old emails
# creating new Email obj
newEmail = Eca_user_emails(email_address = email_record['email'],
status = email_record['status'],
active = active_date)
exstUser.user_emails.append(newEmail) # and I get error here because autoflush
session.commit()
if __name__ == '__main__':
main()
Error message:
sqlalchemy.exc.IntegrityError: ...
[SQL: 'UPDATE user_emails SET sql_user_id=%(sql_user_id)s WHERE user_emails.sql_id = %(user_emails_sql_id)s'] [parameters: {'sql_user_id': None, 'user_emails_sql_id': Decimal('1')}]
Can't find any idea why this sql_user_id is None :(
When I chek exstUser and newEmail objects in debugger - it looks like everething fine. I mean all the reference is OK. The session obj and it's dirty attribute looks also OK in the debugger (sql_user_id is set for Eca_user_emails obj).
And what is most strange for me - this code worked absolutely fine when it was without a main function, just all code after the classes declaration. But after I wrote main declaration and put all code here I started to get this error.
I am completely new to Python so maybe this is one of stupid mistakes...
Any ideas how to fix it and what is the reason? Thanks for reading this :)
By the way: Python 3.4, sqlalchemy 1.0, SQL Server 2012
sql_user_id is None because by default SQLAlchemy clears out the foreign key when you delete a child object across a relationship, that is, when you clear exstUser.user_emails SQLAlchemy sets sql_user_id to None for all those instances. If you want SQLAlchemy to issue DELETEs for Eca_user_emails instances when they are detached from Eca_users, you need to add delete-orphan cascade option to the user_emails relationship. If you want SQLAlchemy to issue DELETEs for Eca_user_emails instances when a Eca_users instance is deleted, you need to add the delete cascade option to the user_emails relationship.
user_emails = relationship("Eca_user_emails", backref=backref('eca_users'), cascade="save-update, merge, delete, delete-orphan")
You can find more information about cascades in the SQLAlchemy docs