I have two tables: Eca_users and Eca_user_emails, one user can have many emails. I recive json with users and their emails. And I wont to load them into MS SQL database. Users can update their emails, so in this json I can get the same users with new (or changed) emails.
My code
# some import here
Base = declarative_base()
class Eca_users(Base):
__tablename__ = 'eca_users'
sql_id = sqlalchemy.Column(sqlalchemy.Integer(), primary_key = True)
first_id = sqlalchemy.Column(sqlalchemy.String(15))
name = sqlalchemy.Column(sqlalchemy.String(200))
main_email = sqlalchemy.Column(sqlalchemy.String(200))
user_emails = relationship("Eca_user_emails", backref=backref('eca_users'))
class Eca_user_emails(Base):
__tablename__ = 'user_emails'
sql_id = sqlalchemy.Column(sqlalchemy.Integer(), primary_key = True)
email_address = Column(String(200), nullable=False)
status = Column(String(10), nullable=False)
active = Column(DateTime, nullable=True)
sql_user_id = Column(Integer, ForeignKey('eca_users.sql_id'))
def main()
engine = sqlalchemy.create_engine('mssql+pymssql://user:pass/ECAusers?charset=utf8')
Session = sessionmaker()
Session.configure(bind = engine)
session = Session()
#then I get my json, parse it and...
query = session.query(Eca_users).filter(Eca_users.first_id == str(user_id))
if query.count() == 0:
# not interesting now
else:
for exstUser in query:
exstUser.name = name #update user info
exstUser.user_emails = [:] # empty old emails
# creating new Email obj
newEmail = Eca_user_emails(email_address = email_record['email'],
status = email_record['status'],
active = active_date)
exstUser.user_emails.append(newEmail) # and I get error here because autoflush
session.commit()
if __name__ == '__main__':
main()
Error message:
sqlalchemy.exc.IntegrityError: ...
[SQL: 'UPDATE user_emails SET sql_user_id=%(sql_user_id)s WHERE user_emails.sql_id = %(user_emails_sql_id)s'] [parameters: {'sql_user_id': None, 'user_emails_sql_id': Decimal('1')}]
Can't find any idea why this sql_user_id is None :(
When I chek exstUser and newEmail objects in debugger - it looks like everething fine. I mean all the reference is OK. The session obj and it's dirty attribute looks also OK in the debugger (sql_user_id is set for Eca_user_emails obj).
And what is most strange for me - this code worked absolutely fine when it was without a main function, just all code after the classes declaration. But after I wrote main declaration and put all code here I started to get this error.
I am completely new to Python so maybe this is one of stupid mistakes...
Any ideas how to fix it and what is the reason? Thanks for reading this :)
By the way: Python 3.4, sqlalchemy 1.0, SQL Server 2012
sql_user_id is None because by default SQLAlchemy clears out the foreign key when you delete a child object across a relationship, that is, when you clear exstUser.user_emails SQLAlchemy sets sql_user_id to None for all those instances. If you want SQLAlchemy to issue DELETEs for Eca_user_emails instances when they are detached from Eca_users, you need to add delete-orphan cascade option to the user_emails relationship. If you want SQLAlchemy to issue DELETEs for Eca_user_emails instances when a Eca_users instance is deleted, you need to add the delete cascade option to the user_emails relationship.
user_emails = relationship("Eca_user_emails", backref=backref('eca_users'), cascade="save-update, merge, delete, delete-orphan")
You can find more information about cascades in the SQLAlchemy docs
Related
I'm having this issue, where sqlalchemy does not recognize the database, even though it is declared with declarative_base. After trying to run a simple query of session.query(AppGeofencing).all() I get sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1046, 'No database selected').
The table is declared as
Base = declarative_base()
AppBase = declarative_base(metadata=MetaData(schema='app'))
class AppGeofencing(AppBase):
__tablename__ = 'geofencing'
id = Column(INTEGER, primary_key=True, autoincrement=True)
name = Column(VARCHAR(45))
polygon = Column(Geometry('POLYGON'))
def __init__(self, name=None, polygon=None):
self.name = name
self.polygon = polygon
The case is only with this table, because I have also done similarly for other tables, and they work just fine.
After enabling the logging for sqlalchemy I can see that is does indeed create the correct query
INFO:sqlalchemy.engine.Engine:SELECT app.geofencing.id AS app_geofencing_id, app.geofencing.name AS app_geofencing_name, ST_AsEWKB(app.geofencing.polygon) AS app_geofencing_polygon
FROM app.geofencing
but somehow it cannot determine the database to use?
Does anyone have any idea, what could cause such issue?
I have a sqlalchemy schema containing three tables, (A, B, and C) related via one-to-many Foreign Key relationships (between A->B) and (B->C) with SQLite as a backend. I create separate database files to store data, each of which use the exact same sqlalchemy Models and run identical code to put data into them.
I want to be able to copy data from all these individual databases and put them into a single new database file, while preserving the Foreign Key relationships. I tried the following code to copy data from one file to a new file:
import sqlalchemy
from sqlalchemy.ext import declarative
from sqlalchemy import Column, String, Integer
from sqlalchemy import orm, engine
Base = declarative.declarative_base()
Session = orm.session_maker()
class A(Base):
__tablename__ = 'A'
a_id = Column(Ingeter, primary_key=True)
adata = Column(String)
b = orm.relationship('B', back_populates='a', cascade='all, delete-orphan', passive_deletes=True)
class B(Base):
__tablename__ = 'B'
b_id = Column(Ingeter, primary_key=True)
a_id = Column(Integer, sqlalchemy.ForeignKey('A.a_id', ondelete='SET NULL')
bdata = Column(String)
a = orm.relationship('A', back_populates='b')
c = orm.relationship('C', back_populates='b', cascade='all, delete-orphan', passive_deletes=True)
class C(Base):
__tablename__ = 'C'
c_id = Column(Ingeter, primary_key=True)
b_id = Column(Integer, sqlalchemy.ForeignKey('B.b_id', ondelete='SET NULL')
cdata = Column(String)
b = orm.relationship('B', back_populates='c')
file_new = 'file_new.db'
resource_new = 'sqlite:////%s' % file_new.lstrip('/')
engine_new = sqlalchemy.create_engine(resource_new, echo=False)
session_new = Session(bind=engine_new)
file_old = 'file_old.db'
resource_old = 'sqlite:////%s' % file_old.lstrip('/')
engine_old = sqlalchemy.create_engine(resource_old, echo=False)
session_old = Session(bind=engine_old)
for arow in session_old.query(A):
session_new.add(arow) # I am assuming that this will somehow know to copy all the child rows from the tables B and C due to the Foreign Key.
When run, I get the error, "Object '' is already attached to session '2' (this is '1')". Any pointers on how to do this using sqlalchemy and sessions? I also want to preserve the Foreign Key relationships within each database.
The use case is where data is first generated locally in non-networked machines and aggregated into a central db on the cloud. While the data will get generated in SQLite, the merge might happen in MySQL or Postgres, although here everything is happening in SQLite for simplicity.
First, the reason you get that error is because the instance arow is still tracked by session_old, so session_new will refuse to deal with it. You can detach it from session_old:
session_old.expunge(arow)
Which will allow you do add arow to session_new without issue, but you'll notice that nothing gets inserted into file_new. This is because SQLAlchemy knows that arow is persistent (meaning there's a row in the db corresponding to this object), and when you detach it and add it to session_new, SQLAlchemy still thinks it's persistent, so it does not get inserted again.
This is where Session.merge comes in. One caveat is that it won't merge unloaded relationships, so you'll need to eager load all the relationships you want to merge:
query = session_old.query(A).options(orm.subqueryload(A.b),
orm.subqueryload(A.b, B.c))
for arow in query:
session_new.merge(arow)
We are making a game server using SQLAlchemy.
because game servers must be very fast, we have decided to separate databases depending on user ID(integer).
so for example I did it successfully like the following.
from threading import Thread
from sqlalchemy import Column, Integer, String, DateTime, create_engine
from sqlalchemy.ext.declarative import declarative_base, DeferredReflection
from sqlalchemy.orm import sessionmaker
DeferredBase = declarative_base(cls=DeferredReflection)
class BuddyModel(DeferredBase):
__tablename__ = 'test_x'
id = Column(Integer(), primary_key=True, autoincrement=True)
value = Column(String(50), nullable=False)
and the next code will create multiple databases.
There will be test1 ~ test10 databases.
for i in range(10):
url = 'mysql://user#localhost/'
engine = create_engine(url, encoding='UTF-8', pool_recycle=300)
con = engine.connect()
con.execute('create database test%d' % i)
the following code will create 10 separate engines.
the get_engine() function will give you an engine depending on the user ID.
(User ID is integer)
engines = []
for i in range(10):
url = 'mysql://user#localhost/test%d'% i
engine = create_engine(url, encoding='UTF-8', pool_recycle=300)
DeferredBase.metadata.bind = engine
DeferredBase.metadata.create_all()
engines.append(engine)
def get_engine(user_id):
index = user_id%10
return engines[index]
by running prepare function, the BuddyModel class will be prepared, and mapped to the engine.
def prepare(user_id):
engine = get_engine(user_id)
DeferredBase.prepare(engine)
** The next code will do what I want to do exactly **
for user_id in range(100):
prepare(user_id)
engine = get_engine(user_id)
session = sessionmaker(engine)()
buddy = BuddyModel()
buddy.value = 'user_id: %d' % user_id
session.add(buddy)
session.commit()
But the problem is that when I do it in multiple threads, it just raise errors
class MetalMultidatabaseThread(Thread):
def run(self):
for user_id in range(100):
prepare(user_id)
engine = get_engine(user_id)
session = sessionmaker(engine)()
buddy = BuddyModel()
buddy.value = 'user_id: %d' % user_id
session.add(buddy)
session.commit()
threads = []
for i in range(100):
t = MetalMultidatabaseThread()
t.setDaemon(True)
t.start()
threads.append(t)
for t in threads:
t.join()
the error message is ...
ArgumentError: Class '<class '__main__.BuddyModel'>' already has a primary mapper defined. Use non_primary=True to create a non primary Mapper. clear_mappers() will remove *all* current mappers from all classes.
so.. my question is that How CAN I DO MULTIPLE-DATABASE like the above architecture using SQLAlchemy?
this is called horizontal sharding and is a bit of a tricky use case. The version you have, make a session based on getting the engine first, will work fine. There are two variants of this which you may like.
One is to use the horizontal sharding extension. This extension allows you to create a Session to automatically select the correct node.
The other is more or less what you have, but less verbose. Build a Session class that has a routing function, so you at least could share a single session and say, session.using_bind('engine1') for a query instead of making a whole new session.
I have found an answer for my question.
For building up multiple-databases depending on USER ID (integer) just use session.
Before explain this, I want to expound on the database architecture more.
For example if the user ID 114 connects to the server, the server will determine where to retrieve the user's information by using something like this.
user_id%10 # <-- 4th database
Architecture
DATABASES
- DB0 <-- save all user data whose ID ends with 0
- DB1 <-- save all user data whose ID ends with 1
.
.
.
- DB8 <-- save all user data whose ID ends with 9
Here is the answer
First do not use bind parameter.. simply make it empty.
Base = declarative_base()
Declare Model..
class BuddyModel(Base):
__tablename__ = 'test_x'
id = Column(Integer(), primary_key=True, autoincrement=True)
value = Column(String(50), nullable=False)
When you want to do CRUD ,make a session
engine = get_engine_by_user_id(user_id)
session = sessionmaker(bind=engine)()
buddy = BuddyModel()
buddy.value = 'This is Sparta!! %d' % user_id
session.add(buddy)
session.commit()
engine should be the one matched with the user ID.
I've just run across a fairly vexing problem, and after testing I have found that NONE of the available answers are sufficient.
I have seen various suggestions but none seem to be able to return the last inserted value for an auto_increment field in MySQL.
I have seen examples that mention the use of session.flush() to add the record and then retrieve the id. However that always seems to return 0.
I have also seen examples that mention the use of session.refresh() but that raises the following error: InvalidRequestError: Could not refresh instance ''
What I'm trying to do seems insanely simple but I can't seem to figure out the secret.
I'm using the declarative approach.
So, my code looks something like this:
class Foo(Base):
__tablename__ = 'tblfoo'
__table_args__ = {'mysql_engine':'InnoDB'}
ModelID = Column(INTEGER(unsigned=True), default=0, primary_key=True, autoincrement=True)
ModelName = Column(Unicode(255), nullable=True, index=True)
ModelMemo = Column(Unicode(255), nullable=True)
f = Foo(ModelName='Bar', ModelMemo='Foo')
session.add(f)
session.flush()
At this point, the object f has been pushed to the DB, and has been automatically assigned a unique primary key id. However, I can't seem to find a way to obtain the value to use in some additional operations. I would like to do the following:
my_new_id = f.ModelID
I know I could simply execute another query to lookup the ModelID based on other parameters but I would prefer not to if at all possible.
I would much appreciate any insight into a solution to this problem.
Thanks for the help in advance.
The problem is you are setting defaul for the auto increment. So when it run the insert into query the log of server is
2011-12-21 13:44:26,561 INFO sqlalchemy.engine.base.Engine.0x...1150 INSERT INTO tblfoo (`ModelID`, `ModelName`, `ModelMemo`) VALUES (%s, %s, %s)
2011-12-21 13:44:26,561 INFO sqlalchemy.engine.base.Engine.0x...1150 (0, 'Bar', 'Foo')
ID : 0
So the output is 0 which is the default value and which is passed because you are setting default value for autoincrement column.
If I run same code without default then it give the correct output.
Please try this code
from sqlalchemy import create_engine
engine = create_engine('mysql://test:test#localhost/test1', echo=True)
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
session = Session()
from sqlalchemy import Column, Integer, Unicode
class Foo(Base):
__tablename__ = 'tblfoo'
__table_args__ = {'mysql_engine':'InnoDB'}
ModelID = Column(Integer, primary_key=True, autoincrement=True)
ModelName = Column(Unicode(255), nullable=True, index=True)
ModelMemo = Column(Unicode(255), nullable=True)
Base.metadata.create_all(engine)
f = Foo(ModelName='Bar', ModelMemo='Foo')
session.add(f)
session.flush()
print "ID :", f.ModelID
Try using session.commit() instead of session.flush(). You can then use f.ModelID.
Not sure why the flagged answer worked for you. But in my case, that does not actually insert the row into the table. I need to call commit() in the end.
So the last few lines of code are:
f = Foo(ModelName='Bar', ModelMemo='Foo')
session.add(f)
session.flush()
print "ID:", f.ModelID
session.commit()
I've got a case where most of the time the relationships between objects was such that pre-configuring an eager (joined) load on the relationship made sense. However now I've got a situation where I really don't want the eager load to be done.
Should I be removing the joined load from the relationship and changing all relevant queries to join at the query location (ick), or is there some way to suppress an eager load in a query once it is set up?
Below is an example where eager loading has been set up on the User->Address relationship. Can the query at the end of the program be configured to NOT eager load?
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
import sqlalchemy.orm as orm
##Set up SQLAlchemy for declarative use with Sqlite...
engine = sa.create_engine("sqlite://", echo = True)
DeclarativeBase = declarative_base()
Session = orm.sessionmaker(bind = engine)
class User(DeclarativeBase):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key = True, autoincrement = True)
name = sa.Column(sa.String, unique = True)
addresses = orm.relationship("Address",
lazy = "joined", #EAGER LOAD CONFIG IS HERE
)
def __init__(self, Name):
self.name = Name
class Address(DeclarativeBase):
__tablename__ = "addresses"
id = sa.Column(sa.Integer, primary_key = True, autoincrement = True)
address = sa.Column(sa.String, unique = True)
FK_user = sa.Column(sa.Integer, sa.ForeignKey("users.id"))
def __init__(self, Email):
self.address = Email
##Generate data tables...
DeclarativeBase.metadata.create_all(engine)
##Add some data...
joe = User("Joe")
joe.addresses = [Address("joe#example.com"),
Address("joeyjojojs#example.net")]
s1 = Session()
s1.add(joe)
s1.commit()
## Access the data for the demo...
s2 = Session()
#How to suppress the eager load (auto-join) in the query below?
joe = s2.query(User).filter_by(name = "Joe").one() # <-- HERE?
for addr in joe.addresses:
print addr.address
You may override eagerness of properties on query-by-query basis, as far as I remember. Will this work?
from sqlalchemy.orm import lazyload
joe = (s2.query(User)
.options(lazyload('addresses'))
.filter_by(name = "Joe").one())
for addr in joe.addresses:
print addr.address
See the docs.
You can use Query.options(raiseload('*')) or Query.enable_eagerloads(False).
Query.enable_eagerloads(False) will disable all eager loading on the query. That is, even if you put a joinedload() or something, it won't be executed.
Query.options(raiseload('*')) will install a raiseload loader on every column, making sure they're not lazily loaded: an exception is raised instead. Note that this mode is fine for development and testing environments, but may be destructive in production. Make it optional like this:
Query.options(raiseload('*') if development else defaultload([]))
also note that raiseload('*') only works for top-level relationships. It won't spread on joined entities! If you request a relationship, you have to specify it twice:
session.query(User).options(
load_only('id'),
joinedload(User.addresses).options(
load_only('id'),
raiseload('*')
),
raiseload('*')
)
also, raiseload('*') only works for relationships, not columns :)
For columns, use defer(..., raiseload=True)