How to disable SQLAlchemy caching? - python

I have a caching problem when I use sqlalchemy.
I use sqlalchemy to insert data into a MySQL database. Then, I have another application process this data, and update it directly.
But sqlalchemy always returns the old data rather than the updated data. I think sqlalchemy cached my request ... so ... how should I disable it?

The usual cause for people thinking there's a "cache" at play, besides the usual SQLAlchemy identity map which is local to a transaction, is that they are observing the effects of transaction isolation. SQLAlchemy's session works by default in a transactional mode, meaning it waits until session.commit() is called in order to persist data to the database. During this time, other transactions in progress elsewhere will not see this data.
However, due to the isolated nature of transactions, there's an extra twist. Those other transactions in progress will not only not see your transaction's data until it is committed, they also can't see it in some cases until they are committed or rolled back also (which is the same effect your close() is having here). A transaction with an average degree of isolation will hold onto the state that it has loaded thus far, and keep giving you that same state local to the transaction even though the real data has changed - this is called repeatable reads in transaction isolation parlance.
http://en.wikipedia.org/wiki/Isolation_%28database_systems%29

This issue has been really frustrating for me, but I have finally figured it out.
I have a Flask/SQLAlchemy Application running alongside an older PHP site. The PHP site would write to the database and SQLAlchemy would not be aware of any changes.
I tried the sessionmaker setting autoflush=True unsuccessfully
I tried db_session.flush(), db_session.expire_all(), and db_session.commit() before querying and NONE worked. Still showed stale data.
Finally I came across this section of the SQLAlchemy docs: http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#transaction-isolation-level
Setting the isolation_level worked great. Now my Flask app is "talking" to the PHP app. Here's the code:
engine = create_engine(
"postgresql+pg8000://scott:tiger#localhost/test",
isolation_level="READ UNCOMMITTED"
)
When the SQLAlchemy engine is started with the "READ UNCOMMITED" isolation_level it will perform "dirty reads" which means it will read uncommited changes directly from the database.
Hope this helps
Here is a possible solution courtesy of AaronD in the comments
from flask.ext.sqlalchemy import SQLAlchemy
class UnlockedAlchemy(SQLAlchemy):
def apply_driver_hacks(self, app, info, options):
if "isolation_level" not in options:
options["isolation_level"] = "READ COMMITTED"
return super(UnlockedAlchemy, self).apply_driver_hacks(app, info, options)

Additionally to zzzeek excellent answer,
I had a similar issue. I solved the problem by using short living sessions.
with closing(new_session()) as sess:
# do your stuff
I used a fresh session per task, task group or request (in case of web app). That solved the "caching" problem for me.
This material was very useful for me:
When do I construct a Session, when do I commit it, and when do I close it

This was happening in my Flask application, and my solution was to expire all objects in the session after every request.
from flask.signals import request_finished
def expire_session(sender, response, **extra):
app.db.session.expire_all()
request_finished.connect(expire_session, flask_app)
Worked like a charm.

I have tried session.commit(), session.flush() none worked for me.
After going through sqlalchemy source code, I found the solution to disable caching.
Setting query_cache_size=0 in create_engine worked.
create_engine(connection_string, convert_unicode=True, echo=True, query_cache_size=0)

First, there is no cache for SQLAlchemy.
Based on your method to fetch data from DB, you should do some test after database is updated by others, see whether you can get new data.
(1) use connection:
connection = engine.connect()
result = connection.execute("select username from users")
for row in result:
print "username:", row['username']
connection.close()
(2) use Engine ...
(3) use MegaData...
please folowing the step in : http://docs.sqlalchemy.org/en/latest/core/connections.html
Another possible reason is your MySQL DB is not updated permanently. Restart MySQL service and have a check.

As i know SQLAlchemy does not store caches, so you need to looking at logging output.

Related

How do you create a table with Sqlalchemy within a transaction in Postgres?

I'm using Sqlalchemy in a multitenant Flask application and need to create tables on the fly when a new tenant is added. I've been using Table.create to create individual tables within a new Postgres schema (along with search_path modifications) and this works quite well.
The limitation I've found is that the Table.create method blocks if there is anything pending in the current transaction. I have to commit the transaction right before the .create call or it will block. It doesn't appear to be blocked in Sqlalchemy because you can't Ctrl-C it. You have to kill the process. So, I'm assuming it's something further down in Postgres.
I've read in other answers that CREATE TABLE is transactional and can be rolled back, so I'm presuming this should be working. I've tried starting a new transaction with the current engine and using that for the table create (vs. the current Flask one) but that hasn't helped either.
Does anybody know how to get this to work without an early commit (and risking partial dangling data)?
This is Python 2.7, Postgres 9.1 and Sqlalchemy 0.8.0b2.
(Copy from comment)
Assuming sess is the session, you can do sess.execute(CreateTable(tenantX_tableY)) instead.
EDIT: CreateTable is only one of the things being done when calling table.create(). Use table.create(sess.connection()) instead.

SQLAlchemy/Pyramid DBSession refresh issue

Here's my scenario:
First view renders form, data goes to secend view, where i store it in DB (MySQL) and redirects to third view which shows what was written to db:
Stoing to db:
DBSession.add(object)
transaction.commit()
DB Session:
DBSession = scoped_session(sessionmaker(expire_on_commit=False,
autocommit=False,
extension=ZopeTransactionExtension()))
After that when I refresh my page several time sometimes I can see DB change, sometimes not, one time old data, second time new and so on...
When I restart server (locally, pserve) DB data is up-to-date.
Maybe it's a matter of creating session?
Check MySQL's transaction isolation level.
The default for InnoDB is REPEATABLE READ: "All consistent reads within the same transaction read the snapshot established by the first read."
You can specify the isolation level in the call to create_engine. See the SQLAlchemy docs.
I suggest you try the READ COMMITTED isolation level and see if that fixes your problem.
It's not clear exactly what your transaction object is or how it connects to the SQLAlchemy database session. I couldn't see anything about transactions in the Pyramid docs and I don't see anything in your code that links your transaction object to your SQLAlchemy session so maybe there is some configuration missing. What example are you basing this code on?
Also: the sessionmaker call is normally done at file score to create a single session factory, which is then used repeatedly to create session objects from the same source. "the sessionmaker() function is normally used to create a top level Session configuration which can then be used throughout an application without the need to repeat the configurational arguments."
It may be the case that since you are creating multiple session factories that there is some data that is supposed to be shared across sessions but is actually not shared because it is created once per factory. Try just calling sessionmaker once and see if that makes a difference.
I believe that your issue is likely to be a persistent session. By default, Pyramids expires all objects in the session after a commit -- this means that SQLA will fetch them from the database the next time you want them, and they will be fresh.
You have overridden this default by indicating "expire_on_commit=False" -- so, make sure that after committing a change you call session.expire_all() if you intend for that session object to grab fresh data on subsequent requests. (The session object is the same for multiple requests in Pyramid, but you aren't guaranteed to get the same thread-scoped session) I recommend not setting expire on commit to false, or using a non-global session: see http://docs.pylonsproject.org/projects/pyramid_cookbook/en/latest/database/sqlalchemy.html#using-a-non-global-session
Alternatively, you could make sure you are expiring objects when necessary, knowing that unexpired objects will stay in memory the way they are and will not be refreshed, and may differ from the same object in a different thread-scoped session.
The problem is that you're setting expire_on_commit=False. If you remove that, it should work. You can read more about what it does on http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#sqlalchemy.orm.session.Session.commit

How to avoid caching in sqlalchemy?

I have a problem with SQL Alchemy - my app works as a constantly working python application.
I have function like this:
def myFunction(self, param1):
s = select([statsModel.c.STA_ID, statsModel.c.STA_DATE)])\
.select_from(statsModel)
statsResult = self.connection.execute(s).fetchall()
return {'result': statsResult, 'calculation': param1}
I think this is clear example - one result set is fetched from database, second is just passed as argument.
The problem is that when I change data in my database, this function still returns data like nothing was changed. When I change data in input parameter, returned parameter "calculation" has proper value.
When I restart the app server, situation comes back to normal - new data are fetched from MySQL.
I know that there were several questions about SQLAlchemy caching like:
How to disable caching correctly in Sqlalchemy orm session?
How to disable SQLAlchemy caching?
but how other can I call this situation? It seems SQLAlchemy keeps the data fetched before and does not perform new queries until application restart. How can I avoid such behavior?
Calling session.expire_all() will evict all database-loaded data from the session. Any access of object attributes subsequent emits a new SELECT statement and gets new data back. Please see http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#refreshing-expiring for background.
If you still see so-called "caching" after calling expire_all(), then you need to close out transactions as described in my answer linked above.
A few possibilities.
You are reusing your session improperly or at improper time. Best practice is to throw away your session after you commit, and get a new one at the last possible moment before you use it. The behavior that appears to be caching may in fact be due to a session lifetime being very long in your application.
Objects that survive longer than the session are not being merged into a subsequent session. The "metadata" may not be able to update their state if you do not merge them back in. This is more a concern for the ORM API of SQLAlchemy, which you do not appear so far to be using.
Your changes are not committed. You say they are so we'll assume this is not it, but if none of the other avenues explain it you may want to look again.
One general debugging tip: if you want to know exactly what SQLAlchemy is doing in the database, pass echo=True to the create_engine function. The engine will print all queries it runs.
Also check out this suggestion I made to someone else, who was using ORM and had transactionality problems, which resolved their issue without ever pinpointing it. Maybe it will help you.
You need to change transaction isolation level to READ_COMMITTED
http://docs.sqlalchemy.org/en/rel_0_9/dialects/mysql.html#mysql-isolation-level

SQLAlchemy trying to access a table in the wrong database

We use CherryPy and SQLAlchemy to build our web app and everything was fine until we tested with 2 concurrent users - then things started to go wrong! Not very good for a web app so I'd be very appreciative if anyone could shine some light on this.
TL;DR
We're getting the following error about 10% of the time when two users are using our site (but accessing different databases) at the same time:
ProgrammingError: (ProgrammingError) (1146, "Table 'test_one.other_child_entity' doesn't exist")
This table is not present in that database so the error makes sense but the problem is that SQLAlchemy shouldn't be looking for the table in that database.
I have reproduced the error in an example here https://gist.github.com/1729817
Explanation
We're developing an application that is very dynamic and is based on the entity_name pattern found at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName
We've since grown that idea so that it stores entities in different databases depending on what user you're logged in as. This is because each user in the system has their own database and can create their own entities (tables). To do this we extend a base entity for each database and then extend that new entity for each additional entity they create in their database.
When the app starts we create a dictionary containing the engine, metadata, classes and tables of all these databases and reflect all of the metadata. When a user logs in they get access to one.
When two users are accessing the site at the same time something is going wrong and SQLAlchemy ends up looking for tables in the wrong database. I guess this is to do with threading but as far as I can see we are following all the rules when it comes to sessions (CP and SQLA), engines, metadata, tables and mappers.
If anyone could give my example (https://gist.github.com/1729817) a quick glance over and point out any glaring problems that would be great.
Update
I can fix the problem by changing the code to use my own custom session router like so:
# Thank you zzzeek (http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/)
class RoutingSession(Session):
def get_bind(self, mapper = None, clause = None):
return databases[cherrypy.session.get('database')]['engine']
And then:
Session = scoped_session(sessionmaker(autoflush = True, autocommit = False, class_ = RoutingSession))
So just hard-coding it to return the engine that's linked to the database that's set in the session. This is great news but now I want to know why my original code didn't work. Either I'm doing it wrong or the following code is not completely safe:
# Before each request (but after the session tool)
def before_request_body():
if cherrypy.session.get('logged_in', None) is True:
# Configure the DB session for this thread to point to the correct DB
Session.configure(bind = databases[cherrypy.session.get('database')]['engine'])
I guess that the binding that's happening here was being overwritten by the user in the other thread which is strange because I thought scoped_session was all about thread-safety?

Execute sql query with Elixir

I'm using Elixir in a project that connects to a postgres database. I want to run the following query on the database I'm connected to, but I'm not sure how to do it as I'm rather new to Elixir and SQLAlchemy. Anyone know how?
VACUUM FULL ANALYZE table
Update
The error is: "UnboundExecutionError: Could not locate a bind configured on SQL expression or this Session". And the same result with session.close() issued before. I did try doing metadata.bind.execute() and that worked for a simple select. But for the VACUUM it said - "InternalError: (InternalError) VACUUM cannot run inside a transaction block", so now I'm trying to figure out how to turn that off.
Update 2
I can get the query to execute, but I'm still getting the same error - even when I create a new session and close the previous one.
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
# ... insert stuff
old_session.commit()
old_session.close()
new_sess = sessionmaker(autocommit=True)
new_sess.configure(bind=create_engine('postgres://user:pw#host/db', echo=True))
sess = new_sess()
sess.execute('VACUUM FULL ANALYZE table')
sess.close()
and the output I get is
2009-12-10 10:00:16,769 INFO sqlalchemy.engine.base.Engine.0x...05ac VACUUM FULL ANALYZE table
2009-12-10 10:00:16,770 INFO sqlalchemy.engine.base.Engine.0x...05ac {}
2009-12-10 10:00:16,770 INFO sqlalchemy.engine.base.Engine.0x...05ac ROLLBACK
finishing failed run, (InternalError) VACUUM cannot run inside a transaction block
'VACUUM FULL ANALYZE table' {}
Update 3
Thanks to everyone who responded. I wasn't able to find the solution I wanted, but I think I'm just going to go with the one described here PostgreSQL - how to run VACUUM from code outside transaction block?. It's not ideal, but it works.
Dammit. I knew the answer was going to be right under my nose. Assuming you setup your connection like I did.
metadata.bind = 'postgres://user:pw#host/db'
The solution to this was as simple as
conn = metadata.bind.engine.connect()
old_lvl = conn.connection.isolation_level
conn.connection.set_isolation_level(0)
conn.execute('vacuum analyze table')
conn.connection.set_isolation_level(old_lvl)
This is similar to what was suggested here PostgreSQL - how to run VACUUM from code outside transaction block?
because underneath it all, sqlalchemy uses psycopg to make the connection to postgres. Connection.connection is a proxy to the psycopg connection. Once I realized this, this problem came back to mind and I decided to take another whack at it.
Hopefully this helps someone.
You need to bind the session to an engine
session.bind = metadata.bind
session.execute('YOUR SQL STATEMENT')
UnboundExecutionError says that your session is not bound to an engine and there is no way to discover engine from query passed to execute(). You can either use engine.execute() directly or pass additional mapper parameter (either mapper or mapped model corresponding to table used in query) to session.execute() to help SQLAlchemy discover proper engine.
The InternalError says that you are trying to execute this statement inside explicitly (with BEGIN statement) started transaction. Have you issued some statements before it without calling commit()? If so, just call commit() or rollback() method to close transaction before doing VACUUM. Also note, that there are several parameter to sessionmaker() that tell SQLAlchemy when transaction should be started.
If you have access to SQLAlchemy session, you can execute arbitrary SQL statements via its execute method:
session.execute("VACUUM FULL ANALYZE table")
(Depending on the Postgres version) you most likely do not want to run "VACUUM FULL".

Categories

Resources