SQLAlchemy KeyError when using connection pool - python

I created a repository class for Person with the below method in it:
def find_by_order_id(self, order_id: str) -> [Person]:
results = []
client_table = Table(TBL_CLIENT, self._metadata, autoload=True,
autoload_with=self._connection)
order_table = Table(TBL_ORDER, self._metadata, autoload=True,
autoload_with=self._connection)
query = select([client_table.c.first_name, client_table.c.last_name, client_table.c.date_of_birth])
.select_from(
order_table.join(client_table, client_table.c.order_id = order_table.c.order_id)
).where(order_table.c.order_id == order_id).distinct()
for row in self._connection.execute(query).fetchall():
results.append(dict(row))
return results
for some odd reason which I have no explanation for, I sometimes get KeyError from SQLAlchemy on a key that actually exists when I debug the code:
File "/home/tghasemi/miniconda3/envs/myproj/lib/python3.6/site-packages/sqlalchemy/util/_collections.py", line 210, in __getattr__
return self._data[key]
KeyError: 'order_id'
I noticed this only happens when I pull the connection from a connection pool while multithreading (each thread only uses one connection - I know connections are not threadsafe):
if use_pooling:
self._engine = create_engine(connection_string, pool_size=db_pool_size,
pool_pre_ping=db_pool_pre_ping, echo=db_echo)
else:
self._engine = create_engine(connection_string, echo=db_echo)
Considering the fact that the key exists(even checked it when the exception happens) when I put a breakpoint where the exception happens, I suspect loading the table is not completed yet when the query is being constructed.
Does anyone have any idea why something like this can happen? I gave up!

Well, I think I managed to fix the problem.
Something I did not mention in my question (in fact I did not think of it as a possibility) was that both the Engine and the MetaData object come from a Singleton class I created for my Database:
class Database(metaclass=SingletonMetaClass):
def __init__(self, config: Config, use_pooling: bool = True, logger: Logger = None):
# Initializing the stuff here ...
#property
def metadata(self): # used to return self._metadata here
return MetaData(self._engine, reflect=False, schema=self._db_schema)
#property
def engine(self):
return self._engine
Although the SQLAlchemy's official documentation says that the MetaData object is thread-safe for read operations, but for some reason in my case (which is a read operation) it was causing this issue. Somehow after not sharing this object among my threads the issue went away (not 100% sure if it literally went away, but it's not happening anymore).

Related

With Peewee, how to check if an SQLite file has been created vs filled without creating a table. If I import, it seems the table is created?

first I'd like to check if the file exists, and Ive used this os.path:
def check_db_exist():
try:
file_exists = exists('games.db')
if file_exists:
file_size = os.path.getsize('games.db')
if file_size > 3000:
return True, file_size
else:
return False, 'too small'
else:
return False, 'does not exist'
except:
return False, 'error'
I have a separate file for my models, and creating the database. My concern is, if I import the class for the database it instantiates the sql file.
Moreover, pywebview when displaying my html, wipes all variables.
If I were to run this process as I load my page, then I can't access the variable for true/false sqlite exists.
db = SqliteDatabase('games.db')
class Game(Model):
game = CharField()
exe = CharField()
path = CharField()
longpath = CharField()
i_d = IntegerField()
class Meta:
database = db
This creates the table, so checking if the file exists is useless.
Then if I uncomment the first line in this file the database gest created, otherwise all of my db. variables are unusable. I must be missing a really obvious function to solve my problems.
# db = SqliteDatabase('games.db')
def add_game(game, exe, path, longpath, i_d):
try:
Game.create(game=game, exe=exe, path=path, longpath=longpath, i_d=i_d)
except:
pass
def loop_insert(lib):
db.connect()
for i in lib[0]:
add_game(i.name, i.exe, i.path, i.longpath, i.id)
db.close()
def initial_retrieve():
db.connect()
vals = ''
for games in Game.select():
val = js.Import.javascript(str(games.game), str(games.exe), str(games.path), games.i_d)
vals = vals + val
storage = vals
db.close()
return storage
should I just import the file at a different point in the file? whenever I feel comfortable? I havent seen that often so I didnt want to be improper in formatting.
edit: edit: Maybe more like this?
def db():
db = SqliteDatabase('games.db')
return db
class Game(Model):
game = CharField()
exe = CharField()
path = CharField()
file 2:
from sqlmodel import db, Game
def add_game(game, exe, path, longpath, i_d):
try:
Game.create(game=game, exe=exe, path=path, longpath=longpath, i_d=i_d)
except:
pass
def loop_insert(lib):
db.connect()
for i in lib[0]:
add_game(i.name, i.exe, i.path, i.longpath, i.id)
db.close()
I am not sure if this answers your question, since it seems to involve multiple processes and/or processors, but In order to check for the existence of a database file, I have used the following:
DATABASE = 'dbfile.db'
if os.path.isfile(DATABASE) is False:
# Create the database file here
pass
else:
# connect to database here
db.connect()
I would suggest using sqlite's user_version pragma:
db = SqliteDatabase('/path/to/db.db')
version = db.pragma('user_version')
if not version: # Assume does not exist/newly-created.
# do whatever.
db.pragma('user_version', 1) # Set user version.
from reddit:
me: To the original challenge, there's a reason I want to know whether the file exists. Maybe its flawed at the premises, I'll explain and you can fill in there.
This script will run on multiple machines I dont ahve access to. At the entry point of a first-time use case, I will be porting data from a remote location, if its the first time the script runs on that machine, its going down a different work flow than a repeated opening.
Akin to grabbing all pc programs vs appending and reading from teh last session. How would you suggest quickly understanding if that process has started and finished from a previous session.
Checking if the sqlite file is made made the most intuitive sense, and then adjusting to byte size. lmk
them:
This is a good question!
How would you suggest quickly understanding if that process
has started and finished from a previous session.
If the first thing your program does on a new system is download some kind of fixture data, then the way I would approach it is to load the DB file as normal, have Peewee ensure the tables exist, and then do a no-clause SELECT on one of them (either through the model, or directly on the database through the connection if you want.) If it's empty (you get no results) then you know you're on a fresh system and you need to make the remote call. If you get results (you don't need to know what they are) then you know you're not on a fresh system.

Python: use the decorator to pass an extra parameter to the function

I have a function than depends on db connection. This function has a lot of return statements, of this kind:
def do_something_with_data(db: Session, data: DataClass):
db = Session()
if condition1:
db.do_something1()
db.close()
return
if condition2:
db.do_something2()
db.close()
return
if condition3:
db.do_something3()
db.close()
return
...
After executing the function, I need to run db.close(), but because of the structure of the function this entry will have to be duplicated many times for each return as shown above.
So I made a decorator that passes the created session to the function and closes the session at the end of the execution of the function instead.
def db_depends(function_that_depends_on_db):
def inner(*args, **kwargs):
db = Session()
result = function_that_depends_on_db(db, *args, **kwargs)
db.close()
return result
return inner
#db_depends
def do_something_with_data(db: Session, data: DataClass):
if condition1:
db.do_something1()
return
if condition2:
db.do_something2()
return
if condition3:
db.do_something3()
return
...
All works great, but the fact, that user see two required arguments in definition, however there is only one (data) seems kinda dirty.
Is it possible to do the same thing without misleading people who will read the code or IDE hints?
Just have the function accept the Session parameter normally:
def do_something_with_data(db: Session, data: DataClass):
if condition1:
db.do_something1()
return
if condition2:
db.do_something2()
return
if condition3:
db.do_something3()
return
This allows the user to specify a Session explicitly, for example to reuse the same Session to do multiple things.
Yes, that doesn't close the Session. Because that is the responsibility of the calling code, since that's where the Session came from in the first place. After all, if the calling code wants to reuse a Session, then it shouldn't be closed.
If you want a convenience method to open a new, temporary Session for the call, you can easily do that using the existing decorator code:
do_something_in_new_session = db_depends(do_something_with_data)
But if we don't need to apply this logic to multiple functions, then "simple is better than complex" - just write an ordinary wrapper:
def do_something_in_new_session(data: DataClass):
db = Session()
result = do_something_with_data(db, data)
db.close()
return result
Either way, it would be better to write the closing logic using a with block, assuming your library supports it:
def do_something_in_new_session(data: DataClass):
with Session() as db:
return do_something_with_data(db, data)
Among other things, this ensures that .close is called even if an exception is raised in do_something_with_data.
If your DB library doesn't support that (i.e., the Session class isn't defined as a context manager - although that should only be true for very old libraries now), that's easy to fix using contextlib.closing from the standard library:
from contextlib import closing
def do_something_in_new_session(data: DataClass):
with closing(Session()) as db:
return do_something_with_data(db, data)
(And of course, if you don't feel the need to make a wrapper like that, you can easily use such a with block directly at the call site.)

Reusing a single session for routing connections with SQLAlchemy between master & read replicas

We needed to route our database requests to either a writer master database or a set of read replicas.
We found a blog post by Mike Bayer suggesting how to do so using SQLAlchemy. We replicated the solution but that did not work out with our existing tests due to various reasons.
We went on with the following below. This will reuse one session rather than creating new ones that will stack altogether:
class ExplicitRoutingSession(SignallingSession):
_name = None
def get_bind(self, mapper=None, clause=None):
# If reader and writer binds are not configured,
# connect using the default SQLALCHEMY_DATABASE_URI
if not self.binds_setup:
return super().get_bind(mapper, clause)
return self.load_balance(mapper, clause)
def load_balance(self, mapper=None, clause=None):
# Use the explicit name if present
if self._name and not self._flushing:
bind = self._name
self._name = None
self.app.logger.debug(f"Connecting -> {bind}")
return get_state(self.app).db.get_engine(self.app, bind=bind)
# Everything else goes to the writer engine
else:
self.app.logger.debug("Connecting -> writer")
return get_state(self.app).db.get_engine(self.app, bind='writer')
def using_bind(self, name):
self._name = name
return self
#cached_property
def binds_setup(self):
binds = self.app.config['SQLALCHEMY_BINDS'] or {}
return all([k in binds for k in ['reader', 'writer']])
So far it works good for us. We assume we might lose some functionality such as db save points by not having stacked sessions... but we'd like to know if there are stability and unforeseen risks other than losing features with such an approach?
Notes:
We are also using flask-sqlalchemy.
This is from an open source notification platform and you can browse the code/branch yourself.

Python class variable not updating

I have a class that is taking in an Id and trying to update the variable current_account but when I print out the details of the current_account it hasn't updated.
Anyone got any ideas for this? New to python so might be doing something stupid that I can't see.
class UserData:
def __init__(self, db_conn=None):
if None == db_conn:
raise Exception("DB Connection Required.")
self.db = db_conn
self.set_my_account()
self.set_accounts()
self.set_current_account()
def set_current_account(self, account_id=None):
print account_id
if None == account_id:
self.current_account = self.my_account
else:
if len(self.accounts) > 0:
for account in self.accounts:
if account['_id'] == account_id:
self.current_account = account
print self.current_account['_id']
else:
raise Exception("No accounts available.")
Assume that set_my_account() gets a dictionary of account data and that set_accounts() get a list of dictionaries of account data.
So when I do the following:
user_data = UserData(db_conn=db_conn)
user_data.set_current_account(account_id=account_id)
Where db_conn is a valid database connection and account_id is a valid account id.
I get the following out of the above two lines.
None
518a310356c02c0756764b4e
512754cfc1f3d16c25c350b7
So the None value is from the declaration of the class and then the next two are from the call to set_current_account(). The first id value is what I'm trying to set. The second id value is what was already set from the class __init__() method.
There were a lot of redundancies an un-Pythonic constructions. I cleaned up the code to help me understand what you trying to do.
class UserData(object):
def __init__(self, db_conn):
self.db = db_conn
self.set_my_account()
self.set_accounts()
self.set_current_account()
def set_current_account(self, account_id=None):
print account_id
if account_id is None:
self.current_account = self.my_account
else:
if not self.accounts:
raise Exception("No accounts available.")
for account in self.accounts:
if account['_id'] == account_id:
self.current_account = account
print self.current_account['_id']
user_data = UserData(db_conn)
user_data.set_current_account(account_id)
You used default arguments (db_conn=None) when a call without an explicit argument is invalid. Yes, you can now call __init__(None) but you could also call __init__('Nalum'); you can't protect against everything.
By moving the "No accounts" exception the block fast-fails and you save one level of indention.
The call UserData(db_conn=db_conn) is valid but unecessarily repetitive.
Unfortunately, I still can't figure out what you are trying to accomplish and this is perhaps the largest flaw. Variable names are terribly important for help the reader (which may be the future you) make sense of code. current_account, my_account, account_id and current_account['_id'] so obscure the intention that you should really consider more distinct, informative names.
Figured out what it was.
The data was being changed else where in the code base. It is now working as expected.
Thanks guys for pointing out the Python centric things that I was doing wrong, good to get it.

SQLAlchemy Session error

I have a problem with the session in SQLAlchemy, when i Add a row in the DB it's OK, but if i want to add another row without closing my app, It doesn't Add
This is the function in my Model:
def add(self,name):
self.slot_name = name
our_slot = self.session_.query(Slot).filter_by(slot_name = str(self.slot_name)).first()
if our_slot:
return 0
else:
self.session_.add(self)
self.session_.commit()
return 1
The problem is that you commit your session. After committing a session, it is closed. Either you commit after you are done adding, or you open a new session after each commit. Also take a look at Session.commit(). You should probably read something about sessions in SQLAlchemy's documentation.
Furthermore, suggest you do this:
def add(self,name):
self.slot_name = name
try:
our_slot = self.session_.query(Slot)\
.filter_by(slot_name = str(self.slot_name)).one()
self.session_.add(self)
return 1
except NoResultFound:
return 0
Of course, this only works if you expect exactly one result. It is considerd good practice to raise exceptions and catch them instead of making up conditions.

Categories

Resources