SQLAlchemy best practices: when / how to configure a scoped_session?

SQLAlchemy best practices: when / how to configure a scoped_session? - python

I am trying to figure out the right approach to use SQLAlchemy scoped sessions the "right way" while keeping the logic of defining a session separate from configuration and separate from using the session. I have been told a number of times a good aproeach would be to have some global scoped_session factory where I can use everywhere:
"""myapp/db.py
"""
from sqlalchemy.orm import sessionmaker, scoped_session
Session = scoped_session(sessionmaker())
Then when I want to use it:
"""myapp/service/dosomething.py
"""
from myapp.db import Session
def do_something(data):
"""Do something with data
"""
session = Session()
bars = session.query(Bar).all()
for bar in bars:
bar.data = data
session.commit()
This seems right, but my problem is that in all examples I have seen, sessionmaker would also set some parameters of the session, namely and most importantly bind an engine. This makes no sense to me, as the actual DB engine will be created from configuration not known at the global scope during the import of the myapp.db module.
What I have looked at doing is to set everything up in my app's "main" (or in a thread's main function), and then just assume that the session is configured in other places (such as when used by do_something() above):
"""myapp/main.py
"""
from sqlalchemy import create_engine
from myapp.db import Session
from myapp.service.dosomething import do_something
def main():
config = load_config_from_file()
engine = create_engine(**config['db'])
Session.configure(bind=engine)
do_something(['foo', 'bar'])
Does this seem like a correct approach? I have not found any good examples of such a flow yet most other examples I find seem either over-simplified or framework specific.

This is old and I've never accepted any of the answers below, but flowing #univerio's comment and 3+ years of continued usage in SQLAlchemy in various projects, my selected approach now is to keep doing exactly what I suggested in the OP:
Create a myapp.db module which defines Session = ScopedSession(sessionmaker())
Import from myapp.db import Session everywhere it is needed
In my app's main or in the relevant initialization code, do:
def main():
config = load_config_from_file()
engine = create_engine(**config['db'])
Session.configure(bind=engine)
do_something(['foo', 'bar'])
I've used this pattern successfully in Web apps, command line tools and long-running backend processes, and never had to change it so far. Ot is simple, reusable and works great, and I'd recommend it to anyone stumbling here because they've asked themselves the same question I did 3 years ago.

What you can do is to separate the config out into a separate module:
"""myapp/cfg.py
"""
config = load_config_from_file()
Then you can import this file wherever you need, including in the db module, so you can construct the engine as well as the session:
"""myapp/db.py
"""
from .cfg import config
engine = create_engine(**config['db'])
Session = scoped_session(sessionmaker(bind=engine))

Think about singleton.
In your case from myapp.db import Session, Session is singleton and global.
Just config Session at the start in your application.
You should have a config process like load config data from file or env, after all configs are ready, run the real program.

Related

How and when to initialise configuration in Python?

I'm getting pretty confused as to how and where to initialise application configuration in Python 3.
I have configuration that consists of application specific config (db connection strings, url endpoints etc.) and logging configuration.
Before my application performs its intended function I want to initialise the application and logging config.
After a few different attempts, I eventually ended up with something like the code below in my main entry module. It has the nice effect of all imports being grouped at the top of the file (https://www.python.org/dev/peps/pep-0008/#imports), but it doesn't feel right since the config modules are being imported for side effects alone which is pretty non-intuitive.
import config.app_config # sets up the app config
import config.logging_config # sets up the logging config
...
if __name__ == "__main__":
...
config.app_config looks something like follows:
_config = {
'DB_URL': None
}
_config['DB_URL'] = _get_db_url()
def db_url():
return _config['DB_URL']
def _get_db_url():
#somehow get the db url
and config.logging_config looks like:
if not os.path.isdir('.\logs'):
os.makedirs('.\logs')
if os.path.exists('logging_config.json'):
with open(path, 'rt') as f:
config = json.load(f)
logging.config.dictConfig(config)
else:
logging.basicConfig(level=log_level)
What is the common way to set up application configuration in Python? Bearing in mind that I will have multiple applications each using the config.app_config and config.logging_config module, but with different connection string possibly read from a file

I ended up with a cut down version of the Django approach: https://github.com/django/django/blob/master/django/conf/init.py
It seems pretty elegant and has the nice benefit of working regardless of which module imports settings first.

Determine what project id my App Engine code is running on

From within an App Engine app, is there a way to determine the project ID a GAE (App Engine) instance is running on?
I want to access a big query table in the same project that the App Engine instance is running in. I'd rather not hard code it in or include it in another config file if possible.
Edit: forgot to mention that this is from Python

This is the "official" way:
from google.appengine.api import app_identity
GAE_APP_ID = app_identity.get_application_id()
See more here: https://developers.google.com/appengine/docs/python/appidentity/

You can get a lot of info from environment variables:
import os
print os.getenv('APPLICATION_ID')
print os.getenv('CURRENT_VERSION_ID')
print os.environ

I tried the other approaches in 2019 using Python3. So far as I can tell, those approaches are for Python2 (and one for Java).
I was able to accomplish this in Python3 using:
import os
app_id = os.getenv('GAE_APPLICATION')
print(app_id)
project_id = os.getenv('GOOGLE_CLOUD_PROJECT')
print(project_id)
source: https://cloud.google.com/appengine/docs/standard/python3/runtime

I also added an app version, in case you need it too.
import com.google.appengine.api.utils.SystemProperty;
String appId = SystemProperty.applicationId.get();
String appVersion = SystemProperty.applicationVersion.get();

Create pyramid request for testing, so that events are triggered

I would like to test a pyramid view like the following one:
def index(request):
data = request.some_custom_property.do_something()
return {'some':data}
some_custom_property is added to the request via such an event handler:
#subscriber(NewRequest)
def prepare_event(event):
event.request.set_property(
create_some_custom_property,
'some_custom_property',reify=True
)
My problem is: If I create a test request manually, the event is not setup correctly, because no events are triggered. Because the real event handler is more complicated and depends on configuration settings, I don't want to reproduce that code in my test code. I would like to use the pyramid infracstructure as much as possible. I learned from an earlier question how to set up a real pyramid app from an ini file:
from webtest import TestApp
from pyramid.paster import get_app
app = get_app('testing.ini#main')
test_app = TestApp(app)
The test_app works fine, but I can only get back the html output (which is the idea of TestApp). What I want to do is, to execute index in the context of app or test_app, but to get back the result of index before it's send to a renderer.
Any hint how to do that?

First of all, I believe this is a really bad idea to write doctests like this. Since it requires a lot of initialization work, which is going to be included in documentation (remember doctests) and will not "document" anything. And, to me, these tests seems to be the job for unit/integration test. But if you really want, here's a way to do it:
import myapp
from pyramid.paster import get_appsettings
from webtest import TestApp
app, conf = myapp.init(get_appsettings('settings.ini#appsection'))
rend = conf.testing_add_renderer('template.pt')
test_app = TestApp(app)
resp = test_app.get('/my/view/url')
rend.assert_(key='val')
where myapp.init is a function that does the same work as your application initialization function, which is called by pserve (like main function here. Except myapp.init takes 1 argument, which is settings dictionary (instead of main(global_config, **settings)). And returns app (i.e. conf.make_wsgi_app()) and conf (i.e pyramid.config.Configurator instance). rend is a pyramid.testing.DummyTemplateRenderer instance.
P.S. Sorry for my English, I hope you'll be able to understand my answer.
UPD. Forgot to mention that rend has _received property, which is the value passed to renderer, though I would not recommend to use it, since it is not in public interface.

Getting config variables outside of the application context

I've extracted several of my sqlalchemy models to a separate and installable package (../lib/site-packages), to use across several applications. So I only need to:
from models_package import MyModel
from any application needing access to these models.
Everything is ok so far, except I cannot find a satisfactory way of getting several application dependent config variables used by some of the models, which may vary from application to application. So some model need to be aware of some variables, where previously I've used the application they were in.
Neither
current_app.config['XYZ']
or
config = LocalProxy(lambda: current_app.config['XYZ'])
have worked (outside of application context errors) so I'm stuck right now. Maybe this is poor programming and/or design on my behalf, so how do clear this up? There must be some way, but I haven't reasoned myself toward it yet.
SOLUTION:
Avoiding setting items that would occur on module load (like a constant containing an api key), both of the above should work, and they do. Anything not using those in the context of model-in-the-application use will of course error, methods returning the values you need should be good.

If you are using a configuration pattern utilising classes and inheritance as described here, you could simply import your config classes with their respective properties and access them anywhere you want:
class Config(object):
IMPORT = 'ME'
DEBUG = False
TESTING = False
DATABASE_URI = 'sqlite:///:memory:'
class ProductionConfig(Config):
DATABASE_URI = 'mysql://user#localhost/foo'
class DevelopmentConfig(Config):
DEBUG = True
class TestingConfig(Config):
TESTING = True
Now, in your foo.py:
from config import Config
print(Config.IMPORT) # prints 'ME'

well, since current_app can be a proxy of your flask program when the blueprint is registered, and that is done at run-time, you can't use it in your models_package modules.
(app tries to import models_package, and models_package requires app's configs to initalize things- thus import fails)
one option would be doing circular imports :
assuming everything is in 'App' module:
__init__.py
import flask
application = flask.Flask(__name__)
application.config = #load configs
import models_package
models_package.py
from App import application
config = application.config
or create your own config object, but that somehow doubles complexity:
models_package.py
import flask
config = flask.config.Config(defaults=flask.Flask.default_config)
#pick one of those and apply the same config initialization as you do in
#your __init__.py
config.from_pyfile(..) #or
config.from_object(..) #or
config.from_envvar(..)

Python App Engine import issues after app is cached

I'm using a modified version on juno (http://github.com/breily/juno/) in Google App Engine. The problem I'm having is I have code like this:
import juno
import pprint
#get('/')
def home(web):
pprint.pprint("test")
def main():
run()
if __name__ == '__main__':
main()
The first time I start the app up in the dev environment it works fine. The second time and every time after that it can't find pprint. I get this error:
AttributeError: 'NoneType' object has no attribute 'pprint'
If I set the import inside the function it works every time:
#get('/')
def home(web):
import pprint
pprint.pprint("test")
So it seems like it is caching the function but for some reason the imports are not being included when it uses that cache. I tried removing the main() function at the bottom to see if that would remove the caching of this script but I get the same problem.
Earlier tonight this code was working fine, I'm not sure what could have changed to cause this. Any insight is appreciated.

I would leave it that way. I saw a slideshare that Google put out about App Engine optimization that said you can get better performance by keeping imports inside of the methods, so they are not imported unless necessary.

Is it possible you are reassigning the name pprint somewhere? The only two ways I know of for a module-level name (like what you get from the import statement) to become None is if you either assign it yourself pprint = None or upon interpreter shutdown, when Python's cleanup assigns all module-level names to None as it shuts things down.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

SQLAlchemy best practices: when / how to configure a scoped_session? - python

Think about singleton. In your case from myapp.db import Session, Session is singleton and global. Just config Session at the start in your application. You should have a config process like load config data from file or env, after all configs are ready, run the real program.

Related

How and when to initialise configuration in Python?

Determine what project id my App Engine code is running on

Create pyramid request for testing, so that events are triggered

Getting config variables outside of the application context

Python App Engine import issues after app is cached

Categories

Resources