Pyramid / SQLAlchemy model binding in Celery tasks

Pyramid / SQLAlchemy model binding in Celery tasks - python

I am trying to set up Celery tasks. Our main app is Pyramid with SQLAlchemy.
So I have a task defined as:
from celery.contrib.methods import task
from apipython.celerytasks import celery
class Email():
def __init__(self, from_name, from_email, to_name, to_email, subject, html_body,
sendgrid_category=None):
self.from_name = from_name
self.from_email = from_email
self.to_name = to_name
self.to_email = to_email
self.subject = subject
self.body = None
self.html_body = html_body
self.sendgrid_category = sendgrid_category
class EmailService():
#task()
def task__send_smtp(self, email, from_user_id=None, to_user_id=None):
# send the email, not shown here
# EmailLog is a SQLAlchemy model
email_log = EmailLog(
email.subject,
email.html_body,
from_user_id=from_user_id,
to_user_id=to_user_id,
action_type=email.sendgrid_category)
DBSession.add(email_log)
transaction.commit()
And celerytasks.py I have:
from celery import Celery
celery = Celery('apipython.celery',
broker='sqla+mysql+mysqldb://root:notarealpassword#127.0.0.1/gs?charset=utf8',
backend=None,
include=['apipython.services.NotificationService'])
if __name__ == '__main__':
celery.start()
It works - the task gets serialized and picked up.
However when I try to use SQLAlchemy / DBSession inside the task, I get an error:
UnboundExecutionError: Could not locate a bind configured on mapper Mapper|EmailLog|emaillogs or this Session
I understand the worker task is running on a separate process and need to have its settings, session, engine etc set up. So I have this:
#worker_init.connect
def bootstrap_pyramid(signal, sender):
import os
from pyramid.paster import bootstrap
sender.app.settings = bootstrap('development.ini')['registry'].settings
customize_settings(sender.app.settings)
engine = sqlalchemy.create_engine('mysql+mysqldb://root:notarealpassword#127.0.0.1/gs?charset=utf8')
DBSession.configure(bind=engine)
Base.metadata.bind = engine
However I am still getting the same error.
DBSession and Base are defined in models.py as
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
Base = declarative_base()
What step am I missing to make the models binding work?
Second question, can this code for creating session / binding work in celery's init, vs worker init?
(BTW I did try pyramid_celery but prefer to make plain celery work)
Thanks,

My colleague tried the exact same code and it worked. Strange

Related

Flask+Pytest+SQLAlchemy: Can't create and drop tables when run pytest using flask-sqlalchemy

when I run tests It succeeds to connect to the database, but it does not create tables. I think maybe there is a different way to create tables when I use flask-sqlalchemy, but I can't find the solution.
This is app.py
db = SQLAlchemy()
def create_app(config_name):
app = Flask(__name__, template_folder='templates')
app.wsgi_app = ProxyFix(app.wsgi_app)
app.config.from_object(config_name)
app.register_blueprint(api)
db.init_app(app)
#app.route('/ping')
def health_check():
return jsonify(dict(ok='ok'))
#app.errorhandler(404)
def ignore_error(err):
return jsonify()
app.add_url_rule('/urls', view_func=Shorty.as_view('urls'))
return app
This is run.py
environment = environ['TINY_ENV']
config = config_by_name[environment]
app = create_app(config)
if __name__ == '__main__':
app.run()
This is config.py
import os
basedir = os.path.abspath(os.path.dirname(__file__))
class Config:
"""
set Flask configuration vars
"""
# General config
DEBUG = True
TESTING = False
# Database
SECRET_KEY = os.environ.get('SECRET_KEY', 'my_precious_secret_key')
SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://root#localhost:3306/tiny'
SQLALCHEMY_TRACK_MODIFICATIONS = False
SERVER_HOST = 'localhost'
SERVER_PORT = '5000'
class TestConfig(Config):
"""
config for test
"""
TESTING = True
SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://root#localhost:3306/test_tiny'
config_by_name = dict(
test=TestConfig,
local=Config
)
key = Config.SECRET_KEY
This is models.py
from datetime import datetime
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
class URLS(db.Model):
__tablename__ = 'urls'
id = db.Column(db.Integer, primary_key=True)
original_url = db.Column(db.String(400), nullable=False)
short_url = db.Column(db.String(200), nullable=False)
created_at = db.Column(db.DateTime, default=datetime.utcnow()
This is test config setting.
db = SQLAlchemy()
#pytest.fixture(scope='session')
def app():
test_config = config_by_name['test']
app = create_app(test_config)
app.app_context().push()
return app
#pytest.fixture(scope='session')
def client(app):
return app.test_client()
#pytest.fixture(scope='session')
def init_db(app):
db.init_app(app)
db.create_all()
yield db
db.drop_all()

The following might be the problem that is preventing your code from running multiple times and/or preventing you from dropping/creating your tables. Regardless if it solves your problem, it is something one might not be aware of and quite important to keep in mind. :)
When you are running your tests multiple times, db.drop_all() might not be called (because one of your tests failed) and therefore, it might not be able to create the tables on the next run (since they are already existing). The problem lies in using a context manager without a try: finally:. (NOTE: Every fixture using yield is a context manager).
from contextlib import contextmanager
def test_foo(db):
print('begin foo')
raise RuntimeError()
print('end foo')
#contextmanager
def get_db():
print('before')
yield 'DB object'
print('after')
This code represents your code, but without using the functionality of pytest. Pytest is running it more or less like
try:
with get_db(app) as db:
test_foo(db)
except Exception as e:
print('Test failed')
One would expect an output similar to:
before
begin_foo
after
Test failed
but we only get
before
begin_foo
Test failed
While the contextmanager is active (yield has been executed), our test method is running. If an exception is raised during the execution of our test function, the execution is stopped WITHOUT running any code after the yield statement. To prevent this, we have to wrap our fixture/contextmanager in a try: ... finally: block. As finally is ALWAYS executed regardless of what has happened.
#contextmanager
def get_db():
print('before')
try:
yield 'DB object'
finally:
print('after')
The code after the yield statement is now executed as expected.
before
begin foo
after
Test failed
If you want to learn more, see the relevant section in the contextmanager docs:
At the point where the generator yields, the block nested in the with statement is
executed. The generator is then resumed after the block is exited. If an unhandled
exception occurs in the block, it is reraised inside the generator at the point
where the yield occurred. Thus, you can use a try…except…finally statement to trap
the error (if any), or ensure that some cleanup takes place.

Correctly setting up Flask-SQLAlchemy for multiple celery workers and threads

I'm struggling to make my Flask, SQLAlchemy (mysql) and Celery setup work properly when there are multiple celery workers with multiple threads involved that all query the same database.
The problem is that I cannot figure out how and where to apply required changes that give the flask application and each celery worker an isolated database object.
From my understanding, separate sessions are required to avoid nasty database errors such as incomplete transactions that block other database queries.
This is my current project structure
/flask_celery.py
from celery import Celery
def make_celery(app):
celery = Celery(app.import_name, backend=app.config['CELERY_RESULT_BACKEND'],
broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
/app.py
#!/usr/bin/env python
import config
from app import app
app.run(port=82,debug=True, host='0.0.0.0')
#app.run(debug=True)
app/__init.py__
from flask import Flask
from celery import Celery
from flask_sqlalchemy import SQLAlchemy
from flask_migrate import Migrate
from flask_celery import make_celery
app = Flask(__name__)
app.config.from_object('config')
app.secret_key = app.config['SECRET_SESSION_KEY']
db = SQLAlchemy(app)
migrate = Migrate(app, db)
celery = make_celery(app)

Maybe give SQLALCHEMY_BINDS a chance. It's a guideline about how to bind multiple databases.
I afraid there are still extra moves you should made.
I assume you have config.py to hold app configurations. Adding SQLALCHEMY_BINDS in Config class with values you prepared, which maybe several other databases' uri.
Handling model classes in models.py if file exists.
Managing your bind_key as argument somehow(sorry i don't make this detailed).
Dealing with bind_key argument to the right celery work...
I hope this may help you a little. And please let me know if you work this out, so I could edit this answer for people who have similar cases.

Integrate graphql into flask blueprint

I'm currently trying to integrate GraphQL (using graphene and flask_graphql) into my flask app. Tried a few tutorial from here and here. But seems none works in my situation.
Currently my project is like this
-manage.py
|app
|__init__.py (create_app is here)
|mod_graphql (this is folder)
|__init__.py (blueprint created here)
|models.py
|schema.py
|controller.py
The issue is in both tutorial, it recommend to create Base and engine in the models file, I did the same, but I would need to read config file using current_app for the URI to create engine. But actually, because it happens in flask initialization, so there is no request context yet, so current_app doesn't exist. so everything fails.
Would it be possible to help me setup this?
Below are some code:
app/__init__.py
create_app():
...
from .mod_graphql import bp_graph
app.register_blueprint(bp_graph)
...
mod_graphql/__init__.py
from flask import Blueprint
bp_graph = Blueprint('graphql', __name__)
from . import controller
from . import models
from . import schema
mod_graphql/models.py
Base = declarative_base()
engine = create_engine(current_app.config.get('BASE_URI'),
convert_unicode=True)
db_session = scoped_session(sessionmaker(autocommit=False,
autoflush=False,
bind=engine))
Base.query = db_session.query_property()
class Model1_Model(Base):
...
class Model2_Model(Base):
...
mod_graphql/schema.py
class Model1(SQLAlchemyObjectType):
class Meta:
model = Model1_Model
interfaces = (relay.Node,)
class Model2(SQLAlchemyObjectType):
class Meta:
model = Model2_Model
interfaces = (relay.Node,)
class Query(graphene.ObjectType):
node = relay.Node.Field()
schema = graphene.Schema(query=Query, types=[Model1])
mod_graphql/controller.py
bp_graph.add_url_rule('/graphql',
view_func=GraphQLView.as_view('graphql',
schema=schema,
graphiql=True,
context={'session':
db_session}))
#bp_graph.teardown_app_request()
def shutdown_session(exception=True):
db_session.remove()
When I try to start server, it tells me:
Working outside of application context
Would you pls recommend the best practice to setup this?
Thanks a lot!

I would suggest using the flask initialization to define the DB connection. This way you can use Flask's internal implementation of connecting to the database without defining it on your own
app/__init__.py
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
def create_app(config_class=Config):
app = Flask(__name__)
app.config.from_object(config_class)
db.init_app(app)
Then all you need to do in your models is:
app/models.py
Base = declarative_base()
class Model1_Model(Base):
Then in your route definition instead of using the db_seesion from your model inside your context, you can reference Flask's db like context={'session': db.session}) that you created in the create_app() function.

Try initiate the Blueprint in
app.__init__
I got a same time like u,cos the
__name__
should be app

Finally figured out how to do it without using current_app:
in config.py, and multi database binding by:
SQLALCHEMY_BINDS = {
'db1': DB1_URI,
'db2': DB2_URI
}
in models.py, do this
engine = db.get_engine(bind='db1')
metaData = MetaData()
metaData.reflect(engine)
Base = automap_base(metadata=MetaData)
class Model1(Base):
__tablename__ = 'db1_tablename'
In this way, the model will be able to access the proper database uri without current_app, but if app_context switches, potentially different solution will be needed

Flask-SQLAlchemy fails to update data in while in thread

I am building an app where users will occasionally initiate a longer-running process. While running, the process will commit updates to a database entry.
Since the process takes some time, I am using the threading module to execute it. But values updated while in the thread are never actually committed.
An example:
from flask import Flask, url_for, redirect
from flask_sqlalchemy import SQLAlchemy
import time, threading, os
if os.path.exists('test.db'): os.remove('test.db')
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///test.db'
db = SQLAlchemy(app)
class Item(db.Model):
id = db.Column(db.Integer, primary_key=True)
value = db.Column(db.Integer)
def __init__(self, value): self.value = value
db.create_all()
item = Item(1)
db.session.add(item)
db.session.commit()
#app.route('/go', methods=['GET'])
def go():
def fun(item):
time.sleep(2)
item.value += 1
db.session.commit()
thr = threading.Thread(target=fun, args=(item,))
# thr.daemon = True
thr.start()
return redirect(url_for('view'))
#app.route('/view', methods=['GET'])
def view(): return str(Item.query.get(1).value)
app.run(host='0.0.0.0', port=8080, debug=True)
My expectation was that the item's value would be asynchronously updated after two seconds (when the fun completes), and that additional requests to /view would reveal the updated value. But this never occurs. I am not an expert on what is going on in the threading module; am I missing something?
I have tried setting thr.daemon=True as pointed out in some posts; but that is not it. The closest SO post I have found is this one; that question does not have a minimal and verifiable example and has not been answered.

I guess this is due to the fact that sessions are local threaded, as mentioned in the documentation. In your case, item was created in one thread and then passed to a new thread to be modified directly.
You can either use scoped sessions as suggested in the documentation, or simply change your URI config to bypass this behavior:
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///test.db?check_same_thread=False'

After some debugging I figured out a solution; though I still do not understand the problem. It has to do with referencing a variable for the database object. If fun updates an object returned by a query, it works as expected:
def fun(item_id):
time.sleep(2)
Item.query.get(item_id).value += 1
db.session.commit()
In context:
from flask import Flask, url_for, redirect
from flask_sqlalchemy import SQLAlchemy
import time, threading, os
if os.path.exists('test.db'): os.remove('test.db')
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///test.db'
db = SQLAlchemy(app)
class Item(db.Model):
id = db.Column(db.Integer, primary_key=True)
value = db.Column(db.Integer)
def __init__(self, value): self.value = value
db.create_all()
item = Item(1)
db.session.add(item)
db.session.commit()
#app.route('/go', methods=['GET'])
def go():
def fun(item_id):
time.sleep(2)
Item.query.get(item_id).value += 1
db.session.commit()
thr = threading.Thread(target=fun, args=(item.id,))
# thr.daemon = True
thr.start()
return redirect(url_for('view'))
#app.route('/view', methods=['GET'])
def view(): return str(Item.query.get(1).value)
app.run(host='0.0.0.0', port=8080, debug=True)
I would be very pleased to hear from anyone knows what exactly is going on here!

Flask-SQLAlchemy: Can't reconnect until invalid transaction is rolled back

So I am using Amazon Web Services RDS to run a MySQL server and using Python's Flask framework to run the application server and Flask-SQLAlchemy to interface with the RDS.
My app config.py
SQLALCHEMY_DATABASE_URI = '<RDS Host>'
SQLALCHEMY_POOL_RECYCLE = 60
My __ init __.py
from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
application = Flask(__name__)
application.config.from_object('config')
db = SQLAlchemy(application)
I have my main application.py
from flask import Flask
from application import db
import flask.ext.restless
from application.models import Person
application = Flask(__name__)
application.debug=True
db.init_app(application)
#application.route('/')
def index():
return "Hello, World!"
manager = flask.ext.restless.APIManager(application, flask_sqlalchemy_db=db)
manager.create_api(Person, methods=['GET','POST', 'DELETE'])
if __name__ == '__main__':
application.run(host='0.0.0.0')
The models.py
class Person(db.Model):
__bind_key__= 'people'
id = db.Column(db.Integer, primary_key=True)
firstName = db.Column(db.String(80))
lastName = db.Column(db.String(80))
email = db.Column(db.String(80))
def __init__(self, firstName=None, lastName=None, email=None):
self.firstName = firstName
self.lastName = lastName
self.email = email
I then have a script to populate the database for testing purposes after db creation and app start:
from application import db
from application.models import Person
person = Person('Bob', 'Jones', 'bob#website.net')
db.session.add(person)
db.session.commit()
Once I've reset the database with db.drop_all() and db.create_all() I start the application.py and then the script to populate the database.
The server will respond with correct JSON but if I come back and check it hours later, I get the error that I need to rollback or sometimes the 2006 error that the MySQL server has gone away.
People suggested that I change timeout settings on the MySQL server but that hasn't fixed anything. Here are my settings:
innodb_lock_wait_timeout = 3000
max_allowed_packet = 65536
net_write_timeout = 300
wait_timeout = 300
Then when I look at the RDS monitor, it shows the MySQL server kept the connection open for quite a while until the timeout. Now correct me if I'm wrong but isn't the connection supposed to be closed after it's finished? It seems that the application server keeps making sure that the database connection exists and then when the MySQL server times out, Flask/Flask-SQLAlchemy throws an error and brings down the app server with it.
Any suggestions are appreciated, thanks!

I think what did it was adding
db.init_app(application)
in application.py, haven't had the error since.

Everytime checking rollback or not is troublesome..
I made insert, update functions which need commit.
#app.teardown_request
def session_clear(exception=None):
Session.remove()
if exception and Session.is_active:
Session.rollback()

It seems not to be a problem with the transactions at the first place, but this is probably caused by an MySQL Error like Connection reset by peer beforehand. That means your connection is lost, probably because your application context was not setup correctly.
In general it is preferrable to use the factory pattern to create your app. This has a lot of advantages, your code is
easier to read and setup
easier to test
avoid circular imports
To prevent the invalid transaction error (that is probably caused by an OperationalError: Connection reset by peer) you should ensure that you are handling the database connection right.
The following example is based on this article which gives a nice explanation of the flask application context and how to use it with database connections or any other extensions.
application.py
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
def create_app():
"""Construct the core application."""
application = Flask(__name__)
application.config.from_object('config') # Set globals
db = SQLAlchemy()
with application.app_context():
# Initialize globals/extensions in app context
db.init_app(app)
# import routes here
from . import routes
return application
if __name__ == "__main__":
app = create_app()
app.run(host="0.0.0.0")
routes.py
from flask import current_app as application
#application.route('/', methods=['GET'])
def index():
return "Hello, World!"
If you still run into disconnect-problems you should also check the SQLAlchemy documentation on dealing with disconnects and have a look at this question.

Here you missing pool recycle as MySql closes session after some time so you need to add pool recycle so that connections in pool get reconnect after pool recycle time.
app.config['SQLALCHEMY_POOL_RECYCLE'] = 3600

This error usually appears when you create sqlalchemy the engine as a singleton. In that case after the connection is invalidated (in my case it was 3600sec) you get the InvalidTransaction error.
Best advice would be to initialise the db session at the time of application initialisation
db.init_app(app)
and import this db session when ever you have to do some CRUD operation.
Never faced this issue post this change on my application.

Alternatively, use this at the end of the script that populates your database:
db.session.close()
That should prevent those annoying "MySQL server has gone away" errors.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pyramid / SQLAlchemy model binding in Celery tasks - python

My colleague tried the exact same code and it worked. Strange

Related

Flask+Pytest+SQLAlchemy: Can't create and drop tables when run pytest using flask-sqlalchemy

Correctly setting up Flask-SQLAlchemy for multiple celery workers and threads

Integrate graphql into flask blueprint

Flask-SQLAlchemy fails to update data in while in thread

Flask-SQLAlchemy: Can't reconnect until invalid transaction is rolled back

Categories

Resources