Use the SQLAlchemy session in custom Flask-SQLAlchemy query method - python

I would like to create a custom method to tell me whether a row exists in my database. The SQLAlchemy exists method returns a subquery but doesn't execute anything, so I wanted to create the method does_exist, which will simply return True or False. Here is my code:
from flask_sqlalchemy import SQLAlchemy, BaseQuery, Model
class CustomBaseQuery(BaseQuery):
def does_exist(self):
return db.session.query(self.exists()).scalar()
db = SQLAlchemy(app, query_class=CustomBaseQuery)
This actually does work, but it seems wrong to refer to db.session within the body of the method, which thus depends on later naming the SQLAlchemy instance db. I would like to find a way to reference the eventual db.session object in a more general way.
Full working example here: https://gist.github.com/wbruntra/3db7b630e6ffb86fe792e4ed5a7a9578

Though undocumented, the session used by the Query object is accessible as
self.session
so your more generic CustomBaseQuery could look like
class CustomBaseQuery(BaseQuery):
def does_exist(self):
return self.session.query(self.exists()).scalar()

Related

Unit testing a function that depends on database

I am running tests on some functions. I have a function that uses database queries. So, I have gone through the blogs and docs that say we have to make an in memory or test database to use such functions. Below is my function,
def already_exists(story_data,c):
# TODO(salmanhaseeb): Implement de-dupe functionality by checking if it already
# exists in the DB.
c.execute("""SELECT COUNT(*) from posts where post_id = ?""", (story_data.post_id,))
(number_of_rows,)=c.fetchone()
if number_of_rows > 0:
return True
return False
This function hits the production database. My question is that, when in testing, I create an in memory database and populate my values there, I will be querying that database (test DB). But I want to test my already_exists() function, after calling my already_exists function from test, my production db will be hit. How do I make my test DB hit while testing this function?
There are two routes you can take to address this problem:
Make an integration test instead of a unit test and just use a copy of the real database.
Provide a fake to the method instead of actual connection object.
Which one you should do depends on what you're trying to achieve.
If you want to test that the query itself works, then you should use an integration test. Full stop. The only way to make sure the query as intended is to run it with test data already in a copy of the database. Running it against a different database technology (e.g., running against SQLite when your production database in PostgreSQL) will not ensure that it works in production. Needing a copy of the database means you will need some automated deployment process for it that can be easily invoked against a separate database. You should have such an automated process, anyway, as it helps ensure that your deployments across environments are consistent, allows you to test them prior to release, and "documents" the process of upgrading the database. Standard solutions to this are migration tools written in your programming language like albemic or tools to execute raw SQL like yoyo or Flyway. You would need to invoke the deployment and fill it with test data prior to running the test, then run the test and assert the output you expect to be returned.
If you want to test the code around the query and not the query itself, then you can use a fake for the connection object. The most common solution to this is a mock. Mocks provide stand ins that can be configured to accept the function calls and inputs and return some output in place of the real object. This would allow you to test that the logic of the method works correctly, assuming that the query returns the results you expect. For your method, such a test might look something like this:
from unittest.mock import Mock
...
def test_already_exists_returns_true_for_positive_count():
mockConn = Mock(
execute=Mock(),
fetchone=Mock(return_value=(5,)),
)
story = Story(post_id=10) # Making some assumptions about what your object might look like.
result = already_exists(story, mockConn)
assert result
# Possibly assert calls on the mock. Value of these asserts is debatable.
mockConn.execute.assert_called("""SELECT COUNT(*) from posts where post_id = ?""", (story.post_id,))
mockConn.fetchone.assert_called()
The issue is ensuring that your code consistently uses the same database connection. Then you can set it once to whatever is appropriate for the current environment.
Rather than passing the database connection around from method to method, it might make more sense to make it a singleton.
def already_exists(story_data):
# Here `connection` is a singleton which returns the database connection.
connection.execute("""SELECT COUNT(*) from posts where post_id = ?""", (story_data.post_id,))
(number_of_rows,) = connection.fetchone()
if number_of_rows > 0:
return True
return False
Or make connection a method on each class and turn already_exists into a method. It should probably be a method regardless.
def already_exists(self):
# Here the connection is associated with the object.
self.connection.execute("""SELECT COUNT(*) from posts where post_id = ?""", (self.post_id,))
(number_of_rows,) = self.connection.fetchone()
if number_of_rows > 0:
return True
return False
But really you shouldn't be rolling this code yourself. Instead you should use an ORM such as SQLAlchemy which takes care of basic queries and connection management like this for you. It has a single connection, the "session".
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy_declarative import Address, Base, Person
engine = create_engine('sqlite:///sqlalchemy_example.db')
Base.metadata.bind = engine
DBSession = sessionmaker(bind=engine)
session = DBSession()
Then you use that to make queries. For example, it has an exists method.
session.query(Post.id).filter(q.exists()).scalar()
Using an ORM will greatly simplify your code. Here's a short tutorial for the basics, and a longer and more complete tutorial.

How to make/use a custom database function in Django

Prologue:
This is a question arising often in SO:
Equivalent of PostGIS ST_MakeValid in Django GEOS
Geodjango: How to Buffer From Point
Get random point from django PolygonField
Django custom for complex Func (sql function)
and can be applied to the above as well as in the following:
Django F expression on datetime objects
I wanted to compose an example on SO Documentation but since it got shut down on August 8, 2017, I will follow the suggestion of this widely upvoted and discussed meta answer and write my example as a self-answered post.
Of course, I would be more than happy to see any different approach as well!!
Question:
Django/GeoDjango has some database functions like Lower() or MakeValid() which can be used like this:
Author.objects.create(name='Margaret Smith')
author = Author.objects.annotate(name_lower=Lower('name')).get()
print(author.name_lower)
Is there any way to use and/or create my own custom database function based on existing database functions like:
Position() (MySQL)
TRIM() (SQLite)
ST_MakePoint() (PostgreSQL with PostGIS)
How can I apply/use those functions in Django/GeoDjango ORM?
Django provides the Func() expression to facilitate the calling of database functions in a queryset:
Func() expressions are the base type of all expressions that involve database functions like COALESCE and LOWER, or aggregates like SUM.
There are 2 options on how to use a database function in Django/GeoDjango ORM:
For convenience, let us assume that the model is named MyModel and that the substring is stored in a variable named subst:
from django.contrib.gis.db import models as gis_models
class MyModel(models.Model):
name = models.CharField()
the_geom = gis_models.PolygonField()
Use Func()to call the function directly:
We will also need the following to make our queries work:
Aggregation to add a field to each entry in our database.
F() which allows the execution of arithmetic operations on and between model fields.
Value() which will sanitize any given value (why is this important?)
The query:
MyModel.objects.aggregate(
pos=Func(F('name'), Value(subst), function='POSITION')
)
Create your own database function extending Func:
We can extend Func class to create our own database functions:
class Position(Func):
function = 'POSITION'
and use it in a query:
MyModel.objects.aggregate(pos=Position('name', Value(subst)))
GeoDjango Appendix:
In GeoDjango, in order to import a GIS related function (like PostGIS's Transform function) the Func() method must be replaced by GeoFunc(), but it is essentially used under the same principles:
class Transform(GeoFunc):
function='ST_Transform'
There are more complex cases of GeoFunc usage and an interesting use case has emerged here: How to calculate Frechet Distance in Django?
Generalize custom database function Appendix:
In case that you want to create a custom database function (Option 2) and you want to be able to use it with any database without knowing it beforehand, you can use Func's as_<database-name> method, provided that the function you want to use exists in every database:
class Position(Func):
function = 'POSITION' # MySQL method
def as_sqlite(self, compiler, connection):
#SQLite method
return self.as_sql(compiler, connection, function='INSTR')
def as_postgresql(self, compiler, connection):
# PostgreSQL method
return self.as_sql(compiler, connection, function='STRPOS')

Add trigger to SQLAlchemy Base Class

I'm making a SQLAlchemy base class for a new Postgres database and want to have bookkeeping fields incorporated into it. Specifically, I want to have two columns for modified_at and modified_by that are updated automatically. I was able to find out how to do this for individual tables, but it seems like making this part of the base class is trickier.
My first thought was to try and leverage the declared_attr functionality, but I don't actually want to make the triggers an attribute in the model so that seems incorrect. Then I looked at adding the trigger using event.listen:
trigger = """
CREATE TRIGGER update_{table_name}_modified
BEFORE UPDATE ON {table_name}
FOR EACH ROW EXECUTE PROCEDURE update_modified_columns()
"""
def create_modified_trigger(target, connection, **kwargs):
if hasattr(target, 'name'):
connection.execute(modified_trigger.format(table_name=target.name))
Base = declarative_base()
event.listen(Base.metadata,'after_create', create_modified_trigger)
I thought I could find table_name using the target parameter as shown in the docs but when used with Base.metadata it returns MetaData(bind=None) rather than a table.
I would strongly prefer to have this functionality as part of the Base rather than including it in migrations or externally to reduce the chance of someone forgetting to add the triggers. Is this possible?
I was able to sort this out with the help of a coworker. The returned MetaData object did in fact have a list of tables. Here is the working code:
modified_trigger = """
CREATE TRIGGER update_{table_name}_modified
BEFORE UPDATE ON {table_name}
FOR EACH ROW EXECUTE PROCEDURE update_modified_columns()
"""
def create_modified_trigger(target, connection, **kwargs):
"""
This is used to add bookkeeping triggers after a table is created. It hooks
into the SQLAlchemy event system. It expects the target to be an instance of
MetaData.
"""
for key in target.tables:
table = target.tables[key]
connection.execute(modified_trigger.format(table_name=table.name))
Base = declarative_base()
event.listen(Base.metadata, 'after_create', create_modified_trigger)

Can't delete row from SQLAlchemy due to wrong session

I am trying to delete an entry from my table. This is my code for the delete function.
#app.route("/delete_link/<link_id>", methods=['GET', 'POST'])
def delete_link(link_id):
link = models.Link.query.filter(models.Link.l_id == link_id).first()
db.session.delete(link)
db.session.commit()
return flask.redirect(flask.url_for('links'))
the line: db.session.delete(link) returns me this error:
InvalidRequestError: Object '' is already attached to session '1' (this is '2')
I've tried this code as well:
#app.route("/delete_link/<link_id>", methods=['GET', 'POST'])
def delete_link(link_id):
link = models.Link.query.filter(models.Link.l_id == link_id)
link.delete()
db.session.commit()
return flask.redirect(flask.url_for('links'))
which does not update the database. Link must not be in the session I guess, but I don't know how to check that, and how to fix it.
I am new to sqlalchemy.
EDIT:
I use this to create my db variable which probably creates the session at this stage (this is at the top of the code). It comes from the flask documentation
from yourapplication import db
You are creating 2 instances of the db object, inherently creating 2 different sessions.
In models.py:
...
5. from config import app
6.
7. db = SQLAlchemy(app)
In erika.py:
...
16. from config import app
...
23. db = SQLAlchemy(app)
then when you try to delete the element:
link = models.Link.query.filter(models.Link.l_id == link_id).first()
db.session.delete(link)
db.session.commit()
the following happens:
models.Link.query uses the database session created by models.py to get the record.
db.session.delete uses the session created by erika.py.
link is attached to the models.py session and you can't use another session (erikas.py) to delete it. Hence:
InvalidRequestError: Object '' is already attached to session '1' (this is '2')
Solution
The solution it's simple. Have only one instance of a db object at any time and reuse that instance whenever you need db operations.
erika.py
from models import db
This way you are always using the same session that was used to fetch your records.
It appears to be a similar problem to the one described at http://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-xvi-debugging-testing-and-profiling
It's a good in-depth description of the problem and how he solved it. The author of that article made a fix that's available as a fork.
The Fix
To address this problem we need to find an alternative way of attaching Flask-WhooshAlchemy's query object to the model.
The documentation for Flask-SQLAlchemy mentions there is a model.query_class attribute that contains the class to use for queries. This is actually a much cleaner way to make Flask-SQLAlchemy use a custom query class than what Flask-WhooshAlchemy is doing. If we configure Flask-SQLAlchemy to create queries using the Whoosh enabled query class (which is already a subclass of Flask-SQLAlchemy's BaseQuery), then we should have the same result as before, but without the bug.
I have created a fork of the Flask-WhooshAlchemy project on github where I have implemented these changes. If you want to see the changes you can see the github diff for my commit, or you can also download the fixed extension and install it in place of your original flask_whooshalchemy.py file.

does sqlalchemy have an equivalent to datamapper's first_or_create method?

using datamapper, if you want to "either find the first resource matching some given criteria or just create that resource if it can't be found, you can use #first_or_create."
i am using flask-sqlalchemy and am wondering if there is a similar feature.
thanks!
there is some things in django named get_or_create you can create somethings like that for sqlalachemy
def get_or_create(session, model, **kwargs):
instance = session.query(model).filter_by(**kwargs).first()
if instance:
return instance
else:
instance = model(**kwargs)
return instance
from here: python - Does SQLAlchemy have an equivalent of Django's get_or_create? - Stack Overflow -> Does SQLAlchemy have an equivalent of Django's get_or_create?

Categories

Resources