Flask / SQL Alchemy raw query instead of class method - python

I'll acknowledge that this is a strange question before I ask it. I'm wondering if it's possible to replicate Flask / SQL Alchemy class methods using raw SQL instead of using the methods themselves?
Long story short, my teammates and I are taking a database design course, and we're now in the implementation phase where we are coding the app that is based on our DB schema design. We want to keep things simple, so we opted for using Flask in Python. We're following the Flask Mega Tutorial, which is a kickass-tic tutorial explaining how to build a basic site like we're doing. We've just completed Chapter 5: User Logins, and are moving on.
In the app/routes.py script, the tutorial does something to grab the user information. Here's the example login route for the example app:
from flask_login import current_user, login_user
from app.models import User
# ...
#app.route('/login', methods=['GET', 'POST'])
def login():
if current_user.is_authenticated:
return redirect(url_for('index'))
form = LoginForm()
if form.validate_on_submit():
user = User.query.filter_by(username=form.username.data).first()
if user is None or not user.check_password(form.password.data):
flash('Invalid username or password')
return redirect(url_for('login'))
login_user(user, remember=form.remember_me.data)
return redirect(url_for('index'))
return render_template('login.html', title='Sign In', form=form)
The line user = User.query.filter_by(username=form.username.data).first() is what I'm interested in. Basically, that line instantiates the User class, which is a database model from SQL Alchemy, and grabs information about the user from the email address they entered. Calling those methods generates a SQL statement like the following:
SELECT `User`.`userID` AS `User_userID`,
`User`.user_email AS `User_user_email`,
`User`.user_first_name AS `User_user_first_name`,
`User`.user_last_name AS `User_user_last_name`,
`User`.user_password AS `User_user_password`
FROM `User`
WHERE `User`.user_email = 'test#test.com'
LIMIT 1
And also some information about the user variable itself:
>>> print(type(user))
<class 'myapp.models.User'>
>>> pp(user.__dict__)
{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x7f5a026a8438>,
'userID': 1,
'user_email': 'test#test.com',
'user_first_name': 'SomeFirstName',
'user_last_name': 'SomeLastName',
'user_password': 'somepassword'}
On our project, we're not supposed to be using generated SQL statements like the one that comes from calling query.filter_by(username=form.username.data).first() on the instantiated User class; we should be writing the raw SQL ourselves, which normally doesn't make sense, but in our case it does.
Is this possible?

First of all: Talk to your professor or TA. You will save yourself time by not making assumptions about something so major. If the goal of the class is to think about database schema design then using an ORM is probably fine. If you need to write your own SQL, then don't use an ORM to begin with.
To answer the technical question: yes, you can use SQLAlchemy purely as a database connection pool, as a tool to create valid SQL statements from Python objects, and as a full-fledged ORM, and every gradation in between.
For example, using the ORM layer, you can tell a Query object to not generate the SQL for you but instead take text. This is covered in the SQLAlchemy ORM tutorial under the Using Textual SQL section:
Literal strings can be used flexibly with Query, by specifying their use with the text() construct, which is accepted by most applicable methods
For your login example, querying for just the password could look like this:
user = User.query.from_statement(
db.text("SELECT * FROM User where user_email=:email LIMIT 1")
).params(email=form.username.data).first()
if user is None or user.check_password(form.password.data):
# ...
You could also read up on the SQL Expression API (the core API of the SQLAlchemy library) to build queries using Python code; the relationship between Python objects and resulting query is pretty much one on one; you generally would first produce a model of your tables and then build your SQL from there, or you can use literals:
s = select([
literal_column("User.password", String)
]).where(
literal_column("User.user_email") == form.username.data
).select_from(table("User")).limit(1)
and execute such objects with the Session.execute() method
results = db.session.execute(s)
If you wanted to really shoot yourself in the foot, you can pass strings to db.session.execute() directly too:
results = db.session.execute("""
SELECT user_password FROM User where user_email=:email LIMIT 1
""", {'email': form.username.data})
Just know that Session.execute() returns a ResultProxy() instance, not ORM instances.
Also, know that Flask-Login doesn't require you to use an ORM. As the project documentation states:
However, it does not:
Impose a particular database or other storage method on you. You are entirely in charge of how the user is loaded.
So you could just create a subclass of UserMixin that you instantiate each time you queried the database, manually.
class User(flask_login.UserMixin):
def __init__(self, id): # add more attributes as needed
self.id = id
#login_manager.user_loader
def load_user(user_id):
# perhaps query the database to confirm the user id exists and
# load more info, but all you basically need is:
return User(user_id)
# on login, use
user = User(id_of_user_just_logged_in)
login_user(user)
That's it. The extension wants to see instances that implement 4 basic methods, and the UserMixin class provides all of those and you only need to provide the id attribute. How you validate user ids and handle login is up to you.

Related

Unit testing a function that depends on database

I am running tests on some functions. I have a function that uses database queries. So, I have gone through the blogs and docs that say we have to make an in memory or test database to use such functions. Below is my function,
def already_exists(story_data,c):
# TODO(salmanhaseeb): Implement de-dupe functionality by checking if it already
# exists in the DB.
c.execute("""SELECT COUNT(*) from posts where post_id = ?""", (story_data.post_id,))
(number_of_rows,)=c.fetchone()
if number_of_rows > 0:
return True
return False
This function hits the production database. My question is that, when in testing, I create an in memory database and populate my values there, I will be querying that database (test DB). But I want to test my already_exists() function, after calling my already_exists function from test, my production db will be hit. How do I make my test DB hit while testing this function?
There are two routes you can take to address this problem:
Make an integration test instead of a unit test and just use a copy of the real database.
Provide a fake to the method instead of actual connection object.
Which one you should do depends on what you're trying to achieve.
If you want to test that the query itself works, then you should use an integration test. Full stop. The only way to make sure the query as intended is to run it with test data already in a copy of the database. Running it against a different database technology (e.g., running against SQLite when your production database in PostgreSQL) will not ensure that it works in production. Needing a copy of the database means you will need some automated deployment process for it that can be easily invoked against a separate database. You should have such an automated process, anyway, as it helps ensure that your deployments across environments are consistent, allows you to test them prior to release, and "documents" the process of upgrading the database. Standard solutions to this are migration tools written in your programming language like albemic or tools to execute raw SQL like yoyo or Flyway. You would need to invoke the deployment and fill it with test data prior to running the test, then run the test and assert the output you expect to be returned.
If you want to test the code around the query and not the query itself, then you can use a fake for the connection object. The most common solution to this is a mock. Mocks provide stand ins that can be configured to accept the function calls and inputs and return some output in place of the real object. This would allow you to test that the logic of the method works correctly, assuming that the query returns the results you expect. For your method, such a test might look something like this:
from unittest.mock import Mock
...
def test_already_exists_returns_true_for_positive_count():
mockConn = Mock(
execute=Mock(),
fetchone=Mock(return_value=(5,)),
)
story = Story(post_id=10) # Making some assumptions about what your object might look like.
result = already_exists(story, mockConn)
assert result
# Possibly assert calls on the mock. Value of these asserts is debatable.
mockConn.execute.assert_called("""SELECT COUNT(*) from posts where post_id = ?""", (story.post_id,))
mockConn.fetchone.assert_called()
The issue is ensuring that your code consistently uses the same database connection. Then you can set it once to whatever is appropriate for the current environment.
Rather than passing the database connection around from method to method, it might make more sense to make it a singleton.
def already_exists(story_data):
# Here `connection` is a singleton which returns the database connection.
connection.execute("""SELECT COUNT(*) from posts where post_id = ?""", (story_data.post_id,))
(number_of_rows,) = connection.fetchone()
if number_of_rows > 0:
return True
return False
Or make connection a method on each class and turn already_exists into a method. It should probably be a method regardless.
def already_exists(self):
# Here the connection is associated with the object.
self.connection.execute("""SELECT COUNT(*) from posts where post_id = ?""", (self.post_id,))
(number_of_rows,) = self.connection.fetchone()
if number_of_rows > 0:
return True
return False
But really you shouldn't be rolling this code yourself. Instead you should use an ORM such as SQLAlchemy which takes care of basic queries and connection management like this for you. It has a single connection, the "session".
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy_declarative import Address, Base, Person
engine = create_engine('sqlite:///sqlalchemy_example.db')
Base.metadata.bind = engine
DBSession = sessionmaker(bind=engine)
session = DBSession()
Then you use that to make queries. For example, it has an exists method.
session.query(Post.id).filter(q.exists()).scalar()
Using an ORM will greatly simplify your code. Here's a short tutorial for the basics, and a longer and more complete tutorial.

Data Model persistence between requests

I'm building an application using Flask and Flask-SQLAlchemy.
In the application I use database models written in the SQLAlchemy declarative language, let's say that I have a table called Server.
The application, by design choices, ask the user -via WTForms- to set values for the fields of the Server table between different pages(views) and I need save the instance to the database in the last view.
My problem is: I have 2 'circular' views and would like to store the instances of objects created in the first view directly in the database session in order to be able to query the session in the second view and committing the result only in the last view (the end of the loop), like the pseudocode shows (very simplified, it makes no sense in this form but is to explain the concept):
def first_view():
form = FormOne() #prompt the first form
if form.validate_on_submit():
#form validate, i create the instance in SQLalchemy
server = Server() #Database Model
server.hostname = form.hostname.data
loop_count = form.repetition.data
session['loop'] = loop_count
db.session.add(server) #adding the object to the session
#but i'm not committing the session in fact the next view needs to know about this new object and a third view needs to commit based on the user input
return redirect(url_for('second_view'))
return render_template("first_view.html", form=form)
def second_view():
form = FormTwo() #prompt the second form
if form.validate_on_submit():
hostname_to_search = form.hostname.data #i get some other input
#i use the input to query the session (here different server instace can appear, depends on the user input of the first view)
rslt= db.session.query(Server).filter(Server.hostname==hostname_to_search ).all()
#the session is empty and doesn't contain the instance created in the previous view... <-- :(
if session['loop'] <= 0 :
#end the loop
return redirect(url_for('commit_view'))
else:
loop_count = session.pop('loop',1)
session['loop'] = loop_count-1
#go back to the first page to add another server instance
return redirect(url_for('first_view'))
return render_template("first_view.html", form=form)
def commit_view():
#loop finished, now i can commit my final instances
db.session.commit() <--here i save the data, db.session is empty
return 'DONE!'
but SEEMS that the session in Flask-SQLAlchemy is local to the request, so it appears that between a view and another the db.session is resetted/empty.
The first solution came into my mind is to store even the server object values in the flask.session (in json format) but this means that I need to jsonify and parse back every time the flask.session to build back the objects in each view: I cannot use the query power of the database, but for example I have to manually check if the hostname input by the user is already present in the previously created server objects.
My question is: how is possible(if it is possible and good practice) to keep the db session 'open' between different views:
How can i implement this?
Is it convenient?
Is it thread safe?
Maybe i'm using the wrong approach to solve the problem? (Of course I can do a single page, but the real case in much more complex and needs to be structured between different pages)

Atomically Compare-Exchange a Model Field in Django

How can I atomically compare-exchange-save a value of Django Model instance Field? (Using PostgreSQL as the DB backend).
An example use case is making sure multiple posts with similar content (e.g. submits of the same form) take effect only once, without relying on insecure and only sometimes-working client-side javascript or server-side tracking of form UUIDs, which isn't secure against malicious multiple-posts.
For example:
def compare_exchange_save(model_object, field_name, comp, exch):
# How to implement?
....
from django.views.generic.edit import FormView
from django.db import transaction
from my_app.models import LicenseCode
class LicenseCodeFormView(FormView):
def post(self, request, ...):
# Get object matching code entered in form
license_code = LicenseCode.objects.get(...)
# Safely redeem the code exactly once
# No change is made in case of error
try:
with transaction.atomic()
if compare_exchange_save(license_code, 'was_redeemed', False, True):
# Deposit a license for the user with a 3rd party service. Raises an exception if it fails.
...
else:
# License code already redeemed, don't deposit another license.
pass
except:
# Handle exception
...
What you are looking for is the update function on a QuerySet object.
Depending on the value, you can do a comparison with Case, When objects - check out the docs on conditional updates NOTE that link is for 1.10 - Case/When came in in 1.8.
You might also find utility in using F which is used to reference a value in a field.
For example:
I need to update a value in my model Model:
(Model.objects
.filter(id=my_id)
.update(field_to_be_updated=Case(
When(my_field=True, then=Value(get_new_license_string()),
default=Value(''),
output_field=models.CharField())))
If you need to use an F object, just reference it on the right hand side of the equals in the update expression.
The update doesn't necessitate the use of transaction.atomic() context manager but if you need to do any other database operations you should continue to wrap that code with transaction.atomic()
Edit:
You may also like to use the queryset select_for_update method that implements row locks when the queryset is executed docs.

Flask/WTF/SQLAlchemy: using QuerySelectMultipleField with Form.populate_obj()

I have a form (WTForms via Flask-WTF) that includes a QuerySelectMultipleField, something like this:
class EditDocumentForm(Form):
# other fields omitted for brevity
users = QuerySelectMultipleField('Select Users',
query_factory=User.query.all,
get_label=lambda u: u.username)
This works great—I just instantiate the form and pass it to my template for rendering, and all the right choices are there.
However, when I POST the form back and try to suck up the data with Form.populate_obj(), I get an angry message from SQLAlchemy:
InvalidRequestError: Object '<User at 0x10a4d33d0>' is already attached to session '1' (this is '3')
The view function:
#app.route("/document/edit/<doc_id>", methods=['GET', 'POST'])
#login_required
def edit_document(doc_id):
doc = Document.query.filter_by(id=doc_id).first()
if (doc is not None) and (doc.user_id == current_user.id):
form = EditDocumentForm(obj=doc)
if request.method == "POST":
if form.validate():
form.populate_obj(doc)
db.session.commit()
return redirect('/')
else:
_flash_validation_errors(form)
return render_template("edit.html", form=form)
flash("The document you requested doesn't exist, or you don't have permission to access it.", "error")
return(redirect('/'))
So it looks like there's one session used when the form is created, and another when I'm trying to populate my model object. This is all happening under the hood, as I'm relying on Flask-SQLAlchemy to do all the session stuff for me.
In the Document model, the user field is declared this way:
users = db.relationship('User',
secondary=shares,
backref=db.backref('shared_docs', lazy='dynamic'))
(where of course shares is an instance of SQLAlchemy.table for a many-to-many relationship).
So: am I doing something wrong, or is Form.populate_obj() the problem, or perhaps I can blame aliens? Let me rephrase: What am I doing wrong?
Edit
The workaround this answer seems to fix the problem, namely changing my query_factory by importing my SQLAlchemy object and explicitly using its session:
query_factory=lambda: db.session.query(User)
I have to say, though, this has a weird smell to me. Is this really the best way to handle it?
It all depends on how your models classes are bound to a session. My guess is: you're not using the base class provided by Flask-SQLAlchemy: db.Model for your Document and User models ?
As stated in your ''edit'', by not using User's query method, and using db.session.query instead, you are forcing populate_obj to use the same session that you will commit later with your db.session.commit call. That said, you are still probably using another session when doing Document.query.filter_by which most likely means you are still using 2 DB connections and could reduce it to one.
Overall, I would advise you to stay away from using the query method on your models (but that's because I don't like magic stuff ;) ), make sure to use Flask-SQLAlchemy's db.Model if you can and read in-depth how the framework / libraries you use work, as it's a very good habit to take, does not take a lot of time, and can significantly improve the quality and maintainability of your code.

How to paginate in Flask-SQLAlchemy for db.session joined queries?

Say, we have the following relationships:
a person can have many email addresses
a email service provider can (obviously) serve multiple email address
So, it's a many to many relationship. I have three tables: emails, providers, and users. Emails have two foreign ids for provider and user.
Now, given a specific person, I want to print all the email providers and the email address it hosts for this person, if it exists. (If the person do not have an email at Gmail, I still want Gmail be in the result. I believe otherwise I only need a left inner join to solve this.)
I figured out how to do this with the following subqueries (following the sqlalchemy tutorial):
email_subq = db.session.query(Emails).\
filter(Emails.user_id==current_user.id).\
subquery()
provider_and_email = db.session.query(Provider, email_subq).\
outerjoin(email_subq, Provider.emails).\
all()
This works okay (it returns a 4-tuple of (Provider, user_id, provider_id, email_address), all the information that I want), but I later found out this is not using the Flask BaseQuery class, so that pagination provided by Flask-SQLAlchemy does not work. Apparently db.session.query() is not the Flask-SQLAlchemy Query instance.
I tried to do Emails.query.outerjoin[...] but that returns only columns in the email table though I want both the provider info and the emails.
My question: how can I do the same thing with Flask-SQLAlchemy so that I do not have to re-implement pagination that is already there?
I guess the simplest option at this point is to implement my own paginate function, but I'd love to know if there is another proper way of doing this.
I'm not sure if this is going to end up being the long-term solution, and it does not directly address my concern about not using the Flask-SQLAlchemy's BaseQuery, but the most trivial way around to accomplish what I want is to reimplement the paginate function.
And, in fact, it is pretty easy to use the original Flask-SQLAlchemy routine to do this:
def paginate(query, page, per_page=20, error_out=True):
if error_out and page < 1:
abort(404)
items = query.limit(per_page).offset((page - 1) * per_page).all()
if not items and page != 1 and error_out:
abort(404)
# No need to count if we're on the first page and there are fewer
# items than we expected.
if page == 1 and len(items) < per_page:
total = len(items)
else:
total = query.order_by(None).count()
return Pagination(query, page, per_page, total, items)
Modified from the paginate function found around line 376: https://github.com/mitsuhiko/flask-sqlalchemy/blob/master/flask_sqlalchemy.py
Your question is how to use Flask-SQLAlchemy's Pagination with regular SQLAlchemy queries.
Since Flask-SQLAlchemy's BaseQuery object holds no state of its own, and is derived from SQLAlchemy's Query, and is really just a container for methods, you can use this hack:
from flask.ext.sqlalchemy import BaseQuery
def paginate(sa_query, page, per_page=20, error_out=True):
sa_query.__class__ = BaseQuery
# We can now use BaseQuery methods like .paginate on our SA query
return sa_query.paginate(page, per_page, error_out)
To use:
#route(...)
def provider_and_email_view(page):
provider_and_email = db.session.query(...) # any SQLAlchemy query
paginated_results = paginate(provider_and_email, page)
return render_template('...', paginated_results=paginated_results)
*Edit:
Please be careful doing this. It's really just a way to avoid copying/pasting the paginate function, as seen in the other answer. Note that BaseQuery has no __init__ method. See How dangerous is setting self.__class__ to something else?.
*Edit2:
If BaseQuery had an __init__, you could construct one using the SA query object, rather than hacking .__class__.
Hey I have found a quick fix for this here it is:
provider_and_email = Provider.query.with_entities(email_subq).\
outerjoin(email_subq, Provider.emails).paginate(page, POST_PER_PAGE_LONG, False)
I'm currently using this approach:
query = BaseQuery([Provider, email_subq], db.session())
to create my own BaseQuery. db is the SqlAlchemy instance.
Update: as #afilbert suggests you can also do this:
query = BaseQuery(provider_and_email.subquery(), db.session())
How do you init your application with SQLAlchemy?
Probably your current SQLAlchemy connection has nothing to do with flask.ext.sqalchemy and you use original sqlalchemy
Check this tutorial and check your imports, that they really come from flask.ext.sqlalchemy
http://pythonhosted.org/Flask-SQLAlchemy/quickstart.html#a-minimal-application
You can try to paginate the list with results.
my_list = [my_list[i:i + per_page] for i in range(0, len(my_list), per_page)][page]
I did this and it works:
query = db.session.query(Table1, Table2, ...).filter(...)
if page_size is not None:
query = query.limit(page_size)
if page is not None:
query = query.offset(page*page_size)
query = query.all()
I could be wrong, but I think your problem may be the .all(). By using that, you're getting a list, not a query object.
Try leaving it off, and pass your query to the pagination method like so (I left off all the subquery details for clarity's sake):
email_query = db.session.query(Emails).filter(**filters)
email_query.paginate(page, per_page)

Categories

Resources