Proper way to establish persistent complex object in Flask - python

I am looking for the "good practice" advice on how to handle a persistent object in Flask.
I have my own classes that handle user, groups, user membership in groups and user/group permissions. Among those, there is a Passport class that holds information about the current user and their permissions.
The idea is that each user session should be associated with its own Passport object that persists over the views: so that certain permissions could be initialized upon user login, and can be checked later while using the views and performing AJAX requests.
Currently I have serialize and deserialize methods in the Passport class, and a FlaskPassport class that is initialized in the views.py global scope, that has a read-only "passport" property that reads the serialized passport data from a session variable and returns deserialized object. And it has a save() method that does the opposite. This FlaskPassport class also has a decorator method for views that checks the permissions. And the code that gives access to the passport data that are stored in the session in serialized state looks pretty clumsy. The fact that the passport object has to be manually saved after alteration doesn't seem right - it should be so that Flask saves the altered passport object to the session after the request is processed automatically.
So, I am looking for some clever pattern that would give access to a global passport object, accessible to all views, and also let me add decorators to the views that need permission checking.

There are multiple ways to do this, including:
Storing the passport instance on g and using a before_request and after_request handler pair to hydrate / serialize the instance from / to the session:
#app.before_request
def load_passport():
if "passport_id" in session:
g.passport = create_passport_from_id(session["passport_id"])
#app.after_request
def serialize_passport(response):
if hasattr(g, "passport"):
session["passport_id"] = g.passport.id
return response
Use the thread-local pattern that Flask uses for request and g (among others). Under the hood this uses Werkzeug's LocalProxy, which is mounted on either the application context or the request context (depending on the lifetime of the underlying object):
from flask import (_request_ctx_stack as request_ctx,
has_request_context, session)
from werkzeug.local import LocalProxy
current_passport = LocalProxy(get_passport)
def get_passport():
if has_request_context() and "passport_id" in session:
if not hasattr(request_ct.top, "passport"):
passport_id = session["passport_id"]
request_ctx.top.passport = construct_passport_from_id(passport_id)
return getattr(request_ctx.top, "passport", None)
return EmptyPassport()
#app.after_request
def serialize(response):
if current_passport.is_not_empty():
session["passport_id"] = current_passport.id
return response
It is worth noting that I have chosen not to serialize the entire passport to the session, since that information is passed back and forth with every request (depending on how much information you are storing in your passport, this may or may not be something that concerns you).
It is also worth noting that neither of these approaches is inherently secure. Flask does sign the session cookie to make it tamper proof, but you'll still need to worry about logout, freshness, etc. Take a look at Flask-Login's code for some of the other things you'll want to think about.

Related

How To Create Template Based Dynamic URL with Flask

I have a flask route this like:
#app.route('/product/<string:slug>')
def product(slug):
# some codes...
return render_template('product.html', product=product)
Different clients use the project (different websites, same infrastructure). And every customer wants the product URL to be different. Like;
asite.com/product-nike-shoe-323
bsite.com/nike-shoe
csite.com/product/nike-shoue
vs. vs
How do I set the URL structure to come from the database?
like:
url_config = "product-{product_name}-{product_id}"
or
url_config = "product-{product_id}"
Note: please without redirect.
I’m not 100% clear on what you refer to when you say “database” here. From context I infer you may be talking about the Flask Config object. If that’s the case, you can simply register your view function right after setting up the app configuration. Just call app.add_url_rule() to register the URL pattern from the configuration to point to your view function of choice.
If, however, you are talking about a SQL or NoSQL database and you have built a web UI to register routes, then don’t dispair. Flask routes can be registered with the app object at any point. There is no point in the Flask app lifecycle after which you can no longer register a route!
All that registering a route does, is create a mapping between a URL template and endpoint name, an opaque string. Most of the time, you also register a function to be called to handle the specific endpoint, and most of the time, Flask infers the endpoint name from the function. Once registered in the mapping any next incoming request can be routed to the function for the given endpoint.
So, Flask keeps two maps:
from url route -> endpoint name: Flask.url_map
from endpoint name -> function: Flask.view_functions
That said, there is API for removing or changing url registrations (other than restarting your server, of course). You can’t change the url route, the endpoint name for a given route or what endpoint maps to what function. The intention of the framework is that you register your routes early on when first starting your server, via code that runs directly when imported or when bound to the app (Blueprints and Flask extensions do the latter). The majority of Flask apps will create their Flask instance, register all their routes and extensions, then pass the instance to the WSGI server for request dispatch, and that’s it. But there is nothing in the implementation stopping you from registering more routes after this point.
If you want to register URL routes from database information, you have to take care of at least the following two things:
Register existing routes at start-up. Once you have a connection to your database established, retrieve the existing routes and register them.
If a new entry is added to the database, register a new route.
First of all: if I were to implement something like this I’d use one view function. You can always figure out what url rule was matched and what endpoint name this mapped to by looking at request.url_rule and request.endpoint, respectively.
Next, I’d explicitly generate endpoint names for each url rule from the database. Use the primary key in the name; you want to be able to find the database row from the endpoint name and vice versa. How you do this is up to you; let’s assume you know how to do this, and you have two functions for this named pk_from_endpoint() and endpoint_from_pk().
Your view function can then look like this:
from flask import request
def product_request(**kwargs):
key = pk_from_endpoint(request.endpoint)
row = database_query(key)
# … process request
You register a route for a given database row with:
app.add_url_route(row.url_config, endpoint_from_pk(row.id), product_request)
As mentioned, you can’t change URL registrations. But, as long as changes to these URLs are infrequent you could always add new registrations and for any old entries use abort(404) to return a 404 Not Found response.
That's not possible with Flask's routing system. The URL map is supposed to be defined at startup and not change after that.
However, if you have some specific path where you need the dynamic parts (e.g. /product/WHATEVER), then you can register a route for /product/<slug> and query the database within your view function.
That said, if you REALLY want URL rules in a DB, and do not mind connecting to your database during startup (usually that's ugly), then nothing stop you from querying the database at startup time and define the URL rules based on data from the DB. Quite ugly, but doable.
Example:
with app.app_context():
url_map = {u.endpoint: u.rule for u in URLRules.query}
#app.route(url_map['foo'])
def foo():
...
Of course doing so makes it harder to nicely structure your app unless you use app.add_url_rule() for all the endpoints in a single place instead of the #app.route() decorators.
Likewise with blueprints of course.

When are Flask Resources created?

I'm new to Flask. I have a resource class that handles POST requests. The request processing is quite elaborate, and can't be all in the post function. Do I get a new Resource instance for each request? Or are instances reused? Is it safe for me to do something like:
class MyResource(Resource):
def post(self):
self.var = 17
self.do_some_work()
return self.var * self.var
Does Flask guarantee that my resource instance will not be used for other transactions?
Resource objects are created at the time the request should be served and they are not persistent. Keep in mind that REST principles say that APIs must be stateless. If you want to store data between requests, you should use some kind of database.
The simplest method to prove what I said is to use a print (id(self)) in your get handler and trigger the request a few times. You will see that the object always changes.
Now, if you are interested about Flask internals, here we go.
The class Resource is part of Flask-RESTtful and the documentation states the following:
Resources are built on top of Flask pluggable views, giving you easy access to multiple HTTP methods just by defining methods on your resource.
Resources are added with the method Resource.add_resource() and it is simply registering the underlying View object.
if self.app is not None:
self._register_view(self.app, resource, *urls, **kwargs)
else:
self.resources.append((resource, urls, kwargs))
Resource._register_view() method does a lot of crazy stuff, but the most meaningful things are those two lines:
resource_func = self.output(resource.as_view(endpoint, *resource_class_args, **resource_class_kwargs))
...
self.blueprint_setup.add_url_rule(url, view_func=resource_func, **kwargs)
Here you can see that the view object provides a handler that will be associated with the URL path. This handler will be called every time a HTTP request is made to this route.
Finally, we arrived at the core, in the View.as_view() method, it creates a function on-the-fly and this function will represent the route handler.
def view(*args, **kwargs):
self = view.view_class(*class_args, **class_kwargs)
return self.dispatch_request(*args, **kwargs)
As you can see, this function will create a new object every time a request must be dispatched and as you already guessed, view_class is containing your custom class for handling the requests.

Django: app level variables

I've created a Django-rest-framework app. It exposes some API which does some get/set operations in the MySQL DB.
I have a requirement of making an HTTP request to another server and piggyback this response along with the usual response. I'm trying to use a self-made HTTP connection pool to make HTTP requests instead of making new connections on each request.
What is the most appropriate place to keep this app level HTTP connection pool object?
I've looked around for it & there are multiple solutions each with some cons. Here are some:
To make a singleton class of the pool in a diff file, but this is not a good pythonic way to do things. There are various discussions over why not to use singleton design pattern.
Also, I don't know how intelligent it would be to pool a pooler? (:P)
To keep it in init.py of the app dir. The issue with that are as follows:
It should only contain imports & things related to that.
It will be difficult to unit test the code because the import would happen before mocking and it would actually try to hit the API.
To use sessions, but I guess that makes more sense if it was something user session specific, like a user specific number, etc
Also, the object needs to be serializable. I don't know how HTTP Connection pool can be serialized.
To keep it global in views.py but that also is discouraged.
What is the best place to store such app/global level variables?
This thread is a bit old but still could be googled. generally, if you want a component to be accessible among several apps in your Django project you can put it in a general or core app as a Util or whatever.
in terms of reusability and app-specific you can use a Factory with a cache mechanism something like:
class ConnectionPool:
pass
#dataclass
class ConnectionPoolFactory:
connection_pool_cache: dict[str: ConnectionPool] = field(default_factory=dict)
def get_connection(self, app_name: str) -> ConnectionPool:
if self.connection_pool_cache.get(app_name, None) is None:
self.connection_pool_cache[app_name] = ConnectionPool()
return self.connection_pool_cache[app_name]
A possible solution is to implement a custom Django middleware, as described in https://docs.djangoproject.com/ja/1.9/topics/http/middleware/.
You could initialize the HTTP connection pool in the middleware's __init__ method, which is only called once at the first request. Then, start the HTTP request during process_request and on process_response check it has finished (or wait for it) and append that response to the internal one.

GAE: cookies vs datastore

While looking for a good and efficient way to have a session in my app, I found GAE Boilerplate and GAE Sessions.
GAEB is amazing but very vast for my needs: I don't need Federate Login nor a default User structure, but I like the design structure and the way they solved some issues (routes, forms,...).
GAES is quite simple but powerfull to treat sessions. The most I like is the way it stores everything in a cookie, in this case, it stores the full user entity in a cookie so in the next page calls, no other datastore hits are done: the user data is always read from the cookie (this need, obviusly to update the data if the user updates something, which is not usual).
In the other hand, GAEB stores only the user ID and then retrieves, on each page call, the username and the user email. This is part of the code for the BaseHandler it uses (GAEB uses NDB model):
#webapp2.cached_property
def username(self):
if self.user:
try:
user_info = models.User.get_by_id(long(self.user_id))
if not user_info.activated:
self.auth.unset_session()
self.redirect_to('home')
else:
return str(user_info.username)
except AttributeError, e:
# avoid AttributeError when the session was delete from the server
logging.error(e)
self.auth.unset_session()
self.redirect_to('home')
return None
Same for email, and in the render_template function it does this:
def render_template(self, filename, **kwargs):
.... some code.....
# set or overwrite special vars for jinja templates
kwargs.update({
'app_name': self.app.config.get('app_name'),
'user_id': self.user_id,
'username': self.username,
'email': self.email,
... more vars ...
})
kwargs.update(self.auth_config)
It seems that reads 2 times (one for username and one for email) from the datastore, because this funcs makes models.User.get_by_**field**(long(self.user_id))
The only thing I don't know exactly what means is the #webapp2.cached_property, that maybe means that all this datastore reads are done from a cache and really don't hit the datastore.
Can someone tell me what is the better solution to save hits to the database? It seems that it is better to have all the user data in a cookie (obviously, secured) and don't hit the datastore on every page call, but maybe I'm mistaken (I'm relatively noob with GAE) and all this reads to datastore are cached and, then, for free.
Saving session data in the cookie is highly discouraged:
It has to be transfered with each request (slow on mobile connections)
The HTTP Header size you can transfer to the GAE is limited (64Kb if i remember correctly) - thats the upper bound of data you could store
Even if you encrypt and sign your session, you would still be vulnerable to reply attacks (you cannot perform a logout safely)
I don't know the implementations you mentioned, but we have an session implementation in our CMS, see https://bitbucket.org/viur/server/src/b2e9e3dca3adabee97e1761d630600a387a02c44/session.py?at=master .
The general idea is to generate a random string (used as a session identifier).
On session load a datastore "get by key" is performed (which is cached, so if that object is still in memcache, it wont hit the datastore at all).
And saving is only performed if the data stored inside the session changed or the session has not been
updated for the last 5 minutes.
Then you can copy the values of your user object into the session and wont have an additional datastore request.

Somewhat confused about how auth and sessions work in Python/Django

I'm using an existing database for my latest Django project, so unless I change my Models or the Django auth code, it's going to be rather difficult to merge the two.
Rather than messing with the existing auth backend, I'm planning on just writing my own authentication app.
Anyway, all of my previous authentication apps have been written in PHP, where basically i just throw everything in session variables and verify them on every page... Here's what I'm a bit confused about. It appears that when a user is authenticated/logged in, the entire user is added to a session, but I can't figure out where or how that is occurring.
In the default Django login function, it's assigning user to request.user ... is this being saved as a session variable somehow or is it just passed into the next view? If it is just being passed to the next view, how are future requests authenticated without requiring further login requests?
The default Django auth login is below..
def login(request, user):
"""
Persist a user id and a backend in the request. This way a user doesn't
have to reauthenticate on every request.
"""
if user is None:
user = request.user
# TODO: It would be nice to support different login methods, like signed cookies.
if SESSION_KEY in request.session:
if request.session[SESSION_KEY] != user.id:
# To avoid reusing another user's session, create a new, empty
# session if the existing session corresponds to a different
# authenticated user.
request.session.flush()
else:
request.session.cycle_key()
request.session[SESSION_KEY] = user.id
request.session[BACKEND_SESSION_KEY] = user.backend
if hasattr(request, 'user'):
request.user = user
user_logged_in.send(sender=user.__class__, request=request, user=user)
I also tried to follow the user_logged_in.send(), which is in django.dispatch.dispatcher.send but I'm not entirely sure what that's supposed to do either.
def send(self, sender, **named):
"""
Send signal from sender to all connected receivers.
If any receiver raises an error, the error propagates back through send,
terminating the dispatch loop, so it is quite possible to not have all
receivers called if a raises an error.
Arguments:
sender
The sender of the signal Either a specific object or None.
named
Named arguments which will be passed to receivers.
Returns a list of tuple pairs [(receiver, response), ... ].
"""
responses = []
if not self.receivers:
return responses
for receiver in self._live_receivers(_make_id(sender)):
response = receiver(signal=self, sender=sender, **named)
responses.append((receiver, response))
return responses
Basically what I'm looking for is for someone to explain an efficient way to save user session data in Python that does not depend on the Django framework. A little run-through of the Django authentication would be nice as well.
HTTP is stateless; regardless of the server used, the framework or language, there is no intrinsic way for an HTTP client to say "this request is part of that session". That's part of the design of HTTP.
So sessions are always a feature of the web application; either supported by the a web app framework or implemented in the app itself. The most usual way for a stateful session to be created from the stateless protocol is with cookies; Clients will store cookies at the request of a server and return those same cookies to that server in future requests.
Session data can be serialized and stored in the cookie itself, but that's both insecure (secret information could be forged or eavesdropped), and inefficient (takes bandwidth even though the individual bytes are of no use to the client), and so the preferred solution is to use an opaque (and even better, single use) session key is stored in a cookie, and the web application will store session data out of band; either in memory, in the filesystem, or a database backend, or some other option.
django takes care of most of this transparently in "middleware", modules that modify incoming requests and outgoing responses. The auth middleware will read a cookie and check if that represents a logged in user, and if so, add a user object to the request; it also attaches cookies to responses when a user gets logged in. the session middlware works in a similar fashion, checking for a cookie, and reading the session data from wherever it was stored between requests, and also grabbing session data from responses and storing them, along with setting a cookie to associate the client's session with the session data it just stored.
Since both of these features are useful, independent of each other (I tend to avoid sessions, but usually use some kind of authentication), they do not depend on each other. Session-like authentication is implemented in a similar manner as sessions, but authenticated users are not stored in "The Session", nor are sessions attached to "The authenticated User".
You might not think so, but django's authentication system is designed to be extended; if you already have a database of valid users you'd like to authenticate against, it's very simple to add a new auth backend that dovetails neatly into the standard django auth application (which means you can also use other applications that depend on it in turn).

Categories

Resources