User authentication: prepare vs get_current_user in tornado - python

I need to authenticate a user from a cookie in an application running on Tornado. I need to parse the cookie and load the user from the DB using the cookie content. On checking out the Tornado RequestHandler documentation, there are 2 ways of doing it:
by overriding prepare() method of RequestHandler class.
by overriding get_current_user() method of RequestHandler class.
I'm confused with the following statement:
Note that prepare() may be a coroutine while get_current_user() may
not, so the latter form is necessary if loading the user requires
asynchronous operations.
I don't understand 2 things in it:
What does the doc mean by saying that get_current_user() may not be a coroutine? What does may not mean here? Either it can be a coroutine, or it can't.
Why is the latter form, i.e. get_current_user(), required if async operation is required? If prepare() can be a coroutine and get_current_user() may not, then shouldn't prepare() be used for async operations?
I would really appreciate any help with this.

Here, "may not be a coroutine" means "is not allowed to be a coroutine" or "must not be a coroutine". The language used is confusing and it should probably be changed to say "must not".
Again, the docs are confusing: in this sentence prepare() is mentioned first, but before this sentence are two examples and get_current_user is first. "Latter" refers to the second example which uses prepare().
So in summary, it always works to override prepare() and set self.current_user, whether you need a coroutine or not. If you don't need a coroutine to get the current user, you can override get_current_user() instead and it will be called automatically the first time self.current_user is accessed. It doesn't really matter which one you choose; you can use whichever feels more natural to you. (The reason we have two different methods is that get_current_user() is older but we had to use a different method for coroutines)

1
The recommended way of getting current user is to use RequestHandler.current_user property. This property is actually a function, it returns RequestHandler._current_user if set, otherwise it tries to set it with call of get_current_user.
Because of current_user is a property - it can't be yielded, therefore get_current_user can't be coroutine function.
Of course you could, read cookie and call db, authenticate user in get_current_user, but only in blocking (sync) manner.
2
In the the doc you quoted, the latter example is the one with prepare.

Related

How to define a "limit test context" using TDD?

How to I can define the "limit context" of my tests?
I say this because of mocks, where my service has many other libs integrated, like: rabbit-mq, redis, etc, and instances from some one classes. Then, the greather part of time I'm writing test code, creating "complex" mocks to the service be possible to test.
Is possible defines this limit? Should be a possible way "no test" theses "expensive" method, and just test the "simple related method", that are simple to input the paramters, like a sum(a, b).
Is obvious that the more test coverage the better, but is many, many time writing some tests with a questionable useful result.
When is using TDD to well defined methods, like the sum(a,b), it is very useful, but when this method will receive instances or use instance from another integrated services, the mocks receive the greather part of attemption, not the "method objective".
Imagine this service/method to test:
class ClientService:
def ok_here_is_the_discount(self, some_args):
# Ok, receive the discount to thi signature, now is possible
# calcula the it, and response to the broker
self.calculate_user_discount_based_on_signature_type(some_args.discount_value)
def calculate_user_discount_based_on_signature_type(self, some_args):
# here will send the "result" to the broker
some_message_borker_publisher(
some_args.value - some_args.discount_signature
)
def request_the_signature_type_discount_in_another_service(self, some_args):
# Ok, I receive the client that has the signature type,
# now is need request what is the value to this signature
# in signature service
some_message_borker_publisher(
queue='siganture.service.what_discount_for_this_signature',
signature_type=some_args.client.singature_type
)
#Ok, this message goes to the broker, and the signature.service receive it
def method_that_receive_messages(self, some_args):
# Here we receives that want to calculate discount
# but just receive the client(dict/class/instance) with the signature type,
# we still dont know the value of theses discount, because when is responsible
# to manage the siganture, is the signature server
if some_args.message_type == 'please_calculate_discount':
self.request_the_signature_type_discount_in_another_service(some_args.client)
if some_args.message_type == 'message_from_signature_discount':
self.ok_here_is_the_discount(some_args.value)
1 - it receive a message 'please_calculate_discount' and call the method
self.request_the_signature_type_discount_in_another_service(some_args.client)
But it still not have the discount value, because this is signature service. (message to the broker)
2 - The supose that the signature server response to 'message_from_signature_discount', then it call the method:
self.ok_here_is_the_discount(some_args.value)
3 - ok, the method ok_here_is_the_discount receive discount and call method
calculate_user_discount_based_on_signature_type()
That now has the values to calculate to the discount and send these value to the broker.
Understand the complexity of these tests (TDD)? I need tests the "method_that_receive_messages" mocking all related nexted actions, or just test the related method, like "calculate_user_discount_based_on_signature_type"?
In this case is better uses a really broker to be possible test it?
Well it is easy to mock things in Python but it still comes with some overhead of course. What I have tended towards over many years now is to set up integration tests with mocks that tests the happy path through the system and then design the system with side effect free functions as much as possible and throw unit tests at them. Then you can start your TDDing with setting up the overall test for the happy path and then unit test particulars in easy to test functions.
It is still useless when the thing behind the mock changes though but it gives a great sense of security when refactoring.
The mock libraries in Python are fine but sometimes it is easier to just write your own and replace the real thing with patch. Pretty sweet actually.

How to wrap functions in try/except and require a parameter

I want to wrap a bunch of functions in a try/except and have them email me a traceback when they fail. I'm using Django's ExceptionReporter, so I need the request object to send the traceback email. Some of the functions I want to wrap have the request object as a parameter already, but not all of them.
I was thinking about using a decorator for the try/except, but then it isn't clear that the request object is a required parameter for all the functions it decorates. Is there a better way to do this?
Edit: The functions I'm trying to wrap are all just supplementary functions after the core stuff required for the response are done, so I don't want to use Django's automatic emailing that happens when returning a 500 error because of an uncaught exception. I suppose that opens up the possibility of running these methods as separate processes after returning the response, but that also gets complicated in Django.

Python: How to add/initialize new global vars IN another module?

I looked up other posts on the topic and I couldn't find my situation exactly. It is in a Django app, although I believe it's purely a (newbie) Python question. Here's my situation:
Let's say I have mymodule.py where I have various constants and common functions, and at some point elsewhere in the program, I will want to add (and initialize) another attribute for mymodule (if it it's not yet been added):
import mymodule
class UserView(View):
# this method always gets called first..
def get(self, request):
try:
# check if attribute exists
mymodule.user_data;
except AttributeError:
# add it if it doesn't
mymodule.user_data = mymodule.get_user_data()
# continue on..
# sometime later, this method is called..
def post(self, request)
print(mymodule.user_data)
My assumption was that once mymodule.user_data is added, it will persist as a global variable? Even though I do set it in the get() method first, when I try to read it in the post() method later, I get Error: 'module' object has no attribute 'account'
Does it need to be pre-initialized in mymodule.py, as some empty object? I may not necessarily know what type of object it will be -- how would I do it in Python? (Sorry, coming from JS -- don't shoot!)
You should not do this. Your proposed solution is very dangerous, as now all users will share the same data. You almost certainly don't want that.
For per-user data shared between requests, you should use the session.
Edit
There's no way to know if they are separate processes or not. Your server software (Apache, or whatever) will determine the number of processes to run (based on your settings), and automatically route requests between them. Each process could serve any number of requests before being killed and restarted. So, in all likelihood, two consecutive requests could indeed be served by the same process, in which case the data will indeed collide.
Note that the session data is stored on the server (only a key is stored in the user's cookie), so size shouldn't be a consideration. See the sessions documentation.
You should not want to do that.
But it works as "expected": just do
mymodule.variable = value
anywhere in your code.
So, yes, your example code is setting the variable in the current running program -
but then you hit the part where I said: "you should not want to do that" :-)
Because django, when running with production settings will behave differently than a single-proccess, single-thread python application.
In this case, if the variable is not set in mymodule when you try to access it later, it maybe because this access is happening in another process entirely (thus, "global variables" (actually, in Python we have "module" variables) won't work, since they are set per process).
In this particular case, since you have a function ot retrieve your desired value,and you may be worried that it is an expensive value, you should memoize it - check the documentation on django.utils.functional.memoize (which will change to django.utils.lru_cache.lru_cache in upcoming versions) - https://docs.djangoproject.com/en/dev/releases/1.7/ - this way it will be called once per process in your application as it serves from separated processes.
My solution (for now):
In the module mymodule.py, I initialized a dictionary: data = {}
Then in my get() method:
if not ('user' in mymodule.data):
mymodule.data['user'] = mymodule.get_user_data()
Subsequently, I'm able to retrieve the mymodule.data['user'] object in the post() method (and presumably elsewhere in my code). Seems to work but please let me know if it's an aberration!

What is a generative method?

I'm familiar with Python generators, however I've just come across the term "generative method" which I am not familiar with and cannot find a satisfactory definition.
To put it in context, I found the term in SQLAlchemy's narrative documentation:
Full control of the “autocommit” behavior is available using the generative Connection.execution_options() method provided on Connection, Engine, Executable, using the “autocommit” flag which will turn on or off the autocommit for the selected scope.
What is a generative method? Trying to iterate the object returned by Connection.execution_options() doesn't work so I'm figuring it's something other than a standard generator.
It doesn't appear to be a common database concept, but SQLAlchemy uses the term generative in the sense "generated by your program iteratively at runtime". (So, no connection to python generators). An example from the tutorial:
The Query object is fully generative, meaning that most method calls
return a new Query object upon which further criteria may be added.
For example, to query for users named “ed” with a full name of “Ed
Jones”, you can call filter() twice, which joins criteria using AND:
>>> for user in session.query(User).\
... filter(User.name=='ed').\
... filter(User.fullname=='Ed Jones'):
... print user
This call syntax is more commonly known as "method chaining", and the design that allows it as a "fluent interface".
So, in the case of Connection.execution_options(), "generative" means that it returns the modified connection object, so that you can chain the calls as above.
Looking at the source code of Connection.execution_options (lib/sqlalchemy/engine/base.py), all that method does is add options to the connection.
The idea is that those options influence the future behaviour of e.g. queries.
As an example:
result = connection.execution_options(stream_results=True).\
execute(stmt)
Here, the behaviour was changed in the middle of the connection for just this query.
In a way, it "generates" or clones itself as an object that has a slightly different behaviour.
Here you can also set autocommit to True. Example
# obtain a connection
connection = ...
# do some stuff
# for the next section we want autocommit on
autocommitting_connection = connection.execution_options(autocommit=True)
autocommitting_connection.execute(some_insert)
result = autocommitting_connection.execute(some_query)
# done with this section. Continue using connection (no autocommit)
This is what is meant with that section of the documentation. "generative method" refers to a method that returns a modified copy of the same instance that you can continue working with. This is applicable to the classes Connection, Engine, Executable.
You would have to consult the specific documentation or source code of that project to really make sure, but I would guess that it returns a modified version of some object adapted to the requirements/behaviour defined by the arguments.
The documentation states:
The method returns a copy of this Connection which references the same
underlying DBAPI connection, but also defines the given execution
options which will take effect for a call to execute().
As #zzzeek comments above, this is now documented in the SQLAlchemy glossary.
generative means:
A term that SQLAlchemy uses to refer what’s normally known as method chaining; see that term for details.
And method chaining is:
An object-oriented technique whereby the state of an object is constructed by calling methods on the object. The object features any number of methods, each of which return a new object (or in some cases the same object) with additional state added to the object.

Designing a interface to a websites api

Ok I am programing a way to interface with Grooveshark (http://grooveshark.com). Right now I have a class Grooveshark and several methods, one gets a session with the server, another gets a token that is based on the session and another is used to construct api calls to the server (and other methods use that). Right now I use it like so.... Note uses twisted and t.i.defer in twisted
g = Grooveshark()
d = g.get_session()
d.addCallback(lambda x: g.get_token())
## and then something like.... ##
g.search("Song")
I find this unpythonic and ugly sense even after initializing the class you have to call two methods first or else the other methods won't work. To solve this I am trying to get it so that the method that creates api calls takes care of the session and token. Currently those two methods (the session and token methods) set class variables and don't return anything (well None). So my question is, is there a common design used when interfacing with sites that require tokens and sessions? Also the token and session are retrieved from a server so I can't have them run in the init method (as it would either block or may not be done before a api call is made)
I find this unpythonic and ugly sense
even after initializing the class you
have to call two methods first or else
the other methods won't work.
If so, then why not put the get_session part in your class's __init__? If it always must be performed before anything else, that would seem to make sense. Of course, this means that calling the class will still return a yet-unusable instance -- that's kind of inevitable with asynchronous, event-drive programming... you don't "block until the instance is ready for use".
One possibility would be to pass the callback to perform as an argument to the class when you call it; a more Twisted-normal one would be to have Grooveshark be a function which returns a deferred (you'll add to the deferred the callback to perform, and call it with the instance as the argument when that instance is finally ready to be used).
I would highly recommend looking at the Facebook graph API. Just because you need sessions and some authentication doesn't mean you can build a clean REST API. Facebook uses OAuth to handle the authentication but there are other possibilities.

Categories

Resources