I am trying to initialize a requests.Session, to keep a connection with a webpage. However, I read that each time the session class is called, a new session is created.
How is it possible to keep the connection alive? Because with my current code, it's giving me the webpage content after I call the login method (that's OK, it shows that I logged into the page and gives me the content I want), but when I call the update method, it gives me the content from the login page again, not from the page I actually want after login.
import requests
class LoginLogout:
# creating session
def __init__(self):
self.s = requests.Session()
# login method
def login(self, user, password, server):
payload_data = {'user': user, 'pass': password, 'server': server}
print(self.s.post(LOGIN_LINK, payload_data))
# update method
def update(self, updt_link):
print(self.s.get(updt_link))
def logout(self):
response = self.s.get('...some webpage/user/logout...')
self.s.close()
print(response)
Here I am calling the objects:
if switch_parameter == "login":
login_var = LoginLogout()
login_var.login(USER, PASSWORD, SERVER)
print('IS IT OK ?', login_var.s.get('.../login...')) # <-OK it shows 200 result (but should I use there "s" too ?)
elif switch_parameter == "start":
start()
elif switch_parameter == "stop":
stop()
elif switch_parameter == "update":
update_prem = LoginLogout()
update_prem.update('...different/page...')
# (am I missing here "s" ?, or I shouldnt be using it here anyway)
elif switch_parameter == "logout":
logout()
else:
pass
What am I doing wrong here? I just want to use login to log into the website and keep the session active, while calling update every time I need to get another page. Am I even on the right track or completely wrong?
The whole point of requests.Session is to persist ephemeral constants (like cookies) between requests. In your code you initialize a new session object, when you initialize a LoginLogout object.
You do that here:
if switch_parameter == "login":
login_var = LoginLogout()
...
And you do that here:
elif switch_parameter == "update":
update_prem = LoginLogout()
...
Now login_var and update_prem are obviously different objects and both have the s attribute, each holding a different requests.Session object. How do you expect the attributes of one session to be magically available to the other?
If you want to use an existing session, use it. Don't create a new one.
I don't know about your actual use case of course, but from what you have presented here, it seems you need to do something like this:
scraper_obj = LoginLogout()
scraper_obj.login(USER, PASSWORD, SERVER)
...
scraper_obj.update('...')
...
scraper_obj.logout()
Since your created a wrapper around the actual requests.Session instance with LoginLogout, you should not ever need to deal with its s attribute directly, assuming you have methods on LoginLogout for every kind of request you want to make. You initialize it once and then use its methods to perform requests via its internal session object.
PS
You casually mentioned in a follow-up comment that you set this up as a script to be called repeatedly from the outside and depending on the parameter passed to the script, you want to either log into the site or scrape a specific page.
This shows that you either don't understand how "logging in" even works or that you don't understand how processes work. Typically some session attribute (e.g. cookie) is created on the client so that it can present it to the server to show that it is already authenticated. When using requests as an HTTP client library, this data is stored inside a requests.Session object.
When you call a Python script, you create a new process. Just because you start the same script twice in a row does not mean that one of those processes has any connection to the other. Calling the script to login once has absolutely no effect on what happens the next time you call that script to do something else. None of those earlier session attributes will be present in the second process. I hope this is clear now.
Related
I am playing around with Flask, striving to understand details of how sessions are working, I am using:
Python 3.6.1
Flask 0.12.2
Flask documentation clearly states (bold is mine):
The session object works pretty much like an ordinary dict, with the
difference that it keeps track on modifications.
This is a proxy.
...
Section on proxies mentions that (again, bold is mine):
If you need to get access
to the underlying object that is proxied, you can use the
_get_current_object() method
Thus, underlying object (session._get_current_object()) must remain the same for the request or, as was suggested by this answer and comment, a thread. Though, it does not persist nor inside request, nor thread.
Here is a demonstration code:
import threading
from flask import (
Flask,
session,
)
app = Flask(__name__)
app.secret_key = 'some random secret key'
#app.route('/')
def index():
print("session ID is: {}".format(id(session)))
print("session._get_current_object() ID is: {}".format(id(session._get_current_object())))
print("threading.current_thread().ident is: {}".format(threading.current_thread().ident))
print('________________________________')
return 'Check the console! ;-)'
If you will run Flask application above, and repeatedly go to the / — id returned by session._get_current_object() will, occasionally, change, while threading.current_thread().ident never changes.
This leads me to ask the following questions:
What exactly is returned by session._get_current_object()?
I get that it is an object underlying session proxy, but what this underlying object is bound to (if it is not a request and not a thread, if anything I would expect it never to change, for the simple application above)?
What exactly is returned by session._get_current_object()?
Technically speaking, it is the object referenced in the session attribute of the top-most element in the LocalStack instance named _request_ctx_stack.
This top-most element of that stack is a RequestContext that is instantiated in Flask.wsgi_app, which is called for every HTTP request. The RequestContext implements methods to push and pop itself to and from the local stack _request_ctx_stack. The push method also takes care of requesting a new session for the context.
This session is what is made available in the session proxy; the request, that the RequestContext has been initialized with, is made available via the request proxy. These two proxies are only usable inside a request context, i.e. with an active HTTP request being processed.
I get that it is an object underlying session proxy, but what this
underlying object is bound to (if it is not a request and not a
thread, if anything I would expect it never to change, for the simple
application above)?
As outlined above, the request context's session, proxied by the session local proxy, belongs to the RequestContext. And it is changing with every request. As documented in Lifetime of the Context, a new context is created for each request, and it creates a new session every time push is executed.
The id of session._get_current_object() staying the same between consecutive requests is, probably, due to the new session object being created in the same memory address that the old one from the previous request occupied.
See also: How the Context Works section of the Flask documentation.
Here is a modified code snippet, to illustrate answer by shmee
import threading
from flask import (
Flask,
session,
request
)
app = Flask(__name__)
app.secret_key = 'some random secret key'
#app.route('/')
def index():
print(">>> session <<<")
session_id = id(session)
session_object_id = id(session._get_current_object())
print("ID: {}".format(session_id),
"Same as previous: {}".format(session.get('prev_sess_id', '') == session_id))
print("_get_current_object() ID: {}".format(session_object_id),
"Same as previous: {}".format(session.get('prev_sess_obj_id', '') == session_object_id))
session['prev_sess_id'] = session_id
session['prev_sess_obj_id'] = session_object_id
print("\n>>> request <<<")
request_id = id(request)
request_object_id = id(request._get_current_object())
print("request ID is: {}".format(request_id),
"Same as previous: {}".format(session.get('prev_request_id', '') == request_id))
print("request._get_current_object() ID is: {}".format(id(request._get_current_object())),
"Same as previous: {}".format(session.get('prev_request_obj_id', '') == request_object_id))
session['prev_request_id'] = request_id
session['prev_request_obj_id'] = request_object_id
print("\n>>> thread <<<")
thread_id = threading.current_thread().ident
print("threading.current_thread().ident is: {}".format(threading.current_thread().ident),
"Same as previous: {}".format(session.get('prev_thread', '') == thread_id))
session['prev_thread'] = thread_id
print('-' * 100)
return 'Check the console! ;-)'
The only obscurity left is, indeed, why sometimes session._get_current_object() remains unchanged between between consecutive requests. And as suggested by shmee (bold is mine), it is:
probably, due to the new session object being created in the same memory address that the old one from the previous request occupied.
I have an API request to a third-party website that works great in the command line (from https://github.com/haochi/personalcapital):
pc = PersonalCapital()
try:
pc.login(email, password)
except RequireTwoFactorException:
pc.two_factor_challenge(TwoFactorVerificationModeEnum.SMS)
pc.two_factor_authenticate(TwoFactorVerificationModeEnum.SMS, input('code: '))
pc.authenticate_password(password)
accounts_response = pc.fetch('/newaccount/getAccounts')
accounts = accounts_response.json()['spData']
When I run the above in the command line, I get back a JSON just as intended.
However, I'd like to use it in a web app on a Flask server. So, I need to remove the command line input('code: ') for SMS confirmation. I'm thinking I'll use a form via 'POST' to get the user input.
However, if I redirect() or render_template() to send the user to the form, it interrupts my API session, and I get back a "session not authenticated" response from the API.
Server logic. Routes in question are /update (email and password first) and /authenticate (SMS confirmation form):
#app.route("/update", methods=["GET", "POST"])
#login_required
def update():
# Via post:
if request.method == "POST":
# Ensure userentered email
if not request.form.get("pc_email"):
return apology("Please enter username", 400)
# Ensure user entered password
elif not request.form.get("pc_password"):
return apology("Please enter password", 400)
# Save email & password
email = request.form.get("pc_email")
password = request.form.get("pc_password")
# Try to log in
try:
pc.login(email, password)
# If 2-factor is required, send sms & redirect
except RequireTwoFactorException:
pc.two_factor_challenge(TwoFactorVerificationModeEnum.SMS)
return redirect("/authenticate")
# Get data:
else:
# Get accounts data
accounts_response = pc.fetch('/newaccount/getAccounts')
accounts = accounts_response.json()['spData']
# TODO - update database w/ data from accounts & transactions
return redirect("/")
#app.route("/authenticate", methods=["GET","POST"])
#login_required
def authenticate():
# Via POST:
if request.method == "POST":
# SMS authentication
pc.two_factor_authenticate(TwoFactorVerificationModeEnum.SMS, \
request.form.get(sms))
pc.authenticate_password(password)
# Get accounts data
accounts_response = pc.fetch('/newaccount/getAccounts')
accounts = accounts_response.json()
# TODO - update database w/ data from accounts & transactions
# Redirect to "/"
return render_template("test.html", accounts=accounts)
# Via GET:
else:
return render_template("authenticate.html")
Source code for project is here: https://github.com/bennett39/budget/blob/stackoverflow/01/application.py
How do I block the code from executing while waiting for the user to respond with their SMS code? Or, should I be going about this problem a different way?
The error you are experiencing is actually due to the way you try to use global variables to persist state between requests. You initially define password as a module level variable and then set password = request.form.get("pc_password") within your update function. Due to pythons rules regarding global and local variables https://docs.python.org/3/faq/programming.html#id9 this creates a new local variable containing the password value and leaves the module level variable untouched. You then access the original global password variable within your authenticate function which fails as this password variable is still set to its original value of ''. The quick fix would be to add global password at the start of your update function but this ignores the other problems with this method of persisting state. All of your global variables are shared between everyone using your site, so that if multiple people are logged in then they will all be logged into the same personal capital account. It would be preferable to use the session object to persist this data as each user will then only be able to access their own session object and there will be no risk of people accessing each others accounts. Your use of the PersonalCapital object complicates things a little as this uses instance variables to persist state, which is appropriate for a command line application but less so for a web application. It is a very simple object however, with only 2 instance variables. It should therefore be fairly straightforward to extract these and store them in the session at the end of your update function and use these values to rebuild the object at the start of your authenticate function.
I am using Flask, with the flask-session plugin for server-side sessions stored in a Redis backend. I have flask set up to use persistent sessions, with a session timeout. How can I make an AJAX request to get the time remaining on the session without resetting the timeout?
The idea is for the client to check with the server before displaying a timeout warning (or logging out the user) in case the user is active in a different tab/window of the same browser.
EDIT: after some digging, I found the config directive SESSION_REFRESH_EACH_REQUEST, which it appears I should be able to use to accomplish what I want: set that to False, and then the session should only be refreshed if something actually changes in the session, so I should be able to make a request to get the timeout without the session timeout changing. It was added in 0.11, and I'm running 0.11.1, so it should be available.
Unfortunately, in practice this doesn't appear to work - at least when checking the ttl of the redis key to get the time remain. I checked, and session.modified is False, so it's not just that I am doing something in the request that modifies the session (unless it just doesn't set that flag)
The following works, though it is rather hacky:
In the application __init__.py, or wherever you call Session(app) or init_app(app):
#set up the session
Session(app)
# Save a reference to the original save_session function so we can call it
original_save_session = app.session_interface.save_session
#----------------------------------------------------------------------
def discretionary_save_session(self, *args, **kwargs):
"""A wrapper for the save_session function of the app session interface to
allow the option of not saving the session when calling specific functions,
for example if the client needs to get information about the session
(say, expiration time) without changing the session."""
# bypass: list of functions on which we do NOT want to update the session when called
# This could be put in the config or the like
#
# Improvement idea: "mark" functions on which we want to bypass saving in
# some way, then check for that mark here, rather than having a hard-coded list.
bypass = ['check_timeout']
#convert function names to URL's
bypass = [flask.url_for(x) for x in bypass]
if not flask.request.path in bypass:
# if the current request path isn't in our bypass list, go ahead and
# save the session normally
return original_save_session(self, *args, **kwargs)
# Override the save_session function to ours
app.session_interface.save_session = discretionary_save_session
Then, in the check_timeout function (which is in the bypass list, above), we can do something like the following to get the remaining time on the session:
#app.route('/auth/check_timeout')
def check_timeout():
""""""
session_id = flask.session.sid
# Or however you want to get a redis instance
redis = app.config.get('REDIS_MASTER')
# If used
session_prefix = app.config.get('SESSION_KEY_PREFIX')
#combine prefix and session id to get the session key stored in redis
redis_key = "{}{}".format(session_prefix, session_id)
# The redis ttl is the time remaining before the session expires
time_remain = redis.ttl(redis_key)
return str(time_remain)
I'm sure the above can be improved upon, however the result is as desired: when calling /auth/check_timeout, the time remaining on the session is returned without modifying the session in any way.
I have a NewRequest event handler (subscriber) in Pyramid which looks like this:
#subscriber(NewRequest)
def new_request_subscriber(event):
request = event.request
print('Opening DB conn')
// Open the DB
request.db = my_connect_to_db()
request.add_finished_callback(close_db_connection)
However, I have observed that a connection to the DB is opened even if the request goes to a static asset, which is obviously unnecessary. Is there a way, from the NewRequest handler, to check if the request is bound for a static asset? I have tried comparing the view_name to my static view's name, but apparently the view_name attribute is not available at this early stage of processing the request.
If anyone has any interesting ideas about this, please let me know!
The brute force way is to compare the request.path variable to your static view's root, a la request.path.startswith('/static/').
The method I like the best and use in my own apps is to add a property to the request object called db that is lazily evaluated upon access. So while you may add it to the request, it doesn't do anything until it is accessed.
import types
def get_db_connection(request):
if not hasattr(request, '_db'):
request._db = my_connect_to_db()
request.add_finished_callback(close_db_connection)
return request._db
def new_request_subscriber(event):
request = event.request
request.db = types.MethodType(get_db_connection, request)
Later in your code you can access request.db() to get the connection. Unfortunately it's not possible to add a property to an object at runtime (afaik), so you can't set it up so that request.db gives you what you want. You can get this behavior without using a subscriber by the cookbook entry where you subclass Request and add your own lazy property via Pyramid's #reify decorator.
def _connection(request):
print "******Create connection***"
#conn = request.registry.dbsession()
conn = MySQLdb.connect("localhost", "DB_Login_Name", "DB_Password", "data_base_name")
def cleanup(_):
conn.close()
request.add_finished_callback(cleanup)
return conn
#subscriber(NewRequest)
def new_request_subscriber(event):
print "new_request_subscriber"
request = event.request
request.set_property(_connection, "db", reify = True)
try this one, I reference fallow web page
http://pyramid.readthedocs.org/en/1.3-branch/api/request.html
"set_property" section, it works for me.
I'm struggling to figure this one out, sessions work when i run my application normally but i can't figure out how to set data in the session in my test case.
The docs say in a test case you have to save the session to apply the changes before making the request. https://docs.djangoproject.com/en/1.2/topics/testing/#persistent-state
e.g.
from django.test import TestCase
class TestLogin(TestCase):
def test_processuser(self):
redirect = '/processuser/'
session = self.client.session
session["id"] = '1234'
session.save()
response = self.client.get(redirect)
However the session object returned from self.client.session is just a normal python dict?
Diging into the code the Client.session call is this:
def _session(self):
"""
Obtains the current session variables.
"""
if 'django.contrib.sessions' in settings.INSTALLED_APPS:
engine = import_module(settings.SESSION_ENGINE)
cookie = self.cookies.get(settings.SESSION_COOKIE_NAME, None)
if cookie:
return engine.SessionStore(cookie.value)
return {}
session = property(_session)
cookie = self.cookies.get(settings.SESSION_COOKIE_NAME, None) returns None so it just returns a dict in stead of a session store.
It looks like i have to do some more preparation in the test client before i save a session? Not really got much experience in this any help would be appreciated.
Django 1.2.5
Python 2.6.5
Cheers,
Asim.
Edit: this answer is now outdated; as of at least Django 1.7, you can just set the cookie directly on the test client.
See e.g. this answer to this question or the comments on this answer to another, similar, question.
Old outdated answer follows...
Adding this for people who really do need to set a cookie, e.g. because they need to do something which isn't covered by the Django auth mechanism...
You can't set cookies directly on TestClient objects but if you use the RequestFactory class you can do it. So instead of (say):
response = Client().post('/foo')
you do:
request = RequestFactory().post('/foo')
request.COOKIES['blah'] = 'hello'
response = foo_view(request)
where foo_view is the view corresponding to the '/foo' path, i.e. the view you're looking to test.
HTH somebody.
The simplest thing would be to login as someone, so the test client would set the cookie for you.
self.client.login(username,password)
should do. Refer the documentation for more.
Contrary to the most upvoted answer, you CAN set cookies directly on the test client.
Remember everything is an object, you just have to know where/what to patch
so it goes like this:
client.cookies[key] = data
client.cookies is an instance of http.cookies.SimpleCookie from the standard library and it behaves like a dict. so you can use .update for bulk updates to a cookies value. This can be useful if you want to alter other cookie values like max-age, path domain etc.
Finally, if you want to set a signed_cookie, You can reuse the helpers from django like this:
from django.core.signing import get_cookie_signer
signed_cookie_value = get_cookie_signer(salt=key).sign(data)
client.cookies[key] = signed_cookie_value
Pay attention to the salt. It has to match on both ends (Signing and retrieval). A Different salt value for signing would generate a different cookie that cannot be retrieved when you call response.get_signed_cookie(key)
For other people who are running into this problem please be aware that the Client.logout() function will throw away your cookies. For example:
response = self.client.post(self.url, self.data)
print response.client.cookies.items() # Displays the cookie you just set
self.client.logout()
response = self.client.post(reverse('loginpage'), {'username': 'username', 'password': 'password'}, follow=True)
print response.client.cookies.items() # Does not display the cookie you set before since it got destroyed by logout()
To make sure your cookies stay alive during testing make a call to your logout page in stead of using the Client.logout() function, like so:
response = self.client.post(self.url, self.data)
print response.client.cookies.items() # Displays the cookie you just set
self.client.get(reverse('logoutpage'))
response = self.client.post(reverse('loginpage'), {'username': 'username', 'password': 'password'}, follow=True)
print response.client.cookies.items() # Does display the cookie you set before since it did not get destroyed by client.logout()