Understanding the Python requests module

Understanding the Python requests module - python

So I'm currently learning the python requests module but I'm a bit confused and was wondering if someone could steer me in the right direction. I've seen some people post headers when they want to log into the website, but where do they get these headers from and when do you need them? I've also seen some people say you need an authentication token, but I've seen some other solutions not even use headers or an authentication token at all. This is supposedly the authentication token but I'm not sure where to go from here after I post my username and password.
<input type="hidden" name="lt" value="LT-970332-9KawhPFuLomjRV3UQOBWs7NMUQAQX7" />

Although your question is a bit vague, I'll try to help you.
Authentication
A web browser (client) can authenticate on the target server by providing data, usually the pair login/password, which is usually encoded for security reasons.
This data can be passed from client to server using the following parts of HTTP request:
URL parameters (http://httpbin.org/get?foo=bar)
headers
body (this is where POST parameters from HTML forms usually go)
Tokens
After successful authentication server generates a unique token and sends it to client. If server wants client to store token as a cookie, it includes Set-Cookie header in its response.
A token usually represents a unique identifier of a user session. In most cases token has an expiration date for security reasons.
Web browsers usually store token as a cookie in internal cookie storage and use them in all subsequent requests to corresponding website. A single website can use multiple tokens and other cookies for a single user.
Research
Every web site has its own authentication format, rules and restrictions, so first thing you need to do is a little research on target website. You need to get information about the client sends auth information to server, what server replies and where session data is being stored (usually you can find it in client request headers).
In order to do that, you may use a proxy (Burp for example) to intercept browser traffic. It can help you to get the data passed from client to server and back.
Try to authenticate and then browse some pages on target site using your web browser with a proxy. After that, using your proxy, examine what parts of HTTP request/response do client and browser use to store information about sessions and authentication.
After that you can finally use python and requests to do what you want.

Related

Is it possible to use Jira cookie based auth with separate connections?

I'm trying to create a command line client for Jira, but I don't really want to store the username/password, and I don't want to have to put in my password with every single request.
Jira says they have a cookie based API, but it doesn't look like it works the way that I think it works.
Specifically, when using Python's requests library I can only re-use the cookie if I have a Session object that I think keeps a connection to Jira.
But if I try to say, make a requests.post request and requests.get requests to the REST URL, it fails with a 401 and tells me that I'm not authenticated. OTOH, if I create a Session, I can do
session.post(.../rest/auth/1/session)
print(session.get(.../rest/auth/1/session).status_code)
And I'll get the 200 that I expect.
I do notice that there's another cookie in the requests response headers:
atlassian.xsrf.token=SOMETHING|RANDOM|lout
but I didn't see anything about that in the documentation.
Is it possible to do this, or do I have to store the username/password if I want to break the connection in between requests?

You are correct, the session is required. From the documentation:
The client creates a new session for the user, via the JIRA REST API.
JIRA returns a session object, which has information about the session including the session cookie. The client stores this session object.
The client can now set the cookie in the header for all subsequent requests to the JIRA REST API.
In other words, the session is integral to the request, receipt and use of the cookie-based authentication token.
Also, the atlassian.xsrf.token would have been injected by atlassian to prevent cross-site forgery and hijacking of the session/cookie.
The way I see it, here are your simple-but-secure options:
For every invocation of your script, use the session to request-receive-retain the cookie (and then, once all API calls are complete, let everything get discarded)
Base64 encode your username and password, store it in a separate file (encrypted if you so choose), and have your script collect (and decrypt) it then place it in an authorization header. See Hiding a password in a python script (insecure obfuscation only).

If you follow the goal not to authorize every time you send a request to the API, you should send (POST) your authentication requests using the cookie-based authentication /rest/auth/1/session, not Basic Auth, to get the token. You will then use that obtained token subsequently in your further requests (in the Cookie header) to the API without a need to authorize every single request.
Watch out for the important missing piece in the API documentation: you should sent the username, NOT email, to authorize in a cookie-based manner. Even though both variants work for the Basic Auth, only user works for the cookie-based authentication.

Extra protection layer for Django Rest Framework and OAuth2 Toolkit

This is a follow up question for this.
I'm using the latest Django OAuth2 Toolkit (0.10.0) with Python 2.7, Django 1.8 and Django REST framework 3.3
Some background:
When authenticating, the client receive a new AccessToken that he uses every time a makes a new request to the server. This AccessToken is owned by the client and being transferred using Authorization header upon request.
A simple test that I made was grabbing this access token from an authenticated client and send it in the Authorization header using a simple HTTP request from a different machine.
The result was that this new "client" is now authenticated just like the original client, and he can make requests as he pleased.
So the issue is:
The access token is not bind to any form of client validation (Like session id or client IP address). Any one that can get/find/steal/lookup the client's AccessToken, can be fake requests on behalf of this client.
I researched this issue allot but I couldn't find any one who addressed this matter. Maybe i'm doing something wrong in the from of authenticating the client? I would love some insights. Maybe its a simple configuration, out-of-the-box solution that I missed.
Thanks!

This method of attack is called replay attack. This video by Professor Messer explains replay attack.
You can't really implement anything client side (browser) to overcome this because of the transparency of web browsers.
What you can do is to implement a digest authentication using a nonce.
In cryptography, a nonce is an arbitrary number that may only be used once.
a basic implementation looks like this.
User requests API server.
API server responds with a HTTP 401 and a nonce in a WWW-Authenticate header [you have to keep track of nonces] (a JWT with nonce which is set to expire in a small window, may be 2 seconds or less would be better and stateless).
Client signs the request with received nonce, a client nonce and password and calls the resource again.
API server validates the signature, If the signature is valid the request is accepted.
Attacker captures the request and fakes the user.
Since nonce is expired/'used only once' the attacker's request is rejected.

How cookies and tokens work

I'm designing a web application and now I'm working on the authentication function. I read that there are two approaches: cookies and tokens. I do not really understanding how these two work.
I'm planing to use django-rest-framework-jwt if I chose tokens. Here's where I am at :
Tokens
The user sends his data (login and password). The application verifies that the data are correct and calculates a token and then send it back to the user. When the user make a request he includes the token in the request. The application decodes the request and we get the information about the user.
My question :
- How do we get the token? Is it like calculating a hash code?
- How do we get the user information after we decode the token?
- How is it determined that the token is dead?
- Can a web application that uses tokens be used through a browser
Cookies
Same as tokens but cookie are sent using the HTTP header not in request body. Cookies must stored in the server side.
My question :
- In articles I read they say that tokens have the advantage that they have life time. But cookies have that too. So what's the difference between the life time of a cookie and a token?
- How we identify the user who made the request? Do we store a dictionary (cookie, user id)?

I believe "Tokens" as you call it are identical to "Sessions" as on https://docs.djangoproject.com/en/dev/topics/http/sessions/.
Similar to what you stated, Sessions calculate a hash code/id to be sent back to the user to be identified as an authenticated user etc.
To answer your questions directly:
Sessions and Cookies work together. Once Django generates a SessionId it is stored on user's computer through the use of a cookie while it is also recorded in django backend. Therefore, I am not sure if your question is valid. Try reading up at http://www.tangowithdjango.com/book/chapters/cookie.html.
The link above also answers this questions for you. To summarize, the SessionId sent back to the user includes an ID to identify that user as authenticated or any other property etc.

Basically the difference between Cookie-based authentication (by storing sessionIds in cookies on the client) and token authentication is that an authentication token is sent in the http-header 'authentication' field. This is more flexible, since there are REST clients (native clients on phones etc.) that don't support the concept of a cookie at all.
Session Authentication is built-in in Django, session authentication is provided by Django-rest-framework.
Django-rest-framework has a built-in method of passing the token to the client, but you're welcome to implement your own devices.
Tokens are valid until they are deleted from the database. Again, you can roll your own auto-invalidation solution here.
The django-rest-framework documentation is pretty detailed about the different authentication mechanisms it supports. See http://www.django-rest-framework.org/api-guide/authentication

Tornado's XSRF protection

I am using Facebook's Tornado web engine for Python for a project I'm doing and was planning on implementing the XSRF protection, but it left me a little confused.
On a typical request it sets an "_xsrf" cookie to the user's browser if it's not found and then matches that with the value embedded in an HTML form value the browser has sent with the request.
Well let's say an attacker did something like this:
<img src="blah.com/transfer_money?account=0098&destination=0099&_xsrf=
(whatever the client's cookie contains)" title="cool image" />
What's to prevent the attacker from using the cookie outright? As far as I can tell the cookies used for XSRF are not "secure" both from the check_xsrf_cookie method and the xsrf_token method that actually generates the XSRF token. Am I missing something...?

If I understand you correctly, you are asking what prevents attacker from accessing user's cookie in given domain.
Well, the answer is: browser security policy. The script from one domain cannot access cookie from other domain (most of the time). More details here: http://en.wikipedia.org/wiki/HTTP_cookie#Domain_and_Path
This can be circumvented by using XSS (Cross-Site Scripting) attack: injecting the script directly into the source of attacked page. Another approach is to break the client application (browser).
However, most of the time it is not possible for the attacker to retrieve user's cookie from other domain. Additional level of security would be to associate specific CSRF (or "XSRF") token with specific user (and to check it during validation).

Python urllib2 accesses page without sending authentication details

I was reading urllib2 tutorial wherein it mentions that in order to access a page that requires authentication (e.g. valid username and password), the server first sends an HTTP header with error code 401 and (python) client then sends a request with authentication details.
Now, the problem in my case is that there exist two different versions of a webpage, one that can be accessed without supplying any authentication details and one that is quite different when authentication details are supplied (i.e. when the user is logged in the system). As an example think about url www.gmail.com, when you are not logged in you get a log-in page, but if your browser remembers you from your last login then the result is your email account homepage with your inbox displayed.
I follow all the details to set up an handler for authentication and install an opener. However everytime I request the page get back the version of the webpage that does not have the user logged-in.
How can I access the other version of webpage that has the user logged-in?

Requests makes this easy. As its creators say:
Python’s standard urllib2 module provides most of the HTTP capabilities you need, but the API is thoroughly broken.

Try using Mechanize. It has cookie handling features that would allow your program to be "logged in" even though it's not a real person.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.