Testing web-tornado using Firefox's HttpRequest addon - python

I am testing my web-tornado application using Firefox's HttpRequest add-on but after I log in and receive my secure cookie data, I am not able to re-use it to consume protected methods.
This is my response data:
POST http://mylocalurl:8888/user/login
Content-Type: application/x-www-form-urlencoded
Login=mylogin;Pass=123
-- response -- 200 OK Content-Length: 33
Content-Type: text/html; charset=UTF-8
Server: TornadoServer/2.2.1
Set-Cookie:
IdUser="Mjk=|1395170421|ffaf0d6fecf2f91c0dccca7cab03d799ef6637a0";
expires=Thu, 17 Apr 2014 19:20:21 GMT; Path=/
{
"Success": true }
-- end response --
Now why I am trying to do is to configure HttpRequester to use this cookie for my new requests. I tried to add it using the "Headers" tab but my server keeps sending me a 403, Forbidden.
Can anyone help me on this ? It could be with another tool (for linux) too.

I really like fiddler2 for these kind of things and there's an alpha build for mono that you may wish to try out: http://www.telerik.com/download/fiddler
If you don't mind paid software you can use Charles, for which there is a free trial.
And if you are testing and already using python, why not use a simple python script with requests and its Session object with cookie-persistence..

Related

Using 'Requests' python module for POST request, receiving response as if it were GET

So I am trying to make a script that checks a reservation availability of a bus. The starting link for this is https://reservation.pc.gc.ca/.
In the reserve box the following needs to be selected:
Reservation: Day Use (Guided Hikes, Lake O’Hara Bus)
Park: Yoho-Lake O'Hara
Arrival Date: Jun 16
Party Size: 2
When those options are entered, it takes you to the following page: https://reservation.pc.gc.ca/Yoho-LakeO'Hara?Calendar
It is my understanding that if I send a POST request to that second link with the correct data it should return the page I'm looking for
If I look in the dev tools network info when I select the correct parameters the form data is:
__EVENTTARGET:
__EVENTARGUMENT:
__VIEWSTATE: -reallllly long string-
__VIEWSTATEGENERATOR: 8D0E13E6
ctl00$MainContentPlaceHolder$rdbListReservationType: Events
ddlLocations: 213a1bc9-9218-4e98-9a7f-0f209008e437**
ddlArrivalMonth: 2017-06-16
ddlArrivalDay: 19
ddlNights: 1
ddlDepartureMonth:
ddlDepartureDay:
ddlEquipment:
ddlEquipmentSub:
ddlPartySize:2
ctl00$MainContentPlaceHolder$chkExcludeAccessible: on
ctl00$MainContentPlaceHolder$imageButtonCalendar.x: 64
ctl00$MainContentPlaceHolder$imageButtonCalendar.y: 56
So the code I wrote is:
import requests
payload = {
'__EVENTTARGET': '',
'__EVENTARGUMENT': '',
'__VIEWSTATE':-reallly long string-,
'__VIEWSTATEGENERATOR': '8D0E13E6',
'ctl00$MainContentPlaceHolder$rdbListReservationType': 'Events',
'ddlLocations': '213a1bc9-9218-4e98-9a7f-0f209008e437',
'ddlArrivalMonth': 2017-06-16,
'ddlArrivalDay': 19,
'ddlNights': 1,
'ddlDepartureMonth': '',
'ddlDepartureDay': '',
'ddlEquipment': '',
'ddlEquipmentSub': '',
'ddlPartySize': 2,
'ctl00$MainContentPlaceHolder$chkExcludeAccessible': 'on',
'ctl00$MainContentPlaceHolder$imageButtonCalendar.x': 64,
'ctl00$MainContentPlaceHolder$imageButtonCalendar.y': 56
}
r = requests.get(r"https://reservation.pc.gc.ca/Yoho-LakeO'Hara?Calendar", data=payload)
print r.text
r.text ends up just being the second link as if no parameters were entered - as if I just sent a normal GET request to the link. I tried turning the payload values that are integers into strings, I tried removing the empty key:value pairs. No luck. Trying to figure out what I'm missing.
It looks to me like there are 2 things going on:
#errata was correct, and this should be a POST request. You're about halfway there.
What I noticed though is that it seems to post the form data to Home.aspx and the URL that you see after submitting the form is the result of that processing and subsequent redirect.
You might try posting the form data as json to ./Home.aspx.
I found through Postman that this nearly worked, but I had to specify the content-type in order to get the proper results.
If you need to know how to add header and body instructions to the .post() method, it looks like there is a good example (though perhaps slightly outdated) here:
adding header to python request module
Also, fwiw, check out Postman. If you're both inexperienced with requests and with doing it in Python, at least this may lesson some of the burden of trial and error.
You are using
r = requests.get(r"https://reservation.pc.gc.ca/Yoho-LakeO'Hara?Calendar", data=payload)
instead of
r = requests.post(r"https://reservation.pc.gc.ca/Yoho-LakeO'Hara?Calendar", data=payload)
Digging a bit deeper in your problem, I found out that the URL you are calling is actually redirecting to a different URL (returning HTTP response 302):
$ curl -I "https://reservation.pc.gc.ca/Yoho-LakeO'Hara"
HTTP/1.1 302 Found
Cache-Control: private
Content-Length: 77273
Content-Type: text/html; charset=utf-8
Location: https://reservation-pc.fjgc-gccf.gc.ca/GccfLanguage.aspx?lang=eng&ret=https%3a%2f%2freservation.pc.gc.ca%3a443%2fYoho-LakeO%27Hara
Server: Microsoft-IIS/8.0
Set-Cookie: ASP.NET_SessionId=qw4p4e2zxjxx0c2zyq014p45; path=/; secure; HttpOnly
Set-Cookie: CookieLocaleName=en-CA; path=/; secure; HttpOnly
X-Powered-By: ASP.NET
X-Frame-Options: SAMEORIGIN
Date: Wed, 17 May 2017 14:22:53 GMT
However, following the Location from response results also in 302:
$ curl -I "https://reservation-pc.fjgc-gccf.gc.ca/GccfLanguage.aspx?lang=eng&ret=https%3a%2f%2freservation.pc.gc.ca%3a443%2fYoho-LakeO%27Hara"
HTTP/1.1 302 Found
Cache-Control: private
Content-Length: 179
Content-Type: text/html; charset=utf-8
Location: https://reservation.pc.gc.ca:443/Yoho-LakeO'Hara?gccf=true
Server: Microsoft-IIS/8.0
Set-Cookie: ASP.NET_SessionId=rbcuvexfg4fb340ixtcjd1qy; path=/; secure; HttpOnly
Set-Cookie: _gc_lang=eng; domain=.fjgc-gccf.gc.ca; path=/; secure; HttpOnly
X-Powered-By: ASP.NET
X-Frame-Options: SAMEORIGIN
Date: Wed, 17 May 2017 14:24:55 GMT
All this probably results in Requests transforming your POST into GET in the end...

Content-type is blank in the headers of some requests

I've ran this queries millions (yes, millions) of times before with other URLs. However, I'm getting a KeyError when checking the content-type of the following webpage.
Code snippet:
r = requests.get("http://health.usnews.com/health-news/articles/2014/10/15/limiting-malpractice-claims-may-not-curb-costly-medical-tests", timeout=10, headers=headers)
if "text/html" in r.headers["content-type"]:
Error:
KeyError: 'content-type'
I checked the content of r.headers and it's:
CaseInsensitiveDict({'date': 'Fri, 20 May 2016 06:44:19 GMT', 'content-length': '0', 'connection': 'keep-alive', 'server': 'BigIP'})
What could be causing this?
Not all servers set a Content-Type header. Use .get() to retrieve a default if it is missing:
if "text/html" in r.headers.get("content-type", ''):
For the URL you gave I can't reproduce this:
$ curl -s -D - -o /dev/null "http://health.usnews.com/health-news/articles/2014/10/15/limiting-malpractice-claims-may-not-curb-costly-medical-tests"
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
X-Powered-By: Brightspot
Content-Type: text/html;charset=UTF-8
Date: Fri, 20 May 2016 06:45:12 GMT
Set-Cookie: JSESSIONID=A0C35776067AABCF9E029150C64D8D91; Path=/; HttpOnly
Transfer-Encoding: chunked
but if the header is missing from your response then it usually isn't Python's fault, and certainly not your code's fault.
It could be you encountered a buggy server or temporary glitch, or the server you contacted doesn't like you for one reason or another. Your sample response headers have the content-length set to 0 as well, for example, indicating there was no content to serve at all.
The server that gave you that response is BigIP, a load balancer / network router product from a company called F5. Hard to say exactly what kind (they have global routing servers as well as per-datacenter or cluster load balancers). It could be that the load balancer ran out of back-end servers to serve the request, doesn't have servers in your region, or the load balancer decided that you are sending too many requests and refuses to give you more than just this response, or it is the wrong phase of the moon and Jupiter is in retrograde and it threw a tantrum. We can't know!
But, just in case this happens again, do also look at the response status code. It may well be a 4xx or 5xx status code indicating that something was wrong with your request or with the server. For example, a 429 status code response would indicate you made too many requests in a short amount of time and should slow down. Test for it by checking r.status_code.

Making Head Requests in Twisted

I am relatively new to using Twisted and I am having trouble returning the content-length header when performing a basic head request. I have set up an asynchronous client already but the trouble comes in this bit of code:
def getHeaders(url):
d = Agent(reactor).request("HEAD", url)
d.addCallbacks(handleResponse, handleError)
return d
def handleResponse(r):
print r.code, r.headers
whenFinished = twisted.internet.defer.Deffered()
r.deliverBody(PrinterClient(whenFinished))
return whenFinished
I am making a head request and passing the url. As indicated in this documentation the content-length header is not stored in self.length, but can be accessed from the self.headers response. The output is returning the status code as expected but the header output is not what is expected. Using "uhttp://www.espn.go.com" as an example it currently returns:
Set-Cookie: SWID=77638195-7A94-4DD0-92A5-348603068D58;
path=/; expires=Fri, 31-Jan-2034 00:50:09 GMT; domain=go.com;
X-Ua-Compatible: IE=edge,chrome=1
Cache-Control: max-age=15
Date: Fri, 31 Jan 2014 00:50:09 GMT
P3P: CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa IVAi IVDi CONi
OUR SAMo OTRo BUS PHY ONL UNI PUR COM NAV INT DEM CNT STA PRE"
Content-Type: text/html; charset=iso-8859-1
As you can see, no content-length field is returned. If the same request is done in requests then the result will contain the content-length header:
r = requests.head("http://www.espn.go.com")
r.headers
({'content-length': '46133', 'content-encoding': 'gzip'...})
(rest omitted for readability)
What is causing this problem? I am sure it is a simple mistake on my part but I for the life of me cannot figure out what I have done wrong. Any help is appreciated.
http://www.espn.go.com/ returns one response if the client sends an Accept-Encoding: gzip header and another response if it doesn't.
One of the differences between the two responses is the inclusion of the Content-Length header.
If you want to make requests using Agent including Accept-Encoding: gzip then take a look at ContentDecoderAgent or the third-party treq package.
http allows (but does not REQUIRE) entity headers in responses to HEAD requests. The only restriction it places is that 200 responses to HEAD requests MUST NOT include an entity payload. Its up to the origin server to decide which, if any entity headers it would like to include.
In the case of Content-Length, it makes sense for this to be optional for HEAD; if the entity will be computed dynamically (as with compressing/decompressing content), it's better for the server to avoid the extra work of computing the content length when the request won't include the content anyway.

Google App Engine adding cache-control and other headers

I have a Flask application on Google App Engine and I want to tell browsers to cache a response that uses the Cache-Control header. It works as expected on dev_appserver.py but when deployed to App Engine the headers are modified and break the cache header.
Here is the Flask view in particular:
#app.route("/resource")
def resource():
response = make_response(render_template("resource.html"))
response.headers['Cache-Control'] = "max-age=31536000"
logging.error("HEADERS: {}".format(response.headers))
return response
The logs for both development server and App Engine show:
Content-Type: text/html; charset=utf-8
Content-Length: 112628
Cache-Control: max-age=31536000
When I run it with the development app server it works as expected, as you can see from the headers below.
When I open Chrome's development tools the headers for App Engine are:
alternate-protocol:443:quic
cache-control:no-cache, must-revalidate
content-encoding:gzip
content-length:19520
content-type:text/html; charset=utf-8
date:Wed, 22 Jan 2014 19:53:47 GMT
expires:Fri, 01 Jan 1990 00:00:00 GMT
pragma:no-cache
server:Google Frontend
set-cookie:session=; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
status:200 OK
strict-transport-security:max-age=31536000
vary:Accept-Encoding
version:HTTP/1.1
x-appengine-estimated-cpm-us-dollars:$0.002267
x-appengine-resource-usage:ms=7388 cpu_ms=5069
x-frame-options:DENY
x-ua-compatible:chrome=1
In contrast the development app server headers are as expected:
Cache-Control:max-age=31536000, private
Content-Length:112628
content-type:text/html; charset=utf-8
Date:Wed, 22 Jan 2014 19:57:05 GMT
Expires:Wed, 22 Jan 2014 19:57:05 GMT
Server:Development/2.0
set-cookie:session=; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
x-frame-options:DENY
x-ua-compatible:chrome=1
Of course I have checked to make sure that I am not adding the extra headers, and I could find no reference of the cache-related headers (pragma, expires and cache-control) being added outside the given view.
So it seems App Engine is adding a bunch of headers when deployed, which seems unusual. What might I have overlooked?
-- EDIT --
As #dinoboff noted from the docs in a comment below:
Cache-Control, Expires and Vary
These headers specify caching policy to intermediate web proxies (such as Internet Service Providers) and browsers. If your script sets these headers, they will usually be unmodified, unless the response has a Set-Cookie header, or is generated for a user who is signed in using an administrator account.
These headers are additional headers that are added because you are looking at the site as a logged in admin user. They will not be present for "normal" users.
This blog post talks about the X-AppEngine-Resource-Usage header specifically: http://googleappengine.blogspot.co.uk/2009/08/new-features-in-124.html
And as they note:
You can view these headers using plugins such as Firefox's Live HTTP
Headers or Firebug. Note that only logged in administrators see these
figures - ordinary users, and users who aren't logged in, won't see
them at all.

OAuth and the YouTube API

I am trying to use the YouTube services with OAuth. I have been able to obtain request tokens, authorize them and transform them into access tokens.
Now I am trying to use those tokens to actually do requests to the YouTube services. For instance I am trying to add a video to a playlist. Hence I am making a POST request to
https://gdata.youtube.com/feeds/api/playlists/XXXXXXXXXXXX
sending a body of
<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom"
xmlns:yt="http://gdata.youtube.com/schemas/2007">
<id>XXXXXXXXX</id>
</entry>
and with the headers
Gdata-version: 2
Content-type: application/atom+xml
Authorization: OAuth oauth_consumer_key="www.xxxxx.xx",
oauth_nonce="xxxxxxxxxxxxxxxxxxxxxxxxx",
oauth_signature="XXXXXXXXXXXXXXXXXXX",
oauth_signature_method="HMAC-SHA1",
oauth_timestamp="1310985770",
oauth_token="1%2FXXXXXXXXXXXXXXXXXXXX",
oauth_version="1.0"
X-gdata-key: key="XXXXXXXXXXXXXXXXXXXXXXXXX"
plus some standard headers (Host and Content-Length) which are added by urllib2 (I am using Python) at the moment of the request.
Unfortunately, I get an Error 401: Unknown authorization header, and the headers of the response are
X-GData-User-Country: IT
WWW-Authenticate: GoogleLogin service="youtube",realm="https://www.google.com/youtube/accounts/ClientLogin"
Content-Type: text/html; charset=UTF-8
Content-Length: 179
Date: Mon, 18 Jul 2011 10:42:50 GMT
Expires: Mon, 18 Jul 2011 10:42:50 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Server: GSE
Connection: close
In particular I do not know how to interpret the WWW-Authenticate header, whose realm hints to ClientLogin.
I have also tried to play with the OAuth Playground and the Authorization header sent by that site looks exactly like mine, except for the order of the fields. Still, on the plyaground everything works. Well, almost: I get an error telling that a Developer key is missing, but that is reasonable since there is no way to add one on the playground. Still, I go past the Error 401.
I have also tried to manually copy the Authorization header from there, and I got an Error 400: Bad request.
What am I doing wrong?
Turns out the problem was the newline before xmlns:yt. I was able to debug this using ncat, as suggeested here, and inspecting the full response.
i would suggest using the oauth python module, because it much more simple and takes care of the auth headers :) https://github.com/simplegeo/python-oauth2, as a solution i suggest you encode your parameters with 'utf-8' , i had a similar problem, and the solution was that google was expecting utf-8 encoded strings

Categories

Resources