How to get the request header sent by requests.get()? - python

I'd like to know what a HTTP GET request header is sent by requests.get() from the client side.
requests.get('http://localhost:9000')
The request header sent by the above python command monitored by netcat is the following. However, I don't find a way to directly monitor the HTTP GET request header sent at the client side.
GET / HTTP/1.1
Host: localhost:9000
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.18.4
import requests
import sys
req = requests.Request('GET', 'localhost:9000')
print req.headers
prepared = req.prepare()
s = requests.Session()
page = s.send(prepared)
The request header sent by the above python command monitored by netcat is the following.
GET / HTTP/1.1
Host: localhost:9000
Accept-Encoding: identity
Also, req.headers can be used to monitor the header. It is not exactly the same as the header sent as it does not contain the Accept-Encoding header. Also, the HTTP GET request header sent by this way is also different from that of the first way.
$ ./main.py
{}
Is there a method to directly monitor what HTTP GET header is sent in the first way?
Also, why the requests send by the two methods are not the same? Isn't it better to make them consistent to avoid possible confusion?

I'd like to know what a HTTP GET request header is sent by requests.get() from the client side
If I got it right, you want to view headers, that were actually sent by requests.get().
You can access them by using .request.headers attributes:
import requests
r = requests.get("http://example.com")
print(r.request.headers)

Related

http POST message body not received

Im currently creating a python socket http server, and I'm working on my GET and POST requests. I got my GET implementation working fine, but the body element of the POST requests won't show up.
Code snippet:
self.host = ''
self.port = 8080
self.listener = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.listener.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self.listener.bind((self.host, self.port))
self.listener.listen(1)
while True:
client_connection, client_address = self.listener.accept()
request = client_connection.recv(2048)
print request
This code yields the http header after processing the post request from the webpage:
POST /test.txt HTTP/1.1
Host: localhost:8080
Content-Type: application/x-www-form-urlencoded
Origin: http://localhost:8080
Content-Length: 21
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/601.6.17 (KHTML, like Gecko) Version/9.1.1 Safari/601.6.17
Referer: http://localhost:8080/
Accept-Language: nb-no
Accept-Encoding: gzip, deflate
But there is no body, so the question is why am i not receiving the http body when i know it is sent?
Thanks!
while True:
client_connection, client_address = self.listener.accept()
request = client_connection.recv(2048)
print request
recv does not read exactly 2048 bytes but it reads up to 2048 bytes. If some data arrive recv will return with the data even if more data might follow. My guess is that in your case the client is first sending the HTTP header and then the body. If NAGLE algorithms is off at the client side (common) it is likely that your first recv will only get the header and that you would need another recv for the body. This would explain what happens in your case: you get the header but not the body since you don't do another recv.
But even that would be a too simple implementation which will go wrong sooner or later. To make it correctly you should implement the HTTP protocol correctly: first read the HTTP header which might need multiple recv if the header is large. Then you should parse the header, figure out the size of the body (Content-length header) and read the remaining bytes.

Python Requests module - proxy not working

I'm trying to connect to website with python requests, but not with my real IP. So, I found some proxy on the internet and wrote this code:
import requests
proksi = {
'http': 'http://5.45.64.97:3128'
}
x = requests.get('http://www.whatsmybrowser.org/', proxies = proksi)
print(x.text)
When I get output, proxy simple don't work. Site returns my real IP Address. What I did wrong? Thanks.
The answer is quite simple. Although it is a proxy service, it doesn't guarantee 100% anonymity. When you send the HTTP GET request via the proxy server, the request sent by your program to the proxy server is:
GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0
Now, when the proxy server sends this request to the actual destination, it sends:
GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0
Via: 1.1 naxserver (squid/3.1.8)
X-Forwarded-For: 122.126.64.43
Cache-Control: max-age=18000
Connection: keep-alive
As you can see, it throws your IP (in my case, 122.126.64.43) in the HTTP header: X-Forwarded-For and hence the website knows that the request was sent on behalf of 122.126.64.43
Read more about this header at: https://www.rfc-editor.org/rfc/rfc7239
If you want to host your own squid proxy server and want to disable setting X-Forwarded-For header, read: http://www.squid-cache.org/Doc/config/forwarded_for/

use Python urllib.urlopen to send data accurately

I want to use Python to simulate a login action which acquires some message sending via HTTP GET method. So I write something like this
from urllib.request import urlopen, Request
urlopen(Request(URL, data=data_for_verify.encode(), method='GET'))
The problem is, it doesn't do the same as a real login action which like this (scratch from Wireshark, HTTP printable data only)
GET /rjsdcctrl?mac%3dfcaa14ec56f3%26ipv4%3d1681312010%26ipv61%3d0%26ipv62%3d0%26ipv63%3d0%26ipv64%3d0%26product%3d33554432%26mainver%3d67108864%26subver%3d1610612736 HTTP/1.1
Accept: text/*
User-Agent: HttpCall
Accept-Language: en-us
Host: 10.0.6.251
Cache-Control: no-cache
And what my program did is:
GET / HTTP/1.1
Accept-Encoding: identity
Content-Type: application/x-www-form-urlencoded
Host: 10.0.6.251:80
User-Agent: Python-urllib/3.4
Connection: close
Content-Length: 161
rjsdcctrl?mac%3dfcaa14ec56f3%26ipv4%3d1681312010%26ipv61%3d0%26ipv62%3d0%26ipv63%3d0%26ipv64%3d0%26product%3d33554432%26mainver%3d67108864%26subver%3d1610612736
A real login action have the header comes first, and do not have the line GET / HTTP /1.1
or it just a header without content, and the first line GET contain the real request message. How can I simulate that using Python's urllib?
I use Python 3.4
You shouldn't use the data parameter if you don't want to send data as part of the body. Append the value to the URL:
full_url = "%s?%s" % (URL, data_for_verify.encode())
urlopen(full_url)
To extend #Daniel's answer, you can make use of urllib.urlencode method to prepare your get parameter string and also headers keyword argument to override default headers. So for example:
import urllib
url = 'http://www.example.com/'
data = {
'key1': 'value1',
'key2': 'value2',
'key3': 'value3'
}
headers = {
'Overriden-Header': 'Overriden Header Value'
}
## Update the url and make the actual requests
url = '%s?%s' % (url, urllib.urlencode(data))
response = urllib.urlopen(url, headers=headers)

How to send/receive URL parameters in a HTTP POST correctly?

I am using cakephp 2.4.5. I want to send a HTTP POST with URL parameters. I am using python 2.7 request module to send the HTTP POST. Please assume the payload is structured correctly as I have tested that part.
URL_post = http://127.0.0.1/webroot/TestFunc?identity_number=S111A/post
r = requests.post(URL_post, payload)
On the cakephp side, the controller looks something like this;
public function TestFunc($id=null)
{
$identity_number = $this->request->query['identity_number'];
$this->request->data['Model']['associated_id']=$identity_number;
$this->Model->saveAll($this->request->data, array('deep' => true));
}
I have tested that the query is not received correctly. However, if I am not using HTTP POST and just throwing in a normal URL, the query can be received correctly.
What have I done wrong?
The query part of the url is sent correctly:
import requests
requests.post('http://localhost/webroot/TestFunc?identity_number=S111A/post',
{'Model': 'data'})
The Request
POST /webroot/TestFunc?identity_number=S111A/post HTTP/1.1
Host: localhost
User-Agent: python-requests/2.2.1 CPython/3.4 Linux/3.2
Accept: */*
Accept-Encoding: gzip, deflate, compress
Content-Type: application/x-www-form-urlencoded
Content-Length: 10
Model=data
You could also make the requests using params:
requests.post('http://localhost/webroot/TestFunc',
data={'Model': 'data'},
params={'identity_number': 'S111A/post'})
The only difference is that S111A/post is sent as S111A%2Fpost (the same url in the end).
Look at http://docs.python-requests.org/en/latest/user/quickstart/#passing-parameters-in-urls.
payload = {"identity_number": "S111A/post"}
URL_post = "http://127.0.0.1/webroot/TestFunc"
req = requests.post(URL_post, params=payload)
print(req.status_code)

requests python lib adding in Accept header

so I have made a request to a server with pythons request library. The code looks like this (it uses an adapter so it needs to match a certain pattern)
def getRequest(self, url, header):
"""
implementation of a get request
"""
conn = requests.get(url, headers=header)
newBody = conn.content
newHeader = conn.headers
newHeader['status'] = conn.status_code
response = {"Headers" : newHeader, "Body" : newBody.decode('utf-8')}
self._huddleErrors.handleResponseError(response)
return response
the header parameter I am parsing in is this
{'Authorization': 'OAuth2 handsOffMyToken', 'Accept': 'application/vnd.huddle.data+json'}
however I am getting an xml response back from the server. After checking fiddler I see the request being sent is:
Accept-Encoding: identity
Accept: */*
Host: api.huddle.dev
Authorization: OAuth2 HandsOffMyToken
Accept: application/vnd.huddle.data+json
Accept-Encoding: gzip, deflate, compress
User-Agent: python-requests/1.2.3 CPython/3.3.2 Windows/2008ServerR2
As we can see there are 2 Accept Headers! The requests library is adding in this Accept:* / * header which is throwing off the server. Does anyone know how I can stop this?
As stated in comments it seems this is a problem with the requests library in 3.3. In requests there are default headers (which can be found in the utils folder). When you don't specify your own headers these default headers are used. However if you specify your own headers instead requests tries to merge the headers together to make sure you have all the headers you need.
The problem shows its self in def request() method in sessions.py. Instead of merging all the headers it puts in its headers and then chucks in yours. For now I have just done the dirty hack of removing the accept header from the default headers found in util

Categories

Resources