There are a couple of default headers that HTTPie sets. I'm wondering if there is a way to remove some header, like Accept-Encoding?
The reason I like to unset Accept-Encoding is to check our server's behavior about HTTP Compression.
Per https://github.com/jakubroztocil/httpie#http-headers , you can override those headers. For example, set Accept-Encoding to empty to achieve the same effect as if you had removed it -- per the rules at http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3 .
Add the Headers follow by a colon.
No headers:
http -v https://jsonplaceholder.typicode.com/todos/1 \
Accept: \
Accept-Encoding: \
Connection: \
Host: \
User-Agent:
Request:
GET /todos/1 HTTP/1.1
Host: jsonplaceholder.typicode.com
Response:
HTTP/1.1 200 OK
...
Standard:
http -v https://jsonplaceholder.typicode.com/todos/1
Request:
GET /todos/1 HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: jsonplaceholder.typicode.com
User-Agent: HTTPie/0.9.8
Response:
HTTP/1.1 200 OK
...
The -v option display the request. Also, remember no spaces after \ in multiline bash commands.
Related
Working with a project, where using cookie for user identification.
When user arrives, it calls the service (which is running in localhost) and the service sending cookie with the response header looks like below:
curl 'http://127.0.0.1:8000/api/v1.0/tracking' -X OPTIONS -H 'Access-Control-Request-Method: POST' -H 'Origin: http://local.com:8080' -H 'Access-Control-Request-Headers: content-type,x-forwarded-for' --compressed
The response header looks like below:
HTTP/1.1 200 OK
Connection: keep-alive
Keep-Alive: 60
Access-Control-Allow-Headers: Content-Type, Access-Control-Allow-Headers, x-forwarded-for
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST, PATCH, GET
Content-Length: 0
Content-Type: text/plain; charset=utf-8
Set-Cookie: id=random_id_123_123; expires=Wed, 06-Dec-2017 10:57:36 GMT; Domain=.local.com; Path=/
And then after a specific user action, the app is sending following API request:
curl 'http://127.0.0.1:8000/api/v1.0/tracking?event=video_added&user_id=123123123' -H 'Origin: http://local.com:8080' -H 'Accept: */*' -H 'Referer: http://local.com:8080/' -H 'Connection: keep-alive' --compressed
The request header for the above request looks like below:
GET api/v1.0/tracking?event=video_added&user_id=123123123 HTTP/1.1
Host: 127.0.0.1:8000
Connection: keep-alive
Accept: */*
Origin: http://local.com:8080
User-Agent: My user agent
Referer: http://local.com:8080/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
I was expecting the cookie (random_id_123_123) to be received with the first request as response header would be the request header for the second request.
The website is running on: http://local.com:8080 (which actually running on local machine and my vhost config pointing 127.0.0.1 local.com) and its being served by python SimpleHTTPServer.
The backend service which is setting the cookie is running on port 8000 in localhost also.
Seems I have missed something during the implementation. Whats that?
Edit: Here is the code.
Your issue is that cookies are only sent based on the domain. Your code has
var settings = {
"crossDomain": true,
"url": "http://127.0.0.1:8000/api/v1.0/tracking?event=video_added&tracking_id=123123123",
"method": "GET",
}
The url is 127.0.0.1:8000 and it should be local.com:8000 if you want the cookies to be passed.
Last time I checked, curl doesn't have enabled the cookies by default.
To do so you will need to:
Use the parameter -b /path/to/cookiejar to read cookies.
Use the parameter -c /path/to/cookiejar to write cookies.
So your requests should become:
curl -c cookiejar 'http://127.0.0.1:8000/api/v1.0/tracking' \
-X OPTIONS -H 'Access-Control-Request-Method: POST' \
-H 'Origin: http://local.com:8080' \
-H 'Access-Control-Request-Headers: content-type,x-forwarded-for' \
--compressed
And:
curl -b cookiejar 'http://127.0.0.1:8000/api/v1.0/tracking?event=video_added&user_id=123123123' \
-H 'Origin: http://local.com:8080' \
-H 'Accept: */*' \
-H 'Referer: http://local.com:8080/' \
-H 'Connection: keep-alive' --compressed
I'm trying to connect to website with python requests, but not with my real IP. So, I found some proxy on the internet and wrote this code:
import requests
proksi = {
'http': 'http://5.45.64.97:3128'
}
x = requests.get('http://www.whatsmybrowser.org/', proxies = proksi)
print(x.text)
When I get output, proxy simple don't work. Site returns my real IP Address. What I did wrong? Thanks.
The answer is quite simple. Although it is a proxy service, it doesn't guarantee 100% anonymity. When you send the HTTP GET request via the proxy server, the request sent by your program to the proxy server is:
GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0
Now, when the proxy server sends this request to the actual destination, it sends:
GET http://www.whatsmybrowser.org/ HTTP/1.1
Host: www.whatsmybrowser.org
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.10.0
Via: 1.1 naxserver (squid/3.1.8)
X-Forwarded-For: 122.126.64.43
Cache-Control: max-age=18000
Connection: keep-alive
As you can see, it throws your IP (in my case, 122.126.64.43) in the HTTP header: X-Forwarded-For and hence the website knows that the request was sent on behalf of 122.126.64.43
Read more about this header at: https://www.rfc-editor.org/rfc/rfc7239
If you want to host your own squid proxy server and want to disable setting X-Forwarded-For header, read: http://www.squid-cache.org/Doc/config/forwarded_for/
If, for example, i send a request to google, it looks like thios in fiddler:
GET http://www.google.com HTTP/1.1
Accept-Encoding: identity
Host: google.com
Connection: close
User-Agent: Python-urllib/2.7
and i want to make it look like this:
GET http://www.google.com HTTP/1.1
Accept-Encoding: identity
Host: google.com
Connection: close
User-Agent: Python-urllib/2.7
a=112&b=2335
How can i do it using urllib2/1?
Thanks!!
I've followed the following article in an attempt to setup Apache2 caching in order to use it with Django on Ubuntu 12.10 with mod_wsgi. I want Apache to cache some requests for me.
http://www.howtoforge.com/caching-with-apaches-mod_cache-on-ubuntu-10.04
From the article I enabled the modules and setup the following php script to test the caching. The caching works just fine - I only get a new timestamp after 5 minutes.
vi /var/www/cachetest.php
<?php
header("Cache-Control: must-revalidate, max-age=300");
header("Vary: Accept-Encoding");
echo time()."<br>";
?>
Now in my django response, I return an HttpResponse object after setting the appropriate headers the same way:
# Create a Response Object with the content to return and set it's
response = HttpResponse("%s"%(output_display))
response['Cache-Control'] = 'must-revalidate, max-age=20'
response['Vary'] = 'Accept-Encoding'
return response
The caching with the Django request doesn't work at all. I've used Firefox's LiveHeaders to examine the HTTP response headers.
For the example link above and the PHP script the headers look like:
http://localhost/cachetest.php
GET /cachetest.php HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cache-Control: max-age=0
HTTP/1.1 200 OK
Date: Sun, 10 Mar 2013 02:29:32 GMT
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.4.6-1ubuntu1.1
Cache-Control: must-revalidate, max-age=300
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 34
Connection: close
Content-Type: text/html
----------------------------------------------------------
For my Django Request - the caching doesn't work, it always forces the lengthy operation to complete the response - just like re-loading the php request above with F5. Using the FireFox plugin I seem to be writing the correct headers:
http://localhost/testdjango/testdjango/
GET /testdjango/testdjango/ HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 200 OK
Date: Sun, 10 Mar 2013 02:32:41 GMT
Server: Apache/2.2.22 (Ubuntu)
Vary: Accept-Encoding
Cache-Control: must-revalidate, max-age=20
Content-Encoding: gzip
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
----------------------------------------------------------
What am I doing wrong? How can I get the django caching to work like the php script? Thanks!
This seems to be your problem:
Transfer-Encoding: chunked
It means a 'streaming response', in terms of mod_mem_cache. And, according to the docs:
By default, a streamed response will not be cached unless it has a
Content-Length header.
You can solve it by setting the MCacheMaxStreamingBuffer directive.
I have a question about Python regex. I don't have much information about Python regex. I am working with HTTP request messages and parsing them with regex. As you know, the HTTP GET messages are in this format.
GET / HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: 10.2.0.12
Connection: Keep-Alive
I want to parse the URI, method, user-agent, and the host areas of the message. My regex for this job is:
r'^({0})\s+(\S+)\s+[^\n]*$\n.*^User-Agent:\s*(\S+)[^\n]*$\n.*^Host:\s*(\S+)[^\n]*$\n'.format('|'.join(methods)), re.MULTILINE|re.DOTALL)
But, when the message comes up with like
GET / HTTP/1.0
Host: 10.2.0.12
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Connection: Keep-Alive
I can not catch them because of the places of host or, user-agent changed. So I need a generic regex that will catch all of them, even if the places of host, method, uri are changed in the message.
Readability Counts (The Zen of Python)
Use findall() for each subexpression you want to find. This way your regex will be short, readable, and independent of the location of the subexpression.
Define a simple, readable regex:
>>> user=re.compile("User-Agent: (.*?)\n")
Test it with two different http headers:
>>> s1='''GET / HTTP/1.0
Host: 10.2.0.12
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Connection: Keep-Alive'''
>>> s2='''GET / HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: 10.2.0.12
Connection: Keep-Alive'''
>>> user.findall(s1)
['Wget/1.12 (linux-gnu)']
>>> user.findall(s2)
['Wget/1.12 (linux-gnu)']
Parse the whole headers into a dictionary like so?
headers = """GET / HTTP/1.0
Host: 10.2.0.12
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Connection: Keep-Alive"""
headers = headers.splitlines()
firstLine = headers.pop(0)
(verb, url, version) = firstLine.split()
d = {'verb' : verb, 'url' : url, 'version' : version}
for h in headers:
h = h.split(': ')
if len(h) < 2:
continue
field=h[0]
value= h[1]
d[field] = value
print d
print d['User-Agent']
print d['url']