Accessing Response Header Location in Through Python - python

I'm currently trying to access the location field in a response header from a GET request to url: https://dbr.ee/aUJA/d?. Currently, I have been able to view the location field through this Python code:
import requests
r = requests.get('hhttps://dbr.ee/aUJA/d?', allow_redirects=False, headers={'User-Agent': 'Mozilla/5.0'})
print r.headers
But the output is the wrong location field
{'Status': '302 Found', 'X-Request-Id':
'9e968067-1bee-4cc9-9305-19d45d5cb6ea', 'X-XSS-Protection': '1;
mode=block', 'X-Content-Type-Options': 'nosniff', 'Transfer-Encoding':
'chunked', 'Set-Cookie':
'__cfduid=d21c538fd46c153a046bf461ca281978d1499637583; expires=Mon,
09-Jul-18 21:59:43 GMT; path=/; domain=.dbr.ee; HttpOnly,
ahoy_visitor=f4f1c08c-add3-45c0-8325-675b1caf3048; path=/;
expires=Tue, 09 Jul 2019 21:59:44 -0000,
ahoy_visit=cdbb4ca8-3272-473c-8562-03596d88ec0f; path=/; expires=Mon,
10 Jul 2017 01:59:44 -0000, ahoy_track=true; path=/, SERVERID=;
Expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/', 'X-Runtime':
'0.006820', 'Server': 'cloudflare-nginx', 'Connection': 'keep-alive',
'Location': 'hhttps://dbr.ee/aUJA', 'Cache-Control': 'no-cache',
'Date': 'Sun, 09 Jul 2017 21:59:44 GMT', 'X-Frame-Options':
'SAMEORIGIN', 'Content-Type': 'text/html; charset=utf-8', 'CF-RAY':
'37be8d52fdc83822-ATL'}
Which is:
'Location': 'hhttps://dbr.ee/aUJA'
While on the site, the actual response header is this (viewed through Chrome Developer Tools)
cache-control:no-cache cf-ray:37be8bacacb437d4-ATL
content-type:text/html; charset=utf-8 date:Sun, 09 Jul 2017 21:58:36
GMT
location:hhttps://s.dbr.ee/sffc/python%2Dlogo%2Dmaster%2Dv3%2DTM.png.zip?temp_url_sig=41ebabb749293a6fe3f3ec82c5ab8ec01b0ed053&temp_url_expires=1499637816&filename=python-logo-master-v3-TM.png.zip;&attachment
server:cloudflare-nginx
set-cookie:ahoy_visit=f7d15e42-155c-443f-a637-22c3681863a5; path=/;
expires=Mon, 10 Jul 2017 01:58:36 -0000
set-cookie:_dbree_session=U2x6akdCbUJ4c28wdW9MeUFYOXo1QUVxLzV3ZVNxcGtTWW1jbVdkWEdPOWZPMWFiOEl4M0VWY1dOWGNYTjNubEJoVWJHejRCTlQwQlkwL0UrM09QallTMzhFZlU3RFBBTDZxaW9xcGRMeXNlQS9mZFByYTZQWTM0ZlBHMU50ekhhTkt1bjZENXJHRnc2a3dWeGY2d3BBPT0tLVNKOTJnL0Q3SjloWEc0MTZqTnRPNFE9PQ%3D%3D--2dd8f3e77a673f385c9a231af426b55f1d1f71c0;
domain=dbr.ee; path=/; HttpOnly set-cookie:SERVERID=; Expires=Thu,
01-Jan-1970 00:00:01 GMT; path=/ status:302 status:302 Found
x-content-type-options:nosniff x-frame-options:SAMEORIGIN
x-request-id:f57f3ca7-c7aa-4449-a2d7-7b5014010d0f x-runtime:0.015892
x-xss-protection:1; mode=block
where Location is
location:hhttps://s.dbr.ee/sffc/python%2Dlogo%2Dmaster%2Dv3%2DTM.png.zip?temp_url_sig=41ebabb749293a6fe3f3ec82c5ab8ec01b0ed053&temp_url_expires=1499637816&filename=python-logo-master-v3-TM.png.zip;&attachment
Which is the download link I am trying to scrape in Python. This appears in Developer Tools after clicking the Direct Download button.
How can I get the header to show me the correct field location in Python?
*links have been modified with h in front of http because of not allowing me to post more than 2 links, but are necessary for context of the question

Looks like the issue was the missing referer header. Once I add that to your code I get the appropriate 302 redirect response, with the correct Location header:
import requests
r = requests.get('https://dbr.ee/aUJA/d?', allow_redirects=False, headers={
'Referer': 'https://dbr.ee/aUJA'
})
print(r.headers)
Which produces:
{'Date': 'Sun, 09 Jul 2017 23:44:55 GMT', 'Content-Type': 'text/html; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Set-Cookie': '__cfduid=d071cba66cc515ca7f2bc620362c6d46d1499643895; expires=Mon, 09-Jul-18 23:44:55 GMT; path=/; domain=.dbr.ee; HttpOnly, ahoy_visitor=64d9f580-781e-4037-8951-ce57b73df720; path=/; expires=Tue, 09 Jul 2019 23:44:55 -0000, ahoy_visit=802132cc-4e0e-4089-9be5-49f05223f567; path=/; expires=Mon, 10 Jul 2017 03:44:55 -0000, SERVERID=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/', 'Status': '302 Found', 'Cache-Control': 'no-cache', 'X-XSS-Protection': '1; mode=block', 'X-Request-Id': '14a0d0df-c14d-477d-b87c-b6edb823619c', 'Location': 'https://s.dbr.ee/sffc/python%2Dlogo%2Dmaster%2Dv3%2DTM.png.zip?temp_url_sig=084b2b71c8c12df993d528e991a5b44e46e974ef&temp_url_expires=1499644195&filename=python-logo-master-v3-TM.png.zip;&attachment', 'X-Runtime': '0.006968', 'X-Frame-Options': 'SAMEORIGIN', 'X-Content-Type-Options': 'nosniff', 'Server': 'cloudflare-nginx', 'CF-RAY': '37bf2769fda80fa5-YYZ'}

Related

Error 401 after successful NTLM authentication

I am trying to login into a site, which requires NTLM authentication, using HttpNtlmAuth from requests_ntlm. Here is the code. I have not shown actual url (url1) as it is from my company.
url1 = "http://url"
payload={"ref": "B72048061"}
header_data = {'User-agent':'Mozilla/5.0'}
auth = HttpNtlmAuth(User, Password)
r1 = requests.get(url1, params=payload, headers=header_data, auth=auth, proxies=proxies, verify=False)
r1
I am getting below error.
<Response [401]>
Here are more details.
print(r1.url)
print(r1.headers)
print(r1.text)
output
http://url?ref=B72048061
{'Date': 'Mon, 19 Dec 2022 11:55:05 GMT', 'Set-Cookie': 'SMSESSION=8WEKGZ5xQuxJ9Uefebxe+YIv6HDZhltVtmL9gdrSvmguTuaIH6zZ40TW9+y5ZSTQzEKJLpRta26Fj1txU2lHrnnU7F8GSuviGsLPrKoh0xtlxhkrdogKgOtFuaw9k2e9cusAWjtBE+as2i1Qd0D2cPRUGm/4fJez+itFVyAzJ0eP7pW/ggGyIFoZmU7XjgaqliRc9tuqU5SufXrEPztO4yZSEUgaXUz/ul6XK0NZ/QO211cCmxanYkmGHvCdAhT/z9p2V8Xq5wlRRvpgupVbuxgrvu+OQOANdhuLWOj+KZFiZmyyECKe0QQnK08tFiV7GbZUHNRMK+8lm/zYOkSm/w9NXtIluAIBEzuClzk1cmfesjxCmDXjHuZ90jtwQDxuSRwIcXzdZqvl4L6k8oBIPZVJD3QFAkmudxnXZTddHRFx+YaKvF2Yjhq8vmef+ucanEavekMktpoo9r226og31Zd6uV3C1mZT/8zcN4PVdoJR0XZMLSUKwxLh28EQjIkDX0RhXQkOkVaX3BAiMqubPqJsjlZDRVjNdIqa05qbLBmAG9tzNIv1PNzXltJ6/zptiy81Hms8W3IZrNQLbe8Ry2tAqCCbmkdP+KJ1N20OHGbcUiyyigW2YN24u3/JFx8XTEd4DkSpzFk/kVzBWw/zDlNAI3N0rMNcbfZ/MdXO54lLCa35+znwyb9fSxRt3n4tNCnfpaf6okIXJjYFoEZl/moICdXmEXXF+u3LJAWSc5hZM78S7kM4r0rrxz1qfZZSK0cVPCyc4KfAksnrIgq8ciCEZuieVcXiYbYic9d0B9XcOEArkDxoremnavk0RZAxisyaYR5UJcKtQTKfr5r2lzgEwD2mHIU8sjiDG0lNchlE05uJfYTrnas6oNOA1ZLuKHHRJ0kx8U6QSi4yHbuPeJ4Gubqbpiq/a8E6x5j00hwJl/ZDp8CB77HzhgWvACnYD8TrNwzEpjOYD70PyAq1QHeoUsmZVgggPHs0PtX05UhL+77aIVT/STdZCKj4ZdDbPHbTgz81vh9jfpR6lv6FXCWEWO/b+DxDsBuaPlDid36gVxCcvj4r8YnLIclWEVcznrZfulE2wTIB86Ckk+3lVnRw5FWPWLhvgdcbeiQFho7u8B6x4H7zTqWinCpvVH8mVEu6fPA7swECLMii+YrbC2v67Uyc1qdGj5+HO7qJYEcMMA2yI5ygaQK7kMYeeLOPj2U6Qw7ni4u989WFvV24ZdQE9TW6hMDWYz82MEnJkI7spDO3JKxfiMStYY1/TdaeVsGrc3KonDBwNPjfjn03zkI+kX+CtJML; path=/; domain=xxx.corp, SMCHALLENGE=YES; Path=/; Domain=xxx.corp, SMONDENIEDREDIR=NO; Expires=Thu, 01-Dec-94 16:00:00 GMT; Path=/; Domain=xxx.corp', 'X-Powered-By': 'Servlet/3.0', 'WWW-Authenticate': 'Basic realm="REP_F370_TAI-root [12:55:5:945] "', '$WSEP': '', 'Expires': 'Thu, 01 Dec 1994 16:00:00 GMT', 'Cache-Control': 'no-cache="set-cookie, set-cookie2"', 'X-OneAgent-JS-Injection': 'true', 'X-ruxit-JS-Agent': 'true', 'Server-Timing': 'dtSInfo;desc="0", dtRpid;desc="-521266895"', 'Keep-Alive': 'timeout=10, max=99', 'Connection': 'Keep-Alive', 'Transfer-Encoding': 'chunked', 'Content-Type': 'text/html;charset=ISO-8859-1', 'Content-Language': 'en-US'}
Error 401: ACCESS-DENIED
Take a look into response history:
for resp in r1.history:
print(resp.url)
print(resp.text)
print('Status Code:', resp.status_code)
print(resp.headers)
print('\n')
Output:
http:////url?ref=B72048061
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved here.</p>
<hr>
<address>IBM_HTTP_Server at xxx-pdm-services.eu.xxx.corp Port 1080</address>
</body></html>
Status Code: 302
{'Date': 'Mon, 19 Dec 2022 12:25:06 GMT', 'Cache-Control': 'no-store', 'Location': 'https://winssor12-vip.xxx.corp:443/siteminderagent/ntlm/creds.ntc?CHALLENGE=&SMAGENTNAME=-SM-9xex6NsJ4%2b587a54UsP5CW4QMeT35xwiKJoZua8FNdPb8Uvg%2bW%2fjUV2leieKYjCz&TARGET=-SM-HTTP%3a%2f%2fxxx--pdm--services%2eeu%2exxx%2ecorp%3a1080%2faps--web%2fP%2fview%2fO_IP%3fref%3dV92B72048061', 'Server-Timing': 'dtSInfo;desc="0", dtRpid;desc="-139048060"', 'Set-Cookie': 'dtCookie=v_4_srv_9_sn_9A3A0571C2E35B5E0D7C5AB97C0604EB_perc_100000_ol_0_mul_1_app-3Afb71c72ef431f887_1; Path=/; Domain=.xxx.corp', 'Content-Length': '570', 'Keep-Alive': 'timeout=10, max=100', 'Connection': 'Keep-Alive', 'Content-Type': 'text/html; charset=iso-8859-1'}
https://winssor12-vip.xxx.corp:443/siteminderagent/ntlm/creds.ntc?CHALLENGE=&SMAGENTNAME=-SM-9xex6NsJ4%2b587a54UsP5CW4QMeT35xwiKJoZua8FNdPb8Uvg%2bW%2fjUV2leieKYjCz&TARGET=-SM-HTTP%3a%2f%2fxxx--pdm--services.eu.xxx.corp%3a1080%2faps--web%2fP%2fview%2fO_IP%3fref%3dB72048061
Status Code: 302
{'Via': 'proxy A', 'Date': 'Mon, 19 Dec 2022 12:25:06 GMT', 'Server': 'Microsoft-IIS/10.0', 'Location': 'HTTP://xxx-pdm-services.eu.xxx.corp:1080/aps-web/P/view/O_IP?ref=B72048061', 'Connection': 'Keep-Alive', 'set-cookie': 'SMCHALLENGE=NTC_CHALLENGE_DONE; path=/; domain=xxx.corp, SMSESSION=H5Ya4mHFxMf5o5wXubpUuiKrZM09+RY8Na0i+Q2kxE1aor1MQhybJlg4WgFLB5Iw8lRNFhr1qvPUElsa4g+M0obXRSvP6jt0XwQPQOHc6pnmhXYyDB6L2lAIrgccnPvZsFxYF6Cig+coLbE0DIDeWFuDXD47LXBRzTqY1KRwCOkX+oSuC0AdvhGoeIOljgthwm8KLvHEzVq9fvUSfJAPeRPOyoRRtt0JA6Vg4BgCvsiNT1KHV1/9UNugmWp668juLilZHM6ZJZZt5i//5zFMTlgxSH91z7PkQUzKBwHrxZ0BPXIC4dlVsirfaMgRcZ33T5Cj5jLuDWk628Ce3ps6JbxCopaxKiD5xQFxLGbkK0juKvqBPAlJqLG8Rvv3vEGuujPt0G8mGmysGjHBjIYup7xFzCfD4mDkB2haZAOJrIhEJ7NBERoeemak2XmeUfeM8WxdXlAAgZmcxhH5+04RovujA7Qwwv4LW22jO3nPVqBQi81sN+5faW/ZPgCXXZ4Pzd7icTri6yJYVT5SBcaMhXbxkfNwNSk7YgFrU+5BDj7rDJNjVCiGAiapEQjU+3GxrRc9Ql7Rf6+gKWEEcNrWKS68O70+0uZftnldkA8a5MmJuTQ0SyApSWMPIvICQvKzBRrhBPE5R1LWxZRZIQ3ysDhG92fg3s8Arau5dFwuFYFMvQXAAYgjSNZ4c114yfWOjcV2edtDRYH9SekqpX6QCxX0UBex0LeKjbrSmNiY2IPdmBOoxHHRA9KreHgebugh4wEX9Xrp+oUm9MjxhQee+/lqiAqcuh81crWYttu3LkaXkj0BUoRmm6rqhm3GuaSwQSAN35LpcStN/+IgEEYFsIswd1OiWNY6IfARN5nLpqQgupMhVa3G/kZpadSfDmt0/u+G07M1EjjpaJB2eoDCHeDBFVZAJhgNxtzj2Na6T0SE5JYmCSp1XdZj0oeOC33eDX1M9q9xDYOCHLhmcCcFP7KE5mi0Q3KZ7N//ws95uPTdgPwMdxxKVykNebdbEPQjMltN09Uj80ws5N4FRu/9Cyq+4L+jS0gEaZz0OIR4l0v4nhqWqtoq3s1pkTlGU8z52uJE1IKH7pO5gG5iz9+/HYpA1JIdQH/HuL2Nu6SOh14FTdrATU5Id9Ip9Wvyi62G/FgLXHh/BYHC9yxrZYQX0M3NYQSjCuDRtbhGLPKkmYvMSDl8ZGmjlLapRdaQn66J6ILPmLKnifoydT8FJizsAzVtdGhNo5yszH5iCK6Y0976oVWqYfIadT+I6RdBC+/iVVzxHYZEMxsDOLBKAKPLDR5tyK2hilM4; path=/; domain=airbus.corp', 'X-Powered-By': 'ASP.NET', 'Cache-Control': 'no-store', 'Content-Length': '0', 'Persistent-Auth': 'false'}
So as can be seen in history, NTLM authentication was completed and is redirected back to required url1. However due to some issues, which I am not able to figure out, access is denied with error code 401 as can be seen above. Another thing which I noticed is that SMchallenge in response header set cookie is 'YES' and there exists SMONDENIEDREDIR. When I checked the form in browser network for succeful login, SMchallenge in set cookies remain 'No' and there exists nothing like SMONDENIEDREDIR. I am now wondering whether NTLM authentication was successful or not and how to solve this issue!

Can not get Set Cookie value python requests headers

I am trying to get a particular cookie from a request that I can see being set in browser through a particular endpoint which is like this
https://www.store.com/cart/miniCart/TOTAL?_=1591997339780
these are the response headers I see through Chrome Dev Tools
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
CF-Cache-Status: DYNAMIC
CF-RAY: 5a26c442fb22801a-SAN
cf-request-id: 034c18fddd0000801ac6b42200000001
Connection: keep-alive
Content-Encoding: gzip
Content-Language: es
Content-Type: application/json;charset=UTF-8
Date: Fri, 12 Jun 2020 21:46:48 GMT
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Expires: 0
Pragma: no-cache
Server: cloudflare
Set-Cookie: JSESSIONID=18A8A12169ED6472A7359160F663CCF8; Path=/; Secure; HttpOnly
set-cookie: store-cart=d49a003e-41b5-444a-a71d-26b6f8db201c; Expires=Sun, 09-Nov-2031 13:46:48 GMT; Path=/; Secure; HttpOnly
set-cookie: AWSELB=11E5B3D30C8ACAF6D3240C8807474BBC740A29E2E0C61131788A04E3E6A646357EAA774C0A57B3DA33B571BADB93658470F13A3C847B4477CA237BB286CE5F3813ACBA53EEB69427F5D135043AFB3B2DC4835F3057;PATH=/;SECURE;HTTPONLY
Strict-Transport-Security: max-age=31536000 ; includeSubDomains
Transfer-Encoding: chunked
Via: 1.1 www.innvictus.com
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
but through Python requests I get the following headers only with response.headers using exact request headers in my browser and in my code
{'Date': 'Fri, 12 Jun 2020 21:56:36 GMT', 'Content-Type': 'application/json;charset=UTF-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Cache-Control': 'no-cache, no-store, max-age=0, must-revalidate', 'Content-Language': 'es', 'Expires': '0', 'Pragma': 'no-cache', 'Set-Cookie': 'JSESSIONID=66AA2037611590192D2E13C38FF65289; Path=/; Secure; HttpOnly, AWSELB=11E5B3D30C8ACAF6D3240C8807474BBC740A29E2E0D0EAFB9AD200F275E3F63597988B98E611188683EDE09A5FA437554B92ECADED7B4477CA237BB286CE5F3813ACBA53EEF44544C6AD7FBBF8C242FCAC378603C5;PATH=/;SECURE;HTTPONLY', 'Strict-Transport-Security': 'max-age=31536000 ; includeSubDomains', 'Via': '1.1 www.store.com', 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'SAMEORIGIN', 'X-XSS-Protection': '1; mode=block', 'CF-Cache-Status': 'DYNAMIC', 'cf-request-id': '034c21f9160000e6f09b961200000001', 'Expect-CT': 'max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"', 'Server': 'cloudflare', 'CF-RAY': '5a26d2a1beb9e6f0-EWR', 'Content-Encoding': 'gzip'}
the cookie I need is the "store-cart=d49a003e-41b5-444a-a71d-26b6f8db201c; Expires=Sun, 09-Nov-2031 13:46:48" cookie but as you can see not in the dictionary I get with response.headers

Different JSON repsonse when using requests module in python

I am trying to get JSON response from this URL.
But the JSON I see in the browser is different than what I get from python's requests response.
The code and its output:-
#code
import requests
r = requests.get("https://www.bigbasket.com/product/get-products/?slug=fruits-vegetables&page=1&tab_type=[%22all%22]&sorted_on=popularity&listtype=pc")
print("Status code: ", r.status_code)
print("JSON: ", r.json())
print("Headers:\n",r.headers())
#output
Status code: 200
JSON: '{"cart_info": {}, "tab_info": [], "screen_name": ""}'
Headers:
{'Content-Type': 'application/json',
'Content-Length': '52',
'Server': 'nginx',
'x-xss-protection': '1; mode=block',
'x-content-type-options': 'nosniff',
'x-frame-options': 'SAMEORIGIN',
'Access-Control-Allow-Origin': 'https://b2b.bigbasket.com',
'Date': 'Sat, 02 Sep 2017 18:43:51 GMT',
'Connection': 'keep-alive',
'Set-Cookie': '_bb_cid=4; Domain=.bigbasket.com; expires=Fri, 28-Aug-2037 18:43:51 GMT; Max-Age=630720000; Path=/, ts="2017-09-03 00:13:51.164"; Domain=.bigbasket.com; expires=Sun, 02-Sep-2018 18:43:51 GMT; Max-Age=31536000; Path=/, _bb_rd=6; Domain=.bigbasket.com; expires=Sun, 02-Sep-2018 18:43:51 GMT; Max-Age=31536000; Path=/'}
This is what Chrome shows in dev tools:-
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 4206
Server: nginx
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
Content-Encoding: gzip
x-frame-options: SAMEORIGIN
Access-Control-Allow-Origin: https://b2b.bigbasket.com
Date: Sat, 02 Sep 2017 15:43:20 GMT
Connection: keep-alive
Vary: Accept-Encoding
Set-Cookie: ts="2017-09-02 21:13:20.193"; Domain=.bigbasket.com; expires=Sun, 02-Sep-2018 15:43:20 GMT; Max-Age=31536000; Path=/
Set-Cookie: _bb_rd=6; Domain=.bigbasket.com; expires=Sun, 02-Sep-2018 15:43:20 GMT; Max-Age=31536000; Path=/
Also tried separating query string and specifying it as params argument but it is giving the same result.
import requests
s = requests.session()
s.get("https://www.bigbasket.com/product/get-products/?slug=fruits-vegetables&page=1&tab_type=[%22all%22]&sorted_on=popularity&listtype=pc")
r = s.get("https://www.bigbasket.com/product/get-products/?slug=fruits-vegetables&page=1&tab_type=[%22all%22]&sorted_on=popularity&listtype=pc")
print("Status code: ", r.status_code)
print("JSON: ", r.json())
This is happening because of different City ID identified by your web browser and Requests.
You can check value of _bb_cid in both the cases

How to get the server info of a website using python requests?

I want to make a web crawler to make a statistic about most popular server software among Bulgarian sites, such as Apache, nginx, etc. Here is what I came up with:
import requests
r = requests.get('http://start.bg')
print(r.headers)
Which return the following:
{'Debug': 'unk',
'Content-Type': 'text/html; charset=utf-8',
'X-Powered-By': 'PHP/5.3.3',
'Content-Length': '29761',
'Connection': 'close',
'Set-Cookie': 'fbnr=1; expires=Sat, 13-Feb-2016 22:00:01 GMT; path=/; domain=.start.bg',
'Date': 'Sat, 13 Feb 2016 13:43:50 GMT',
'Vary': 'Accept-Encoding',
'Server': 'Apache/2.2.15 (CentOS)',
'Content-Encoding': 'gzip'}
Here you can easily see that it runs on Apache/2.2.15 and you can get this result by simply saying r.headers['Server']. I tried that with several Bulgarian websites and they all had the Server key.
However, when I request the header of a more sophisticated website, such as www.teslamotors.com, I get the following info:
{'Content-Type': 'text/html; charset=utf-8',
'X-Cache-Hits': '9',
'Cache-Control': 'max-age=0, no-cache, no-store',
'X-Content-Type-Options': 'nosniff',
'Connection': 'keep-alive',
'X-Varnish-Server': 'sjc04p1wwwvr11.sjc05.teslamotors.com',
'Content-Language': 'en',
'Pragma': 'no-cache',
'Last-Modified': 'Sat, 13 Feb 2016 13:07:50 GMT',
'X-Server': 'web03a',
'Expires': 'Sat, 13 Feb 2016 13:37:55 GMT',
'Content-Length': '10290',
'Date': 'Sat, 13 Feb 2016 13:37:55 GMT',
'Vary': 'Accept-Encoding',
'ETag': '"1455368870-1"',
'X-Frame-Options': 'SAMEORIGIN',
'Accept-Ranges': 'bytes',
'Content-Encoding': 'gzip'}
As you can see there isn't any ['Server'] key in this dictionary (although there is X-Server and X-Varnish-Server which I'm not sure what they mean, but its value is not a server name like Apache.
So i'm thinking there must be another request I could send that would yield the desired server information, or probably they have their own specific server software (which sounds plausible for facebook).
I also tried other .com websites, such as https://spotify.com and it does have a ['Server'] key.
So is there a way to find the info about the servers Facebook and Tesla Motors use?
That has nothing to do with python, most well configured web servers will not return information inside the "server" http header due to security implications.
No sane developer would want to let you know that they are running an unpatched version of xxx product.

Not able to upload a file through python

After several attempts and repeated failures, I am posting my code excerpt here. I keep getting Authentication failure. Can somebody point out what is it that I am doing wrong here?
import requests
fileToUpload = {'file': open('/home/pinku/Desktop/Test_Upload.odt', 'rb')}
res = requests.post('https://upload.backupgrid.net/add', fileToUpload)
print res.headers
cookie = {'PHPSESSID': 'tobfr5f31voqmtdul11nu6n9q1'}
requests.post('https://upload.backupgrid.net/add', cookie, fileToUpload)
By print res.headers, I get the following:
CaseInsensitiveDict({'content-length': '67',
'access-control-allow-methods': 'OPTIONS, HEAD, GET, POST, PUT,
DELETE', 'x-content-type-options': 'nosniff', 'content-encoding':
'gzip', 'set-cookie': 'PHPSESSID=ou8eijalgpss204thu7ht532g1; path=/,
B100Serverpoolcookie=4281246842.1.973348976.502419456; path=/',
'expires': 'Thu, 19 Nov 1981 08:52:00 GMT', 'vary': 'Accept-Encoding',
'server': 'Apache/2.2.15 (CentOS)', 'pragma': 'no-cache',
'cache-control': 'no-store, no-cache, must-revalidate', 'date': 'Mon,
09 Sep 2013 09:13:08 GMT', 'access-control-allow-origin': '*',
'access-control-allow-headers': 'X-File-Name, X-File-Type,
X-File-Size', 'content-type': 'text/html; charset=UTF-8'})
It contains the cookies also. Am I passing the cookies correctly? Please help!
You are not passing cookies correctly, should be:
requests.post('https://upload.backupgrid.net/add',
files=fileToUpload,
cookies=cookie)
See also documentation:
Cookies
POST a Multipart-Encoded File

Categories

Resources