I got fiddler to capture a GET request, I want to re send the exact request with python.
This is the request I captured:
GET https://example.com/api/content/v1/products/search?page=20&page_size=25&q=&type=image HTTP/1.1
Host: example.com
Connection: keep-alive
Search-Version: v3
Accept: application/json
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
Referer: https://example.com/search/?q=&type=image&page=20
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
You can use the requests module.
The requests module automatically supplies most of the headers for you so you most likely do not need to manually include all of them.
Since you are sending a GET request, you can use the params parameter to neatly form the query string.
Example:
import requests
BASE_URL = "https://example.com/api/content/v1/products/search"
headers = {
"Connection": "keep-alive",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
}
params = {
"page": 20,
"page_size": 25,
"type": "image"
}
response = requests.get(BASE_URL, headers=headers, params=params)
import requests
headers = {
'authority': 'stackoverflow.com',
'cache-control': 'max-age=0',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'referer': 'https://stackoverflow.com/questions/tagged/python?sort=newest&page=2&pagesize=15',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,tr-TR;q=0.8,tr;q=0.7',
'cookie': 'prov=6bb44cc9-dfe4-1b95-a65d-5250b3b4c9fb; _ga=GA1.2.1363624981.1550767314; __qca=P0-1074700243-1550767314392; notice-ctt=4%3B1550784035760; _gid=GA1.2.1415061800.1552935051; acct=t=4CnQ70qSwPMzOe6jigQlAR28TSW%2fMxzx&s=32zlYt1%2b3TBwWVaCHxH%2bl5aDhLjmq4Xr',
}
response = requests.get('https://stackoverflow.com/questions/55239787/how-to-send-a-get-request-with-headers-via-python', headers=headers)
This is an example of how to send a get request to this page with headers.
You may open SSL socket (https://docs.python.org/3/library/ssl.html) to example.com:443, write your captured request into this socket as raw bytes, and then read HTTP response from the socket.
You may also try to use http.client.HTTPResponse class to read and parse HTTP response from your socket, but this class is not supposed to be instantiated directly, so some unexpected obstacles could emerge.
Related
I'm trying to log in to a website through a python script that I've created using the requests module. I've issued a post HTTP request with appropriate parameters and headers to the server, but for some reason I get a different response from that site compared to what I see in dev tools. The status is always 200, though. There is also a get request in place within the script that should fetch the credentials once the login is successful. Currently, it throws a JSONDecodeError on the last line.
import requests
link = 'https://propwire.com/login'
check_url = 'https://propwire.com/search'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
'x-requested-with': 'XMLHttpRequest',
'referer': 'https://propwire.com/login',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,bn;q=0.8',
'origin': 'https://propwire.com',
}
payload = {"email":"some-email","password":"password","remember":"true"}
with requests.Session() as s:
r = s.get(link)
headers['x-xsrf-token'] = r.cookies['XSRF-TOKEN'].rstrip('%3D')
s.headers.update(headers)
s.post(link,json=payload)
res = s.get(check_url)
print(res.json()['props']['auth'])
How can I get around the 403 code?
I was trying to get search result from a website, however I got "Response[403]" message, I've found similar get solving 403 error by adding headers to request.get, however it didn't work for my problem. What should I do to correctly get the result I want?
Google Chrome Request
Request
Request URL: http://178.32.46.165/
Request Method: GET
Status Code: 200 OK
Remote Address: 178.32.46.165:80
Referrer Policy: strict-origin-when-cross-origin
Request Headers
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate
Accept-Language: tr-TR,tr;q=0.9
Cache-Control: max-age=0
Connection: keep-alive
Host: 178.32.46.165
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36
Python Request
import requests
headers = {
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Language': 'tr-TR,tr;q=0.9',
}
response = requests.get('http://178.32.46.165/', headers=headers, verify=False)
print(response)
Response
<Response [403]>
If you're getting a 403 response, you're likely not meant to be able to access this resource. This is not a Python issue, this is an issue with whatever website you're trying to access.
Additionally, this seems to be a temporary issue. I pasted your code as-is into my Python prompt and received <Response [200]>
I think it may provide blocking depending on the country.
I want to know request headers of URL or cookie value in headers.
url = 'https://research.ibfd.org/endecapod/myN=3+10+4293760389+4293746984&Nu=global_rollup_key&Np=2&Nao=20&Nty=0&Ns=decision_date_sortable|1'
Headers for that URL are
headers = {
"Accept":"application/json; charset=utf-8",
"Accept-Encoding":"gzip, deflate, br",
"Accept-Language":"en-US,en;q=0.9",
"Connection":"keep-alive",
"Cookie": "optimizelyEndUserId=oeu1575288961321r0.6329519991798649; optimizelyBuckets=%7B%7D; _ga=GA1.2.693353487.1575288963; _hjid=cfef24b9-ec0f-4677-90eb-6f25e4a4351e; s_ecid=MCMID%7C89844361514127015796100074732098692046; optimizelySegments=%7B%22785891794%22%3A%22gc%22%2C%22786481600%22%3A%22false%22%2C%22787091486%22%3A%22referral%22%7D; angular-ab-tests-ibfd_preview=no; IBFD_SESSION=1xwvMTmbcR714KGplMpDLoYVqTikDSbU; _hjIncludedInSample=1; AMCVS_2F71365B5AA928340A495E05%40AdobeOrg=1; s_cc=true; wt_tags=SUB:kbase:masna.venu#thomsonreuters.com::username:; s_sq=%5B%5BB%5D%5D; _hjAbsoluteSessionInProgress=1; has_js=1; _gid=GA1.2.1180329552.1595404659; s_ppvl=Home%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C90%2C90%2C588%2C1422%2C588%2C1280%2C720%2C1.35%2CP; s_ppn=Search%20-%20Tax%20Research%20Platform%20-%20IBFD; s_nr=1595405276724-Repeat; AMCV_2F71365B5AA928340A495E05%40AdobeOrg=1075005958%7CMCIDTS%7C18466%7CMCMID%7C89844361514127015796100074732098692046%7CMCOPTOUT-1595412476s%7CNONE%7CMCAID%7CNONE%7CvVersion%7C4.4.1; s_ppv=Search%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C100%2C75%2C21135.408203125%2C360%2C588%2C1280%2C720%2C1.35%2CL",
#"Cookie":"optimizelyEndUserId=oeu1575288961321r0.6329519991798649; optimizelyBuckets=%7B%7D; _ga=GA1.2.693353487.1575288963; _hjid=cfef24b9-ec0f-4677-90eb-6f25e4a4351e; s_ecid=MCMID%7C89844361514127015796100074732098692046; optimizelySegments=%7B%22785891794%22%3A%22gc%22%2C%22786481600%22%3A%22false%22%2C%22787091486%22%3A%22referral%22%7D; angular-ab-tests-ibfd_preview=no; _gid=GA1.2.1850576194.1594914732; IBFD_SESSION=IstWIRdSWRTYlHdmcgyqwaigHIGkr6Q3; AMCVS_2F71365B5AA928340A495E05%40AdobeOrg=1; _hjIncludedInSample=1; s_ppn=Search%20-%20Tax%20Research%20Platform%20-%20IBFD; s_ppvl=%5B%5BB%5D%5D; s_cc=true; wt_tags=SUB:kbase:masna.venu#thomsonreuters.com::username:; s_sq=%5B%5BB%5D%5D; _hjAbsoluteSessionInProgress=1; AMCV_2F71365B5AA928340A495E05%40AdobeOrg=1075005958%7CMCIDTS%7C18461%7CMCMID%7C89844361514127015796100074732098692046%7CMCOPTOUT-1594974899s%7CNONE%7CMCAID%7CNONE%7CvVersion%7C4.4.1; s_nr=1594967699992-Repeat; s_ppv=Search%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C100%2C91%2C3987%2C759%2C642%2C1280%2C720%2C1.35%2CP",
"Host":"research.ibfd.org",
"Referer":"https://research.ibfd.org/",
"Sec-Fetch-Dest":"empty",
"Sec-Fetch-Mode":"cors",
"Sec-Fetch-Site":"same-origin",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36"
}
I have got those headers from chrome developer tools. In those headers, cookie value is changing everyday. I want to get that cookie value by some script.
Can anyone help me to get that?
Thanks in advance.
This code is to post a form data
headers = {
'authority': 'ec.ef.com.cn',
'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36',
'accept': '*/*',
'accept-language': 'en-US,en;q=0.9,ru-RU;q=0.8,ru;q=0.7,uk;q=0.6,en-GB;q=0.5',
}
s = requests.Session()
response = s.post(url, headers = headers)
which seems different to what Chrome does
I understand :authority is a kind of HTTP/2 Headers. How do I send it with Python requests?
You could use hyper.contrib.HTTP20Adapter, and set the mount(),like:
from hyper.contrib import HTTP20Adapter
import requests
def getHeaders():
headers = {
":authority": "xxx",
":method": "POST",
":path": "/login/secure.ashx",
":scheme": "https",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"
}
return headers
sessions=requests.session()
sessions.mount('https://xxxx.com', HTTP20Adapter())
r=sessions.post(url_search,data=playload,headers=getHeaders())
Refer to a Chinese blog
How do I change the referer if I'm using the requests library to make a GET request to a web page. I went through the entire manual but couldn't find it.
According to http://docs.python-requests.org/en/latest/user/advanced/#session-objects , you should be able to do:
s = requests.Session()
s.headers.update({'referer': my_referer})
s.get(url)
Or just:
requests.get(url, headers={'referer': my_referer})
Your headers dict will be merged with the default/session headers. From the docs:
Any dictionaries that you pass to a request method will be merged with
the session-level values that are set. The method-level parameters
override session parameters.
here we are rotating the user_agent with referer
user_agent_list = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36",
"Mozilla/5.0 (iPad; CPU OS 15_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/104.0.5112.99 Mobile/15E148 Safari/604.1"
]
reffer_list=[
'https://stackoverflow.com/',
'https://twitter.com/',
'https://www.google.co.in/',
'https://gem.gov.in/'
]
headers = {
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Upgrade-Insecure-Requests': '1',
'User-Agent': random.choice(user_agent_list),
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-US,en;q=0.9,fr;q=0.8',
'referer': random.choice(reffer_list)
}