This code is to post a form data
headers = {
'authority': 'ec.ef.com.cn',
'user-agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36',
'accept': '*/*',
'accept-language': 'en-US,en;q=0.9,ru-RU;q=0.8,ru;q=0.7,uk;q=0.6,en-GB;q=0.5',
}
s = requests.Session()
response = s.post(url, headers = headers)
which seems different to what Chrome does
I understand :authority is a kind of HTTP/2 Headers. How do I send it with Python requests?
You could use hyper.contrib.HTTP20Adapter, and set the mount(),like:
from hyper.contrib import HTTP20Adapter
import requests
def getHeaders():
headers = {
":authority": "xxx",
":method": "POST",
":path": "/login/secure.ashx",
":scheme": "https",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"
}
return headers
sessions=requests.session()
sessions.mount('https://xxxx.com', HTTP20Adapter())
r=sessions.post(url_search,data=playload,headers=getHeaders())
Refer to a Chinese blog
Related
I'm trying to write script that login to website and monitor it's contetnt. I want to do this with requests, but despite status code 200, when I'm trying to get content or text after post, it showing me login page html code.
Code here:
import requests
POST_LOGIN_URL = 'https://app.distill.io/login'
REQUEST_URL = 'https://app.distill.io/watchlist#inbox'
payload = {
'name': 'username',
'password': 'pass'
}
h = {
'referer': 'https://app.distill.io/watchlist#inbox',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36',
}
h1 = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36'}
session = requests.session()
p = session.get(POST_LOGIN_URL)
cookies = dict(p.cookies)
p = session.post(POST_LOGIN_URL, data=payload, headers=h, cookies=cookies)
print(p.status_code)
r = session.get(REQUEST_URL, headers=h1)
print(r.text)
Am I doing anything wrong?
I want to know request headers of URL or cookie value in headers.
url = 'https://research.ibfd.org/endecapod/myN=3+10+4293760389+4293746984&Nu=global_rollup_key&Np=2&Nao=20&Nty=0&Ns=decision_date_sortable|1'
Headers for that URL are
headers = {
"Accept":"application/json; charset=utf-8",
"Accept-Encoding":"gzip, deflate, br",
"Accept-Language":"en-US,en;q=0.9",
"Connection":"keep-alive",
"Cookie": "optimizelyEndUserId=oeu1575288961321r0.6329519991798649; optimizelyBuckets=%7B%7D; _ga=GA1.2.693353487.1575288963; _hjid=cfef24b9-ec0f-4677-90eb-6f25e4a4351e; s_ecid=MCMID%7C89844361514127015796100074732098692046; optimizelySegments=%7B%22785891794%22%3A%22gc%22%2C%22786481600%22%3A%22false%22%2C%22787091486%22%3A%22referral%22%7D; angular-ab-tests-ibfd_preview=no; IBFD_SESSION=1xwvMTmbcR714KGplMpDLoYVqTikDSbU; _hjIncludedInSample=1; AMCVS_2F71365B5AA928340A495E05%40AdobeOrg=1; s_cc=true; wt_tags=SUB:kbase:masna.venu#thomsonreuters.com::username:; s_sq=%5B%5BB%5D%5D; _hjAbsoluteSessionInProgress=1; has_js=1; _gid=GA1.2.1180329552.1595404659; s_ppvl=Home%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C90%2C90%2C588%2C1422%2C588%2C1280%2C720%2C1.35%2CP; s_ppn=Search%20-%20Tax%20Research%20Platform%20-%20IBFD; s_nr=1595405276724-Repeat; AMCV_2F71365B5AA928340A495E05%40AdobeOrg=1075005958%7CMCIDTS%7C18466%7CMCMID%7C89844361514127015796100074732098692046%7CMCOPTOUT-1595412476s%7CNONE%7CMCAID%7CNONE%7CvVersion%7C4.4.1; s_ppv=Search%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C100%2C75%2C21135.408203125%2C360%2C588%2C1280%2C720%2C1.35%2CL",
#"Cookie":"optimizelyEndUserId=oeu1575288961321r0.6329519991798649; optimizelyBuckets=%7B%7D; _ga=GA1.2.693353487.1575288963; _hjid=cfef24b9-ec0f-4677-90eb-6f25e4a4351e; s_ecid=MCMID%7C89844361514127015796100074732098692046; optimizelySegments=%7B%22785891794%22%3A%22gc%22%2C%22786481600%22%3A%22false%22%2C%22787091486%22%3A%22referral%22%7D; angular-ab-tests-ibfd_preview=no; _gid=GA1.2.1850576194.1594914732; IBFD_SESSION=IstWIRdSWRTYlHdmcgyqwaigHIGkr6Q3; AMCVS_2F71365B5AA928340A495E05%40AdobeOrg=1; _hjIncludedInSample=1; s_ppn=Search%20-%20Tax%20Research%20Platform%20-%20IBFD; s_ppvl=%5B%5BB%5D%5D; s_cc=true; wt_tags=SUB:kbase:masna.venu#thomsonreuters.com::username:; s_sq=%5B%5BB%5D%5D; _hjAbsoluteSessionInProgress=1; AMCV_2F71365B5AA928340A495E05%40AdobeOrg=1075005958%7CMCIDTS%7C18461%7CMCMID%7C89844361514127015796100074732098692046%7CMCOPTOUT-1594974899s%7CNONE%7CMCAID%7CNONE%7CvVersion%7C4.4.1; s_nr=1594967699992-Repeat; s_ppv=Search%2520-%2520Tax%2520Research%2520Platform%2520-%2520IBFD%2C100%2C91%2C3987%2C759%2C642%2C1280%2C720%2C1.35%2CP",
"Host":"research.ibfd.org",
"Referer":"https://research.ibfd.org/",
"Sec-Fetch-Dest":"empty",
"Sec-Fetch-Mode":"cors",
"Sec-Fetch-Site":"same-origin",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36"
}
I have got those headers from chrome developer tools. In those headers, cookie value is changing everyday. I want to get that cookie value by some script.
Can anyone help me to get that?
Thanks in advance.
My school has a website that students can sign up for classes during a study period and I forget all the time. I'm trying to make a script that logs in and signs up for me. This is what I have:
import requests
payload = {
"username": "",
"pwd": "",
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246"
}
with requests.Session() as s:
p = s.post('http://www.bloomingtonsouth.org/plus/login.php', data=payload)
print(p.text)
r = s.get('http://www.bloomingtonsouth.org/plus/student.php')
print(r.text)
The output: "Not Acceptable!Not Acceptable!An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.
Not Acceptable!Not Acceptable!An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security."
What am I doing wrong?
Edit 1:
headers = {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "en-US,en;q=0.9",
"Cache-Control": "max-age=0",
"Connection": "keep-alive",
"Host": "bloomingtonsouth.org",
"Referer": "http://bloomingtonsouth.org/plus/login.php",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36"
}
url = "http://www.bloomingtonsouth.org/plus/login.php"
payload = {"username": "", "password": "", "submitted":"TRUE"}
post = requests.post(url, headers=headers, data=payload)
r = requests.get('http://www.bloomingtonsouth.org/plus/student.php')
print(r.text)
I tried this, using the headers I found the browser using and it still doesn't work.
User-agent needs to be a header, not in the payload.
login()
def login():
headers = headers - {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246"}
url = "http://www.bloomingtonsouth.org/plus/login.php"
payload = {"username": "", "pwd": ""}
post = requests.post(url, headers=headers, data=payload)
try this code for logging in and let me know if it doesn't work.
If you have the full url to login you could use this online tool that converts URL to python request:
https://curl.trillworks.com/
If you don't understand the website post full url please.
I got fiddler to capture a GET request, I want to re send the exact request with python.
This is the request I captured:
GET https://example.com/api/content/v1/products/search?page=20&page_size=25&q=&type=image HTTP/1.1
Host: example.com
Connection: keep-alive
Search-Version: v3
Accept: application/json
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
Referer: https://example.com/search/?q=&type=image&page=20
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
You can use the requests module.
The requests module automatically supplies most of the headers for you so you most likely do not need to manually include all of them.
Since you are sending a GET request, you can use the params parameter to neatly form the query string.
Example:
import requests
BASE_URL = "https://example.com/api/content/v1/products/search"
headers = {
"Connection": "keep-alive",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
}
params = {
"page": 20,
"page_size": 25,
"type": "image"
}
response = requests.get(BASE_URL, headers=headers, params=params)
import requests
headers = {
'authority': 'stackoverflow.com',
'cache-control': 'max-age=0',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'referer': 'https://stackoverflow.com/questions/tagged/python?sort=newest&page=2&pagesize=15',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,tr-TR;q=0.8,tr;q=0.7',
'cookie': 'prov=6bb44cc9-dfe4-1b95-a65d-5250b3b4c9fb; _ga=GA1.2.1363624981.1550767314; __qca=P0-1074700243-1550767314392; notice-ctt=4%3B1550784035760; _gid=GA1.2.1415061800.1552935051; acct=t=4CnQ70qSwPMzOe6jigQlAR28TSW%2fMxzx&s=32zlYt1%2b3TBwWVaCHxH%2bl5aDhLjmq4Xr',
}
response = requests.get('https://stackoverflow.com/questions/55239787/how-to-send-a-get-request-with-headers-via-python', headers=headers)
This is an example of how to send a get request to this page with headers.
You may open SSL socket (https://docs.python.org/3/library/ssl.html) to example.com:443, write your captured request into this socket as raw bytes, and then read HTTP response from the socket.
You may also try to use http.client.HTTPResponse class to read and parse HTTP response from your socket, but this class is not supposed to be instantiated directly, so some unexpected obstacles could emerge.
How do I change the referer if I'm using the requests library to make a GET request to a web page. I went through the entire manual but couldn't find it.
According to http://docs.python-requests.org/en/latest/user/advanced/#session-objects , you should be able to do:
s = requests.Session()
s.headers.update({'referer': my_referer})
s.get(url)
Or just:
requests.get(url, headers={'referer': my_referer})
Your headers dict will be merged with the default/session headers. From the docs:
Any dictionaries that you pass to a request method will be merged with
the session-level values that are set. The method-level parameters
override session parameters.
here we are rotating the user_agent with referer
user_agent_list = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36",
"Mozilla/5.0 (iPad; CPU OS 15_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/104.0.5112.99 Mobile/15E148 Safari/604.1"
]
reffer_list=[
'https://stackoverflow.com/',
'https://twitter.com/',
'https://www.google.co.in/',
'https://gem.gov.in/'
]
headers = {
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Upgrade-Insecure-Requests': '1',
'User-Agent': random.choice(user_agent_list),
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-US,en;q=0.9,fr;q=0.8',
'referer': random.choice(reffer_list)
}