I want to send a post data requests to this URL : https://www.createsend.com/t/securedsubscribe?token=" + token Every time the token is changing but i figure out a way to retrieve it. But i can't acces the internet site
When i check in Chrome Console this header is used.
authority:www.createsend.com
method:POST
path:/t/securedsubscribe?token=7B9BCC9AE0CD58E2170E07A7D79E679426EBC1D02FA06CA791557CA7ACC1155F5A0DDA83D987E06C
scheme:https
accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
accept-encoding:gzip, deflate, br
accept-language:fr
cache-control:max-age=0
content-length:174
content-type:application/x-www-form-urlencoded
cookie:__utma=38149500.1167947680.1522060606.1522060606.1522060606.1; __utmc=38149500; __utmz=38149500.1522060606.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); __utmt=1; _ga=GA1.2.1167947680.1522060606; _gid=GA1.2.485326558.1522060607; ajs_group_id=null; __qca=P0-314313712-1522060607057; mp_mixpanel__c=0; ajs_user_id=%22C5EC08CADFFC107B-B6DC4E4B6840339E%22; ajs_anonymous_id=%222e3961d1-be64-4ecf-8481-c8a47295b130%22; __utmv=38149500.|1=user-type=user=1; __utmb=38149500.2.10.1522060606; mp_1c1eda798f92601aecaa904fe7b3520a_mixpanel=%7B%22distinct_id%22%3A%20%22C5EC08CADFFC107B-B6DC4E4B6840339E%22%2C%22mp_lib%22%3A%20%22Segment%3A%20web%22%2C%22%24search_engine%22%3A%20%22google%22%2C%22%24initial_referrer%22%3A%20%22https%3A%2F%2Fwww.google.be%2F%22%2C%22%24initial_referring_domain%22%3A%20%22www.google.be%22%2C%22mp_name_tag%22%3A%20%22C5EC08CADFFC107B-B6DC4E4B6840339E%22%2C%22id%22%3A%20%22C5EC08CADFFC107B-B6DC4E4B6840339E%22%7D; _uetsid=_uetcde93ba3; optimizelyEndUserId=oeu1522060693722r0.545521542891791; optimizelyBuckets=%7B%7D; optimizelySegments=%7B%22341521689%22%3A%22direct%22%2C%22341576276%22%3A%22gc%22%2C%22341588087%22%3A%22false%22%2C%222833930025%22%3A%22none%22%2C%225027931715%22%3A%22true%22%2C%225195510267%22%3A%22true%22%7D; intercom-lou-je5td1qt=1; intercom-session-je5td1qt=WmxjSDdDVlRYYTZUc1Z3bmlENTNhOXFJUzhvK2piZElBakFjbDI5dkpwS0hvWEFLVmMweWNHNER0Ujh3QTFKUy0tdGt0aitQQTNuQWoxZGhXMUllakhMQT09--ec3c7db367f57a30eb5b5818de90e43dc8cc39a6; __ssid=089a6e70-58c2-41bb-98d2-90ce7ada1d73
origin:http://tres-bien.com
referer:http://tres-bien.com/odehhasoidj
upgrade-insecure-requests:1
user-agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/64.0.3282.167 Chrome/64.0.3282.167 Safari/537.36
This webpage use H2 protocole. So i use Hyper module to pass trough :
session.mount("https://www.createsend.com",HTTP20Adapter())
r = session.post(url , data=payload2 , headers=header2)
print(r.text)
But I still get this error.
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request - Invalid Header</h2>
<hr><p>HTTP Error 400. The request has an invalid header name.</p>
</BODY></HTML>
The website's form I'm submitting : http://tres-bien.com/odehhasoidj where you can check which POST requests are made
This is my code
header2 = {
#'Host': 'www.createsend.com',
'authority':'www.createsend.com' ,
'method':'POST',
'path':'/t/securedsubscribe?token=' + token,
'scheme':'https',
'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'accept-encoding':'gzip, deflate, br',
'accept-language':'fr',
'cache-control':'max-age=0',
#'content-length':'182',
'content-type':'application/x-www-form-urlencoded',
'origin':'http://tres-bien.com',
'referer':'http://tres-bien.com/odehhasoidj',
'upgrade-insecure-requests':'1',
#'cookie':'__utma=38149500.1167947680.1522060606.1522060606.1522060606.1; __utmc=38149500; __utmz=38149500.1522060606.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); __utmt=1; _ga=GA1.2.1167947680.1522060606; _gid=GA1.2.485326558.1522060607; ajs_group_id=null; __qca=P0-314313712-1522060607057; mp_mixpanel__c=0; ajs_user_id=%22C5EC08CADFFC107B-B6DC4E4B6840339E%22; ajs_anonymous_id=%222e3961d1-be64-4ecf-8481-c8a47295b130%22; __utmv=38149500.|1=user-type=user=1; __utmb=38149500.2.10.1522060606; mp_1c1eda798f92601aecaa904fe7b3520a_mixpanel=%7B%22distinct_id%22%3A%20%22C5EC08CADFFC107B-B6DC4E4B6840339E%22%2C%22mp_lib%22%3A%20%22Segment%3A%20web%22%2C%22%24search_engine%22%3A%20%22google%22%2C%22%24initial_referrer%22%3A%20%22https%3A%2F%2Fwww.google.be%2F%22%2C%22%24initial_referring_domain%22%3A%20%22www.google.be%22%2C%22mp_name_tag%22%3A%20%22C5EC08CADFFC107B-B6DC4E4B6840339E%22%2C%22id%22%3A%20%22C5EC08CADFFC107B-B6DC4E4B6840339E%22%7D; _uetsid=_uetcde93ba3; optimizelyEndUserId=oeu1522060693722r0.545521542891791; optimizelyBuckets=%7B%7D; optimizelySegments=%7B%22341521689%22%3A%22direct%22%2C%22341576276%22%3A%22gc%22%2C%22341588087%22%3A%22false%22%2C%222833930025%22%3A%22none%22%2C%225027931715%22%3A%22true%22%2C%225195510267%22%3A%22true%22%7D; intercom-lou-je5td1qt=1; intercom-session-je5td1qt=WmxjSDdDVlRYYTZUc1Z3bmlENTNhOXFJUzhvK2piZElBakFjbDI5dkpwS0hvWEFLVmMweWNHNER0Ujh3QTFKUy0tdGt0aitQQTNuQWoxZGhXMUllakhMQT09--ec3c7db367f57a30eb5b5818de90e43dc8cc39a6; __ssid=089a6e70-58c2-41bb-98d2-90ce7ada1d73',
#'Authorization': token ,
'user-agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/64.0.3282.167 Chrome/64.0.3282.167 Safari/537.36'
}
payload2 = {
'cm-name' : 'Namz Foll' ,
'cm-zlutdu-zlutdu' : email ,
'cm-f-qkytli': 'Adress Bilning' ,
'cm-f-qkytld': '1030' ,
'cm-f-qkytlh': 'Ciky' ,
'cm-fo-qkytlk ' : '3324398' ,
'cm-f-qkytlu' : '0412345408' ,
'cm-fo-qkytry' : '3324639'
}
url = "https://www.createsend.com/t/securedsubscribe?token=" + token
There is not enough information to tell, but authority, scheme, etc. are special headers that must be prefixed by a colon, as in :authority, :scheme, etc.
Please see "Pseudo Header Fields" here: https://www.rfc-editor.org/rfc/rfc7540#section-8.1.2.
This header belongs to HTTP/2. And you can't use them with requests library, but you can use another library hyper
from hyper import HTTPConnection
conn = HTTPConnection('http2bin.org:443')
conn.request('GET', '/get')
resp = conn.get_response()
print(resp.read())
But learning a new library can be a time taking process, and as we know requests us much simple, to know how you can use this header, this link has explained it very well.
Related
I am trying to login into Gamestop(gamestop.ca) but it wont let me as shown below. I've tried adding various headers but no luck :(..
I'm just wondering maybe I am adding data incorrectly when sending my post request.. Can someone confirm?
The content type of the website in the response section under network is:
content-type: text/html; charset=utf-8
import requests
import json
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36'}
payload = {
'UserName': '*****',
'Password': '******',
'RememberMe': 'false'
}
with requests.session() as s:
r = s.post('https://www.gamestop.ca/Account/LogOn', headers=headers, data=payload)
print(r.status_code)
print(r.content)
print(r.text)
403
b'<HTML><HEAD>\n<TITLE>Access Denied</TITLE>\n</HEAD><BODY>\n<H1>Access Denied</H1>\n \nYou don\'t have permission to access "http://www.gamestop.ca/Account/LogOn" on this server.<P>\nReference #18.70fd017.1635453520.211da9c5\n</BODY>\n</HTML>\n'
<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>
You don't have permission to access "http://www.gamestop.ca/Account/LogOn" on this server.<P>
Reference #18.70fd017.1635453520.211da9c5
</BODY>
</HTML>
p.s. I can login with selenium so I don't think the error is caused by bot protection.
I would like to get the information on this page:
http://www.jnfdc.gov.cn/onsaling/viewhouse.shtml?fmid=757e06e0-c5b3-4384-9a14-2cb1eac011d1
From the browser debugger tools I get the information in this file:
http://www.jnfdc.gov.cn/r/house/757e06e0-c5b3-4384-9a14-2cb1eac011d1_154810896.xml
But when I use the browser to access the url directly, I can't get the file.
I don't know why.
I use python.
import urllib2
#url1 = 'http://www.jnfdc.gov.cn/onsaling/viewhouse.shtml?fmid=757e06e0-c5b3-4384-9a14-2cb1eac011d1'
url = 'http://www.jnfdc.gov.cn/r/house/757e06e0-c5b3-4384-9a14-2cb1eac011d1_113649432.xml'
headers = {
"Accept" :"*/*",
"Accept-Encoding" :"gzip, deflate, sdch",
"Accept-Language" :"zh-CN,zh;q=0.8",
"Cache-Control" :"max-age=0",
"Connection" :"keep-alive",
"Cookie" :"JSESSIONID=A205D8D7B0807FD34F879D6CB6EEB0CE",
"DNT" :"1",
"Host" :"www.jnfdc.gov.cn",
"Referer" :"http://www.jnfdc.gov.cn/onsaling/viewhouse.shtml?fmid=757e06e0-c5b3-4384-9a14-2cb1eac011d1",
"User-Agent" :"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.3051.400 QQBrowser/9.6.11301.400"
}
req = urllib2.Request(url, headers=headers)
resp = urllib2.urlopen(req) #this code throw exception:HTTPError: Not Found
How could I do? Thanks.
For getting data from browser you can try to use Selenium - Selenium doc
Below is some code that I have been trying to use to log in to the cook's illustrated website (https://www.cooksillustrated.com/sign_in).
I start a session, get the authentication token and a hidden encoding field, and then pass the "name" and "value" of the email and password fields (found by inspecting the elements in chrome). The form doesn't seem to contain any other elements; however, the post method doesn't log me in.
I noticed that all of the CSRF tokens ended in "==", so I tried removing them. But it didn't work.
I also tried modifying the post to use the "id" field of the form inputs instead of the "name" (just a shot in the dark, really...name seems like it should work from what I've seen in other examples).
Any thoughts would be much appreciated.
import requests, lxml.html
s = requests.session()
# go to the login page and get its text
login = s.get('https://www.cooksillustrated.com/sign_in')
login_html = lxml.html.fromstring(login.text)
# find the hidden fields names and values; store in a dictionary
hidden_inputs = login_html.xpath(r'//form//input[#type="hidden"]')
form = {x.attrib['name']: x.attrib['value'] for x in hidden_inputs}
print(form)
# I noticed that they all ended in two = signs, so I tried taking that off
# form['authenticity_token'] = form['authenticity_token'][:-2]
# this adds to the form payload the two named fields for user name and password
# found using the "inspect elements" on the login screen
form['user[email]'] = 'my_email'
form['user[password]'] = 'my_pw'
# this uses "id" instead of "name" from the input fields
#form['user_email'] = 'my_email'
#form['user_password'] = 'my_pw'
response = s.post('https://www.cooksillustrated.com/sign_in', data=form)
print(form)
# trying to see if it worked - but the response URL is login again instead of main page
# and it can't find my name
# responses are okay, but I think that just means it posted the form
print(response.url)
print('Christopher' in response.text)
print(response.status_code)
print(response.ok)
Well, the POST request URL should be https://www.cooksillustrated.com/sessions, and if you capture all traffic while logging in, you'll find out the actual POST request made to the server:
POST /sessions HTTP/1.1
Host: www.cooksillustrated.com
Connection: keep-alive
Content-Length: 179
Cache-Control: max-age=0
Origin: https://www.cooksillustrated.com
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Referer: https://www.cooksillustrated.com/sign_in
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.8
utf8=%E2%9C%93&authenticity_token=Uvku64N8V2dq8z%2BGerrqWNobn03Ydjvz8xqgOAvfBmvDM%2B71xJWl2DmRU4zbBE15gGVESmDKP2E16KIqBeAJ0g%3D%3D&user%5Bemail%5D=demo&user%5Bpassword%5D=demodemo
Notice that the last line is encoded data for this request, there're 4 parameters which are utf, authenticity_token, user[email] and user[password].
So in your case, form should include all of them:
form = {'user[email]': 'my_email',
'user[password]': 'my_pw',
'utf': '✓',
'authenticity_token': 'xxxxxx' # make sure you don't ignore '=='
}
Also, you might want to add some headers to appear as coming from Chrome (or whatever browser you like), since the default header of request is python-requests/2.13.0 and some websites don't like traffic from "bots":
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, br',
... # more
}
Now we're ready to make the POST request:
response = s.post('https://www.cooksillustrated.com/sessions', data=form, headers=headers)
I'm trying to write a Python script to let me log in to my fantasy football account at https://fantasy.premierleague.com/, but something is not quite right with my log in. When I login through my browser and check the details using Chrome developer tools, I find that the Request URL is https://users.premierleague.com/accounts/login/ and the form data sent is:
csrfmiddlewaretoken:[My token]
login:[My username]
password:[My password]
app:plfpl-web
redirect_uri:https://fantasy.premierleague.com/a/login
There are also a number of Request headers:
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate, br
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:185
Content-Type:application/x-www-form-urlencoded
Cookie:[My cookies]
Host:users.premierleague.com
Origin:https://fantasy.premierleague.com
Referer:https://fantasy.premierleague.com/
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
So I've written a short Python script using the request library to try to log in and navigate to a page as follows:
import requests
with requests.Session() as session:
url_home = 'https://fantasy.premierleague.com/'
html_home = session.get(url_home)
csrftoken = session.cookies['csrftoken']
values = {
'csrfmiddlewaretoken': csrftoken,
'login': <My username>,
'password': <My password>,
'app': 'plfpl-web',
'redirect_uri': 'https://fantasy.premierleague.com/a/login'
}
head = {
'Host':'users.premierleague.com',
'Referer': 'https://fantasy.premierleague.com/',
}
session.post('https://users.premierleague.com/accounts/login/',
data = values, headers = head)
url_transfers = 'https://fantasy.premierleague.com/a/squad/transfers'
html_transfers = session.get(url_transfers)
print(html_transfers.content)
On printing out the content of my post request, I get a HTML response code 500 error with:
b'\n<html>\n<head>\n<title>Fastly error: unknown domain users.premierleague.com</title>\n</head>\n<body>\nFastly error: unknown domain: users.premierleague.com. Please check that this domain has been added to a service.</body></html>'
If I remove the 'host' from my head dict, I get a HTML response code 405 error with:
b''
I've tried including various combinations of the Request headers in my head dict and nothing seems to work.
The following worked for me. I simply removed headers = head
session.post('https://users.premierleague.com/accounts/login/',
data = values)
I think you are trying to pick your team programmatically, like me. Your code got me started thanks.
I'm trying to login to an aspx page then get the contents of another page as a logged in user.
import requests
from bs4 import BeautifulSoup
URL="https://example.com/Login.aspx"
durl="https://example.com/Daily.aspx"
user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.124 Safari/537.36'
language = 'en-US,en;q=0.8'
encoding = 'gzip, deflate'
accept = 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'
connection = 'keep-alive'
headers = {
"Accept": accept,
"Accept-Encoding": encoding,
"Accept-Language": language,
"Connection": connection,
"User-Agent": user_agent
}
username="user"
password="pass"
s=requests.Session()
s.headers.update(headers)
r=s.get(URL)
print(r.cookies)
soup=BeautifulSoup(r.content)
LASTFOCUS=soup.find(id="__LASTFOCUS")['value']
EVENTTARGET=soup.find(id="__EVENTTARGET")['value']
EVENTARGUMENT=soup.find(id="__EVENTARGUMENT")['value']
VIEWSTATEFIELDCOUNT=soup.find(id="__VIEWSTATEFIELDCOUNT")['value']
VIEWSTATE=soup.find(id="__VIEWSTATE")['value']
VIEWSTATE1=soup.find(id="__VIEWSTATE1")['value']
VIEWSTATE2=soup.find(id="__VIEWSTATE2")['value']
VIEWSTATE3=soup.find(id="__VIEWSTATE3")['value']
VIEWSTATE4=soup.find(id="__VIEWSTATE4")['value']
VIEWSTATEGENERATOR=soup.find(id="__VIEWSTATEGENERATOR")['value']
login_data={
"__LASTFOCUS":"",
"__EVENTTARGET":"",
"__EVENTARGUMENT":"",
"__VIEWSTATEFIELDCOUNT":"5",
"__VIEWSTATE":VIEWSTATE,
"__VIEWSTATE1":VIEWSTATE1,
"__VIEWSTATE2":VIEWSTATE2,
"__VIEWSTATE3":VIEWSTATE3,
"__VIEWSTATE4":VIEWSTATE4,
"__VIEWSTATEGENERATOR":VIEWSTATEGENERATOR,
"__SCROLLPOSITIONX":"0",
"__SCROLLPOSITIONY":"100",
"ctl00$NameTextBox":"",
"ctl00$ContentPlaceHolderNavPane$LeftSection$UserLogin$UserName":username,
"ctl00$ContentPlaceHolderNavPane$LeftSection$UserLogin$Password":password,
"ctl00$ContentPlaceHolderNavPane$LeftSection$UserLogin$LoginButton":"Login",
"ctl00$ContentPlaceHolder1$RetrievePasswordUserNameTextBox":"",
"hiddenInputToUpdateATBuffer_CommonToolkitScripts":"1"
}
r1=s.post(URL, data=login_data)
print (r1.cookies)
d=s.get(durl)
print (d.cookies)
dsoup=BeautifulSoup(r1.content)
print (dsoup)
but the thing is that the cookies are not preserved into the session and I can't get to the next page as a logged in user.
Can someone give me some pointers on this.
Thanks.
When you post to the login page:
r1=s.post(URL, data=login_data)
It's likely issuing a redirect to another page. So the response to the POST request returns the cookies in the response, then it redirects to another page. The redirect is what is captured in r1 and does not contain the cookies.
Try the same command but not allowing redirects:
r1 = s.post(URL, data=login_data, allow_redirects=False)