Python request post doesn't get redirected - python

When I use Chrome to post a from on this website: "http://xh.5156edu.com/index.php", I get redirected to a new page. However, when I use python request module to do the post, like this:
r = requests.post("http://xh.5156edu.com/index.php", data="f_key=%B7%AB&SearchString.x=0&SearchString.y=0")
the status code is 200 and the content is not what I want. I'am sure the data is the same as the one sent by Chrome. I can not understand what's wrong with the scripts. I also tried to add some headers, which didn't work neither.

What you're passing as data are actually query parameters.
This is what you need:
import requests
params = {'f_key': '%B7%AB', 'SearchString.x': '0', 'SearchString.y': '0'}
(r := requests.post("http://xh.5156edu.com/index.php", params=params)).raise_for_status()
with open('x.html', 'w') as html:
html.write(r.text)
You can then open x.html to view the response

Related

Python GET request to redirecting URL does not actually redirect me

I have an URL that redirects me to an other page, for example:
https://www.redirector.com/1
that redirects me to https://www.redirected.com/1
I am trying to fetch the second URL using python requests, I tried doing so using the following code:
import requests
rq = requests.get('https://www.redirector.com/1')
for re in rq.history:
print(re.url)
But that doesn't output anything...
Then I tried print the rq.history and turns out that was actually an empty list. Is there a way to get the https://www.redirected.com/1 URL besides using the history attribute?
You could view the headers of the response and see if there is a Location header (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Location) and the response code is 3xx. This would be the "low" level approach

how to perform a post request just like the browser to get the same results

I have this webpage: https://www.dsbmobile.de. I would like to automate a bot that checks in and gets the newest file. I have an account and login credentials so I tried this with python:
import requests
url = "https://www.dsbmobile.de/Login.aspx?ReturnUrl=%2f"
payload = {'txtUser': 'username', 'txtPass': 'password'}
x = requests.post(url, data=payload)
print(x.text)
I get a result but its just the login page instead of the new page I should be ridirected to .
When looking at the source, I saw that there are hidden input fields such as "__EVENTVALIDATION"
do I need to send them too? Or maybe I need to set something into the header idk. It would be very nice if someone could tell me how to write the post request just like the browser sends it so that I get the right response
I am new but it would be extremely useful to me if I could automate that process.
Thank you very much for trying to help me

Use Python to make HTTP Post Request for form

I am trying to make an HTTP Post request using Python. The specific form I want to submit is on the following page: http://143.137.111.105/Enlace/Resultados2010/Basica2010/R10Folio.aspx
Using Chrome Dev Tools it seems like pushing the button makes an HTTP Post request but I am trying to figure out the exact request that is made. I currently have the following in Python:
import requests
url = 'http://143.137.111.105/Enlace/Resultados2010/Basica2010/R10Folio.aspx'
values = {
'txtFolioAlumno': '210227489P10',
}
r = requests.post(url, values)
print r.content
However, when I run this it simply prints out the HTML of the old page instead of returning the data from the new page (I am interested in getting the number next to 'Matematicas', 422 in this case). I have achieved this task using Selenium which actually opens a test browser, but I want to query the server directly.

Sending POST request on an already open url in python

Basically I want to send a POST request for the following form.
<form method="post" action="">
449 * 803 - 433 * 406 = <input size=6 type="text" name="answer" />
<input type="submit" name="submitbtn" value="Submit" />
</form>
What I basically want to do is read through the page, find out the equation in the form, calculate the answer, enter the answer as parameter to send with the POST request, but without opening a new URL for the page, as a new equation comes up every time the page is opened, hence the previously obtained result becomes obsolete. Finally I want to obtain the page that comes up as a result of sending the POST request. I'm stuck at the part where I have to send a POST request without opening a new URL instance. Also, I would appreciate help on how to read through the page again after the POST request. (would calling read() suffice?)
The python code I have currently looks something like this.
import urllib, urllib2
link = "http://www.websitetoaccess.com"
f = urllib2.urlopen(link)
line = f.readline().strip()
equation = ''
result = ''
file1 = open ('firstPage.html' , 'w')
file2 = open ('FinalPage.html', 'w')
for line in f:
if 'name="answer"' in line:
result = getResult(line)
file1.write(line)
file1.close()
raw_params = {'answer': str(result), 'submit': 'Submit'}
params = urllib.urlencode(raw_params)
request = urllib2.Request(link, params)
page = urllib2.urlopen(request)
file2.write(page.read())
file2.close()
Yeah, that last link really helped, turns out I just needed to create a new session from requests like so:
s = requests.session()
res1 = s.get(url)
And add this as the post request after
res2 = s.post(url, data=post_params)
I believe this achieves the result of storing the cookies from the get request and sending them with the post request, thus maintaining the same question as the previous get request. Many thanks for your help and assistance in this problem Loknar.
I'm a bit puzzled, the POST request will always be a new separate request so I don't understand what you mean by "without opening a new URL instance" ... have you tried taking a look at what happens when you do what you're trying to do in this script manually? Like open developer console in Chrome, go to the network tab, toggle preserve log to on, delete history, and do what you're trying to do manually? Then replicate that in python? Also I recommend you try out the requests module, it makes things simpler than using urllib. Simply pip install requests (and pip install lxml).
import requests
from lxml import etree
url = 'http://www.websitetoaccess.com'
res1 = requests.get(url)
# do something with res1.content
# you could try parsing the html page with lxml
root = etree.fromstring(res1.content, etree.HTMLParser())
# do something with root, find question and calc answer?
post_params = {'answer': str(42), 'submit': 'Submit'}
res2 = requests.post(url, data=post_params)
# check res2 for success or content?
edit:
You're possibly experiencing some header issue or cookies issue. You might be receiving some session ID which enables the server to determine what question you received in the previous GET request. The POST request is a separate request from the previous GET request, it can't be combined to one single request. You should check the headers received from the previous GET request and/or try setting up session/cookies handling (easy to do if using requests, see https://requests.readthedocs.io/en/master/user/advanced/).

Python urllib2 response 404 error but url can be opened

I came across a situation when I used Python Requests or urllib2 to open urls. I got 404 'page not found' responses. For example, url = 'https://www.facebook.com/mojombo'. However, I can copy and paste those urls in browser and visit them. Why does this happen?
I need to get some content from those pages' html source code. Since I can't open those urls using Requests or urllib2, I can't use BeautifulSoup to extract element from html source code. Is there a way to get those page's source code and extract content form it utilizing Python?
Although this is a general question, I still need some working code to solve it. Thanks!
It looks like your browser is using cookies to log you in. Try opening that url in a private or incognito tab, and you'll probably not be able to access it.
However, if you are using Requests, you can pass the appropriate login information as a dictionary of values. You'll need to check the form information to see what the fields are, but Requests can handle that as well.
The normal format would be:
payload = {
'username': 'your username',
'password': 'your password'
}
p = requests.post(myurl, data=payload)
with more or less fields added as needed.

Categories

Resources