How to get response URL in Python requests module? - python

I'm tyring to login to a website login.php using Python requests module.
If the attempt is successful, the page will be redirected to index.php
If not, it remains there in login.php.
I was able to do the same with mechanize module.
import mechanize
b = mechanize.Browser()
url = 'http://localhost/test/login.php'
response = b.open(url)
b.select_form(nr=0)
b.form['username'] = 'admin'
b.form['password'] = 'wrongpwd'
b.method = 'post'
response = b.submit()
print(response.geturl())
if response.geturl() == url:
print('Failed')
else:
print('OK')
If login/password is correct
user#linux:~$ python script.py
http://localhost/test/index.php
OK
user#linux:~$
If login/password is wrong
user#linux:~$ python script.py
http://localhost/test/login.php
Failed
user#linux:~$
My question is how to do the same with requests module?
I was trying different approach here, but none of them work.

I've took the code from your question and modified it:
import requests
url = 'http://localhost/test/login.php'
values = {'username': 'admin', 'password': 'wrongpwd'}
r = requests.post(url, data=values)
print(r.url) # prints the final url of the response
You can know it's a sure thing because it's documented in the source code. All I've done is opened the definition of the Response class.
Now, back to your original question.
Python requests module to verify if HTTP login is successful or not
It depends on whether the website is properly implemented.
When you send a form, any website replies to you with an HTTP response, which contains a status code. A properly implemented website returns different status codes depending on the stuff you've sent. Here's a list of them. If everything is honky-dory, the status code of the response will be 200:
import requests
url = 'http://localhost/test/login.php'
values = {'username': 'admin', 'password': 'wrongpwd'}
r = requests.post(url, data=values)
print(r.status_code == 200) # prints True
If the user entered the wrong credentials, the status code of the response will be 401 (see the list above). Now, if a website is not implemented properly, it will respond with 200 anyway and you'll have to guess whether the login is successful based on other things, such as response.content and response.url.

Related

How to get information from xml after login with requests python

I am trying to get some data from an xml of a website which requires to login first. I try to get python requests Session and I am able to login using get method (shows 200). After that I try to get access to the xml, but it gets me 401.
So far I know that the server checks with each call whether the client is sending a JSESSIONID cookie and whether the VALUE matches the current session.
I manage to get the corresponding cookies and tried to send it via post method but still get 401.
Maybe I think to complicated. I also do not need to achieve this with requests. I would be glad to just get the information from the xml.
import requests
login_url = 'https://www.example.com/login
USER_NAME = 'user'
PASSWORD = 'pass'
xml = 'https://www.example.com/channel_index?123&p=web'
with requests.Session() as s:
response = s.get(login_url,auth = (USER_NAME, PASSWORD))
print(response)
r = s.get(xml)
cookies = s.cookies.get_dict()
r = s.post(xml, cookies = cookies)
print(r)

Using Python requests cannot access a private website while it works by using a browser

I'm trying to download some (.csv) files from a private website with Python requests method.
I can access the website by using a browser. After typing in the url, it pops up a window to fill in username and password.
After that, it starts to download a (.csv) file.
However, it failed when I used Python requests method.
Here is my code.
import requests
# username and pwd in base64
b64_IDpass = '******'
tics_headers = {
"Host": 'http://tics-sign.com',
"Authorization": 'Basic {}'.format(b64_IDpass)
}
# company internet proxy
proxy = {'http': '*****'}
# url
url_get = 'http://tics-sign.com/getlist'
r = requests.get(url_get,
headers=tics_headers,
proxies=proxy)
print(r)
# <Response [404]>
I've checked the headers in a browser, there is no problem.
But why it returns <Response [404]> when using Python?
You need to post your password and username before you get the link.
So you could try this:
request.post("http://tics-sign.com", tics_headers)
And then get the info:
request.get(url_get, proxies=proxy)
This has worked for me in all the previous sites have scraped which need authentication.
The problem is that each site has a different way for accepting authentication. So it may
not even work.
It also may be that python is not getting redirected to http://tics-sign.com/displaypanel/login.aspx. curl didn't for me.
Edit:
I looked at the HTML source of your website and I came up with this:
login_data = {"logName": your_id, "pwd": your_password}
request.post(http://tics-sign.com/displaypanel/login.aspx, login_data)
r = request.get(url_get, proxies=proxy)
You can look at my blog for more info.

Remote login to decoupled website with python and requests

I am trying to login to a website www.seek.com.au. I am trying to test the possibility to remote login using Python request module. The site is Front end is designed using React and hence I don't see any form action component in www.seek.com.au/sign-in
When I run the below code, I see the response code as 200 indicating success, but I doubt if it's actually successful. The main concern is which URL to use in case if there is no action element in the login submit form.
import requests
payload = {'email': <username>, 'password': <password>}
url = 'https://www.seek.com.au'
with requests.Session() as s:
response_op = s.post(url, data=payload)
# print the response status code
print(response_op.status_code)
print(response_op.text)
When i examine the output data (response_op.text), i see word 'Sign in' and 'Register' in output which indicate the login failed. If its successful, the users first name will be shown in the place. What am I doing wrong here ?
P.S: I am not trying to scrape data from this website but I am trying to login to a similar website.
Try this code:
import requests
payload={"email": "test#test.com", "password": "passwordtest", "rememberMe": True}
url = "https://www.seek.com.au:443/userapi/login"
with requests.Session() as s:
response_op = s.post(url, json=payload)
# print the response status code
print(response_op.status_code)
print(response_op.text)
You are sending the request to the wrong url.
Hope this helps

fail to login to website with request

At first here is my code:
import requests
payload = {'name':'loginname',
'password': 'loginpassword'}
requests.post('http://myurl/auth',data=payload,verify=False)
rl = requests.get('http://myurl/dashboard',verify=False)
print(rl.text)
My problem:
I get the status code 200 which means that the login was successfull.
But when i try to visit the protected page http://myurl/dashboard my output doesn't fit. It shows me the first page for login and i don't get it why.
I know there are many questions like that but i studied every answer and the docs but i dont get it.
Any help would be very nice. It drives me cracy. Thanks in advance!
You need to use a Session object to have requests maintain track of your cookies, such as the login cookie which would get set after your requests.post login action. Quoting the[first example there:
s = requests.Session()
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get('http://httpbin.org/cookies')
print(r.text)
# '{"cookies": {"sessioncookie": "123456789"}}'
So in your case:
import requests
payload = {'name': 'loginname',
'password': 'loginpassword'}
s = requests.Session()
r1 = s.post('http://myurl/auth', data=payload, verify=False)
if r1.status_code == 200: # not necessary if you are sure it would login successfully
r2 = s.get('http://myurl/dashboard', verify=False)

Python url form submission not working

Using Python, I'm trying to submit a form to a url and get a response.. This is what I'm doing:
import urllib, urllib2
data = {
'date': '300186',
'search_type': 'state',
'search_state': 'NY',
}
req = urllib2.Request(
url='https://services.aamc.org/20/mcat/findsite',
data=urllib.urlencode(data),
headers={"Content-type": "application/x-www-form-urlencoded"}
)
response = urllib2.urlopen(req)
print(response.read())
However, I'm getting this:
<script>location.replace('https://services.aamc.org/20/mcat');</script>
Which I guess simply means a redirection to the main page... Did I miss something, or is the AAMC website doing this on purpose..?
Thanks
EDIT:
So I'm basically trying to connect to url "https://services.aamc.org/20/mcat/findsite/findexam?date=3001816search_type=state&search_state=NY"
and this works fine when I enter this on my browser.. So I guess there's nothing wrong with the query
I suppose that you have already log into the site from your browser. So the site may have put a cookie in your browser to identify you on next attemp an automatically log you (as do stackoverflow.com). But when you send the request from the python script, there is nothing to identify you, and you are redirected to login page (I tried the URL you show and was ...).
You will have to do the connection in your script to pass the login page. But for it to work, you will have to add a HttpCookieProcessor handler to an Opener
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
then use
response = opener.open(req)
instead of
response = urllib2.urlopen(req)

Categories

Resources