Python: Trying to loggin with requests and perform a HTTP request - python

I am trying to loggin to my account using the following python code without success. The login-process is in two steps on two pages. First enter login, second enter password. I am using Python3:
from bs4 import BeautifulSoup
import requests, lxml.html
with requests.Session() as s:
#First login page
login = s.get('https://accounts.ft.com/login')
login_html = lxml.html.fromstring(login.text)
#getting the form inputs
hidden_inputs = login_html.xpath(r'//form//input')
form = {x.name: x.value for x in hidden_inputs}
#filling inputs with email
form['email'] = 'me#mail.com'
response = s.post('https://accounts.ft.com/login', data=form)
# Receive reponse 200
#Second login page
login_html = lxml.html.fromstring(response.text)
#getting inputs
hidden_inputs = login_html.xpath(r'//form//input')
form = {x.name: x.value for x in hidden_inputs}
#filling inputs with email and password
form['email'] = 'me#mail.com'
form['password'] = 'p****word'
response = s.post('https://accounts.ft.com/login', data=form)
#Receive reponse 200
#Trying to read an article being loggedIn
page = s.get('https://www.ft.com/content/173695cc-1a98-11e7-a266-12672483791a')
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())
# data-next-is-logged-in="false" => Please Register to read this page...
Here is what the Form looks like:
<div class="js-container" data-component="two-step-login-form" id="content">
<div class="lgn-box">
<form action="/login/submitEmail" class="js-email-lookup-form" data-test-id="enter-email-form" method="POST" name="enter-email-form" novalidate="">
<input name="location" type="hidden" value="" />
<input name="continueUrl" type="hidden" value="" />
<input name="readerId" type="hidden" value="" />
<input name="loginUrl" type="hidden" value="/login" />
<div class="lgn-box__title">
<h1 class="lgn-heading--alpha">
Sign in
</h1>
</div>
<div class="o-forms-group">
<label class="o-forms-label" for="email">
Email address
</label>
<input autocomplete="off" autofocus="" class="o-forms-text js-email" id="email" maxlength="64" name="email" required="" type="email">
<input id="password" name="password" style="display:none" type="password">
<label for="password">
</label>
</input>
</input>
</div>
<div class="o-forms-group">
<button class="o-buttons o-buttons--standout o-buttons--big" name="Next" type="submit">
Next
</button>
</div>
</form>
</div>
Here is what my data passed to POST looks like:
form
{'password': 'p****word', 'continueUrl': '', 'loginUrl': '/login', 'email': 'me#mail.com', 'readerId': '', 'location': ''}
The POST request returns for both 1st and 2nd loggin page a 200 response. But it seems that I am still not logged in.
I have tried using http://accounts.ft.com/sso/redirects?email=me#mail.com as a URL for POST request, returning a 405 Bad Request error
I am not sure that I am actually not logged in, bud I have no idea how to monitor that.
Is it possible that the website prevents me from logging-in if not in a web-browser?

Try using selenium to simulate the web browser as it appears that FT blocks automated access.
Alternatively you can see if a site has been archived with something like archive.is (which will pull most sites into a more machine friendly setup).
Finally, there is both a datamining API and a headline API that the FT offers at their developer page

Related

Python : How to submit CGI form using Request

I just start learning Python and want to make a script to submit Form.
I found Form use CGI, Here the Form:
<div class="box" id="url_upload">
<div class="tabcontent">
<div class="progress_div"></div>
<div class="reurlupload">
<div class="progress_div"></div>
<form method="post" id="uploadurl" action="https://af03.ayefiles.com/cgi-bin/upload.cgi?upload_type=url">
<input type="hidden" name="sess_id" value="xv71zsrmtr38oh3z">
<input type="hidden" name="utype" value="reg">
<input type="hidden" name="file_public" value="1">
<div class="leftintab">
<p style="margin:0px;">
You can enter up to <b>20</b> URLs, one URL per row</br>
Max file size is <b>10240 Mb</b>
</p>
<textarea name="url_mass" style="width:100%; margin-top: 10px;" placeholder="e.g. http://example.com/xxxxxxxxxx.xyz"></textarea>
</div>
I make python script using request as below:
#I have session with my login & password as cookie
#Go to form page
login = s.get('https://ayefiles.com/?op=upload_form')
login_html = html.fromstring(login.content)
hidden_inputs = login_html.xpath('//input[#type="hidden"]')
# Input query data
form = {x.attrib["name"]: x.attrib["value"] for x in hidden_inputs}
form ['sess_id']= 'xv71zsrmtr38oh3z'
form['utype']= 'reg'
form ['file_public']= '1'
form['url_mass'] = longurl
# POST
login = s.post('https://af03.ayefiles.com/cgi-bin/upload.cgi?upload_type=url', data=form)
print (login.url)
My expected result for login.url ==> ayefiles.com/?op=upload_result&st=OK&fn=xxxxx
But my result fail, ==> ayefiles.com/?op=upload_result&st=Torrent%20engine%20is%20not%20running&
fn=undef
how to solve my problem? What's wrong with my code?
Please kindly help me with correct code.
My mistake at part multiform data.
Correct code :
form ={'sess_id':(None,'xv71zsrmtr38oh3z'),'utype':(None,'reg'),'file_public':(None,'1'),'url_mass':(None,longurl)}
login = s.post('https://af03.ayefiles.com/cgi-bin/upload.cgi?upload_type=url', data=form)

Python - Login to Website

I know that there are several posts on this subject and I believe I have read a significant amount of them, however I still can't login to this website.
Below is my inspection of the login page:
<form id="login" name="login" method="POST" action="/signin">
<div id="login_username">
<label>Email</label>
<input class="textfield" id="email" name="email" type="text" autocomplete="off" value="">
</div>
<div id="login_password">
<label>Password</label>
<input class="textfield" id="password" name="password"
type="password" autocomplete="off" value="">
</div>
<input type="hidden" id="hash" name="hash" value="">
<div id="login_submit">
<a id="forgot_password_link">Forgot Password?</a>
<input class="submitbutton" type="submit" value="Sign In">
</div>
</form>
Below is my code:
username = 'XXXXX#gmail.com'
password = 'XXXX'
hash = ''
data = {'password':password, 'email':username,'hash':hash}
login_url = "https://carmel.orangetheoryfitness.com/login"
s = requests.session()
result = s.post(login_url, data=data, headers = dict(referer=login_url))
scrape_url = 'https://carmel.orangetheoryfitness.com/apps/otf/classes/view?id=16297&loc=0'
result = s.get(url=scrape_url)
From here I go on to search the html document but I'm not finding what I want as I am sent back to the login page when getting the scrape_url. I have verified this by inspecting the resulting html document.
Things I have considered:
-Almost all blog posts or SO responses indicate that usually a CSRF token is required. I have searched the login page and can't find a CSRF token.
The form has an action="/signin" attribute so you need to post to https://carmel.orangetheoryfitness.com/signin instead.
result = s.post('https://carmel.orangetheoryfitness.com/signin', data=data, headers = dict(referer=login_url))

Can't login on a form with requests

I think what i do all correctly but the script do not login on a simple form.
After login i use the get method to try if i can see the user panel but i allways recive the index of the page as if it no were logged
The user and password inputs are well.
some idea ??
import requests
url = 'http://streamcloud.eu/login.html'
headers = {'User-Agent': 'Mozilla/5.0'}
payload = {
'login':'my_login',
'password':'my_password'
}
r = requests.session()
r.get(url)
login = r.post(url,data=payload,headers=headers)
result = r.get('http://streamcloud.eu/?op=my_account')
print(result.text)
You need to "post" your form-data to http://streamcloud.eu/. Additionally pass an additional parameter called op with the value login, to indicate, that you want to log in. All of this can be found out through a quick look at the html of the target website:
<form method="POST" action="http://streamcloud.eu/" class="proform" name="FL">
<input type="hidden" name="op" value="login">
<input type="hidden" name="redirect" value="http://streamcloud.eu/">
<input type="hidden" name="rand" value="">
<p>
<label>Benutzername:</label>
<input type="text" style="font-style: normal;" name="login" value="my_login" class="text_field">
</p>
<div class="clear"></div>
<p>
<label>Passwort:</label>
<input type="password" style="font-style: normal;" name="password" class="text_field">
</p>
<div class="clear"></div>
<div class="clear"></div>
<br>
<div>
<input type="submit" class="button blue medium" value="Senden">
</div>
<div class="clear"></div>
</form>
As you can see the form posts its information to http://streamcloud.eu/:
<form method="POST" action="http://streamcloud.eu/" class="proform" name="FL">
Here you can see the hidden op parameter:
<input type="hidden" name="op" value="login">
Here is the updated code:
import requests
url = 'http://streamcloud.eu'
headers = {'User-Agent': 'Mozilla/5.0'}
payload = {
'op': 'login',
'login': 'my_login',
'password': 'my_password'
}
r = requests.session()
r.get(url)
login = r.post(url, data=payload, headers=headers)
result = r.get('http://streamcloud.eu/?op=my_account')
print(result.text)
After r.post(...) (r.headers['Set-Cookie'])
U will obtain a cookie, so i guess u have to pass that cookie on r.get(...)

python requests module with redirect

I am trying to perform a get request in python using the requests module. However, before I can do a get the website redirects me to a login page. I need to login first which will then land to me to the page I am requesting.
Following is the content I receive after doing the get. How should I perform the login in order to access the page I am looking for? Any help would be appreciated!
<form action="/idp/profile/SAML2/Redirect/SSO?execution=e1s1" method="post">
<div class="form-element-wrapper">
<label for="username">Username</label>
<input class="form-element form-field" id="username" name="j_username" type="text" value="">
</div>
<div class="form-element-wrapper">
<label for="password">Password</label>
<input class="form-element form-field" id="password" name="j_password" type="password" value="******">
</div>
<div class="form-element-wrapper">
<input type="checkbox" name="donotcache" value="1">Don't Remember Login </div>
<div class="form-element-wrapper">
<input id="_shib_idp_revokeConsent" type="checkbox" name="_shib_idp_revokeConsent" value="true">
Clear prior granting of permission for release of your information to this service.
</div>
<div class="form-element-wrapper">
<button class="form-element form-button" type="submit" name="_eventId_proceed"
onClick="this.childNodes[0].nodeValue='Logging in, please wait...'">Login</button>
</div>
</form>
Following is the code I have written until now:
values = {'j_username':'****'}
with requests.Session() as s:
p = s.get(url,verify=False)
logger.info(p.text)
values = {'j_username':'****'}
with requests.Session() as session:
login_response = session.post(login_url, data=data, verify=False)
# the session will now have the session cookie, so subsequent requests will be authenticated. its worth inspecting the response to make sure it is the correct status code.
other_response = session.get(url) # expect this not to redirect to login page

Python 3 script for logging into a website using the Requests module

I'm trying to write some Python (3.3.2) code to log in to a website using the Requests module. Here is the form section of the login page:
<form method="post" action="https://www.ibvpn.com/billing/dologin.php" name="frmlogin">
<input type="hidden" name="token" value="236647d2da7c8408ceb78178ba03876ea1f2b687" />
<div class="logincontainer">
<fieldset>
<div class="clearfix">
<label for="username">Email Address:</label>
<div class="input">
<input class="xlarge" name="username" id="username" type="text" />
</div>
</div>
<div class="clearfix">
<label for="password">Password:</label>
<div class="input">
<input class="xlarge" name="password" id="password" type="password"/>
</div>
</div>
<div align="center">
<p>
<input type="checkbox" name="rememberme" /> Remember Me
</p>
<p>Request a Password Reset</p>
</div>
</fieldset>
</div>
<div class="actions">
<input type="submit" class="btn primary" value="Login" />
</div>
</form>
Here is my code, trying to deal with hidden input:
import requests
from bs4 import BeautifulSoup
url = 'https://www.ibvpn.com/billing/clientarea.php'
body = {'username':'my email address','password':'my password'}
s = requests.Session()
loginPage = s.get(url)
soup = BeautifulSoup(loginPage.text)
hiddenInputs = soup.findAll(name = 'input', type = 'hidden')
for hidden in hiddenInputs:
name = hidden['name']
value = hidden['value']
body[name] = value
r = s.post(url, data = body)
This just returns the login page. If I post my login data to the URL in the 'action' field, I get a 404 error.
I've seen other posts on StackExchange where automatic cookie handling doesn't seem to work, so I've also tried dealing with the cookies manually using:
cookies = dict(loginPage.cookies)
r = s.post(url, data = body, cookies = cookies)
But this also just returns the login page.
I don't know if this is related to the problem, but after I've run either variant of the code above, entering r.cookies returns <<class 'requests.cookies.RequestsCookieJar'>[]>
If anyone has any suggestions, I'd love to hear them.
You are loading the wrong URL. The form has an action attribute:
<form method="post" action="https://www.ibvpn.com/billing/dologin.php" name="frmlogin">
so you must post your login information to:
https://www.ibvpn.com/billing/dologin.php
instead of posting back to the login page. POST to soup.form['action'] instead:
r = s.post(soup.form['action'], data=body)
Your code is handling cookies just fine; I can see that s.cookies holds a cookie after requesting the login form, for example.
If this still doesn't work (a 404 is returned), then the server is using additional techniques to detect scripts vs. real browsers. Usually this is done by parsing the request headers. Look at your browser headers and replicate those. It may just be the User-Agent header that they parse, but Accept-* headers and Referrer can also play a role.

Categories

Resources