Download a CSV file with python requests from an authenticated URL

Download a CSV file with python requests from an authenticated URL - python

For one of my project I need to fetch a CSV file from a website authenticated URL. Unfortunately there is no API to do it. So that is why I decided to use requests with sessions to fetch it.
The login page is:
http://extranet.ffbb.com/fbi/identification.do
I checked on the page for the login and password names in the HTML code. Login is "identificationBean.identifiant" and password is "identificationBean.mdp".
So I try to connect and to display the returned HTML with the following code but it returns the HTML code of a wrong login/password (like if I type a wrong login/pass from my browser). And I am sure my credentials are right.
login = "my_login"
password = "my_password"
with requests.Session() as session:
data = {
'identificationBean.identifiant': '{}'.format(config.login),
'identificationBean.mdp': '{}'.format(config.password)
}
url = 'http://extranet.ffbb.com/fbi/identification.do'
response = session.post(url, data=data)
print(response.text)
Thank you for your help

Related

Python: Remain logged in a website using mechanicalsoup

I am a newbie in Python. I want to write a script in python where I run thousands of URLs and save the response. In order to access these URLs, credentials are required. So, I have written a basic script which goes to the URL and print the response. When I go through multiple URL, the website returns the error that multiple users are logged in. So, I would like to login once and run other URLs in the same logged in session. Is there any way I can do that using Mechanical soup.
Here is my script:
import mechanicalsoup
browser = mechanicalsoup.StatefulBrowser()
browser.open("mywebsite1")
browser.select_form('form[action="/login"]')
browser.get_current_form().print_summary()
browser["userId"] = "myusername"
browser["password"] = "mypassword"
response = browser.submit_selected()
browser.launch_browser()
print(response.text)
browser.open("mywebsite2")
print(response.text)
...... so on for all the URLs
How can I save my session? Thanks in advance for your help

Just save & load session.cookies:
def save_cookies(browser):
return browser.session.cookies.get_dict()
def load_cookies(browser, cookies):
from requests.utils import cookiejar_from_dict
browser.session.cookies = cookiejar_from_dict(cookies)
browser = mechanicalsoup.StatefulBrowser()
browser.open("www.google.com")
cookies = save_cookies(browser)
load_cookies(browser, cookies)

Remote login to decoupled website with python and requests

I am trying to login to a website www.seek.com.au. I am trying to test the possibility to remote login using Python request module. The site is Front end is designed using React and hence I don't see any form action component in www.seek.com.au/sign-in
When I run the below code, I see the response code as 200 indicating success, but I doubt if it's actually successful. The main concern is which URL to use in case if there is no action element in the login submit form.
import requests
payload = {'email': <username>, 'password': <password>}
url = 'https://www.seek.com.au'
with requests.Session() as s:
response_op = s.post(url, data=payload)
# print the response status code
print(response_op.status_code)
print(response_op.text)
When i examine the output data (response_op.text), i see word 'Sign in' and 'Register' in output which indicate the login failed. If its successful, the users first name will be shown in the place. What am I doing wrong here ?
P.S: I am not trying to scrape data from this website but I am trying to login to a similar website.

Try this code:
import requests
payload={"email": "test#test.com", "password": "passwordtest", "rememberMe": True}
url = "https://www.seek.com.au:443/userapi/login"
with requests.Session() as s:
response_op = s.post(url, json=payload)
# print the response status code
print(response_op.status_code)
print(response_op.text)
You are sending the request to the wrong url.
Hope this helps

Using python requests module to login on an Wordpress based website

I'm trying to login to a Wordpress based website using python's request module and beautifulsoup4. It seems like the code fails to sucessfully login. Also, there is no csrf token on the website. How do I sucessfully login to the website?
import requests
import bs4 as bs
with requests.session() as c:
link="https://gpldl.com/sign-in/" #link of the webpage to be logged in
initial=c.get(link) #passing the get request
login_data={"log":"*****","pwd":"******"} #the login data from any account on the site. Stars must be replaced with username and password
page_login=c.post(link, data=login_data) #posting the login data into the link
print(page_login) #checking status of requested page
page=c.get("https://gpldl.com/my-gpldl-account/") #requesting source code of logged in page
good_data = bs.BeautifulSoup(page.content, "lxml") #parsing it with BS4
print(good_data.title) #printing this gives the title that is got from the page when accessed from an logged-out account

You are sending your POST request to a wrong URL, the correct one should be https://gpldl.com/wp-login.php, also there're 5 parameters for the payload which are log, pwd, rememberme, redirect_to, redirect_to_automatic.
So it should be:
login_data = {"log": "*****","pwd": "******",
"rememberme": "forever",
"redirect_to": "https://gpldl.com/my-gpldl-account/",
"redirect_to_automatic": "1"
}
page_login = c.post('https://gpldl.com/wp-login.php', data=login_data)
Edit:
You could use Chrome Dev tool to find out all these info while logging in, it's like this:
As to rememberme key, I would suggest you to do exact same thing a browser does, also add some headers for your request, especially User-Agent, because for some websites they just don't welcome you got login this way.

Python Sessions/Requests Losing Authentication Mid-Session

I'm trying to login to Campaign Monitor to scrape some data from pages related to email campaign performance.
The "login-protected" URL of the page I'm trying to access looks like this:
https://mycompany.createsend.com/campaigns/reports/lists/DFGDF987GD98F7GD?s=BCV98B5XF54BVC54BC
Going to that page in a web browser (try it here) will redirect to the login page, itself with a URL like this:
https://login.createsend.com/l/98SDF76DS87F68S/DFGDF987GD98F7GD?ReturnUrl=%2Fcampaigns%2Freports%2Flists%2FBCV98B5XF54BVC54BC%3Fs%3BCV98B5XF54BVC54BC&s=7DS6F87S6DF876SDF76
What I've gathered from trying to solve this is that I need to open a session, authenticate on the redirect URL, then request the URL that I actually want (using the authenticated session).
Here is the code I'm using to try to accomplish that:
payload = {
'username': 'myUsername',
'password': 'myPassword',
}
redURL = 'https://login.createsend.com/l/98SDF76DS87F68S/DFGDF987GD98F7GD?ReturnUrl=%2Fcampaigns%2Freports%2Flists%2FBCV98B5XF54BVC54BC%3Fs%3BCV98B5XF54BVC54BC&s=7DS6F87S6DF876SDF76'
with requests.Session() as s:
p = s.post(redURL, data=payload)
# This prints the "success" message I've pasted below
print p.content
r = s.get('https://mycompany.createsend.com/campaigns/reports/lists/DFGDF987GD98F7GD?s=BCV98B5XF54BVC54BC')
# This prints the HTML of the login page again, as if I'm not authenticated
print r.content
Here is the "successful" response after the first POST for the session:
{"MultipleAccounts":false,"LoginStatus":"Success","SiteAddress":"https://mycompany.createsend.com","ErrorMessage":"","SessionExpired":false,"Url":"https://mycompany.createsend.com/login?Origin=Marketing\u0026ReturnUrl=%2fcampaigns%2freports%2flists%2f92D2FBCV98B5XF54BVC%3fs%7DS6F87S6DF876SDF76\u0026s=2FBCV98B5XF54BVC","DomainSwitchAddress":"https://mycompany.createsend.com","DomainSwitchAddressQueryString":null,"NeedsDomainSwitch":false}
Can someone please help me out with why the second request in the session prints the HTML of the login page instead of the HTML of the authenticated version of the page (ie. the page with the data I'm looking for)?

save session/cookies with requests

I want to send a post request to a login form with the action /ShowLogin.xpl. I wrote a code using requests which sends an http request and sends post request with the user name & password credentials. How over, when I try to print out the content of the web page I still see the login page because the session or the cookie isn't saved. Can anyone help me?
The url is: https://www.xplace.com/ShowLogin.xpl
My code is:
payload = {"j_username" : "myusername", "j_password" : "mypassword"}
with requests.Session() as s:
webpage = s.post("https://www.xplace.com/ShowLogin.xpl", data=payload)
# An authorised request
r = s.get("https://www.xplace.com/ShowRecommendedProjects.xpl")
print r.text

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Download a CSV file with python requests from an authenticated URL - python

Related

Python: Remain logged in a website using mechanicalsoup

Remote login to decoupled website with python and requests

Using python requests module to login on an Wordpress based website

Python Sessions/Requests Losing Authentication Mid-Session

save session/cookies with requests

Categories

Resources