I'm trying to make a script to auto-login to this website and I'm having some troubles. I was hoping I could get assistance with making this work. I have the below code assembled but I get 'Your request cannot be processed at this time\n' in the bottom of what's returned to me when I should be getting some different HTML if it was successful:
from pyquery import PyQuery
import requests
url = 'https://licensing.gov.nl.ca/miriad/sfjsp?interviewID=MRlogin'
values = {'d_1553779889165': 'email#email.com',
'd_1553779889166': 'thisIsMyPassw0rd$$$',
'd_1618409713756': 'true',
'd_1642075435596': 'Sign in'
}
r = requests.post(url, data=values)
print (r.content)
I do this in .NET, but I think the logic can be written in Python as well.
Firstly, I always use Fiddler to capture requests that a webpage sends then identify the request which you want to replicate and add all the cookies and headers that are sent with it in your code.
After sending the login request you will get some cookies that will identify that you've logged in and you use those cookies to proceed further in your site. For example, if you want to retrieve user's info after logging in first you need to trick the server thinking that you are logged in and that is where those log in cookies will help you
Also, I don't think the login would be so simple through a script because if you're trying to automate a government site, they may have some anti-bot security there lying there, some kind of fingerprint or captcha.
Hope this helps!
Related
How do i to a website using requests library, i watched a lot of tutorials but they seem to have a 302 POST request in their networks tab in inspector. I see a lot of GET requests in my tab when i login. A friend of mine said cookies but i am really a beginner i don't know how to login.
Also, i would like to know the range from which i can use this library or any helpful source of information from where i can begin learning this library.
import requests
r = requests.get("https://example.com")
I want to POST request, the same friend told me that i would require API access of that website to proceed further is it true?
Depending on the site that you are trying to log in with it may be necessary to log in via a chrome browser (selenium) and from there extract and save the cookies for further injection and use within the requests module.
To extract cookies from selenium and save them to a json file use:
cookies = driver.get_cookies()
with open('file_you_want_to_save_the_cookies_to.json', 'w') as f:
json.dump(cookies, f)
To then use these cookies in the request module use:
cookies = {
'cookie_name' : 'cookie_value'
}
with requests.Session() as s:
r = s.get(url, headers=headers, cookies=cookies)
I could help further if you mention what site you are trying to do this on
Well best practice is to create a little project which can use your library. For requests lib it can accessing some of free APIs on the internet. For example I made this one while ago which is using free API:
##########################################
# #
# INSULT API #
# #
##########################################
def insult_api(name):
params={
("who", name)
}
response = requests.get(
f"https://insult.mattbas.org/api/en/insult.json",
params=params)
return response.json()
Helpful source is (obvisouly official documentation) basically any youtube video or SO post here. Just look for the thing you want to do.
Anyway if you are looking for logging into website without API accesses, you can use selenium libraby.
The requests module is extremely powerful if used properly and with the necessary information sent in the requests. First, analyse the network packets via tools like Network tab in Chrome Dev Tools. Then try to replicate the request via requests in python.
Usually, you will need headers, and data sent.
headers = {
<your headers here>
}
data = <data here>
req = requests.post("https://www.examplesite.com/api/login", data=data, headers=headers)
Everything should be easily found in network packets, unless it has some sort of security like csrf-tokens etc, which need to be sent along with the login req. In order to do this, you need to send a GET req to get the info, then send a POST req with the info.
If you could provide the site you're trying to use it would be pretty helpful too. Best of luck!
I have this webpage: https://www.dsbmobile.de. I would like to automate a bot that checks in and gets the newest file. I have an account and login credentials so I tried this with python:
import requests
url = "https://www.dsbmobile.de/Login.aspx?ReturnUrl=%2f"
payload = {'txtUser': 'username', 'txtPass': 'password'}
x = requests.post(url, data=payload)
print(x.text)
I get a result but its just the login page instead of the new page I should be ridirected to .
When looking at the source, I saw that there are hidden input fields such as "__EVENTVALIDATION"
do I need to send them too? Or maybe I need to set something into the header idk. It would be very nice if someone could tell me how to write the post request just like the browser sends it so that I get the right response
I am new but it would be extremely useful to me if I could automate that process.
Thank you very much for trying to help me
I am looking to open a connection with python to http://www.horseandcountry.tv which takes my login parameters via the POST method. I would like to open a connection to this website in order to scrape the site for all video links (this, I also don't know how to do yet but am using the project to learn).
My question is how do I pass my credentials to the individual pages of the website? For example if all I wanted to do was use python code to open a browser window pointing to http://play.horseandcountry.tv/live/ and have it open with me already logged in, how do I go about this?
As far as I know you have two options depending how you want to crawl and what you need to crawl:
1) Use urllib. You can do your POST request with the necessary login credentials. This is the low level solution, which means that this is fast, but doesn't handle high level stuff like javascript codes.
2) Use selenium. Whith that you can simulate a browser (Chrome, Firefox, other..), and run actions via your python code. Then it is much slower but works well with too "sophisticated" websites.
What I usually do: I try the first option and if a encounter a problem like a javascript security layer on the website, then go for option 2. Moreover, selenium can open a real web browser from your desktop and give you a visual of your scrapping.
In any case, just goolge "urllib/selenium login to website" and you'll find what you need.
If you want to avoid using Selenium (opening web browsers), you can go for requests, it can login the website and grab anything you need in the background.
Here is how you can login to that website with requests.
import requests
from bs4 import BeautifulSoup
#Login Form Data
payload = {
'account_email': 'your_email',
'account_password': 'your_passowrd',
'submit': 'Sign In'
}
with requests.Session() as s:
#Login to the website.
response = s.post('https://play.horseandcountry.tv/login/', data=payload)
#Check if logged in successfully
soup = BeautifulSoup(response.text, 'lxml')
logged_in = soup.find('p', attrs={'class': 'navbar-text pull-right'})
print s.cookies
print response.status_code
if logged_in.text.startswith('Logged in as'):
print 'Logged In Successfully!'
If you need explanations for this, you can check this answer, or requests documentation
You could also use the requests module. It is one of the most popular. Here are some questions that relate to what you would like to do.
Log in to website using Python Requests module
logging in to website using requests
I am using the following script:
import requests
import json
import os
COOKIES = json.loads("") #EditThisCookie export here (json) to send requests
COOKIEDICTIONARY = {}
for i in COOKIES:
COOKIEDICTIONARY[i['name']] = i['value']
def follow(id):
post = requests.post("https://instagram.com/web/friendships/" + id + "/follow/", cookies=COOKIEDICTIONARY)
print(post.text)
follow('309438189')
os.system("pause")
This script is supposed to send a follow request to the user, '3049438189' on Instagram. However, if the code is run, the post.text outputs some HTML code, including
"This page could not be loaded. If you have cookies disabled in your
browser, or you are browsing in Private Mode, please try enabling
cookies or turning off Private Mode, and then retrying your action."
It's supposed to append the cookies to the variable, COOKIEDICTIONARY in a "requests" module readable format. If you print the array (I don't know what it's called in Python), it replies with all of the cookies and their values.
The cookies put in are valid and the requests syntax (I believe to be) is correct.
I have fixed it. The problem was certain headers that I needed were not present, such as Origin (I will get the full list soon). For anybody who wants to imitate any instagram post request, you need those headers or it will error.
At work, I sit behind a proxy. When I connect to the company WiFi and open up a browser, a pop-up box usually appears asking for my company credentials before it will let me navigate to any internal/external site.
I am using the Python Requests package to automate pulling data from an external site but am encountering a 401 error that is related to not having authenticated first. This happens when I don't authenticate first using the browser. If I authenticate first with the browser and then use Python requests then everything is fine and I'm able to navigate to any site.
My questions is how do I perform the work authentication part using Python? I want to be able to automate this process so that I can set a cron job that grabs data from an external source every night.
I've tried providing an blank URL:
import requests
response = requests.get('')
But requests.get() requires a properly structure URL. I want to be able to emulate as if I've opened up a browser and capturing the pop-up that asks for authentication. This does not rely on any URL being used.
From the requests documentation
import requests
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
requests.get("http://example.org", proxies=proxies)