I am trying to log in to a website using python and the requests module.
I am doing:
import requests
payload = {
'login_Email': 'xxxxx#gmail.com',
'login_Password': 'xxxxx'
}
with requests.Session() as s:
p = s.post('https://www.auction4cars.com/', data=payload)
print p.text
The problem is, that the output of this just seems to be the login page and not the page AFTER log in. I.e. the page says 'welcome guest' and 'please enter your username and password' etc.
I was expecting it to return the page saying something like 'thanks for logging in xxxxx' etc.
Can anyone suggest what I'm doing wrong?
EDIT:
I don't think my question is a duplicate of How to "log in" to a website using Python's Requests module? because I am using the script from the most popular answer (regarded on the thread as the answer that should be the accepted one).
I have also tried the accepted one, but my problem remains.
EDIT:
I confused about whether I need to do something with cookies. The URLs that I am trying to visit after logging in don't seem to contain cookie values.
EDIT:
This seems to be a similar problem to mine:
get restricted page after login using requests,urllib2 python
However, I don't see other inputs that I need to fill out. Except:
Do I need to do anything with the submit button?
I was helped with the problem here:
Replicate browser actions with a python script using Fiddler
Thank you for any input.
Related
I'm logging in on to a webpage using the Session in Python, where the 2FA is enabled.
self.response = self.session.post(login_url,
data={"email": self.email, "password": self.password},
headers=self.session.headers)
By using the above code I'm able to log in to the webpage successfully, because I know the data it is asking for i.e., email & password. After this it redirects me the the 2FA page, where it is asks for the security questions (NOTE: There are total 10 security questions). At runtime it selects any 2 random questions.
Like in the above picture it asked me to enter the answer of "questionId": "6" & "questionId": "10".
So, my problem is how do I know that for which questionId's answer it is asking for? So that I post the answer's data accordingly.
You could use a library like BeautifulSoup to scrape the security question asked each time and then pass the request the corresponding answer.
I am a beginner so I apologize if my question is very obvious or not worded correctly.
I need to send a request to a URL so data can then be sent back in XLM format. The URL will have a user specific login and password, so I need to incorporate that as well. Also there is a port (port 80) that I need to include in the request. Is requests.get the way to go? I'm not exactly sure where to start. After receiving the XLM data, I need to process it (store it) on my machine - if anyone also wants to take a stab at that (I am also struggling to understand exactly how XLM data is sent over, is it an entire file?). Thanks in advance for the help.
Here is a python documentation on how to fetch internet resources using the urllib package.
It talks about getting the data, storing it in a file, sending data and some basic authentication.
https://docs.python.org/3/howto/urllib2.html
Getting the URL would look something like this import
Import urllib.request
urllib.request.urlopen("http://yoururlhere.co.uk").read()
Note that this is for strings and Python 3 only.
Python 2 version can be found here
What is the quickest way to HTTP GET in Python?
If you want to parse the data you may want to use this
https://docs.python.org/2/library/xml.etree.elementtree.html
I hope this helps! I am not too sure on how you would approach the username and password stuff but these links can hopefully provide you with information on how to do some of the other stuff!
Import the requests library and then call the post method as follows:
import requests
data = {
"email" : "netsparkertest#test.com",
"password" : "abcd12333",
}
r = requests.post('www.facebook.com', data=data)
print r.text
print r.status_code
print r.content
print r.headers
I am still fairly new to python and have some questions regarding using requests to sign in. I have read for hours but can't seem to get an answer to the following questions. If I choose a site such as www.amazon.com. I can sign in & determine the sign in link: https://www.amazon.com/gp/sign-in.html...
I can also find the sent form data, which includes items such as:
appActionToken:
appAction:SIGNIN
openid.pape.max_auth_age:ape:MA==
openid.return_to:
password: XXXX
email: XXXX
prevRID:
create:
metadata1: XXXX
my questions are as follows:
When finding form data, how do I know which items I must send back in a dictionary via post request. For the above, are email & password sufficient, and when browsing other sites, how do I know which ones are necessary?
The following code should work, but doesn't. What am I doing wrong?
The example includes a header category to determine the browser type. Another site, such as www.slashdot.org, does not need the header value to sign in. How do I know which sites require the header value and which ones don't?
Anyone who could provide input and help me sign in with requests would be doing me a great favor. I thank you very much.
import requests
session = requests.Session()
data = {'email':'xxxxx', 'password':'xxxxx'}
header={'User-Agent' : 'Mozilla/5.0'}
response = session.post('https://www.amazon.com/gp/sign-in.html', data,headers=header)
print response.content
When finding form data, how do I know which items I must send back in a dictionary via post request. For the above, are email & password sufficient, and when browsing other sites, how do I know which ones are necessary?
You generally need to either (a) read the documentation for the site you're using, if it's available, or (b) examine the HTML yourself (and possibly trace the http traffic) to see what parameters are necessary.
The following code should work, but doesn't. What am I doing wrong?
You didn't provide any details about how your code is not working.
The example includes a header category to determine the browser type. Another site, such as www.slashdot.org, does not need the header value to sign in. How do I know which sites require the header value and which ones don't?
The answer here is really the same as for the first question. Either you are using an API for which documentation exists that answers this question, or you're trying to automate a site that was designed primarily for human consumption via a web browser, which means you're going to have figure out through investigation, trial, and error exactly what parameters you need to provide to make the remote server happy.
I'm doing a small project to help my work go by faster.
I currently have a program written in Python 3.2 that does almost all of the manual labour for me, with one exception.
I need to log on to the company website (username and password) then choose a month and year and click download.
I would like to write a little program to do that for me, so that the whole process is completely done by the program.
I have looked into it and I can only find tools for 2.X.
I have looked into urllib and I know that some of the 2.X moudles are now in urllib.request.
I have even found some code to start it off, however I'm confused as to how to put it into practise.
Here is what I have found:
import urllib2
theurl = 'http://www.someserver.com/toplevelurl/somepage.htm'
username = 'johnny'
password = 'XXXXXX'
# a great password
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
# this creates a password manager
passman.add_password(None, theurl, username, password)
# because we have put None at the start it will always
# use this username/password combination for urls
# for which `theurl` is a super-url
authhandler = urllib2.HTTPBasicAuthHandler(passman)
# create the AuthHandler
opener = urllib2.build_opener(authhandler)
urllib2.install_opener(opener)
# All calls to urllib2.urlopen will now use our handler
# Make sure not to include the protocol in with the URL, or
# HTTPPasswordMgrWithDefaultRealm will be very confused.
# You must (of course) use it when fetching the page though.
pagehandle = urllib2.urlopen(theurl)
# authentication is now handled automatically for us
All Credit to Michael Foord and his page: Basic Authentication
So I changed the code around a bit and replaced all the 'urllib2' with 'urllib.request'
Then I learned how to open a webpage, figuring the program should open the webpage, use the login and password data to open the page, then I'll learn how to download the files from it.
ie = webbrowser.get('c:\\program files\\internet explorer\\iexplore.exe')
ie.open(theurl)
(I know Explorer is garbage, just using it to test then I'll be using crome ;) )
But that doesnt open the page with the login data entered, it simply opens the page as though you had typed in the url.
How do I get it to open the page with the password handle?
I sort of understand how Michael made them, but I'm not sure which to use to actually open the website.
Also an after thought, might I need to look into cookies?
Thanks for your time
you get things confused here.
webbrowser is a wrapper around your actual webbrowser, and urllib is a library for http- and url-related stuff.
They don't know each other, and serve very different purposes.
In former IE versions, you could encode HTTP Basic Auth username and password in the URL like so:
http(s)://Username:Password#Server/Ressource.ext - I believe Firefox and Chrome still support that, IE killed it: http://support.microsoft.com/kb/834489/EN-US
if you want to emulate a browser, rather than just open a real one, take a look at mechanize: http://wwwsearch.sourceforge.net/mechanize/
your browser doesn't know anything about the authenitcation you've done in python (and that has nothing to do wheater your browser is garbage or not). the webbrowser module simply offers convenience methods for launching a browser and pointing it to a webbrowser. you can't 'transfer' your credentials to the browser.
as for migrating from python2 to python3: the 2to3 tool can convert simple scripts like your automatically.
They are not running in the same environment.
You need to figure out what really happened when you click the download button. Use your browser's develop tool to get the POST format the website is using. Then build a request in python to fetch the file.
Requests is a nice lib to do that kind of things much easier.
I would use selenium, this is some code from a little script I have hacked about a bit to give you an idea:
def get_name():
user = 'johnny'
passwd = 'XXXXXX'
try :
driver = webdriver.Remote(desired_capabilities=webdriver.DesiredCapabilities.HTMLUNIT)
driver.get('http://www.someserver.com/toplevelurl/somepage.htm')
assert 'Page Title' in driver.title
username = driver.find_element_by_name('name_of_userid_box')
username.send_keys(user)
password = driver.find_element_by_name('name_of_password_box')
password.send_keys(passwd)
submit = driver.find_element_by_name('name_of_login_button')
submit.click()
driver.get('http://www.someserver.com/toplevelurl/page_with_download_button.htm')
assert 'page_with_download_button title' in driver.title
download = driver.find_element_by_name('download_button')
download.click()
except :
print('process failed')
I'm new to python so that may not be the best code every written but it should give you the general idea.
Hope it helps
I've recently written this with help from SO. Now could someone please tell me how to make it actually log onto the board. It brings up everything just in a non logged in format.
import urllib2, re
import urllib, re
logindata = urllib.urlencode({'username': 'x', 'password': 'y'})
page = urllib2.urlopen("http://www.woarl.com/board/index.php", logindata)
pagesource = page.read()
print pagesource
Someone recently asked the same question you're asking. If you read through the answers to that question you'll see code examples showing you how to stay logged in while browsing a site in a Python script using only stuff in the standard library.
The accepted answer might not be as useful to you as this other answer, since the accepted answer deals with a specific problem involving redirection. However, I recommend reading through all of the answers regardless.
You probably want to look into preserving cookies from the server.
Pycurl or Mechanize will make this much easier for you
If actually look at the page, you see that the login link takes you to http://www.woarl.com/board/ucp.php?mode=login
That page has the login form and submits to http://www.woarl.com/board/ucp.php?mode=login again with POST.
You'll then have to extract the cookies that are probably set, and put those in a CookieJar or similar.
You probably want to create an opener with these handlers and apply it to urllib2.
With these applied your cookies are handled and you'll be redirected, if server decides it wants you somewhere else.
# Create handlers
cookieHandler = urllib2.HTTPCookieProcessor() # Needed for cookie handling
redirectionHandler = urllib2.HTTPRedirectHandler() # needed for redirection (not needed for javascript redirect?)
# Create opener
opener = urllib2.build_opener(cookieHandler,redirectionHandler)
# Install the opener
urllib2.install_opener(opener)