I want to fetch the page 192.168.1.1 /basic/home_dhcplist.htm
from the router but it asks for username and password at the start.
I am fetching the page in Python through urllib2
import urllib2
response = urllib2.urlopen('http://192.168.1.1/basic/home_dhcplist.htm')
html = response.read()
str="Prasads"
value= html.find(str)
print value
if value!=-1 :
print "found"
else:
print "not found"
response.close()
Every home router I have seen uses basic auth for authentication. This is simply another header that you send along with the request. Each time you request a page the username and password are sent as headers to the server, where they are verified each and every request.
I would suggest the requests library over urllib2.
import requests
r = requests.get('http://192.168.1.1/basic/home_dhcplist.htm', auth=('username', 'password'))
if 'Prasads' in r.text():
print "found"
else:
print "not found"
Basically you need to set the cookie which maintains the session, most probably.
Access the page via a browser(Firefox) enter the login pass when prompted to do so.
Press Ctrl-Shift-k, and reload the page and click on any of the most recent GET requests, you'll get a window showing the GET request details. Note the Request Headers and set the cookie accordingly.
The key-value which would be the most useful is Authorization.
Related
I am trying to get some data from an xml of a website which requires to login first. I try to get python requests Session and I am able to login using get method (shows 200). After that I try to get access to the xml, but it gets me 401.
So far I know that the server checks with each call whether the client is sending a JSESSIONID cookie and whether the VALUE matches the current session.
I manage to get the corresponding cookies and tried to send it via post method but still get 401.
Maybe I think to complicated. I also do not need to achieve this with requests. I would be glad to just get the information from the xml.
import requests
login_url = 'https://www.example.com/login
USER_NAME = 'user'
PASSWORD = 'pass'
xml = 'https://www.example.com/channel_index?123&p=web'
with requests.Session() as s:
response = s.get(login_url,auth = (USER_NAME, PASSWORD))
print(response)
r = s.get(xml)
cookies = s.cookies.get_dict()
r = s.post(xml, cookies = cookies)
print(r)
I am trying to login to a website www.seek.com.au. I am trying to test the possibility to remote login using Python request module. The site is Front end is designed using React and hence I don't see any form action component in www.seek.com.au/sign-in
When I run the below code, I see the response code as 200 indicating success, but I doubt if it's actually successful. The main concern is which URL to use in case if there is no action element in the login submit form.
import requests
payload = {'email': <username>, 'password': <password>}
url = 'https://www.seek.com.au'
with requests.Session() as s:
response_op = s.post(url, data=payload)
# print the response status code
print(response_op.status_code)
print(response_op.text)
When i examine the output data (response_op.text), i see word 'Sign in' and 'Register' in output which indicate the login failed. If its successful, the users first name will be shown in the place. What am I doing wrong here ?
P.S: I am not trying to scrape data from this website but I am trying to login to a similar website.
Try this code:
import requests
payload={"email": "test#test.com", "password": "passwordtest", "rememberMe": True}
url = "https://www.seek.com.au:443/userapi/login"
with requests.Session() as s:
response_op = s.post(url, json=payload)
# print the response status code
print(response_op.status_code)
print(response_op.text)
You are sending the request to the wrong url.
Hope this helps
I'm trying to login to a Wordpress based website using python's request module and beautifulsoup4. It seems like the code fails to sucessfully login. Also, there is no csrf token on the website. How do I sucessfully login to the website?
import requests
import bs4 as bs
with requests.session() as c:
link="https://gpldl.com/sign-in/" #link of the webpage to be logged in
initial=c.get(link) #passing the get request
login_data={"log":"*****","pwd":"******"} #the login data from any account on the site. Stars must be replaced with username and password
page_login=c.post(link, data=login_data) #posting the login data into the link
print(page_login) #checking status of requested page
page=c.get("https://gpldl.com/my-gpldl-account/") #requesting source code of logged in page
good_data = bs.BeautifulSoup(page.content, "lxml") #parsing it with BS4
print(good_data.title) #printing this gives the title that is got from the page when accessed from an logged-out account
You are sending your POST request to a wrong URL, the correct one should be https://gpldl.com/wp-login.php, also there're 5 parameters for the payload which are log, pwd, rememberme, redirect_to, redirect_to_automatic.
So it should be:
login_data = {"log": "*****","pwd": "******",
"rememberme": "forever",
"redirect_to": "https://gpldl.com/my-gpldl-account/",
"redirect_to_automatic": "1"
}
page_login = c.post('https://gpldl.com/wp-login.php', data=login_data)
Edit:
You could use Chrome Dev tool to find out all these info while logging in, it's like this:
As to rememberme key, I would suggest you to do exact same thing a browser does, also add some headers for your request, especially User-Agent, because for some websites they just don't welcome you got login this way.
I am using Requests API with Python2.7.
I am trying to download certain webpages through proxy servers. I have a list of available proxy servers. But not all proxy servers work as desired. Some proxies require authentication, others redirect to advertisement pages etc. In order to detect/verify incorrect responses, I have included two checks in my url requests code. It looks similar to this
import requests
proxy = '37.228.111.137:80'
url = 'http://www.google.ca/'
response = requests.get(url, proxies = {'http' : 'http://%s' % proxy})
if response.url != url or response.status_code != 200:
print 'incorrect response'
else:
print 'response correct'
print response.text
There are some proxy servers with which the requests.get call is successful and they pass these two conditions and still contain invalid html source in response.text attribute. However, if I use the same proxy in my FireFox browser and try to open the same webpage, I am displayed an invalid webpage, but my python script says that the response should be valid.
Can someone point to me that what other necessary checks I am missing to weed out incorrect html results?
or
How can I successfully verify if the webpage I intended to receive is correct?
Regards.
What is an "invalid webpage" when displayed by your browser? The server can return a HTTP status code of 200, but the content is an error message. You understand it to be an error message because you can comprehend it, a browser or code can not.
If you have any knowledge about the content of the target page, you could check whether the returned HTML contains that content and accept it on that basis.
I have a service, that exposes API for logging in. In my browser if I do
<host>:<port>/myservice/api/login?u=admin&pw=admin
The above url, returns a ticket, that I can pass along for my successive requests.
More details about the same, here.
Below is my python script.
import urllib
url = 'http://<host>:<port>/myservice/api/login?u=admin&pw=admin'
print 'Retrieving', url
uh = urllib.urlopen(url)
data = uh.read()
print 'Retrieved',len(data),'characters'
print data
When I run this I get
IOError: ('http error', 401, 'Authorization Required', <httplib.HTTPMessage instance at 0x<somenumber>>)
Now, I am not sure what am I supposed to do. So I went to my browser, and opened the developer's console.
Apparently, the url has moved to something else. I see two requests.
first one hits the url that I am hitting. Response Header has a Location:Parameter.
The second request hits the url that is returned as Location parameter. the Authorization header has 'Negotiation
It also has a setcookie in the response header.
Now, I am not sure what exactly to do with this information, but if someone can help. Thanks
I believe you problem is having the wrong URL for the login service
If I change you code to instead be:
import urllib, json
url = 'http://localhost:8080/alfresco/service/api/login?u=admin&pw=admin&format=json'
print "Retrieving %s" % url
uh = urllib.urlopen(url)
data = uh.read()
print "Retrieved %d characters" % len(data)
print "Data is %s" % data
ticket = json.loads(data)["data"]["ticket"]
print "Ticket is %s" % ticket
Then against a freshly installed Alfresco 4.2 server, I get back a login Ticket for the admin user.
Note the use of the json format of the login API - much easier to parse from JSON, and of the correct path to the login api - /alfresco/service/api/login
try this two small changes may be it will help:
1) use urllib.urlencode while passing parameters to request url
import urllib
params = urllib.urlencode({'u': 'admin', 'pw': 'admin'})
uh = urllib.urlopen("http://<host>:<port>/myservice/api/login?%s" % params')
2) Stimulate a web browser while making a request using urllib2
import urllib2
req = urllib2.Request('http://<host>:<port>/myservice/api/login?u=admin&pw=admin', headers={ 'User-Agent': 'Mozilla/5.0' })
uh = urllib2.urlopen(req)
401 is unauthorized error! it means you are not authorized to access API. Did you already signup for API keys and access tokens?
Check detailed description of 401 Error:
http://techproblems.org/http-error-401/