Log in a site and navigate another pages - python

I have a script for Python 2 to login into a webpage and then move inside to reach a couple of files pointed to on the same site, but different pages. Python 2 let me open the site with my credentials and then create a opener.open() to keep the connection available to navigate to the other pages.
Here's the code that worked in Python 2:
$Your admin login and password
LOGIN = "*******"
PASSWORD = "********"
ROOT = "https:*********"
#The client have to take care of the cookies.
jar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
#POST login query on '/login_handler' (post data are: 'login' and 'password').
req = urllib2.Request(ROOT + "/login_handler",
urllib.urlencode({'login': LOGIN,
'password': PASSWORD}))
opener.open(rep)
#Set the right accountcode
for accountcode, queues in QUEUES.items():
req = urllib2.Request(ROOT + "/switch_to" + accountcode)
opener.open(req)
I need to do the same thing in Python 3. I have tried with request module and urllib, but although I can establish the first login, I don't know how to keep the opener to navigate the site. I found the OpenerDirector but it seems like I don't know how to do it, because I haven't reached my goal.
I have used this Python 3 code to get the result desired but unfortunately I can't get the csv file to print it.
enter image description here

Question: I don't know how to keep the opener to navigate the site.
Python 3.6ยป Documentation urllib.request.build_opener
Use of Basic HTTP Authentication:
import urllib.request
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(realm='PDQ Application',
uri='https://mahler:8092/site-updates.py',
user='klem',
passwd='kadidd!ehopper')
opener = urllib.request.build_opener(auth_handler)
# ...and install it globally so it can be used with urlopen.
urllib.request.install_opener(opener)
f = urllib.request.urlopen('http://www.example.com/login.html')
csv_content = f.read()

Use python requests library for python 3 and session.
http://docs.python-requests.org/en/master/user/advanced/#session-objects
Once you login your session will be automatically managed. You dont need to create your own cookie jar. Following is the sample code.
s = requests.Session()
auth={"login":LOGIN,"pass":PASS}
url=ROOT+/login_handler
r=s.post(url, data=auth)
print(r.status_code)
for accountcode, queues in QUEUES.items():
req = s.get(ROOT + "/switch_to" + accountcode)
print(req.text) #response text

Related

Log Into Website, Download File

I'm trying to use a python script to log into my school's website and then download the homework assignment PDFs that are uploaded once a week. I've successfully downloaded PDFs from normal, non-protected websites, but I'm having trouble understanding the mechanics of cookies. I've done a bunch of googling, but the only code I've found is the following.
import urllib, urllib2, cookielib
testfile = urllib.URLopener()
username = 'example#gmail.com'
password = '*****'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'j_password' : password})
opener.open('http-this.pdf', login_data)
testfile.retrieve("http-path-to-file")
Basically, I've tried putting in all the appropriate information, but it didn't work, and I have no idea how to manipulate the code to make it do what I want. How can I use python to log into the website and then download a pdf?
Edit
Okay, here's the new code I've got that sort of works, but it outputs a copy of the site's html code with a .pdf extension instead of the file that I'm actually trying to download from the site. What's going wrong?
import requests
s = requests.Session()
data = {"login":"MYLOG", "password":"*****"}
url = "https://website.php"
url2 = "https://path-to-pdf.pdf"
r2 = s.post(url, data=data)
s.get(url2)
r = s.get(url2)
with open("204_HW.pdf", "wb") as code:
code.write(r.content)

Download a file from https with authentication

I have a Python 2.6 script that downloades a file from a web server. I want this this script to pass a username and password(for authenrication before fetching the file) and I am passing them as part of the url as follows:
import urllib2
response = urllib2.urlopen("http://'user1':'password'#server_name/file")
However, I am getting syntax error in this case. Is this the correct way to go about it? I am pretty new to Python and coding in general.
Can anybody help me out?
Thanks!
If you can use the requests library, it's insanely easy. I'd highly recommend using it if possible:
import requests
url = 'http://somewebsite.org'
user, password = 'bob', 'I love cats'
resp = requests.get(url, auth=(user, password))
I suppose you are trying to pass through a Basic Authentication. In this case, you can handle it this way:
import urllib2
username = 'user1'
password = '123456'
#This should be the base url you wanted to access.
baseurl = 'http://server_name.com'
#Create a password manager
manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
manager.add_password(None, baseurl, username, password)
#Create an authentication handler using the password manager
auth = urllib2.HTTPBasicAuthHandler(manager)
#Create an opener that will replace the default urlopen method on further calls
opener = urllib2.build_opener(auth)
urllib2.install_opener(opener)
#Here you should access the full url you wanted to open
response = urllib2.urlopen(baseurl + "/file")
Use requests library and just put the credentials inside your .netrc file.
The library will load them from there and you will be able to commit the code to your SCM of choice without any security worries.

Log in into website and download file with python requests

I have a website with a HTML-Form. After logging in it takes me to a start.php site and then redirects me to an overview.php.
I want to download files from that server... When I click on the download link of a ZIP-File the address behind the link is:
getimage.php?path="vol/img"&id="4312432"
How can I do that with requests? I tried to create a session and do the GET-Command with the right params... but the answer is just the website I would see when I'm not logged in.
c = requests.Session()
c.auth =('myusername', 'myPass')
request1 = c.get(myUrlToStart.PHP)
tex = request1.text
with open('data.zip', 'wb') as handle:
request2 = c.get(urlToGetImage.Php, params=payload2, stream=True)
print(request2.headers)
for block in request2.iter_content(1024):
if not block:
break
handle.write(block)
What you're doing is a request with basic authentication. This does not fill out the form that is displayed on the page.
If you know the URL that your form sends a POST request to, you can try sending the form data directly to this URL
Those who are looking for the same thing could try this...
import requests
import bs4
site_url = 'site_url_here'
userid = 'userid'
password = 'password'
file_url = 'getimage.php?path="vol/img"&id="4312432"'
o_file = 'abc.zip'
# create session
s = requests.Session()
# GET request. This will generate cookie for you
s.get(site_url)
# login to site.
s.post(site_url, data={'_username': userid, '_password': password})
# Next thing will be to visit URL for file you would like to download.
r = s.get(file_url)
# Download file
with open(o_file, 'wb') as output:
output.write(r.content)
print(f"requests:: File {o_file} downloaded successfully!")
# Close session once all work done
s.close()

how to open facebook/gmail/authencation sites using python urllib2.urlopen?

I am trying to write a small web-based proxy using python, I can fetch and show normal websites, but I can not login to facebook/gmail/...anything with login .
I have seen some examples of authentication here
http://docs.python.org/release/2.5.2/lib/urllib2-examples.html but I don't know how I can make a general solution for all web sites with login , any idea?
my code is :
def showurl():
url=request.vars.url
response = urllib2.urlopen(url)
html = response.read()
return html
Your proxy-server needs to store cookies, search stackoverflow for cookielib.
Many web sites authenticate clients in different way, so your job is to fake client as much as possible with your proxy-server. Some web sites authenticate by browser type, some by creating cookies and storing sessionId in it, or other JavaScript hidden content that allows to do some authentication steps.
As far as my small experience, all important stuff ends in cookies.
This is just flat example how to use cookielib.
import urllib, urllib2, cookielib, getpass
username = ''
button = 'submit'
www_login = 'http://website.com'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders.append(('User-agent', 'Mozilla/4.0'))
opener.addheaders.append( ('Referer', '/dev/null') )
login_data = urllib.urlencode({'username' : username, 'password': getpass.getpass("Password:"), 'login' : button})
resp = opener.open(www_login, login_data)
print resp.read()
EDITED:
Don't mislead yourself with "Basic HTTP Authentication" and authentication by facebook/gmail because it is different stuff. "Basic HTTP Authentication" or "Digest HTTP Authentication" is done by web-server not web-site that you want to log in.
http://www.voidspace.org.uk/python/articles/authentication.shtml#id24

Form-based authentication with Python

I'm trying to use a code read in Kent's Korner for Form-based authentication. At least I'm told the web site I'm trying to read is form-based authenticated.
But I don't seem to be able to get past the login page. The code I'm using is
Import urllib, urllib2, cookielib, string
# configure an opener that will handle cookies
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
urllib2.install_opener(opener)
# use the opener to POST to the login form and the protected page
params = urllib.urlencode(dict(username='user', password='stuff'))
f = opener.open('http://www.hammernutrition.com/forums/memberlist.php?mode=viewprofile&u=1323', params)
data = f.read()
f.close()
f = opener.open('http://www.hammernutrition.com/forums/memberlist.php?mode=viewprofile&u=1323')
data = f.read()
f.close()
You can simulate a web browser in Python without using much too resources with mechanize
(Debian/Ubuntu package is called python-mechanize). It handles both cookies and submitting forms, just the way a web browser would do, one great example is a Python Dropbox Uploader script, which you can transform to your needs.

Categories

Resources