How do i implement cookielib to create a session

How do i implement cookielib to create a session - python

import urllib2
import urllib
url = "http://www.torn.com/authenticate.php"
username = raw_input("Your username; ")
password = raw_input("Your password: ")
query = {'player':username, 'password':password}
data_query = urllib.urlencode(query)
sending_data = urllib2.Request(url, data_query)
print sending_data()
response = urllib2.urlopen(sending_data)
print "You are currently logged unto to:", response.geturl()
print response.read()
How do i implement the cookielib to create a session and please explain line by line Thank you

from cookielib import CookieJar
cj = CookieJar()
login_data = urllib.urlencode({'j_username' : username, 'j_password' : password})
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.open("http://www.torn.com/authenticate.php", login_data)
First, you want to initialize your CookieJar. Then you encode your credentials, as they need to be sent in a certain form to be read by the PHP server. Then you have to initialize an opener, which is basically an HTTP client, and configure it to use your CookieJar. Then submit your login info to the server to create the session and generate cookies. To continue using these cookies, use opener.open() instead of urllib2.urlopen (although you can still use urllib2.Request() to generate the requests.

You'll want to use an OpenerDirector for that.
From the documentation:
import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
So in order to also pass data your code should look something like this:
import urllib2
import urllib
import cookielib
url = "http://www.torn.com/authenticate.php"
username = raw_input("Your username; ")
password = raw_input("Your password: ")
query = {'player':username, 'password':password}
data_query = urllib.urlencode(query)
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
response = opener.open(url, data_query)
print "You are currently logged unto to:", response.geturl()
print response.read()

Related

TeamCity: urllib2.HTTPError: HTTP Error 401: Unauthorized

Wrote a script to cancel old builds running on TeamCity. I'm able to list the all running builds based as per my requirement(buildTypeId,branchName).I need to send a REST API call to cancel old build, this request is throwing 401 error. I guess misusing the POST method using urllib2. Anyone can help me on this please?
import urllib2
import re
from xml.dom import minidom
url = 'https://<TeamCity>/httpAuth/app/rest/builds?locator=running:true'
username = user_name
password = password
p = urllib2.HTTPPasswordMgrWithDefaultRealm()
p.add_password(None, url, username, password)
handler = urllib2.HTTPBasicAuthHandler(p)
opener = urllib2.build_opener(handler)
urllib2.install_opener(opener)
dom = minidom.parse(urllib2.urlopen(url))
build = dom.getElementsByTagName('build')
for i in range(0, len(build)):
for j in range(i+1, len(build)):
find_build_type = re.search(r'<Job_1>.*', build[i].getAttribute("buildTypeId"), re.M | re.I)
if build[i].getAttribute("branchName") != '<default>' and find_build_type:
if (build[i].getAttribute("buildTypeId") == build[j].getAttribute("buildTypeId") and build[i].getAttribute("branchName") == build[j].getAttribute("branchName")):
#Get the old build
old_run = "https://<TeamCity>/app/rest/buildQueue/id:" + str(min(int(build[j].getAttribute("id")),int(build[i].getAttribute("id"))))
#headers
headers={"Content-Type": "application/xml"}
#data
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, username, password)
opener = urllib2.build_opener(urllib2.HTTPBasicAuthHandler(passman))
urllib2.install_opener(opener)
r = urllib2.Request(old_run, data, headers)
#send API request to cancel the old build
u = urllib2.urlopen(r)

Scrape data from a page that requires a login

I am new to Python and Web Scraping and I am trying to write a very basic script that will get data from a webpage that can only be accessed after logging in. I have looked at a bunch of different examples but none are fixing the issue. This is what I have so far:
from bs4 import BeautifulSoup
import urllib, urllib2, cookielib
username = 'name'
password = 'pass'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'password' : password})
opener.open('WebpageWithLoginForm')
resp = opener.open('WebpageIWantToAccess')
soup = BeautifulSoup(resp, 'html.parser')
print soup.prettify()
As of right now when I print the page it just prints the contents of the page as if I was not logged in. I think the issue has something to do with the way I am setting the cookies but I am really not sure because I do not fully understand what is happening with the cookie processor and its libraries.
Thank you!
Current Code:
import requests
import sys
EMAIL = 'usr'
PASSWORD = 'pass'
URL = 'https://connect.lehigh.edu/app/login'
def main():
# Start a session so we can have persistant cookies
session = requests.session(config={'verbose': sys.stderr})
# This is the form data that the page sends when logging in
login_data = {
'username': EMAIL,
'password': PASSWORD,
'LOGIN': 'login',
}
# Authenticate
r = session.post(URL, data=login_data)
# Try accessing a page that requires you to be logged in
r = session.get('https://lewisweb.cc.lehigh.edu/PROD/bwskfshd.P_CrseSchdDetl')
if __name__ == '__main__':
main()

You can use the requests module.
Take a look at this answer that i've linked below.
https://stackoverflow.com/a/8316989/6464893

Python httplib.InvalidURL: nonnumeric port fail

I'm trying to open a URL in Python that needs username and password. My specific implementation looks like this:
http://char_user:char_pwd#casfcddb.example.com/......
I get the following error spit to the console:
httplib.InvalidURL: nonnumeric port: 'char_pwd#casfcddb.example.com'
I'm using urllib2.urlopen, but the error is implying it doesn't understand the user credentials. That it sees the ":" and expects a port number rather than the password and actual address. Any ideas what I'm doing wrong here?

Use BasicAuthHandler for providing the password instead:
import urllib2
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, "http://casfcddb.xxx.com", "char_user", "char_pwd")
auth_handler = urllib2.HTTPBasicAuthHandler(passman)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
urllib2.urlopen("http://casfcddb.xxx.com")
or using the requests library:
import requests
requests.get("http://casfcddb.xxx.com", auth=('char_user', 'char_pwd'))

I ran into a situation where I needed BasicAuth handling and only had urllib available (no urllib2 or requests). The answer from Uku mostly worked, but here are my mods:
import urllib.request
url = 'https://your/url.xxx'
username = 'username'
password = 'password'
passman = urllib.request.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, username, password)
auth_handler = urllib.request.HTTPBasicAuthHandler(passman)
opener = urllib.request.build_opener(auth_handler)
urllib.request.install_opener(opener)
resp = urllib.request.urlopen(url)
data = resp.read()

Python client for multipart form with CAS

I am trying to write a Python script to POST a multipart form to a site that requires authentication through CAS.
There are two approaches that both solve part of the problem:
The Python requests library works well for submitting multipart forms.
There is caslib, with a login function. It returns an OpenerDirector that can presumably be used for further requests.
Unfortunately, I can't figure out how to get a complete solution out what I have so far.
There are just some ideas from a couple hours of research; I am open to just about any solution that works.
Thanks for the help.

I accepted J.F. Sebastian's answer because I think it was closest to what I'd asked, but I actually wound up getting it to work by using mechanize, Python library for web browser automation.
import argparse
import mechanize
import re
import sys
# (SENSITIVE!) Authentication info
username = r'username'
password = r'password'
# Command line arguments
parser = argparse.ArgumentParser(description='Submit lab to CS 235 site (Winter 2013)')
parser.add_argument('lab_num', help='Lab submission number')
parser.add_argument('file_name', help='Submission file (zip)')
args = parser.parse_args()
# Go to login site
br = mechanize.Browser()
br.open('https://cas.byu.edu/cas/login?service=https%3a%2f%2fbeta.cs.byu.edu%2f~sub235%2fsubmit.php')
# Login and forward to submission site
br.form = br.forms().next()
br['username'] = username
br['password'] = password
br.submit()
# Submit
br.form = br.forms().next()
br['labnum'] = list(args.lab_num)
br.add_file(open(args.file_name), 'application/zip', args.file_name)
r = br.submit()
for s in re.findall('<h4>(.+?)</?h4>', r.read()):
print s

You could use poster to prepare multipart/form-data. Try to pass poster's opener to the caslib and use caslib's opener to make requests (not tested):
import urllib2
import caslib
import poster.encode
import poster.streaminghttp
opener = poster.streaminghttp.register_openers()
r, opener = caslib.login_to_cas_service(login_url, username, password,
opener=opener)
params = {'file': open("test.txt", "rb"), 'name': 'upload test'}
datagen, headers = poster.encode.multipart_encode(params)
response = opener.open(urllib2.Request(upload_url, datagen, headers))
print response.read()

You could write a Authentication Handler for Requests using caslib. Then you could do something like:
auth = CasAuthentication("url", "login", "password")
response = requests.get("http://example.com/cas_service", auth=auth)
Or if you're making tons of requests against the website:
s = requests.session()
s.auth = auth
s.post('http://casservice.com/endpoint', data={'key', 'value'}, files={'filename': '/path/to/file'})

Download affy annotation file with python

I want to download an affymetrix annotation file. But it needs to log in first.
The log in page is https://www.affymetrix.com/estore/user/login.jsp
The file I want to download is:
http://www.affymetrix.com/Auth/analysis/downloads/na32/genotyping/GenomeWideSNP_6.na32.annot.db.zip
I have try some method but I cannot figure it out.
#
from requests import session
payload = {
'action': 'login',
'username': 'username', #This part should be changed
'password': 'password' #This part should be changed
}
with session() as c:
c.post('https://www.affymetrix.com/estore/user/login.jsp', data=payload)
request = c.get('http://www.affymetrix.com/Auth/analysis/downloads/na32/genotyping/GenomeWideSNP_6.na32.annot.db.zip')
print request.headers
print request.text
#
I also try urllib2,
import urllib, urllib2, cookielib
username = 'username'
password = 'password'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'password' : password})
opener.open('https://www.affymetrix.com/estore/user/login.jsp', login_data)
resp = opener.open('http://www.affymetrix.com/Auth/analysis/downloads/na32/genotyping/GenomeWideSNP_6.na32.annot.db.zip')
resp.read()

Here's the URL that the information is getting posted to.
https://www.affymetrix.com/estore/user/login.jsp?_DARGS=/estore/user/login.jsp
here is the information that is being posted.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do i implement cookielib to create a session - python

Related

TeamCity: urllib2.HTTPError: HTTP Error 401: Unauthorized

Scrape data from a page that requires a login

Python httplib.InvalidURL: nonnumeric port fail

Python client for multipart form with CAS

Download affy annotation file with python

Categories

Resources