I'm trying to log into the vRealize Operations Rest API using basic auth method from this question:
HTTP Basic Authentication not working in Python 3.4
So I use the second code sample:
import urllib.parse
import urllib.request
import urllib.response
userName = "username"
passWord = "password"
top_level_url = "URL"
# create an authorization handler
p = urllib.request.HTTPPasswordMgrWithDefaultRealm()
p.add_password(None, top_level_url, userName, passWord);
auth_handler = urllib.request.HTTPBasicAuthHandler(p)
opener = urllib.request.build_opener(auth_handler)
urllib.request.install_opener(opener)
try:
result = opener.open(top_level_url)
messages = result.read()
print (messages)
except IOError as e:
print (e)
and I get a blue font: HTTP Error 401: Unauthorized
So I think this is an issue with trusting the certificate or it has something to do with the headers. How can I work around this? Any advice would be appreciated.
I'm trying to open a URL in Python that needs username and password. My specific implementation looks like this:
http://char_user:char_pwd#casfcddb.example.com/......
I get the following error spit to the console:
httplib.InvalidURL: nonnumeric port: 'char_pwd#casfcddb.example.com'
I'm using urllib2.urlopen, but the error is implying it doesn't understand the user credentials. That it sees the ":" and expects a port number rather than the password and actual address. Any ideas what I'm doing wrong here?
Use BasicAuthHandler for providing the password instead:
import urllib2
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, "http://casfcddb.xxx.com", "char_user", "char_pwd")
auth_handler = urllib2.HTTPBasicAuthHandler(passman)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
urllib2.urlopen("http://casfcddb.xxx.com")
or using the requests library:
import requests
requests.get("http://casfcddb.xxx.com", auth=('char_user', 'char_pwd'))
I ran into a situation where I needed BasicAuth handling and only had urllib available (no urllib2 or requests). The answer from Uku mostly worked, but here are my mods:
import urllib.request
url = 'https://your/url.xxx'
username = 'username'
password = 'password'
passman = urllib.request.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, username, password)
auth_handler = urllib.request.HTTPBasicAuthHandler(passman)
opener = urllib.request.build_opener(auth_handler)
urllib.request.install_opener(opener)
resp = urllib.request.urlopen(url)
data = resp.read()
import urllib2
import urllib
url = "http://www.torn.com/authenticate.php"
username = raw_input("Your username; ")
password = raw_input("Your password: ")
query = {'player':username, 'password':password}
data_query = urllib.urlencode(query)
sending_data = urllib2.Request(url, data_query)
print sending_data()
response = urllib2.urlopen(sending_data)
print "You are currently logged unto to:", response.geturl()
print response.read()
How do i implement the cookielib to create a session and please explain line by line Thank you
from cookielib import CookieJar
cj = CookieJar()
login_data = urllib.urlencode({'j_username' : username, 'j_password' : password})
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.open("http://www.torn.com/authenticate.php", login_data)
First, you want to initialize your CookieJar. Then you encode your credentials, as they need to be sent in a certain form to be read by the PHP server. Then you have to initialize an opener, which is basically an HTTP client, and configure it to use your CookieJar. Then submit your login info to the server to create the session and generate cookies. To continue using these cookies, use opener.open() instead of urllib2.urlopen (although you can still use urllib2.Request() to generate the requests.
You'll want to use an OpenerDirector for that.
From the documentation:
import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
So in order to also pass data your code should look something like this:
import urllib2
import urllib
import cookielib
url = "http://www.torn.com/authenticate.php"
username = raw_input("Your username; ")
password = raw_input("Your password: ")
query = {'player':username, 'password':password}
data_query = urllib.urlencode(query)
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
response = opener.open(url, data_query)
print "You are currently logged unto to:", response.geturl()
print response.read()
I am using the requests module.
I have figured out how to submit data to a login form on a website and retrieve the session key, but I can't see an obvious way to use this session key in subsequent requests.
Can someone fill in the ellipsis in the code below or suggest another approach?
>>> import requests
>>> login_data = {'formPosted': '1', 'login_email': 'me#example.com', 'password': 'pw'}
>>> r = requests.post('https://localhost/login.py', login_data)
>>>
>>> r.text
'You are being redirected here'
>>> r.cookies
{'session_id_myapp': '127-0-0-1-825ff22a-6ed1-453b-aebc-5d3cf2987065'}
>>>
>>> r2 = requests.get('https://localhost/profile_data.json', ...)
You can easily create a persistent session using:
s = requests.Session()
After that, continue with your requests as you would:
s.post('https://localhost/login.py', login_data)
# logged in! cookies saved for future requests.
r2 = s.get('https://localhost/profile_data.json', ...)
# cookies sent automatically!
# do whatever, s will keep your cookies intact :)
For more about Sessions: https://requests.readthedocs.io/en/latest/user/advanced/#session-objects
the other answers help to understand how to maintain such a session. Additionally, I want to provide a class which keeps the session maintained over different runs of a script (with a cache file). This means a proper "login" is only performed when required (timout or no session exists in cache). Also it supports proxy settings over subsequent calls to 'get' or 'post'.
It is tested with Python3.
Use it as a basis for your own code. The following snippets are release with GPL v3
import pickle
import datetime
import os
from urllib.parse import urlparse
import requests
class MyLoginSession:
"""
a class which handles and saves login sessions. It also keeps track of proxy settings.
It does also maintine a cache-file for restoring session data from earlier
script executions.
"""
def __init__(self,
loginUrl,
loginData,
loginTestUrl,
loginTestString,
sessionFileAppendix = '_session.dat',
maxSessionTimeSeconds = 30 * 60,
proxies = None,
userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1',
debug = True,
forceLogin = False,
**kwargs):
"""
save some information needed to login the session
you'll have to provide 'loginTestString' which will be looked for in the
responses html to make sure, you've properly been logged in
'proxies' is of format { 'https' : 'https://user:pass#server:port', 'http' : ...
'loginData' will be sent as post data (dictionary of id : value).
'maxSessionTimeSeconds' will be used to determine when to re-login.
"""
urlData = urlparse(loginUrl)
self.proxies = proxies
self.loginData = loginData
self.loginUrl = loginUrl
self.loginTestUrl = loginTestUrl
self.maxSessionTime = maxSessionTimeSeconds
self.sessionFile = urlData.netloc + sessionFileAppendix
self.userAgent = userAgent
self.loginTestString = loginTestString
self.debug = debug
self.login(forceLogin, **kwargs)
def modification_date(self, filename):
"""
return last file modification date as datetime object
"""
t = os.path.getmtime(filename)
return datetime.datetime.fromtimestamp(t)
def login(self, forceLogin = False, **kwargs):
"""
login to a session. Try to read last saved session from cache file. If this fails
do proper login. If the last cache access was too old, also perform a proper login.
Always updates session cache file.
"""
wasReadFromCache = False
if self.debug:
print('loading or generating session...')
if os.path.exists(self.sessionFile) and not forceLogin:
time = self.modification_date(self.sessionFile)
# only load if file less than 30 minutes old
lastModification = (datetime.datetime.now() - time).seconds
if lastModification < self.maxSessionTime:
with open(self.sessionFile, "rb") as f:
self.session = pickle.load(f)
wasReadFromCache = True
if self.debug:
print("loaded session from cache (last access %ds ago) "
% lastModification)
if not wasReadFromCache:
self.session = requests.Session()
self.session.headers.update({'user-agent' : self.userAgent})
res = self.session.post(self.loginUrl, data = self.loginData,
proxies = self.proxies, **kwargs)
if self.debug:
print('created new session with login' )
self.saveSessionToCache()
# test login
res = self.session.get(self.loginTestUrl)
if res.text.lower().find(self.loginTestString.lower()) < 0:
raise Exception("could not log into provided site '%s'"
" (did not find successful login string)"
% self.loginUrl)
def saveSessionToCache(self):
"""
save session to a cache file
"""
# always save (to update timeout)
with open(self.sessionFile, "wb") as f:
pickle.dump(self.session, f)
if self.debug:
print('updated session cache-file %s' % self.sessionFile)
def retrieveContent(self, url, method = "get", postData = None, **kwargs):
"""
return the content of the url with respect to the session.
If 'method' is not 'get', the url will be called with 'postData'
as a post request.
"""
if method == 'get':
res = self.session.get(url , proxies = self.proxies, **kwargs)
else:
res = self.session.post(url , data = postData, proxies = self.proxies, **kwargs)
# the session has been updated on the server, so also update in cache
self.saveSessionToCache()
return res
A code snippet for using the above class may look like this:
if __name__ == "__main__":
# proxies = {'https' : 'https://user:pass#server:port',
# 'http' : 'http://user:pass#server:port'}
loginData = {'user' : 'usr',
'password' : 'pwd'}
loginUrl = 'https://...'
loginTestUrl = 'https://...'
successStr = 'Hello Tom'
s = MyLoginSession(loginUrl, loginData, loginTestUrl, successStr,
#proxies = proxies
)
res = s.retrieveContent('https://....')
print(res.text)
# if, for instance, login via JSON values required try this:
s = MyLoginSession(loginUrl, None, loginTestUrl, successStr,
#proxies = proxies,
json = loginData)
Check out my answer in this similar question:
python: urllib2 how to send cookie with urlopen request
import urllib2
import urllib
from cookielib import CookieJar
cj = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
# input-type values from the html form
formdata = { "username" : username, "password": password, "form-id" : "1234" }
data_encoded = urllib.urlencode(formdata)
response = opener.open("https://page.com/login.php", data_encoded)
content = response.read()
EDIT:
I see I've gotten a few downvotes for my answer, but no explaining comments. I'm guessing it's because I'm referring to the urllib libraries instead of requests. I do that because the OP asks for help with requests or for someone to suggest another approach.
The documentation says that get takes in an optional cookies argument allowing you to specify cookies to use:
from the docs:
>>> url = 'http://httpbin.org/cookies'
>>> cookies = dict(cookies_are='working')
>>> r = requests.get(url, cookies=cookies)
>>> r.text
'{"cookies": {"cookies_are": "working"}}'
http://docs.python-requests.org/en/latest/user/quickstart/#cookies
Upon trying all the answers above, I found that using "RequestsCookieJar" instead of the regular CookieJar for subsequent requests fixed my problem.
import requests
import json
# The Login URL
authUrl = 'https://whatever.com/login'
# The subsequent URL
testUrl = 'https://whatever.com/someEndpoint'
# Logout URL
testlogoutUrl = 'https://whatever.com/logout'
# Whatever you are posting
login_data = {'formPosted':'1',
'login_email':'me#example.com',
'password':'pw'
}
# The Authentication token or any other data that we will receive from the Authentication Request.
token = ''
# Post the login Request
loginRequest = requests.post(authUrl, login_data)
print("{}".format(loginRequest.text))
# Save the request content to your variable. In this case I needed a field called token.
token = str(json.loads(loginRequest.content)['token']) # or ['access_token']
print("{}".format(token))
# Verify Successful login
print("{}".format(loginRequest.status_code))
# Create your Requests Cookie Jar for your subsequent requests and add the cookie
jar = requests.cookies.RequestsCookieJar()
jar.set('LWSSO_COOKIE_KEY', token)
# Execute your next request(s) with the Request Cookie Jar set
r = requests.get(testUrl, cookies=jar)
print("R.TEXT: {}".format(r.text))
print("R.STCD: {}".format(r.status_code))
# Execute your logout request(s) with the Request Cookie Jar set
r = requests.delete(testlogoutUrl, cookies=jar)
print("R.TEXT: {}".format(r.text)) # should show "Request Not Authorized"
print("R.STCD: {}".format(r.status_code)) # should show 401
Save only required cookies and reuse them.
import os
import pickle
from urllib.parse import urljoin, urlparse
login = 'my#email.com'
password = 'secret'
# Assuming two cookies are used for persistent login.
# (Find it by tracing the login process)
persistentCookieNames = ['sessionId', 'profileId']
URL = 'http://example.com'
urlData = urlparse(URL)
cookieFile = urlData.netloc + '.cookie'
signinUrl = urljoin(URL, "/signin")
with requests.Session() as session:
try:
with open(cookieFile, 'rb') as f:
print("Loading cookies...")
session.cookies.update(pickle.load(f))
except Exception:
# If could not load cookies from file, get the new ones by login in
print("Login in...")
post = session.post(
signinUrl,
data={
'email': login,
'password': password,
}
)
try:
with open(cookieFile, 'wb') as f:
jar = requests.cookies.RequestsCookieJar()
for cookie in session.cookies:
if cookie.name in persistentCookieNames:
jar.set_cookie(cookie)
pickle.dump(jar, f)
except Exception as e:
os.remove(cookieFile)
raise(e)
MyPage = urljoin(URL, "/mypage")
page = session.get(MyPage)
snippet to retrieve json data, password protected
import requests
username = "my_user_name"
password = "my_super_secret"
url = "https://www.my_base_url.com"
the_page_i_want = "/my_json_data_page"
session = requests.Session()
# retrieve cookie value
resp = session.get(url+'/login')
csrf_token = resp.cookies['csrftoken']
# login, add referer
resp = session.post(url+"/login",
data={
'username': username,
'password': password,
'csrfmiddlewaretoken': csrf_token,
'next': the_page_i_want,
},
headers=dict(Referer=url+"/login"))
print(resp.json())
This will work for you in Python;
# Call JIRA API with HTTPBasicAuth
import json
import requests
from requests.auth import HTTPBasicAuth
JIRA_EMAIL = "****"
JIRA_TOKEN = "****"
BASE_URL = "https://****.atlassian.net"
API_URL = "/rest/api/3/serverInfo"
API_URL = BASE_URL+API_URL
BASIC_AUTH = HTTPBasicAuth(JIRA_EMAIL, JIRA_TOKEN)
HEADERS = {'Content-Type' : 'application/json;charset=iso-8859-1'}
response = requests.get(
API_URL,
headers=HEADERS,
auth=BASIC_AUTH
)
print(json.dumps(json.loads(response.text), sort_keys=True, indent=4, separators=(",", ": ")))
Update: based on Lee's comment I decided to condense my code to a really simple script and run it from the command line:
import urllib2
import sys
username = sys.argv[1]
password = sys.argv[2]
url = sys.argv[3]
print("calling %s with %s:%s\n" % (url, username, password))
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, username, password)
urllib2.install_opener(urllib2.build_opener(urllib2.HTTPBasicAuthHandler(passman)))
req = urllib2.Request(url)
f = urllib2.urlopen(req)
data = f.read()
print(data)
Unfortunately it still won't generate the Authorization header (per Wireshark) :(
I'm having a problem sending basic AUTH over urllib2. I took a look at this article, and followed the example. My code:
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, "api.foursquare.com", username, password)
urllib2.install_opener(urllib2.build_opener(urllib2.HTTPBasicAuthHandler(passman)))
req = urllib2.Request("http://api.foursquare.com/v1/user")
f = urllib2.urlopen(req)
data = f.read()
I'm seeing the following on the Wire via wireshark:
GET /v1/user HTTP/1.1
Host: api.foursquare.com
Connection: close
Accept-Encoding: gzip
User-Agent: Python-urllib/2.5
You can see the Authorization is not sent, vs. when I send a request via curl: curl -u user:password http://api.foursquare.com/v1/user
GET /v1/user HTTP/1.1
Authorization: Basic =SNIP=
User-Agent: curl/7.19.4 (universal-apple-darwin10.0) libcurl/7.19.4 OpenSSL/0.9.8k zlib/1.2.3
Host: api.foursquare.com
Accept: */*
For some reason my code seems to not send the authentication - anyone see what I'm missing?
thanks
-simon
The problem could be that the Python libraries, per HTTP-Standard, first send an unauthenticated request, and then only if it's answered with a 401 retry, are the correct credentials sent. If the Foursquare servers don't do "totally standard authentication" then the libraries won't work.
Try using headers to do authentication:
import urllib2, base64
request = urllib2.Request("http://api.foursquare.com/v1/user")
base64string = base64.b64encode('%s:%s' % (username, password))
request.add_header("Authorization", "Basic %s" % base64string)
result = urllib2.urlopen(request)
Had the same problem as you and found the solution from this thread: http://forums.shopify.com/categories/9/posts/27662
(copy-paste/adapted from https://stackoverflow.com/a/24048772/1733117).
First you can subclass urllib2.BaseHandler or urllib2.HTTPBasicAuthHandler, and implement http_request so that each request has the appropriate Authorization header.
import urllib2
import base64
class PreemptiveBasicAuthHandler(urllib2.HTTPBasicAuthHandler):
'''Preemptive basic auth.
Instead of waiting for a 403 to then retry with the credentials,
send the credentials if the url is handled by the password manager.
Note: please use realm=None when calling add_password.'''
def http_request(self, req):
url = req.get_full_url()
realm = None
# this is very similar to the code from retry_http_basic_auth()
# but returns a request object.
user, pw = self.passwd.find_user_password(realm, url)
if pw:
raw = "%s:%s" % (user, pw)
auth = 'Basic %s' % base64.b64encode(raw).strip()
req.add_unredirected_header(self.auth_header, auth)
return req
https_request = http_request
Then if you are lazy like me, install the handler globally
api_url = "http://api.foursquare.com/"
api_username = "johndoe"
api_password = "some-cryptic-value"
auth_handler = PreemptiveBasicAuthHandler()
auth_handler.add_password(
realm=None, # default realm.
uri=api_url,
user=api_username,
passwd=api_password)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
Here's what I'm using to deal with a similar problem I encountered while trying to access MailChimp's API. This does the same thing, just formatted nicer.
import urllib2
import base64
chimpConfig = {
"headers" : {
"Content-Type": "application/json",
"Authorization": "Basic " + base64.encodestring("hayden:MYSECRETAPIKEY").replace('\n', '')
},
"url": 'https://us12.api.mailchimp.com/3.0/'}
#perform authentication
datas = None
request = urllib2.Request(chimpConfig["url"], datas, chimpConfig["headers"])
result = urllib2.urlopen(request)
The second parameter must be a URI, not a domain name. i.e.
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, "http://api.foursquare.com/", username, password)
I would suggest that the current solution is to use my package urllib2_prior_auth which solves this pretty nicely (I work on inclusion to the standard lib.