401 Authorization Required on Twitter API request - python

I'm building an Python script to automatically post status updates on Twitter. I've carefully read and followed the API documentation for creating a signature and do the request to update the status. But I'm getting an error when doing the request.
HTTP Error 401: Authorization Required
I've seen more questions like this on Stackoverflow and other resources, but couldn't find a solution to this problem yet.
def sendTweet(self):
url = 'https://api.twitter.com/1.1/statuses/update.json'
message = 'Testing'
oauths = [
['oauth_consumer_key', '1234567890'],
['oauth_nonce', self.generate_nonce()],
['oauth_signature', ''],
['oauth_signature_method', 'HMAC-SHA1'],
['oauth_timestamp', str(int(time.time()))],
['oauth_token', '1234567890'],
['oauth_version', '1.0']
]
signature = self.sign(oauths, message)
oauths[2][1] = signature
count = 0
auth = 'OAuth '
for oauth in oauths:
count = count + 1
if oauth[1]:
auth = auth + self.quote(oauth[0]) + '="' + self.quote(oauth[1]) + '"'
if count < len(oauths):
auth = auth + ', '
headers = {
'Authorization': auth,
'Content-Type': 'application/x-www-form-urlencoded'
}
payload = {'status' : message.encode('utf-8')}
req = urllib2.Request(url, data=urllib.urlencode(payload), headers=headers)
try:
response = urllib2.urlopen(req)
data = response.read()
print data
except (urllib2.HTTPError, urllib2.URLError) as err:
print err
This is what the self.sign function does
def sign(self, oauths, message):
signatureParams = ''
count = 0
values = [
['oauth_consumer_key', oauths[0][1]],
['oauth_nonce', oauths[1][1]],
['oauth_signature_method', oauths[3][1]],
['oauth_timestamp', oauths[4][1]],
['oauth_token', oauths[5][1]],
['oauth_version', oauths[6][1]],
['status', message]
]
customerSecret = '1234567890'
tokenSecret = '1234567890'
for value in oauths:
count = count + 1
signatureParams = signatureParams + self.quote(value[0]) + '=' + self.quote(value[1])
if count < len(oauths):
signatureParams = signatureParams + '&'
signature = 'POST&' + self.quote(self.endPoint) + '&' + self.quote(signatureParams)
signinKey = self.quote(customerSecret) + '&' + self.quote(tokenSecret)
signed = base64.b64encode(self.signRequest(signinKey, signature))
return signed
And below are the helper functions.
def generate_nonce(self):
return ''.join([str(random.randint(0, 9)) for i in range(8)])
def quote(self, param):
return urllib.quote(param.encode('utf8'), safe='')
def signRequest(self, key, raw):
from hashlib import sha1
import hmac
hashed = hmac.new(key, raw, sha1)
return hashed.digest().encode('base64').rstrip('\n')
This is what the auth string looks like when I print it.
OAuth oauth_consumer_key="1234567890",
oauth_nonce="78261149",
oauth_signature="akc2alpFWWdjbElZV2RCMmpGcDc0V1d1S1B3PQ%3D%3D",
oauth_signature_method="HMAC-SHA1", oauth_timestamp="1540896473",
oauth_token="1234567890",
oauth_version="1.0"
That is exactly like it should be if I read the Twitter documentation. But still I get the 401 Authorization Required error.. I just can't see what is going wrong here.
The Access level for the keys is Read, write, and direct messages
All keys are replaced with 1234567890, but in the actual code I use the real ones.
Does someone have an idea what I am doing wrong? Thanks in advance!

Related

Unable to send authenticated OKEx API POST requests with body (in python). 401 Signature invalid. Authenticated GET requests work

As the title suggests I am able to send the authenticated GET requests for the OKEx API without any issues. However as soon as I make a POST request by adding in the extra parameters in the signature as well as in the request's body I get the 401 response {'msg': 'Invalid Sign', 'code': '50113'}.
def signature(timestamp, method, request_path, body,secret_key):
if str(body) == '{}' or str(body) == 'None':
message = str(timestamp) + str.upper(method) + request_path + ''
else:
message = str(timestamp) + str.upper(method) + request_path +body
mac = hmac.new(bytes(secret_key, encoding='utf8'), bytes(message, encoding='utf-8'), digestmod='sha256')
d = mac.digest()
return base64.b64encode(d)
def get_header(endpoint,request,body={}):
header = dict()
header['CONTENT-TYPE'] = 'application/json'
header['OK-ACCESS-KEY'] = okex_key
current_time=get_time()
header['OK-ACCESS-SIGN'] = signature(current_time, request, endpoint , body, okex_secret)
header['OK-ACCESS-TIMESTAMP'] = str(current_time)
header['OK-ACCESS-PASSPHRASE'] = okex_pass
return header
def place_market_order(url,pair,side,amount,tdMode='cash'):
endpoint='/api/v5/trade/order'
request='POST'
body={
"instId":pair,
"tdMode":tdMode, #cash, cross, isolated
"side":side,
"ordType":"market",
"sz":str(Decimal(str(amount)))
}
body = json.dumps(body)
header = get_header(endpoint,request,body)
response= requests.post(url+endpoint, headers=header,data=body)
return response
#
url = 'http://www.okex.com'
place_market_order(url,pair="BTC-USDT",side="buy",amount=0.001)
The signature logic should be ok since authenticated GET requests work.
But what's wrong with how I handle the body? I'm using v5 of the OKEx API.
Thanks in advance!
Under place_market_order function:
response= requests.post(url+endpoint, headers=header,data=body)
should be as follows
response= requests.post(url=url+endpoint, headers=header,data=body)
You need send POST request as JSON formate.
Good example:
{"instId":"BTC-USDT","tdMode":"cash","side":"buy","ordType":"limit","sz":"0.00001","px":"30000"}
Bad example:
instId=BTC-USDT&tdMode=cash&side=buy&ordType=limit&sz=0.00001&px=30000

Shouldn't be accessing this method but get UnboundLocalError: local variable 'response' referenced before assignment

I am getting an error on this line
File "/Users/j/udacity/item_catalog/item-catalog-1490/application.py", line 171, in fbconnect
response.headers['Content-Type'] = 'application/json'
UnboundLocalError: local variable 'response' referenced before assignment
But I don't understand why it's even going to that method because the two states seem to be equal
***************state C155X84FWY0WLD2V5VS2UZDDWXN72F1O
************* login C155X84FWY0WLD2V5VS2UZDDWXN72F1O
But inside that method I have a print for state not equal and it doesn't print that out. In the browser I am getting a 500 error.
#app.route('/fbconnect', methods=['POST'])
def fbconnect():
print "***************state" ,request.args.get('state')
print "*************login session state", login_session['state']
if request.args.get('state') != login_session['state']:
print "*********************state not equal"
response = make_response(json.dumps('Invalid state parameter.'), 401)
response.headers['Content-Type'] = 'application/json'
return response
I'm posting the full content below
#app.route('/fbconnect', methods=['POST'])
def fbconnect():
#state = ''.join(random.choice(string.ascii_uppercase + string.digits) for x in xrange(32))
#login_session['state'] = state
print "***************state" ,request.args.get('state')
print "*************login session state", login_session['state']
if request.args.get('state') != login_session['state']:
print "*********************state not equal"
response = make_response(json.dumps('Invalid state parameter.'), 401)
response.headers['Content-Type'] = 'application/json'
return response
access_token = request.data
app_id = json.loads(open('fb_client_secrets.json', 'r').read())['web']['app_id']
app_secret = json.loads(open('fb_client_secrets.json', 'r').read())['web']['app_secret']
url = 'https://graph.facebook.com/oauth/access_token?grant_type=fb_exchange_token&client_id=%s&client_secret=%s&fb_exchange_token=%s' % (app_id, app_secret, access_token)
h = httplib2.Http()
result = h.request(url, 'GET')[1]
# Use token to get user info from API
userinfo_url = "https://graph.facebook.com/v2.4/me"
token = result.split("&")[0]
url = 'https://graph.facebook.com/v2.4/me?%s&fields=name,id,email' % token
h = httplib2.Http()
result = h.request(url, 'GET')[1]
data = json.loads(result)
login_session['provider'] = 'facebook'
login_session['username'] = data["name"]
login_session['email'] = data["email"]
login_session['facebook_id'] = data["id"]
# The token must be stored in the login_session in order to properly logout, let's strip out the information before the equals sign in our token
stored_token = token.split("=")[1]
login_session['access_token'] = stored_token
# Get user picture
url = 'https://graph.facebook.com/v2.4/me/picture?%s&redirect=0&height=200&width=200' % token
h = httplib2.Http()
result = h.request(url, 'GET')[1]
data = json.loads(result)
login_session['picture'] = data["data"]["url"]
# see if user exists
user_id = getUserID(login_session['email'])
print "user_id", user_id
if not user_id:
user_id = createUser(login_session)
login_session['user_id'] = user_id
output = ''
output += '<h1>Welcome, '
output += login_session['username']
output += '!</h1>'
output += '<img src="'
output += login_session['picture']
output += ' " style = "width: 300px; height: 300px;border-radius: 150px;-webkit-border-radius: 150px;-moz-border-radius: 150px;"> '
flash("Now logged in as %s" % login_session['username'])
#session['name']=login_session['username']
#print session['']
return output

Is There Any Way To Check if a Twitch Stream Is Live Using Python?

I'm just wondering if there is any way to write a python script to check to see if a twitch.tv stream is live?
I'm not sure why my app engine tag was removed, but this would be using app engine.
Since all answers are actually outdated as of 2020-05-02, i'll give it a shot. You now are required to register a developer application (I believe), and now you must use an endpoint that requires a user-id instead of a username (as they can change).
See https://dev.twitch.tv/docs/v5/reference/users
and https://dev.twitch.tv/docs/v5/reference/streams
First you'll need to Register an application
From that you'll need to get your Client-ID.
The one in this example is not a real
TWITCH_STREAM_API_ENDPOINT_V5 = "https://api.twitch.tv/kraken/streams/{}"
API_HEADERS = {
'Client-ID' : 'tqanfnani3tygk9a9esl8conhnaz6wj',
'Accept' : 'application/vnd.twitchtv.v5+json',
}
reqSession = requests.Session()
def checkUser(userID): #returns true if online, false if not
url = TWITCH_STREAM_API_ENDPOINT_V5.format(userID)
try:
req = reqSession.get(url, headers=API_HEADERS)
jsondata = req.json()
if 'stream' in jsondata:
if jsondata['stream'] is not None: #stream is online
return True
else:
return False
except Exception as e:
print("Error checking user: ", e)
return False
I hated having to go through the process of making an api key and all those things just to check if a channel was live, so i tried to find a workaround:
As of june 2021 if you send a http get request to a url like https://www.twitch.tv/CHANNEL_NAME, in the response there will be a "isLiveBroadcast": true if the stream is live, and if the stream is not live, there will be nothing like that.
So i wrote this code as an example in nodejs:
const fetch = require('node-fetch');
const channelName = '39daph';
async function main(){
let a = await fetch(`https://www.twitch.tv/${channelName}`);
if( (await a.text()).includes('isLiveBroadcast') )
console.log(`${channelName} is live`);
else
console.log(`${channelName} is not live`);
}
main();
here is also an example in python:
import requests
channelName = '39daph'
contents = requests.get('https://www.twitch.tv/' +channelName).content.decode('utf-8')
if 'isLiveBroadcast' in contents:
print(channelName + ' is live')
else:
print(channelName + ' is not live')
It looks like Twitch provides an API (documentation here) that provides a way to get that info. A very simple example of getting the feed would be:
import urllib2
url = 'http://api.justin.tv/api/stream/list.json?channel=FollowGrubby'
contents = urllib2.urlopen(url)
print contents.read()
This will dump all of the info, which you can then parse with a JSON library (XML looks to be available too). Looks like the value returns empty if the stream isn't live (haven't tested this much at all, nor have I read anything :) ). Hope this helps!
RocketDonkey's fine answer seems to be outdated by now, so I'm posting an updated answer for people like me who stumble across this SO-question with google.
You can check the status of the user EXAMPLEUSER by parsing
https://api.twitch.tv/kraken/streams/EXAMPLEUSER
The entry "stream":null will tell you that the user if offline, if that user exists.
Here is a small Python script which you can use on the commandline that will print 0 for user online, 1 for user offline and 2 for user not found.
#!/usr/bin/env python3
# checks whether a twitch.tv userstream is live
import argparse
from urllib.request import urlopen
from urllib.error import URLError
import json
def parse_args():
""" parses commandline, returns args namespace object """
desc = ('Check online status of twitch.tv user.\n'
'Exit prints are 0: online, 1: offline, 2: not found, 3: error.')
parser = argparse.ArgumentParser(description = desc,
formatter_class = argparse.RawTextHelpFormatter)
parser.add_argument('USER', nargs = 1, help = 'twitch.tv username')
args = parser.parse_args()
return args
def check_user(user):
""" returns 0: online, 1: offline, 2: not found, 3: error """
url = 'https://api.twitch.tv/kraken/streams/' + user
try:
info = json.loads(urlopen(url, timeout = 15).read().decode('utf-8'))
if info['stream'] == None:
status = 1
else:
status = 0
except URLError as e:
if e.reason == 'Not Found' or e.reason == 'Unprocessable Entity':
status = 2
else:
status = 3
return status
# main
try:
user = parse_args().USER[0]
print(check_user(user))
except KeyboardInterrupt:
pass
Here is a more up to date answer using the latest version of the Twitch API (helix). (kraken is deprecated and you shouldn't use GQL since it's not documented for third party use).
It works but you should store the token and reuse the token rather than generate a new token every time you run the script.
import requests
client_id = ''
client_secret = ''
streamer_name = ''
body = {
'client_id': client_id,
'client_secret': client_secret,
"grant_type": 'client_credentials'
}
r = requests.post('https://id.twitch.tv/oauth2/token', body)
#data output
keys = r.json();
print(keys)
headers = {
'Client-ID': client_id,
'Authorization': 'Bearer ' + keys['access_token']
}
print(headers)
stream = requests.get('https://api.twitch.tv/helix/streams?user_login=' + streamer_name, headers=headers)
stream_data = stream.json();
print(stream_data);
if len(stream_data['data']) == 1:
print(streamer_name + ' is live: ' + stream_data['data'][0]['title'] + ' playing ' + stream_data['data'][0]['game_name']);
else:
print(streamer_name + ' is not live');
📚 Explanation
Now, the Twitch API v5 is deprecated. The helix API is in place, where an OAuth Authorization Bearer AND client-id is needed. This is pretty annoying, so I went on a search for a viable workaround, and found one.
🌎 GraphQL
When inspecting Twitch's network requests, while not being logged in, I found out the anonymous API relies on GraphQL. GraphQL is a query language for APIs.
query {
user(login: "USERNAME") {
stream {
id
}
}
}
In the graphql query above, we are querying a user by their login name. If they are streaming, the stream's id will be given. If not, None will be returned.
🐍 The Final Code
The finished python code, in a function, is below. The client-id is taken from Twitch's website. Twitch uses the client-id to fetch information for anonymous users. It will always work, without the need of getting your own client-id.
import requests
# ...
def checkIfUserIsStreaming(username):
url = "https://gql.twitch.tv/gql"
query = "query {\n user(login: \""+username+"\") {\n stream {\n id\n }\n }\n}"
return True if requests.request("POST", url, json={"query": query, "variables": {}}, headers={"client-id": "kimne78kx3ncx6brgo4mv6wki5h1ko"}).json()["data"]["user"]["stream"] else False
I've created a website where you can play with Twitch's GraphQL API. Refer to the GraphQL Docs for help on GraphQL syntax! There's also Twitch GraphQL API documentation on my playground.
Use the twitch api with your client_id as a parameter, then parse the json:
https://api.twitch.tv/kraken/streams/massansc?client_id=XXXXXXX
Twitch Client Id is explained here: https://dev.twitch.tv/docs#client-id,
you need to register a developer application: https://www.twitch.tv/kraken/oauth2/clients/new
Example:
import requests
import json
def is_live_stream(streamer_name, client_id):
twitch_api_stream_url = "https://api.twitch.tv/kraken/streams/" \
+ streamer_name + "?client_id=" + client_id
streamer_html = requests.get(twitch_api_stream_url)
streamer = json.loads(streamer_html.content)
return streamer["stream"] is not None
I'll try to shoot my shot, just in case someone still needs an answer to this, so here it goes
import requests
import time
from twitchAPI.twitch import Twitch
client_id = ""
client_secret = ""
twitch = Twitch(client_id, client_secret)
twitch.authenticate_app([])
TWITCH_STREAM_API_ENDPOINT_V5 = "https://api.twitch.tv/kraken/streams/{}"
API_HEADERS = {
'Client-ID' : client_id,
'Accept' : 'application/vnd.twitchtv.v5+json',
}
def checkUser(user): #returns true if online, false if not
userid = twitch.get_users(logins=[user])['data'][0]['id']
url = TWITCH_STREAM_API_ENDPOINT_V5.format(userid)
try:
req = requests.Session().get(url, headers=API_HEADERS)
jsondata = req.json()
if 'stream' in jsondata:
if jsondata['stream'] is not None:
return True
else:
return False
except Exception as e:
print("Error checking user: ", e)
return False
print(checkUser('michaelreeves'))
https://dev.twitch.tv/docs/api/reference#get-streams
import requests
# ================================================================
# your twitch client id
client_id = ''
# your twitch secret
client_secret = ''
# twitch username you want to check if it is streaming online
twitch_user = ''
# ================================================================
#getting auth token
url = 'https://id.twitch.tv/oauth2/token'
params = {
'client_id':client_id,
'client_secret':client_secret,
'grant_type':'client_credentials'}
req = requests.post(url=url,params=params)
token = req.json()['access_token']
print(f'{token=}')
# ================================================================
#getting user data (user id for example)
url = f'https://api.twitch.tv/helix/users?login={twitch_user}'
headers = {
'Authorization':f'Bearer {token}',
'Client-Id':f'{client_id}'}
req = requests.get(url=url,headers=headers)
userdata = req.json()
userid = userdata['data'][0]['id']
print(f'{userid=}')
# ================================================================
#getting stream info (by user id for example)
url = f'https://api.twitch.tv/helix/streams?user_id={userid}'
headers = {
'Authorization':f'Bearer {token}',
'Client-Id':f'{client_id}'}
req = requests.get(url=url,headers=headers)
streaminfo = req.json()
print(f'{streaminfo=}')
# ================================================================
This solution doesn't require registering an application
import requests
HEADERS = { 'client-id' : 'kimne78kx3ncx6brgo4mv6wki5h1ko' }
GQL_QUERY = """
query($login: String) {
user(login: $login) {
stream {
id
}
}
}
"""
def isLive(username):
QUERY = {
'query': GQL_QUERY,
'variables': {
'login': username
}
}
response = requests.post('https://gql.twitch.tv/gql',
json=QUERY, headers=HEADERS)
dict_response = response.json()
return True if dict_response['data']['user']['stream'] is not None else False
if __name__ == '__main__':
USERS = ['forsen', 'offineandy', 'dyrus']
for user in USERS:
IS_LIVE = isLive(user)
print(f'User {user} live: {IS_LIVE}')
Yes.
You can use Twitch API call https://api.twitch.tv/kraken/streams/YOUR_CHANNEL_NAME and parse result to check if it's live.
The below function returns a streamID if the channel is live, else returns -1.
import urllib2, json, sys
TwitchChannel = 'A_Channel_Name'
def IsTwitchLive(): # return the stream Id is streaming else returns -1
url = str('https://api.twitch.tv/kraken/streams/'+TwitchChannel)
streamID = -1
respose = urllib2.urlopen(url)
html = respose.read()
data = json.loads(html)
try:
streamID = data['stream']['_id']
except:
streamID = -1
return int(streamID)

instapaper and oauth - 403 "Not logged in" error

I am trying to use the instapaper API, but I keep getting a 403 error for my requests. Here's the code:
consumer_key='...'
consumer_secret='...'
access_token_url = 'https://www.instapaper.com/api/1/oauth/access_token'
consumer = oauth.Consumer(consumer_key, consumer_secret)
client = oauth.Client(consumer)
client.add_credentials('...','...')
params = {}
params["x_auth_username"] = '..'
params["x_auth_password"] = '...'
params["x_auth_mode"] = 'client_auth'
client.set_signature_method = oauth.SignatureMethod_HMAC_SHA1()
resp, token = client.request(access_token_url, method="POST",body=urllib.urlencode(params))
result = simplejson.load(urllib.urlopen('https://www.instapaper.com/api/1/bookmarks/list?' + token))
Any ideas?
You're right about the signature method.
But my main problem was that I wasn't handling the token appropriately. Here's the working code:
consumer = oauth.Consumer('key', 'secret')
client = oauth.Client(consumer)
# Get access token
resp, content = client.request('https://www.instapaper.com/api/1/oauth/access_token', "POST", urllib.urlencode({
'x_auth_mode': 'client_auth',
'x_auth_username': 'uname',
'x_auth_password': 'pass'
}))
token = dict(urlparse.parse_qsl(content))
token = oauth.Token(token['oauth_token'], token['oauth_token_secret'])
http = oauth.Client(consumer, token)
# Get starred items
response, data = http.request('https://www.instapaper.com/api/1/bookmarks/list', method='POST', body=urllib.urlencode({
'folder_id': 'starred',
'limit': '100'
}))
res = simplejson.loads(data)
First, make sure oauth2 is the library you're using. It's the most well-maintained python oauth module.
Second, this looks suspect:
client.set_signature_method = oauth.SignatureMethod_HMAC_SHA1()
You're replacing the set_signature_method function. It should be:
client.set_signature_method(oauth.SignatureMethod_HMAC_SHA1())
You should follow the example here: https://github.com/simplegeo/python-oauth2/blob/master/example/client.py

Parse an HTTP request Authorization header with Python

I need to take a header like this:
Authorization: Digest qop="chap",
realm="testrealm#host.com",
username="Foobear",
response="6629fae49393a05397450978507c4ef1",
cnonce="5ccc069c403ebaf9f0171e9517f40e41"
And parse it into this using Python:
{'protocol':'Digest',
'qop':'chap',
'realm':'testrealm#host.com',
'username':'Foobear',
'response':'6629fae49393a05397450978507c4ef1',
'cnonce':'5ccc069c403ebaf9f0171e9517f40e41'}
Is there a library to do this, or something I could look at for inspiration?
I'm doing this on Google App Engine, and I'm not sure if the Pyparsing library is available, but maybe I could include it with my app if it is the best solution.
Currently I'm creating my own MyHeaderParser object and using it with reduce() on the header string. It's working, but very fragile.
Brilliant solution by nadia below:
import re
reg = re.compile('(\w+)[=] ?"?(\w+)"?')
s = """Digest
realm="stackoverflow.com", username="kixx"
"""
print str(dict(reg.findall(s)))
A little regex:
import re
reg=re.compile('(\w+)[:=] ?"?(\w+)"?')
>>>dict(reg.findall(headers))
{'username': 'Foobear', 'realm': 'testrealm', 'qop': 'chap', 'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 'response': '6629fae49393a05397450978507c4ef1', 'Authorization': 'Digest'}
You can also use urllib2 as [CheryPy][1] does.
here is the snippet:
input= """
Authorization: Digest qop="chap",
realm="testrealm#host.com",
username="Foobear",
response="6629fae49393a05397450978507c4ef1",
cnonce="5ccc069c403ebaf9f0171e9517f40e41"
"""
import urllib2
field, sep, value = input.partition("Authorization: Digest ")
if value:
items = urllib2.parse_http_list(value)
opts = urllib2.parse_keqv_list(items)
opts['protocol'] = 'Digest'
print opts
it outputs:
{'username': 'Foobear', 'protocol': 'Digest', 'qop': 'chap', 'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 'realm': 'testrealm#host.com', 'response': '6629fae49393a05397450978507c4ef1'}
[1]: https://web.archive.org/web/20130118133623/http://www.google.com:80/codesearch/p?hl=en#OQvO9n2mc04/CherryPy-3.0.1/cherrypy/lib/httpauth.py&q=Authorization Digest http lang:python
Here's my pyparsing attempt:
text = """Authorization: Digest qop="chap",
realm="testrealm#host.com",
username="Foobear",
response="6629fae49393a05397450978507c4ef1",
cnonce="5ccc069c403ebaf9f0171e9517f40e41" """
from pyparsing import *
AUTH = Keyword("Authorization")
ident = Word(alphas,alphanums)
EQ = Suppress("=")
quotedString.setParseAction(removeQuotes)
valueDict = Dict(delimitedList(Group(ident + EQ + quotedString)))
authentry = AUTH + ":" + ident("protocol") + valueDict
print authentry.parseString(text).dump()
which prints:
['Authorization', ':', 'Digest', ['qop', 'chap'], ['realm', 'testrealm#host.com'],
['username', 'Foobear'], ['response', '6629fae49393a05397450978507c4ef1'],
['cnonce', '5ccc069c403ebaf9f0171e9517f40e41']]
- cnonce: 5ccc069c403ebaf9f0171e9517f40e41
- protocol: Digest
- qop: chap
- realm: testrealm#host.com
- response: 6629fae49393a05397450978507c4ef1
- username: Foobear
I'm not familiar with the RFC, but I hope this gets you rolling.
An older question but one I found very helpful.
(edit after recent upvote) I've created a package that builds on
this answer (link to tests to see how to use the class in the
package).
pip install authparser
I needed a parser to handle any properly formed Authorization header, as defined by RFC7235 (raise your hand if you enjoy reading ABNF).
Authorization = credentials
BWS = <BWS, see [RFC7230], Section 3.2.3>
OWS = <OWS, see [RFC7230], Section 3.2.3>
Proxy-Authenticate = *( "," OWS ) challenge *( OWS "," [ OWS
challenge ] )
Proxy-Authorization = credentials
WWW-Authenticate = *( "," OWS ) challenge *( OWS "," [ OWS challenge
] )
auth-param = token BWS "=" BWS ( token / quoted-string )
auth-scheme = token
challenge = auth-scheme [ 1*SP ( token68 / [ ( "," / auth-param ) *(
OWS "," [ OWS auth-param ] ) ] ) ]
credentials = auth-scheme [ 1*SP ( token68 / [ ( "," / auth-param )
*( OWS "," [ OWS auth-param ] ) ] ) ]
quoted-string = <quoted-string, see [RFC7230], Section 3.2.6>
token = <token, see [RFC7230], Section 3.2.6>
token68 = 1*( ALPHA / DIGIT / "-" / "." / "_" / "~" / "+" / "/" )
*"="
Starting with PaulMcG's answer, I came up with this:
import pyparsing as pp
tchar = '!#$%&\'*+-.^_`|~' + pp.nums + pp.alphas
t68char = '-._~+/' + pp.nums + pp.alphas
token = pp.Word(tchar)
token68 = pp.Combine(pp.Word(t68char) + pp.ZeroOrMore('='))
scheme = token('scheme')
header = pp.Keyword('Authorization')
name = pp.Word(pp.alphas, pp.alphanums)
value = pp.quotedString.setParseAction(pp.removeQuotes)
name_value_pair = name + pp.Suppress('=') + value
params = pp.Dict(pp.delimitedList(pp.Group(name_value_pair)))
credentials = scheme + (token68('token') ^ params('params'))
auth_parser = header + pp.Suppress(':') + credentials
This allows for parsing any Authorization header:
parsed = auth_parser.parseString('Authorization: Basic Zm9vOmJhcg==')
print('Authenticating with {0} scheme, token: {1}'.format(parsed['scheme'], parsed['token']))
which outputs:
Authenticating with Basic scheme, token: Zm9vOmJhcg==
Bringing it all together into an Authenticator class:
import pyparsing as pp
from base64 import b64decode
import re
class Authenticator:
def __init__(self):
"""
Use pyparsing to create a parser for Authentication headers
"""
tchar = "!#$%&'*+-.^_`|~" + pp.nums + pp.alphas
t68char = '-._~+/' + pp.nums + pp.alphas
token = pp.Word(tchar)
token68 = pp.Combine(pp.Word(t68char) + pp.ZeroOrMore('='))
scheme = token('scheme')
auth_header = pp.Keyword('Authorization')
name = pp.Word(pp.alphas, pp.alphanums)
value = pp.quotedString.setParseAction(pp.removeQuotes)
name_value_pair = name + pp.Suppress('=') + value
params = pp.Dict(pp.delimitedList(pp.Group(name_value_pair)))
credentials = scheme + (token68('token') ^ params('params'))
# the moment of truth...
self.auth_parser = auth_header + pp.Suppress(':') + credentials
def authenticate(self, auth_header):
"""
Parse auth_header and call the correct authentication handler
"""
authenticated = False
try:
parsed = self.auth_parser.parseString(auth_header)
scheme = parsed['scheme']
details = parsed['token'] if 'token' in parsed.keys() else parsed['params']
print('Authenticating using {0} scheme'.format(scheme))
try:
safe_scheme = re.sub("[!#$%&'*+-.^_`|~]", '_', scheme.lower())
handler = getattr(self, 'auth_handle_' + safe_scheme)
authenticated = handler(details)
except AttributeError:
print('This is a valid Authorization header, but we do not handle this scheme yet.')
except pp.ParseException as ex:
print('Not a valid Authorization header')
print(ex)
return authenticated
# The following methods are fake, of course. They should use what's passed
# to them to actually authenticate, and return True/False if successful.
# For this demo I'll just print some of the values used to authenticate.
#staticmethod
def auth_handle_basic(token):
print('- token is {0}'.format(token))
try:
username, password = b64decode(token).decode().split(':', 1)
except Exception:
raise DecodeError
print('- username is {0}'.format(username))
print('- password is {0}'.format(password))
return True
#staticmethod
def auth_handle_bearer(token):
print('- token is {0}'.format(token))
return True
#staticmethod
def auth_handle_digest(params):
print('- username is {0}'.format(params['username']))
print('- cnonce is {0}'.format(params['cnonce']))
return True
#staticmethod
def auth_handle_aws4_hmac_sha256(params):
print('- Signature is {0}'.format(params['Signature']))
return True
To test this class:
tests = [
'Authorization: Digest qop="chap", realm="testrealm#example.com", username="Foobar", response="6629fae49393a05397450978507c4ef1", cnonce="5ccc069c403ebaf9f0171e9517f40e41"',
'Authorization: Bearer cn389ncoiwuencr',
'Authorization: Basic Zm9vOmJhcg==',
'Authorization: AWS4-HMAC-SHA256 Credential="AKIAIOSFODNN7EXAMPLE/20130524/us-east-1/s3/aws4_request", SignedHeaders="host;range;x-amz-date", Signature="fe5f80f77d5fa3beca038a248ff027d0445342fe2855ddc963176630326f1024"',
'Authorization: CrazyCustom foo="bar", fizz="buzz"',
]
authenticator = Authenticator()
for test in tests:
authenticator.authenticate(test)
print()
Which outputs:
Authenticating using Digest scheme
- username is Foobar
- cnonce is 5ccc069c403ebaf9f0171e9517f40e41
Authenticating using Bearer scheme
- token is cn389ncoiwuencr
Authenticating using Basic scheme
- token is Zm9vOmJhcg==
- username is foo
- password is bar
Authenticating using AWS4-HMAC-SHA256 scheme
- signature is fe5f80f77d5fa3beca038a248ff027d0445342fe2855ddc963176630326f1024
Authenticating using CrazyCustom scheme
This is a valid Authorization header, but we do not handle this scheme yet.
In future if we wish to handle CrazyCustom we'll just add
def auth_handle_crazycustom(params):
If those components will always be there, then a regex will do the trick:
test = '''Authorization: Digest qop="chap", realm="testrealm#host.com", username="Foobear", response="6629fae49393a05397450978507c4ef1", cnonce="5ccc069c403ebaf9f0171e9517f40e41"'''
import re
re_auth = re.compile(r"""
Authorization:\s*(?P<protocol>[^ ]+)\s+
qop="(?P<qop>[^"]+)",\s+
realm="(?P<realm>[^"]+)",\s+
username="(?P<username>[^"]+)",\s+
response="(?P<response>[^"]+)",\s+
cnonce="(?P<cnonce>[^"]+)"
""", re.VERBOSE)
m = re_auth.match(test)
print m.groupdict()
produces:
{ 'username': 'Foobear',
'protocol': 'Digest',
'qop': 'chap',
'cnonce': '5ccc069c403ebaf9f0171e9517f40e41',
'realm': 'testrealm#host.com',
'response': '6629fae49393a05397450978507c4ef1'
}
I would recommend finding a correct library for parsing http headers unfortunately I can't reacall any. :(
For a while check the snippet below (it should mostly work):
input= """
Authorization: Digest qop="chap",
realm="testrealm#host.com",
username="Foob,ear",
response="6629fae49393a05397450978507c4ef1",
cnonce="5ccc069c403ebaf9f0171e9517f40e41"
"""
field, sep, value = input.partition(":")
if field.endswith('Authorization'):
protocol, sep, opts_str = value.strip().partition(" ")
opts = {}
for opt in opts_str.split(",\n"):
key, value = opt.strip().split('=')
key = key.strip(" ")
value = value.strip(' "')
opts[key] = value
opts['protocol'] = protocol
print opts
Your original concept of using PyParsing would be the best approach. What you've implicitly asked for is something that requires a grammar... that is, a regular expression or simple parsing routine is always going to be brittle, and that sounds like it's something you're trying to avoid.
It appears that getting pyparsing on google app engine is easy: How do I get PyParsing set up on the Google App Engine?
So I'd go with that, and then implement the full HTTP authentication/authorization header support from rfc2617.
The http digest Authorization header field is a bit of an odd beast. Its format is similar to that of rfc 2616's Cache-Control and Content-Type header fields, but just different enough to be incompatible. If you're still looking for a library that's a little smarter and more readable than the regex, you might try removing the Authorization: Digest part with str.split() and parsing the rest with parse_dict_header() from Werkzeug's http module. (Werkzeug can be installed on App Engine.)
Nadia's regex only matches alphanumeric characters for the value of a parameter. That means it fails to parse at least two fields. Namely, the uri and qop. According to RFC 2617, the uri field is a duplicate of the string in the request line (i.e. the first line of the HTTP request). And qop fails to parse correctly if the value is "auth-int" due to the non-alphanumeric '-'.
This modified regex allows the URI (or any other value) to contain anything but ' ' (space), '"' (qoute), or ',' (comma). That's probably more permissive than it needs to be, but shouldn't cause any problems with correctly formed HTTP requests.
reg re.compile('(\w+)[:=] ?"?([^" ,]+)"?')
Bonus tip: From there, it's fairly straight forward to convert the example code in RFC-2617 to python. Using python's md5 API, "MD5Init()" becomes "m = md5.new()", "MD5Update()" becomes "m.update()" and "MD5Final()" becomes "m.digest()".
If your response comes in a single string that that never varies and has as many lines as there are expressions to match, you can split it into an array on the newlines called authentication_array and use regexps:
pattern_array = ['qop', 'realm', 'username', 'response', 'cnonce']
i = 0
parsed_dict = {}
for line in authentication_array:
pattern = "(" + pattern_array[i] + ")" + "=(\".*\")" # build a matching pattern
match = re.search(re.compile(pattern), line) # make the match
if match:
parsed_dict[match.group(1)] = match.group(2)
i += 1

Categories

Resources