Auto renew the Kerberos ticket - python

I had to use Kerberos authentication for the first time, it kinda works, but I feel like missing a lot of understanding what is going on and how to cook it properly.
Basically, what I need is my Python script to run every couple of hours and send a request to a remote web server using domain account in MS AD environment.
The following code provides me with the ready to go session instance:
from requests import Session
import gssapi
from requests_gssapi import HTTPSPNEGOAuth
session = Session()
session.auth = HTTPSPNEGOAuth(mech=gssapi.mechs.Mechanism.from_sasl_name("SPNEGO"))
The script was added to the crontab of a user in a linux box and kinit was used to obtain a ticket-granting ticket:
kinit -kt ~/ad_user.keytab ad_user#DOMAIN.COM
But after a while it all stopped because of the expired ticket. The solution was simple: adding the kinit to the crontab to run every 8 hours solved the issue.
I'm wondering if there is a better and more proper way to achieve the same? If I don't want/need to create a principal for the server in the AD, but simply want some code to always have a valid ticket - can I avoid having a dedicated task in users' crontab?

Why don't you initiate the ticket cache directly in your code? This might help be more transparent that your job relies on a kerberos login and where it's located. (In 5 years when you come back to this code it will be hard to remember and this might save you some grief later.)
It will also help to ensure that someone can't accidentally disable your job by removing a cron job.
kinit from python script using keytab
I would do this in every script you require authentication to reduce external dependencies.

Related

How do I stop git from asking credentials when I try to clone a repository that doesn't exist?

I'm doing some research and I need to download a very large number of git repositories, something like 17k+.
I wrote a very simple Python script to automate the cloning routine from a dataset containing the GitHub URLs.
first_10 = data.url
name = data.project_name
for x, i in zip(first_10, name):
os.system('git clone {} D:\gitres\{}'.format(x,i))
It just iterates over those two pandas dataframe columns for the URL to be cloned and the folder name.
Here comes the problem: every time the script finds a URL that no longer exists on GitHub, the script halts its routine, asks for credentials and won't resume until my input. Doesn't matter if I input correct credentials or gibberish, it will do this every time it finds an invalid GitHub URL. How do I stop git from asking those credentials?
The reason GitHub sends you a 401 to prompt for credentials if the repository is missing is because they don't want to leak whether a private repository exists. If they didn't prompt, you could easily determine that the repository does exist by getting a 401 and that it doesn't by getting a 404. Instead, GitHub always prompts for credentials, and only then returns a 404 if the repository doesn't exist or isn't accessible to you.
If your desire is not to be prompted at all, as torek mentioned, you can simply set the environment variable GIT_ASKPASS to false and this will work. You could also set GIT_TERMINAL_PROMPT to 0 and that would prevent any prompting for credentials.
However, I strongly recommend that you do indeed set some credentials because GitHub will much more aggressively rate-limit you if you don't set any credentials, and if you do end up using an excessive amount of resources, it's much easier for GitHub to contact you about the problem and ask you to fix it, rather than just block you or make an abuse report to your network provider.
On that note, your Python script is not likely to handle the case where you have a large number of failures for that reason, so you should strongly consider handling that case most robustly. In general, anyone making a large number of HTTP requests to any server needs to learn to gracefully back off.
If you decide you do want to pass credentials, you can do so from the environment using a custom credential helper, or you can use an SSH key and SSH URLs to do this.
I'd suggest you've overspecified the problem (turning this into an XY problem): you don't specifically need to make Git not ask for credentials since you could instead merely clone those repositories that do exist.
That said, there are two ways to prevent Git from asking for credentials:
Use a URL that cannot take credentials. (Any given server may or may not accept such URLs. With GitHub, you could try to log in as git#github.com via ssh, and present a valid public key. After ssh has authenticated you, Git will give you access to any accessible URL, and deny you access to any inaccessible or invalid URL, without asking for further credentials.)
Supply a credential helper that never actually provides any credentials, without asking for any. For instance, you could run with GIT_ASKPASS=false in the environment. See the credentials documentation for details.
(There's one more option as well, which is to allow Git to ask for credentials but redirect the input to a program. This is trickier than just overriding GIT_ASKPASS so there is no reason to cover it here.)
To solve the problem better, find a way to list out the repositories that do exist, and do not attempt to clone the ones that don't. This is likely to go significantly faster.
My guess is that you are using https:// data urls. If you use a personal access token, then GitHub shouldn't be asking you for a username/password. Take a look here on how to set it up.
I think that if you use ssh:// data urls instead, then you wouldn't encounter that problem because git defaults to using your ssh-key for authentication instead of password.
You probably want to check that the repo exists before attempting the clone. There are answers on Stack Overflow for this here.
Alternatively, if you switch to using subprocess instead of os.system, you can simply "trick" it by reading input from /dev/null which will prevent the prompt - that way, no intervention will be needed and the invalid URLs will simply be skipped.
for x, i in zip(first_10, name):
subprocess.call(['git', 'clone', x, i], shell=False, stdin=subprocess.DEVNULL, start_new_session=True)
I have come across another useful trick that might come in handy working in desktop environments.
in conjunction with using GIT_TERMINAL_PROMPT=0, git -c credential.helper= <rest of commands> helps to also suppress credential manager windows to pop up.

Uber Rides/Python - How to configure Oauth w. Apache

I'm new to this so I apologize in advance. I am actually an AppleScript developer and would merely like to use Uber Rides.py into a script, and I have virtually no knowledge in Python. (Just looking for an 'easy' way to initiate a Uber ride within a more complex script using Homebridge and Siri).
I've built the py app but I don't know to get the Oauth code after using the authorization_code_grant.py script.
I will be the only user of this app (it's just for testing at home) so I'm not that worried about Auth but I understand it's a mandatory uber process. There's no frontend to the app (it's just the script running and responding to Siri requests via Homebridge) and no web page for the user to authenticate. The user will be myself so I just need a way to "get" the Oauth code that I then will use as part of the CLI/Python command within the Applescript.
I have two main issues:
Can someone give me a step by step on how to grab the code sent by Uber once the user has logged in and clicked on the authorized button? I seem to understand I should configure my Apache server to "receive" the code but I don't know how to? (If it helps I have set up an Horuku account but I'm not sure I need this considering Apache runs on my Mac, I just don't know how to configure it…)
When I try to use the authorization_code py (with the URI set to http://localhost:7000 but, to my knowledge, nothing runs on port 7000 at the moment, hence question 1) it generates the error below
Error:
>mediacenter$ python example/authorization_code_grant.py
Login and grant access by going to:
login.uber.com/oauth/authorize?scope=profile+request+history&state=MgnYJ18l7DxqbSYxkSfjrbGCL8BQAMg0&redirect_uri=https%3A%2F%2Foauthswift.herokuapp.com%2Fcallback%2Fsiriuber&response_type=code&client_id=3Wk7zJbSLVCFCQ69UZvQJCZ_aBfHJBDu
>Copy the URL you are redirected to and paste here:
oauth-callback/siriuber?state=MgnYJ18l7DxqbSYxkSfjrbGCL8BQAMg0&code=dK1ETADCaHcZCAbXnYKOSapetgexgj
Failed to request access token: UNAUTHORIZED.
[ErrorDetails: 401 UNAUTHORIZED invalid_client]
Traceback (most recent call last):
File "example/authorization_code_grant.py", line 150, in
hello_user(api_client)
File "example/authorization_code_grant.py", line 122, in hello_user
response = api_client.get_user_profile()
AttributeError: 'NoneType' object has no attribute 'get_user_profile'
I hope it kind of makes sense. I know I should spend some time getting to learn a "real" language but AS is (most of the time) perfect to bring different things together quickly and doing what I want!
Thanks in advance,
JC
Same person from GitHub. After digging around, I came up with this.
Basically there is probably a configuration issue in either your example/config.yaml or in your app dashboard. Make sure you configured both of those correctly. the example/config.yaml setup should be exactly like this with the three values replaced. Make sure your redirect URL is the same as the one in your dashboard under "Authorizations" redirect URLs.
Did you install from source? As in did you clone it from the GitHub repo? Or did you install it using pip?
Hope this helps.

SOLR mysolr pysolr Python 401 reply

If there is someone out there who has already worked with SOLR and a python library to index/query solr, would you be able to try and answer the following question.
I am using the mySolr python library but there are others out (like pysolr) there and I don't think the problem is related to the library itself.
I have a default multicore SOLR setup, so no authentication required normally. Don't need it to access the admin page at http://localhost:8080/solr/testcore/admin/ either
from mysolr import Solr
solr = Solr('http://localhost:8080/solr/testcore/')
response = solr.search(q='*:*')
print("response")
print(response)
This code used to work but now I get a 401 reply from SOLR ... just like that, no changes have been made to the python virtual env containing mysolr or the SOLR setup. Still...something must have changed somewhere but I'm out of clues.
What could be the causes of a SOLR 401 reponse?
Additional info: This script and mor advanced script do work on another PC, just not on the one I am working on. Also, adding "/select?q=:" behind the url in the browser does return the correct results. So the SOLR is setup correctly, it has probably something to do with my computer itself. Could windows settings (of any kind) have an impact on how SOLR responds to requests from python? The python env itself has been reinstalled several times to no avail.
Thanks in advance!
The problem was: proxy.
If this exact situation was ever to occur to someone and you are behind a proxy, check if your HTTP and HTTPS environmental variables are not set. If they are... this might cause the python session to try using the proxy while it shouldn't (connecting to localhost via proxy).
It didn't cause any trouble for months but out of the blue it did so whether you encounter this or not might be dependent on how your IT setup your proxy or made some other changes...somewhere.
thank you everyone!

GAE development server keep full text search indexes after restart?

Is there anyway of forcing the GAE dev server to keep full text search indexes after restart? I am finding that the index is lost whenever the dev server is restarted.
I am already using a static datastore path when I launch the dev server (the --datastore_path option).
This functionality was added a few releases ago (in either 1.7.1 or 1.7.2, I think). If you're using an SDK from the last few months it should be working. You can try explicitly setting the --search_indexes_path flag on dev_appserver.py; it's possible that the default location (/tmp/) isn't writable. Could you post the first few lines of the logs from when you start dev_appserver?
in case anyone else comes looking for this, it looks like the simple solution is now to specify
--storage_path=/not/the/tmp/dir
you can still override this with --datastore_path etc.
https://developers.google.com/appengine/docs/python/tools/devserver
(at the bottom of the page..)
Look like this is not an issue anymore. according to documentation (and my tests):
"The development web server simulates the App Engine datastore using a
file on your computer. This file persists between invocations of the
web server, so data you store will still be available the next time
you run the web server."
Please let me know if it is otherwise and I will follow up on that.

SPNEGO (kerberos token generation/validation) for SSO using Python

I'm attempting to implement a simple Single Sign On scenario where some of the participating servers will be windows (IIS) boxes. It looks like SPNEGO is a reasonable path for this.
Here's the scenario:
User logs in to my SSO service using his username and password. I authenticate him using some mechanism.
At some later time the user wants to access App A.
The user's request for App A is intercepted by the SSO service. The SSO service uses SPNEGO to log the user in to App A:
The SSO service hits the App A web page, gets a "WWW-Authenticate: Negotiate" response
The SSO service generates a "Authorization: Negotiate xxx" response on behalf of the user, responds to App A. The user is now logged in to App A.
The SSO service intercepts subsequent user requests for App A, inserting the Authorization header into them before passing them on to App A.
Does that sound right?
I need two things (at least that I can think of now):
the ability to generate the "Authorization: Negotiate xxx" token on behalf of the user, preferably using Python
the ability to validate "Authorization: Negotiate xxx" headers in Python (for a later part of the project)
This is exactly what Apple does with its Calendar Server. They have a python gssapi library for the kerberos part of the process, in order to implement SPNEGO.
Look in CalendarServer/twistedcaldav/authkerb.py for the server auth portion.
The kerberos module (which is a c module), doesn't have any useful docstrings, but PyKerberos/pysrc/kerberos.py has all the function definitions.
Here's the urls for the svn trunks:
http://svn.calendarserver.org/repository/calendarserver/CalendarServer/trunk
http://svn.calendarserver.org/repository/calendarserver/PyKerberos/trunk
Take a look at the http://spnego.sourceforge.net/credential_delegation.html tutorial. It seems to be doing what you are trying to do.
I've been searching quite some time for something similar (on Linux), that has lead me to this page several times, yet giving no answer. So here is my solution, I came up with:
The web-server is a Apache with mod_auth_kerb. It is already running in a Active Directory, single sign-on setup since quite some time.
What I was already able to do before:
Using chromium with single sign on on Linux (with a proper krb5 setup, with working kinit user#domain)
Having python connect and single sign on using sspi from the pywin32 package, with something like sspi.ClientAuth("Negotiate", targetspn="http/%s" % host)
The following code snippet completes the puzzle (and my needs), having Python single sign on with Kerberos on Linux (using python-gssapi):
in_token=base64.b64decode(neg_value)
service_name = gssapi.Name("HTTP#%s" % host, gssapi.C_NT_HOSTBASED_SERVICE)
spnegoMechOid = gssapi.oids.OID.mech_from_string("1.3.6.1.5.5.2")
ctx = gssapi.InitContext(service_name,mech_type=spnegoMechOid)
out_token = ctx.step(in_token)
buffer = sspi.AuthenticationBuffer()
outStr = base64.b64encode(out_token)

Categories

Resources