I've been trying to work out how to download YouTube captions via the API but the official instructions are tailored towards command line code whereas I'm trying to do this in Python Shell.
Currently I've been following this page to no avail - https://developers.google.com/youtube/v3/docs/captions/list
What seems to trip me up is the Storage and args related pieces of code which even after much googling doesn't make any sense to me.
See the storage code below:
storage = Storage("%s-oauth2.json" % sys.argv[0])
#nowhere on the page does it refer to this oauth2.json file
credentials = storage.get()
if credentials is None or credentials.invalid:
credentials = run_flow(flow, storage, args)
#if credentials is none why would storage still be needed as an argument?
In the second half of the page, it's all adding arguments to args.parser which seems to be command line stuff that I don't want to use as I'm working from Python Shell.
I've also been reading the official page on getting authorization OAuth 2.0 for YouTube API so I can rewrite this code myself but I can't get past this part: https://developers.google.com/api-client-library/python/auth/web-app
Step 5: Exchange authorization code for refresh and access tokens
After the web server receives the authorization code, it can exchange the authorization code for an access token.
On your callback page, use the google-authlibrary to verify the authorization server response. Then, use the flow.fetch_tokenmethod to exchange the authorization code in that response for an access token:
state = flask.session['state']
this is error I'm getting from the above code statement:
RuntimeError: Working outside of request context.
This typically means that you attempted to use functionality that needed
an active HTTP request. Consult the documentation on testing for
information about how to avoid this problem.
Essentially from lots of digging regarding the Oauth 2.0 process, I think I am unable to get the flow.fetch_token to swap my authorization code for the access token.
I've tested all my credentials so they are fine, it's getting it authorized that I'm struggling with.
This is the code so far I'm writing as a simplified version of the Captions API code on the first link:
import google.oauth2.credentials
import google_auth_oauthlib.flow
flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file("//client_secret_file.json", scopes=["https://www.googleapis.com/auth/youtube.force-ssl"])
flow.redirect_uri = "http://localhost:8080"
authorization_url, state = flow.authorization_url(access_type="offline", include_granted_scopes="true")
flask.redirect(authorization_url)
[Response 929 bytes [302 FOUND]]
state = flask.session['state']
[RuntimeError]
Summary:
How can I download YouTube captions from code in Python Shell? I've set up my credentials, just can't get the web server to give me the access token so I can execute the rest of the code. I would use the code on the captions.list link but I cannot get past the Storage code statement and args.arguments which is written for command line.
The following solution doesn't work anymore, I recommend switching to my other StackOverflow answer.
With Python 3 you can easily grab YouTube video subtitles in XML format by accessing the URL: https://video.google.com/timedtext?lang=en&v=VIDEO_ID
Here is another URL listing available subtitles available for a given YouTube video: http://video.google.com/timedtext?type=list&v=VIDEO_ID
Related
I am building a local desktop app where I can read, classify, and create playlists.
The following auth code I have is:
##oauth
scope = "playlist-modify-public playlist-read-private playlist-modify-private"
sp = spotipy.Spotify(
auth_manager=spotipy.SpotifyOAuth(
client_id=client_id,
client_secret=client_secret,
redirect_uri= "https://example.com/callback/",
scope= scope, open_browser=False))
when run on cmd, this asks to click the link generated and then to paste the link that I was redirected to. I want to know if there is another way to provide authorization (automatically or permanently) so that my .exe app doesn't run into an error.
code in your response would help a lot.
You cannot grant permanent access to the APIs in a single call, but you can refresh your token automatically whenever the access expires, as shown in the docs.
If you're using Python, I recommend to do this via Spotipy, which makes the auth process much easier (see https://spotipy.readthedocs.io/en/master/#authorization-code-flow)
I am developing an application that is supposed to help a friend of mine better organize his YouTube channels. He has multiple channels on different Google accounts. I'm developing this in Python and I currently don't have too much experience with the YouTube Data API, which I'm planning on using, since it seems like the only option.
The application itself isn't very complicated. The only things it needs to be able to do is upload videos, with a specified title, description and other properties and it should also be possible to write comments on videos. I started a simple application in the Google Developers Console, enabled the YouTube Data API and created an API key and an OAUTH-Client ID.
So far I've managed to post comments on videos, but it seems like every time I run the Python script (currently its just a simple script that posts a single comment) Google wants me to explicitly choose which account I want to use and I have to give permission to the script every time I run it.
Is there a way I can just run the script once and tell Google which account I want to use to post the comment, give all the permissions and Google then remembers that so I don't have to explicitly give permissions every time?
Also how would I be able to then switch accounts and make uploads with that one, because currently I always need to choose one, when the Google client pops up, when running the script.
I've heard you can get an application authorized by Google, would that help with this or is it fine if I just keep my app in test and not in production?
If you have N accounts and want to upload videos on each of them, then you'll have to run to successful completion N OAuth 2 authorization/authentication flows.
For each of these N OAuth flows, upon completing each one successfully, you'll have to make persistent the obtained credentials data to a separate file within your computer local storage.
This could well be conceived as an initialization step of your app (although, at any later stage, you may well repeat it for any additional channel that you need your app be aware of). Your code would look like:
# run an OAuth flow; then obtain credentials data
# relative to the channel the app's user had chosen
# during that OAuth flow
from google_auth_oauthlib.flow import InstalledAppFlow
scopes = ['https://www.googleapis.com/auth/youtube']
flow = InstalledAppFlow.from_client_secrets_file(
client_secret_file, scopes)
cred = flow.run_console()
# build an YouTube service object such that to
# be able to retrieve the ID of the channel that
# the app's user had chosen during the OAuth flow
from googleapiclient.discovery import build
youtube = build('youtube', 'v3', credentials = cred)
response = youtube.channels().list(
part = 'id',
mine = True
).execute()
channel_id = response['items'][0]['id']
# save the credentials data to a JSON text file
cred_file = f"/path/to/credentials/data/dir/{channel_id}.json"
with open(cred_file, 'w', encoding = 'UTF-8') as json_file:
json_file.write(cred.to_json())
Above, client_secret_file is the full path to the file containing your app's client secret JSON file that you've obtained from Google Developers Console.
Subsequently, each time you'll want to upload a video, you'll have to choose from within the app to which channel to upload that video. From the perspective of the logic of your program that would imply the following thing -- say you've chosen the channel of which ID is channel_id: do read in the credentials data file associated to channel_id for to pass its content to your YouTube service object youtube constructed as shown below:
# read in the credentials data associated to
# the channel identified by its ID 'channel_id'
from google.oauth2.credentials import Credentials
cred_file = f"/path/to/credentials/data/dir/{channel_id}.json"
cred = Credentials.from_authorized_user_file(cred_file)
# the access token need be refreshed when
# the previously saved one expired already
from google.auth.transport.requests import Request
assert cred and cred.valid and cred.refresh_token
if cred.expired:
cred.refresh(Request())
# save credentials data upon it got refreshed
with open(cred_file, 'w', encoding = 'UTF-8') as json_file:
json_file.write(cred.to_json())
# construct an YouTube service object through
# which any API invocations are authorized on
# behalf of the channel with ID 'channel_id'
from googleapiclient.discovery import build
youtube = build('youtube', 'v3', credentials = cred)
Upon running this code, the YouTube service object youtube will be initialized such a way that each an every API endpoint call that is issued through this object will accomplish an authorized request on behalf of the channel identified by channel_id.
An important note: you need to have installed the package Google Authentication Library for Python, google-auth, version >= 1.21.3 (google-auth v1.3.0 introduced Credentials.from_authorized_user_file, v1.8.0 introduced Credentials.to_json and v1.21.3 fixed this latter function w.r.t. its class' expiry member), for the credentials object cred to be saved to and loaded from JSON text files.
Also an important note: the code above is simplified as much as possible. Error conditions are not handled at all. For example, the code above does not handle the error situation when cred_file already exists at the time of writing out a new credentials data file or when cred_file does not exist at the time of reading in credentials data that's supposed to already exist.
I am evaluating different options for authentication in a python App Engine flex environment, for apps that run within a G Suite domain.
I am trying to put together the OpenID Connect "Server flow" instructions here with how google-auth-library-python implements the general OAuth2 instructions here.
I kind of follow things up until 4. Exchange code for access token and ID token, which looks like flow.fetch_token, except it says "response to this request contains the following fields in a JSON array," and it includes not just the access token but the id token and other things. I did see this patch to the library. Does that mean I could use some flow.fetch_token to create an IDTokenCredentials (how?) and then use this to build an OpenID Connect API client (and where is that API documented)? And what about validating the id token, is there a separate python library to help with that or is that part of the API library?
It is all very confusing. A great deal would be cleared up with some actual "soup to nuts" example code but I haven't found anything anywhere on the internet, which makes me think (a) perhaps this is not a viable way to do authentication, or (b) it is so recent the python libraries have not caught up? I would however much rather do authentication on the server than in the client with Google Sign-In.
Any suggestions or links to code are much appreciated.
It seems Google's python library contains a module for id token validation. This can be found at google.oauth2.id_token module. Once validated, it will return the decoded token which you can use to obtain user information.
from google.oauth2 import id_token
from google.auth.transport import requests
request = requests.Request()
id_info = id_token.verify_oauth2_token(
token, request, 'my-client-id.example.com')
if id_info['iss'] != 'https://accounts.google.com':
raise ValueError('Wrong issuer.')
userid = id_info['sub']
Once you obtain user information, you should follow authentication process as described in Authenticate the user section.
OK, I think I found my answer in the source code now.
google.oauth2.credentials.Credentials exposes id_token:
Depending on the authorization server and the scopes requested, this may be populated when credentials are obtained and updated when refresh is called. This token is a JWT. It can be verified and decoded [as #kavindu-dodanduwa pointed out] using google.oauth2.id_token.verify_oauth2_token.
And several layers down the call stack we can see fetch_token does some minimal validation of the response JSON (checking that an access token was returned, etc.) but basically passes through whatever it gets from the token endpoint, including (i.e. if an OpenID Connect scope is included) the id token as a JWT.
EDIT:
And the final piece of the puzzle is the translation of tokens from the (generic) OAuthSession to (Google-specific) credentials in google_auth_oauthlib.helpers, where the id_token is grabbed, if it exists.
Note that the generic oauthlib library does seem to implement OpenID Connect now, but looks to be very recent and in process (July 2018). Google doesn't seem to use any of this at the moment (this threw me off a bit).
Can someone please give me a clear explanation of how to get the Google Calendar API v3 working with the Python Client? Specifically, the initial OAuth stage is greatly confusing me. All I need to do is access my own calendar, read it, and make changes to it. Google provides this code for configuring my app:
import gflags
import httplib2
from apiclient.discovery import build
from oauth2client.file import Storage
from oauth2client.client import OAuth2WebServerFlow
from oauth2client.tools import run
FLAGS = gflags.FLAGS
# Set up a Flow object to be used if we need to authenticate. This
# sample uses OAuth 2.0, and we set up the OAuth2WebServerFlow with
# the information it needs to authenticate. Note that it is called
# the Web Server Flow, but it can also handle the flow for native
# applications
# The client_id and client_secret are copied from the API Access tab on
# the Google APIs Console
FLOW = OAuth2WebServerFlow(
client_id='YOUR_CLIENT_ID',
client_secret='YOUR_CLIENT_SECRET',
scope='https://www.googleapis.com/auth/calendar',
user_agent='YOUR_APPLICATION_NAME/YOUR_APPLICATION_VERSION')
# To disable the local server feature, uncomment the following line:
# FLAGS.auth_local_webserver = False
# If the Credentials don't exist or are invalid, run through the native client
# flow. The Storage object will ensure that if successful the good
# Credentials will get written back to a file.
storage = Storage('calendar.dat')
credentials = storage.get()
if credentials is None or credentials.invalid == True:
credentials = run(FLOW, storage)
# Create an httplib2.Http object to handle our HTTP requests and authorize it
# with our good Credentials.
http = httplib2.Http()
http = credentials.authorize(http)
# Build a service object for interacting with the API. Visit
# the Google APIs Console
# to get a developerKey for your own application.
service = build(serviceName='calendar', version='v3', http=http,
developerKey='YOUR_DEVELOPER_KEY')
But (a) it makes absolutely no sense to me; the comment explanations are terrible, and (b) I don't know what to put in the variables. I've registered my program with Google and signed up for a Service Account key. But all that gave me was an encrypted key file to download, and a client ID. I have no idea what a "developerKey" is, or what a "client_secret" is? Is that the key? If it is, how do I get it, since it is actually contained in an encrypted file? Finally, given the relatively simple goals of my API use (i.e., it's not a multi-user, multi-access operation), is there a simpler way to be doing this? Thanks.
A simple (read: way I've done it) way to do this is to create a web application instead of a service account. This may sound weird since you don't need any sort of web application, but I use this in the same way you do - make some queries to my own calendar/add events/etc. - all from the command line and without any sort of web-app interaction. There are ways to do it with a service account (I'll tinker around if you do in fact want to go on that route), but this has worked for me thus far.
After you create a web application, you will then have all of the information indicated above (side note: the sample code above is based on a web application - to use a service account your FLOW needs to call flow_from_clientsecrets and further adjustments need to be made - see here). Therefore you will be able to fill out this section:
FLOW = OAuth2WebServerFlow(
client_id='YOUR_CLIENT_ID',
client_secret='YOUR_CLIENT_SECRET',
scope='https://www.googleapis.com/auth/calendar',
user_agent='YOUR_APPLICATION_NAME/YOUR_APPLICATION_VERSION')
You can now fill out with the values you see in the API console (client_id = the entire Client ID string, client_secret = the client secret, scope is the same and the user_agent can be whatever you want). As for the service line, developerKey is the API key you can find under the Simple API Access section in the API console (label is API key):
service = build(serviceName='calendar', version='v3', http=http,
developerKey='<your_API_key>')
You can then add in a simple check like the following to see if it worked:
events = service.events().list(calendarId='<your_email_here>').execute()
print events
Now when you run this, a browser window will pop up that will allow you to complete the authentication flow. What this means is that all authentication will be handled by Google, and the authentication response info will be stored in calendar.dat. That file (which will be stored in the same directory as your script) will contain the authentication info that the service will now use. That is what is going here:
storage = Storage('calendar.dat')
credentials = storage.get()
if credentials is None or credentials.invalid == True:
credentials = run(FLOW, storage)
It checks for the existence of valid credentials by looking for that file and verifying the contents (this is all abstracted away from you to make it easier to implement). After you authenticate, the if statement will evaluate False and you will be able to access your data without needing to authenticate again.
Hopefully that shines a bit more light on the process - long story short, make a web application and use the parameters from that, authenticate once and then forget about it. I'm sure there are various points I'm overlooking, but hopefully it will work for your situation.
Google now has a good sample application that gets you up and running without too much fuss. It is available as the "5 minute experience - Quickstart" on their
Getting Started page.
It will give you a URL to visit directly if you are working on a remote server without a browser.
I am trying to implement a button on a web-based dashboard that allows a user to export the current data to a Google Spreadsheet using OAuth and GData API. Currently, I can get the user to a login/grant access page, but if I add the line to convert the request token to an access token, I receive:
"RequestError: Unable to upgrade OAuth request token to access token: 400, parameter_absent
oauth_parameters_absent:oauth_token"
I am following the instructions for OAuth 2 on this page:
https://developers.google.com/gdata/docs/auth/oauth
and have read both PyDocs for the Google APIs and found no details on this issue:
http://gdata-python-client.googlecode.com/hg/pydocs/gdata.docs.client.html#DocsClient
(Won't let me post a this hyperlink but other Pydoc is same URL but replace the piece after pydocs/ with gdata.gauth.html#ClientLoginToken)
This is the code that works:
def createDocsClient(self, oauth_callback_url):
docsClient = gdata.docs.client.DocsClient(source='RiskOps-QualityDashboard')
request_token = docsClient.GetOAuthToken(SCOPES, oauth_callback_url, CONSUMER_KEY, consumer_secret=CONSUMER_SECRET)
domain = None
auth_url = request_token.generate_authorization_url(google_apps_domain=domain)
self.redirect(str(auth_url))
request_token = gdata.gauth.AuthorizeRequestToken(request_token, self.request.uri
With the above code, I get to a grant access page and if you click the grant access page, you get a 404 error because it doesn't know where to go after (as expected), but the page has the proper URL displayed listing an oauth_verifier and oauth_token. The "AuthorizeRequestToken" line is supposed to use that URL to authorize the token so up to this line, everything seems to work.
When I add the following line right after the code above, I get the "RequestError" I wrote about:
access_token = docsClient.GetAccessToken(request_token)
I've tried different combinations of nesting the calls within each other, using the AeSave and AeLoad (as the instructions mention might be needed but I'm not sure if my case calls for it) and many other random and unsuccessful ideas and nothing is really giving me a good idea of what I'm missing or doing wrong.
Would really appreciate and help or any ideas anyone has.(If you can't tell, I'm fairly inexperienced when it comes to real-world code (as opposed to academic code). Thanks so much.