I am trying to process speech in real time in order to transcribe it, using Google Cloud's speech API.
I have a service account associated with the speech service, plus a local JSON which should authorize my requests to Google Cloud (it contains a couple of credentials). Even if the key is set to expire on January 1st, 10000, attempting to obtain the transcript generates the following error:
google.api_core.exceptions.Unauthenticated: 401 Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project. [reason: "ACCESS_TOKEN_EXPIRED"
domain: "googleapis.com"
metadata {
key: "method"
value: "google.cloud.speech.v1.Speech.StreamingRecognize"
}
metadata {
key: "service"
value: "speech.googleapis.com"
}
]
What am I doing wrong? I cannot seem to find a solution for this online.
Thank you!
I am trying to upload the file to the custom Google Cloud Storage bucket with a Flutter web app.
final _storage = FirebaseStorage.instanceFor(bucket: bucketName);
Reference documentRef = _storage.ref().child(filename);
await documentRef.putData(await data);
The code works fine for a default bucket but fails with a new custom GCP bucket.
Error: FirebaseError: Firebase Storage: An unknown error occurred, please check the error payload for server response. (storage/unknown)
The HTTP POST response causing this error says:
{
"error": {
"code": 400,
"message": "Your bucket has not been set up properly for Firebase Storage. Please visit 'https://console.firebase.google.com/project/{my_project_name}/storage/rules' to set up security rules."
}
}
So apparently, I need to add a new bucket to Firestore and set up access rules before I can upload the file there.
Since these buckets are created automatically by my backend microservice, is there a way to add them to Firestore and set up the rules with Python SDK? Alternatively, is there any other way to upload data to GCP buckets with Flutter besides Firebase Storage?
Thank you.
So far I think this is not possible to be done using an SDK, nevertheless this can done making a request to the Firebase API. Here's how it would be done using curl:
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
https://firebasestorage.clients6.google.com/v1alpha/projects/[PROJECT_NUMBER]/buckets/[BUCKET_NAME]:addFirebase
So you can make a similar request using requests and creating the token as explained here:
import google.auth
import google.auth.transport.requests
creds, project = google.auth.default()
# creds.valid is False, and creds.token is None
# Need to refresh credentials to populate those
auth_req = google.auth.transport.requests.Request()
creds.refresh(auth_req)
# Now you can use creds.token
Here's my problem:
I have a 365 Family OneDrive subscription with 3 members, my account being the admin.
I am trying to build a python application to read/extract the content of the files I have on this onedrive space based on specific criterias. I want to build it as a command line application, running locally on my PC. I am aware some tools may exist for this but I'd like to code my own solution.
After going through tons of different documentation, I ended up doing the following
Registered my application on the Azure portal
Granted some permission on the Microsoft Graph API (User.read, Files.Read and Files.ReadAll)
Created a secret
Grabbed the sample code provided by Microsoft
Replaces some variables with my Client_Id and Secret
Ran the code
The code returns an access token but the authorization requests fails with 401 - Unauthorized: Access is denied due to invalid credentials.
Here's the Python code I'm using.
import msal
config = {
"authority": "https://login.microsoftonline.com/consumers",
"client_id": "<my client ID>",
"scope": ["https://graph.microsoft.com/.default"],
"secret": "<My secret stuff>",
"endpoint": "https://graph.microsoft.com/v1.0/users"
}
# Create a preferably long-lived app instance which maintains a token cache.
app = msal.ConfidentialClientApplication(
config["client_id"], authority=config["authority"],
client_credential=config["secret"],
)
result = None
result = app.acquire_token_silent(config["scope"], account=None)
if not result:
result = app.acquire_token_for_client(scopes=config["scope"])
if "access_token" in result:
# Calling graph using the access token
graph_data = requests.get( # Use token to call downstream service
config["endpoint"],
headers={'Authorization': 'Bearer ' + result['access_token']}, ).json()
print("Graph API call result: ")
print(json.dumps(graph_data, indent=2))
else:
print(result.get("error"))
print(result.get("error_description"))
print(result.get("correlation_id")) # You may need this when reporting a bug
According to the error message, I'm obviously missing something in the authorization process but can't tell what. I'm not even sure about the Authority and Endpoints I should use. My account being a personal one, I have no tenant.
Do I need to set-up / configure some URI somewhere?
Any help would be welcome.
Thank you in advance.
In your client app you need to store the token that you are getting from the MSAL. and then send the token with an authorized request.
For OneDrive, download the OneDrive for python. You can see the different option for Authentication.
The reason you are getting an access token, ID token, and a refresh token is because of the flow you're using. My suggestion is to review the flows for a better understanding of how the authentication process works and what will be returned accordingly. You can use this MSAL library for python.
I have created and ran dags on a google-cloud-composer environment (dlkpipelinesv1 : composer-1.13.0-airflow-1.10.12). I am able to trigger these dags manually, and using the scheduler, but I am stuck when it comes to triggering them via cloud-functions that detect changes in a google-cloud-storage bucket.
Note that I had another GC-Composer environment (pipelines:composer-1.7.5-airflow-1.10.2) that used those same google cloud functions to trigger the relevant dags, and it was working.
I followed this guide to create the functions that trigger the dags. So I retrieved the following variables:
PROJECT_ID = <project_id>
CLIENT_ID = <client_id_retrieved_by_running_the_code_in_the_guide_within_my_gcp_console>
WEBSERVER_ID = <airflow_webserver_id>
DAG_NAME = <dag_to_trigger>
WEBSERVER_URL = f"https://{WEBSERVER_ID}.appspot.com/api/experimental/dags/{DAG_NAME}/dag_runs"
def file_listener(event, context):
"""Entry point of the cloud function: Triggered by a change to a Cloud Storage bucket.
Args:
event (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
logging.info("Running the file listener process")
logging.info(f"event : {event}")
logging.info(f"context : {context}")
file = event
if file["size"] == "0" or "DTM_DATALAKE_AUDIT_COMPTAGE" not in file["name"] or ".filepart" in file["name"].lower():
logging.info("no matching file")
exit(0)
logging.info(f"File listener detected the presence of : {file['name']}.")
# id_token = authorize_iap()
# make_iap_request({"file_name": file["name"]}, id_token)
make_iap_request(url=WEBSERVER_URL, client_id=CLIENT_ID, method="POST")
def make_iap_request(url, client_id, method="GET", **kwargs):
"""Makes a request to an application protected by Identity-Aware Proxy.
Args:
url: The Identity-Aware Proxy-protected URL to fetch.
client_id: The client ID used by Identity-Aware Proxy.
method: The request method to use
('GET', 'OPTIONS', 'HEAD', 'POST', 'PUT', 'PATCH', 'DELETE')
**kwargs: Any of the parameters defined for the request function:
https://github.com/requests/requests/blob/master/requests/api.py
If no timeout is provided, it is set to 90 by default.
Returns:
The page body, or raises an exception if the page couldn't be retrieved.
"""
# Set the default timeout, if missing
if "timeout" not in kwargs:
kwargs["timeout"] = 90
# Obtain an OpenID Connect (OIDC) token from metadata server or using service account.
open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
logging.info(f"Retrieved open id connect (bearer) token {open_id_connect_token}")
# Fetch the Identity-Aware Proxy-protected URL, including an authorization header containing "Bearer " followed by a
# Google-issued OpenID Connect token for the service account.
resp = requests.request(method, url, headers={"Authorization": f"Bearer {open_id_connect_token}"}, **kwargs)
if resp.status_code == 403:
raise Exception("Service account does not have permission to access the IAP-protected application.")
elif resp.status_code != 200:
raise Exception(f"Bad response from application: {resp.status_code} / {resp.headers} / {resp.text}")
else:
logging.info(f"Response status - {resp.status_code}")
return resp.json
This is the code that runs in the GC-functions
I have checked the environment details in dlkpipelinesv1 and piplines respectively, using this code :
credentials, _ = google.auth.default(
scopes=['https://www.googleapis.com/auth/cloud-platform'])
authed_session = google.auth.transport.requests.AuthorizedSession(
credentials)
# project_id = 'YOUR_PROJECT_ID'
# location = 'us-central1'
# composer_environment = 'YOUR_COMPOSER_ENVIRONMENT_NAME'
environment_url = (
'https://composer.googleapis.com/v1beta1/projects/{}/locations/{}'
'/environments/{}').format(project_id, location, composer_environment)
composer_response = authed_session.request('GET', environment_url)
environment_data = composer_response.json()
and the two are using the same service accounts to run, i.e. the same IAM roles. Although I have noticed the following different details :
In the old environment :
"airflowUri": "https://p5<hidden_value>-tp.appspot.com",
"privateEnvironmentConfig": { "privateClusterConfig": {} },
in the new environment:
"airflowUri": "https://da<hidden_value>-tp.appspot.com",
"privateEnvironmentConfig": {
"privateClusterConfig": {},
"webServerIpv4CidrBlock": "<hidden_value>",
"cloudSqlIpv4CidrBlock": "<hidden_value>"
}
The service account that I use to make the post request has the following roles :
Cloud Functions Service Agent
Composer Administrator
Composer User
Service Account Token Creator
Service Account User
The service account that runs my composer environment has the following roles :
BigQuery Admin
Composer Worker
Service Account Token Creator
Storage Object Admin
But I am still receiving a 403 - Forbidden in the Log Explorer when the post request is made to the airflow API.
EDIT 2020-11-16 :
I've updated to the latest make_iap_request code.
I tinkered with the IAP within the security service, but I cannot find the webserver that will accept HTTP: post requests from my cloud functions... See the image bellow, anyway I added the service account to the default and CRM IAP resources just in case, but I still get this error :
Exception: Service account does not have permission to access the IAP-protected application.
The main question is: What IAP is at stake here?? And how do I add my service account as a user of this IAP.
What am I missing?
There is a configuration parameter that causes ALL requests to the API to be denied...
In the documentation, it is mentioned that we need to override the following airflow configuration :
[api]
auth_backend = airflow.api.auth.backend.deny_all
into
[api]
auth_backend = airflow.api.auth.backend.default
This detail is really important to know, and it is not mentioned in google's documentation...
Useful links :
Triggering DAGS (workflows) with GCS
make_iap_request.py repository
The code throwing the 403 is the way it used to work. There was a breaking change in the middle of 2020. Instead of using requests to make an HTTP call for the token, you should use Google's OAuth2 library:
from google.oauth2 import id_token
from google.auth.transport.requests import Request
open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
see this example
I followed the steps in Triggering DAGs and has worked in my env, please see below my recommendations.
It is a good start that the Componser Environment is up and runnning. Through the process you will only need to upload the new DAG (trigger_response_dag.py) and get the clientID (ends with .apps.googleusercontent.com) with either a python script or from the login page the first time you open Airflow UI.
In the Cloud Functions side, I noticed you have a combination of instructions for Node.js and for Python, for example, USER_AGENT is only for Node.js. And routine make_iap_request is only for python. I hope the following points helps to resolve your problem:
Service Account (SA). The Node.js code uses the default service account ${projectId}#appspot.gserviceaccount.com whose default role is Editor, meaning that it has wide access to the GCP services, including Cloud Composer. In python I think the authentication is managed somehow by client_id since a token is retrieved with id. Please ensure that the SA has this Editor role and don't forget to assign serviceAccountTokenCreator as specified in the guide.
I used Node.js 8 runtime and I noticed the user agent you are concerned it should be 'gcf-event-trigger' as it is hard coded; USER_AGENT = 'gcf-event-trigger'. In python, it seems not necesary.
By default, in the GCS trigger, the GCS Event Type is set to Archive, you need to change it to Finalize/Create. If set to Archive, the trigger won't work when you upload objects, and the DAG won't be started.
If you think your cloud function is correctly configured and an error persists, you can find its cause in your cloud function's LOGS tab in the Console. It can give you more details.
Basically, from the guide, I only had to change the following values in Node.js:
// The project that holds your function. Replace <YOUR-PROJECT-ID>
const PROJECT_ID = '<YOUR-PROJECT-ID>';
// Navigate to your webserver's login page and get this from the URL
const CLIENT_ID = '<ALPHANUMERIC>.apps.googleusercontent.com';
// This should be part of your webserver's URL in the Env's detail page: {tenant-project-id}.appspot.com.
const WEBSERVER_ID = 'v90eaaaa11113fp-tp';
// The name of the DAG you wish to trigger. It's DAG's name in the script trigger_response_dag.py you uploaded to your Env.
const DAG_NAME = 'composer_sample_trigger_response_dag';
For Python I only changed these parameters:
client_id = '<ALPHANUMERIC>.apps.googleusercontent.com'
# This should be part of your webserver's URL:
# {tenant-project-id}.appspot.com
webserver_id = 'v90eaaaa11113fp-tp'
# Change dag_name only if you are not using the example
dag_name = 'composer_sample_trigger_response_dag'
I've created a VM instance on Google Compute Engine. After uploading my project and building my image, I ran into my container and authorized access to Google Cloud Platform with my service account:
gcloud auth activate-service-account test#xxx.iam.gserviceaccount.com --key-file=mykey.json
so that I can access my google cloud storage (I did a test with gsutil cp on my bucket and it works). Now I try to execute a tensorflow python script like this:
python object_detection/model_main_tf2.py \
--pipeline_config_path=/raccoon/config.config \
--model_dir=gs://my-bucket/ \
--num_train_steps=10
specifying as model_dir my bucket so that checkpoints and events are stored there (in order to monitor the progress of the training with tensorboard from my laptop).
Problem is I get the following permission error from tensorflow:
tensorflow.python.framework.errors_impl.PermissionDeniedError:
Error executing an HTTP request: HTTP response code
403 with body '{
"error": {
"code": 403,
"message": "Insufficient Permission",
"errors": [
{
"message": "Insufficient Permission",
"domain": "global",
"reason": "insufficientPermissions"
}
]
}
}
'
when initiating an upload to gs://my-bucket/train/events.out.tfevents.1601998426.266
1f74c3966.450.2928.v2
Failed to flush 1 events to gs://my-bucket/train/events.out.tfevents.1601998426.2661f
74c3966.450.2928.v2
Flushing first event.
Could not initialize events writer. [Op:CreateSummaryFileWriter]
The train directory exists on my bucket and as I said before the following command is working:
gsutil cp test.txt gs://my-bucket/train/.
Am I missing something?
Authenticating gcloud just ensures that future gcloud commands are authenticated. Your script (likely) doesn't use gcloud and thus isn't authenticated.
Instead, if you have service account credentials in a JSON file, you can specify it via the GOOGLE_APPLICATION_CREDENTIALS environment variable to have TensorFlow be able to read/write to GCS via gs:// URLs.