JSONDecodeError when using airflow GoogleBaseHook in python - python

I followed this official document to run airflow in docker. In my local airflow webserver (localhost:8080), I create a Google Cloud connection by pasting my google credential.json in Keyfile JSON and add 2 scopes (shown in the picture below)
airflow connection
my credential looks like:
{
"type": "service_account",
"project_id": "xxxx-tech",
"private_key_id": "xxxxx",
"private_key": "-----BEGIN PRIVATE KEY-----\nXXXXX=\n-----END PRIVATE KEY-----\n",
"client_email": "xxx#tech.iam.gserviceaccount.com",
"client_id": "10xxxx17",
"auth_uri": "https://urldefense.proofpoint.com/v2/url?u=https-3xxxxx= ",
"token_uri": "https://urldefense.proofpoint.com/v2/url?u=https-3xxxxx ",
"auth_provider_x509_cert_url": "https://urldefense.proofpoint.com/v2/url?u=https-3xxxxxx= ",
"client_x509_cert_url": "https://urldefense.proofpoint.com/v2/url?u=https-3xxxxxx= "
}
I want to use airflow's hook to get this credential for reading data in Google Sheet.
But in my python code, I tried:
from airflow.providers.google.common.hooks.base_google import GoogleBaseHook
from googleapiclient.discovery import build
gcp_hook = GoogleBaseHook(gcp_conn_id="spx-service-account")
creds = gcp_hook._get_credentials()
service = build('sheets', 'v4', credentials=creds, cache_discovery=False)
sheet = service.spreadsheets()
result = sheet.values().get(spreadsheetId="1yN3atY6NG7PfY8yNcNAiVqURra8WtQJWCKXc-ccymk0", range="Sheet1!A1:B4").execute()['values']
But executing this python in terminal shows json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
error
I don't know why it shows JSONDecodeError. I also tried the following code. Printing cred_dict in terminal shows the exact same content as my credential.json
gcp_hook = GoogleBaseHook(gcp_conn_id="spx-service-account")
cred_dict = json.loads(gcp_hook._get_field('keyfile_dict'))
From the test above, I guess it may not be a json decode issue, but I still cannot use GoogleBaseHook in my python code. I wonder how can I set airflow connection in the right way?
Thank you in advance!

Oops, it ends up to be something wrong with my credential.json. I have no idea why those token_uri is encoded when I downloaded it from my Gmail (tech team sent it to us via email) and opened it with my windows laptop. If your credential is correct, the code below should work:
from airflow.providers.google.common.hooks.base_google import GoogleBaseHook
from googleapiclient.discovery import build
gcp_hook = GoogleBaseHook(gcp_conn_id="spx-service-account")
creds = gcp_hook._get_credentials()
service = build('sheets', 'v4', credentials=creds, cache_discovery=False)
sheet = service.spreadsheets()
result = sheet.values().get(spreadsheetId="1yN3atY6NG7PfY8yNcNAiVqURra8WtQJWCKXc-ccymk0", range="Sheet1!A1:B4").execute()['values']

Related

I keep getting authentication errors with the Google API

I've been trying to make a script that update the YouTube title with the video views, like the Tom Scott video. But I keep getting this error with the authentication.
I get the code to authenticate it but it just says I don't have enough permission.
<HttpError 400 when requesting https://youtube.googleapis.com/youtube/v3/videos?part=%28%27snippet%2Cstatus%2Clocalizations%27%2C%29&alt=json returned "'('snippet'". Details: "[{'message': "'('snippet'", 'domain': 'youtube.part', 'reason': 'unknownPart', 'location': 'part', 'locationType': 'parameter'}]"> File "C:\Users\erik\Documents\PYTHON CODE\view_changer\example.py", line 49, in main response_update = request_update.execute().
Here's my code:
import os
import google_auth_oauthlib.flow
import googleapiclient.discovery
import googleapiclient.errors
scopes = ["https://www.googleapis.com/auth/youtube"]
api_key = "..."
video_1_part_view = 'statistics'
video_1_id_view = 'hVuoPgZJ3Yw'
video_1_part_update= "snippet,status,localizations",
video_1_part_body_update= { "id": "1y94SadSsMU",
"localizations": {"es": { "title": "no hay nada a ver aqui",
"description": "Esta descripcion es en español."
}
}
}
def main():
# Disable OAuthlib's HTTPS verification when running locally.
# *DO NOT* leave this option enabled in production.
os.environ["OAUTHLIB_INSECURE_TRANSPORT"] = "1"
api_service_name = "youtube"
api_version = "v3"
client_secrets_file = (r"C:\Users\erik\Documents\PYTHON CODE\view_changer\client_secret_379587650859-bbl63je7sh6kva19l33auonkledt09a9.apps.googleusercontent.com.json")
# Get credentials and create an API client
flow = google_auth_oauthlib.flow.InstalledAppFlow.from_client_secrets_file(client_secrets_file, scopes)
credentials = flow.run_console()
youtube = googleapiclient.discovery.build(api_service_name, api_version, credentials=credentials)
#check views
request_view= youtube.videos().list( part = video_1_part_view, id = video_1_id_view )
respons_view = request_view.execute()
views = respons_view['items'][0]['statistics']['viewCount']
print (views)
#update video
request_update = youtube.videos().update(part= video_1_part_update, body = video_1_part_body_update)
response_update = request_update.execute()
print(response_update)
main()
# if __name__ == "__main__":
# main()
If the problem is not in the code I might have an other idea of what it might be. When I was first setting up the OAuth, I didn't know I needed to pick what the user would be giving permission to (I know rookie mistake(: ), but since I updated it, when I ask for permission after going to the authentication link the code gives me back, it doesn't ask permission for all the things I told it to do.
According to the official documentation, the request parameter part of the Videos.list API endpoint is defined as:
part (string)
The part parameter specifies a comma-separated list of one or more video resource properties that the API response will include.
[...]
But you're having that request parameter set as:
('snippet,status,localizations',).
That's because (though your code above doesn't show it) you have the variable video_1_part_view defined as:
video_1_part_view = "snippet,status,localizations",
(Do notice the trailing comma). That makes video_1_part_view a Python tuple, which is not what you need to send to the API endpoint.
The same is true for your variable video_1_part_update.
Just remove the trailing commas and things will work OK.

Firebase storage in free plan: Anonymous caller does not have storage.objects.get access

I am using a free plan for firebase storage. When I want to store a picture from Python, it works fine:
import os
import firebase_admin
from firebase_admin import credentials, storage
file_extension = os.path.splitext(request.files["input_image"].filename)[1] # ex: jpg
cred = credentials.Certificate(os.getenv("FIREBASE_CREDS"))
firebase_admin.initialize_app(cred)
bucket = storage.bucket(os.getenv("FIREBASE_BUCKET"))
# ex: profile_pictures/user1.jpg
blob = bucket.blob(f"profile_pictures/{current_user.username}{file_extension}")
blob.upload_from_string(request.files["input_image"].filename)
profile_picture_url = blob.public_url
However, when I try to go to the picture public url, I am having this page, while I am expecting an image:
The link is as follow:
https://storage.googleapis.com/bucket_name/folder_name/filename
Note that my rule policy is the following:
rules_version = '2';
service firebase.storage {
match /b/{bucket}/o {
match /{allPaths=**} {
allow read, write;
}
}
}
The problem was the following: It looks like when posting a file from the Python SDK, other clients are not allowed to access the URL.
The good new is there is a very easy fix. Just add this line of code after posting the file:
blob.make_public()
That's it.

Creating VMs from Instance Template on Cloud Function via API call

The code I've written seems to be what I need, however it doesn't work and I get a 401 error (authentication) I've tried everything: 1. Service account permissions 2. create secret id and key (not sure how to use those to get access token though) 3. Basically, tried everything for the past 2 days.
import requests
from google.oauth2 import service_account
METADATA_URL = 'http://metadata.google.internal/computeMetadata/v1/'
METADATA_HEADERS = {'Metadata-Flavor': 'Google'}
SERVICE_ACCOUNT = [NAME-OF-SERVICE-ACCOUNT-USED-WITH-CLOUD-FUNCTION-WHICH-HAS-COMPUTE-ADMIN-PRIVILEGES]
def get_access_token():
url = '{}instance/service-accounts/{}/token'.format(
METADATA_URL, SERVICE_ACCOUNT)
# Request an access token from the metadata server.
r = requests.get(url, headers=METADATA_HEADERS)
r.raise_for_status()
# Extract the access token from the response.
access_token = r.json()['access_token']
return access_token
def start_vms(request):
request_json = request.get_json(silent=True)
request_args = request.args
if request_json and 'number_of_instances_to_create' in request_json:
number_of_instances_to_create = request_json['number_of_instances_to_create']
elif request_args and 'number_of_instances_to_create' in request_args:
number_of_instances_to_create = request_args['number_of_instances_to_create']
else:
number_of_instances_to_create = 0
access_token = get_access_token()
address = "https://www.googleapis.com/compute/v1/projects/[MY-PROJECT]/zones/europe-west2-b/instances?sourceInstanceTemplate=https://www.googleapis.com/compute/v1/projects/[MY-PROJECT]/global/instanceTemplates/[MY-INSTANCE-TEMPLATE]"
headers = {'token': '{}'.format(access_token)}
for i in range(1,number_of_instances_to_create):
data = {'name': 'my-instance-{}'.format(i)}
r = requests.post(address, data=data, headers=headers)
r.raise_for_status()
print("my-instance-{} created".format(i))
Any advice/guidance? If someone could tell me how to get an access token using secret Id and key. Also, I'm not too sure if OAuth2.0 will work because I essentially want to turn these machines on, and they do some processing and then self destruct. So there is no user involvement to allow access. If OAuth2.0 is the wrong way to go about it, what else can I use?
I tried using gcloud, but subprocess'ing gcloud commands aren't recommended.
I did something similar to this, though I used the Node 10 Firebase Functions runtime, but should be very similar never-the-less.
I agree that OAuth is not the correct solution since there is no user involved.
What you need to use is 'Application Default Credentials' which is based on the permissions available to your cloud functions' default service account which will be the one labelled as "App Engine default service account" here:
https://console.cloud.google.com/iam-admin/serviceaccounts?folder=&organizationId=&project=[YOUR_PROJECT_ID]
(For my project that service account already had the permissions necessary for starting and stopping GCE instances, but for other API's I have grant it permissions manually.)
ADC is for server-to-server API calls. To use it I called google.auth.getClient (of the Google APIs Auth Library) with just the scope, ie. "https://www.googleapis.com/auth/cloud-platform".
This API is very versatile in that it returns whatever credentials you need, so when I am running on cloud functions it returns a 'Compute' object and when I'm running in the emulator it gives me a "UserRefreshClient" object.
I then include that auth object in my call to compute.instances.insert() and compute.instances.stop().
Here the template I used for testing my code...
{
name: 'base',
description: 'Temporary instance used for testing.',
tags: { items: [ 'test' ] },
machineType: `zones/${zone}/machineTypes/n1-standard-1`,
disks: [
{
autoDelete: true, // you will want this!
boot: true,
type: 'PERSISTENT',
initializeParams: {
diskSizeGb: '10',
sourceImage: "projects/ubuntu-os-cloud/global/images/ubuntu-minimal-1804-bionic-v20190628",
}
}
],
networkInterfaces: [
{
network: `https://www.googleapis.com/compute/v1/projects/${projectId}/global/networks/default`,
accessConfigs: [
{
name: 'External NAT',
type: 'ONE_TO_ONE_NAT'
}
]
}
],
}
Hope that helps.
If you’re getting a 401 error that means that the access token you're using is either expired or invalid.
This guide will be able to show you how to request OAuth 2.0 access tokens and make API calls using a Service Account: https://developers.google.com/identity/protocols/OAuth2ServiceAccount
The .json file mentioned is the private key you create in IAM & Admin under your service account.

oauth2client is now deprecated

In the Python code for requesting data from Google Analytics ( https://developers.google.com/analytics/devguides/reporting/core/v4/quickstart/service-py ) via an API, oauth2client is being used. The code was last time updated in July 2018 and until now the oauth2client is deprecated. My question is can I get the same code, where instead of oauth2client, google-auth or oauthlib is being used ?
I was googling to find a solution how to replace the parts of code where oauth2client is being used. Yet since I am not a developer I didn't succeed. This is how I tried to adapt the code in this link ( https://developers.google.com/analytics/devguides/reporting/core/v4/quickstart/service-py ) to google-auth. Any idea how to fix this ?
import argparse
from apiclient.discovery import build
from google.oauth2 import service_account
from google.auth.transport.urllib3 import AuthorizedHttp
SCOPES = ['...']
DISCOVERY_URI = ('...')
CLIENT_SECRETS_PATH = 'client_secrets.json' # Path to client_secrets.json file.
VIEW_ID = '...'
def initialize_analyticsreporting():
"""Initializes the analyticsreporting service object.
Returns:l
analytics an authorized analyticsreporting service object.
"""
# Parse command-line arguments.
credentials = service_account.Credentials.from_service_account_file(CLIENT_SECRETS_PATH)
# Prepare credentials, and authorize HTTP object with them.
# If the credentials don't exist or are invalid run through the native client
# flow. The Storage object will ensure that if successful the good
# credentials will get written back to a file.
authed_http = AuthorizedHttp(credentials)
response = authed_http.request(
'GET', SCOPES)
# Build the service object.
analytics = build('analytics', 'v4', http=http, discoveryServiceUrl=DISCOVERY_URI)
return analytics
def get_report(analytics):
# Use the Analytics Service Object to query the Analytics Reporting API V4.
return analytics.reports().batchGet(
body=
{
"reportRequests":[
{
"viewId":VIEW_ID,
"dateRanges":[
{
"startDate":"2019-01-01",
"endDate":"yesterday"
}],
"dimensions":[
{
"name":"ga:transactionId"
},
{
"name":"ga:sourceMedium"
},
{
"name":"ga:date"
}],
"metrics":[
{
"expression":"ga:transactionRevenue"
}]
}]
}
).execute()
def printResults(response):
for report in response.get("reports", []):
columnHeader = report.get("columnHeader", {})
dimensionHeaders = columnHeader.get("dimensions", [])
metricHeaders = columnHeader.get("metricHeader", {}).get("metricHeaderEntries", [])
rows = report.get("data", {}).get("rows", [])
for row in rows:
dimensions = row.get("dimensions", [])
dateRangeValues = row.get("metrics", [])
for header, dimension in zip(dimensionHeaders, dimensions):
print (header + ": " + dimension)
for i, values in enumerate(dateRangeValues):
for metric, value in zip(metricHeaders, values.get("values")):
print (metric.get("name") + ": " + value)
def main():
analytics = initialize_analyticsreporting()
response = get_report(analytics)
printResults(response)
if __name__ == '__main__':
main()
I need to obtain response in form of a json with given dimensions and metrics from Google Analytics.
For those running into this problem and wish to port to the newer auth libraries, do a diff b/w the 2 different versions of the short/simple Google Drive API sample at the code repo for the G Suite APIs intro codelab to see what needs to be updated (and what can stay as-is). The bottom-line is that the API client library code can remain the same while all you do is swap out the auth libraries underneath.
Note that sample is only for user acct auth... for svc acct auth, the update is similar, but I don't have an example of that yet (working on one though... will update this once it's published).

Google Firebase: Get, update or create documents using Python

I am not able to get, update or create the documents in Google Firebase (Cloud Firestore) database using Python.
What I have:
A) The database with a collection and documents (inserted manually on the web):
B) Credential JSON file saved as test.json (it is called often path/to/serviceKey.json in the documentation), which looks like this (redacted):
{
"type": "service_account",
"project_id": "test-6f02d",
"private_key_id": "fffca ... 5b7",
"private_key": "-----BEGIN PRIVATE KEY-----\n ... 1IHE=\n-----END PRIVATE KEY-----\n",
"client_email": "test-admin#test-6f02d.iam.gserviceaccount.com",
"client_id": "112 ... 060",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/ ... .gserviceaccount.com"
}
This user has a role Owner.
C) firebase_admin installed (using virtualenv, pip), I can do:
import firebase_admin
from firebase_admin import credentials, firestore
databaseURL = {'databaseURL': "https://test-6f02d.firebaseio.com"}
cred = credentials.Certificate("test.json")
firebase_admin.initialize_app(cred, databaseURL)
<firebase_admin.App object at 0x7f20056534e0>
The following is working:
db = firestore.client()
for k in db.collection('items').get():
print(k)
I am getting the 3 documents, I can access the id of the documents
<google.cloud.firestore_v1beta1.document.DocumentSnapshot object at 0x7f2003bebc18>
<google.cloud.firestore_v1beta1.document.DocumentSnapshot object at 0x7f2003bebdd8>
<google.cloud.firestore_v1beta1.document.DocumentSnapshot object at 0x7f2003bebcf8>
print(k.id)
a3BxcpWpavHmuz6DpZH3
However, it is the max I can get.
1) I do not know how to access the values of the document. Something like this:
from firebase_admin import db
ref = db.reference('items')
print(ref)
<firebase_admin.db.Reference object at 0x7f20013b2828>
# GET?
ref.get()
# empty
2) I do not know how to access the values directly (e.g., using browser or requests), something like:
https://test-6f02d.firebaseio.com/items.json
returns
{
"error" : "Permission denied"
}
3) I do not know how to update an existing document or create a new one in the collection items.
# UPDATE?
# PUSH?
I tried to follow this blog and the documentation (but it does not have examples) and several answers here on SO, but without any success.
Thanks in advance.
Another night and I can answer myself (thanks to Doug for the hint in the discussion):
The problem for me was that there are two similar documentations (for Python part, the 2nd one is more extensive than just Python). I found the first one more helpful, but sometimes I needed to use part of the second one, too:
https://googleapis.github.io/google-cloud-python/latest/firestore/index.html (particularly the API Reference)
https://firebase.google.com/docs/firestore/
1) Accessing the documents:
import firebase_admin
from firebase_admin import credentials, firestore
databaseURL = {
'databaseURL': "https://test-6f02d.firebaseio.com"
}
cred = credentials.Certificate("test.json")
firebase_admin.initialize_app(cred, databaseURL)
database = firestore.client()
col_ref = database.collection('items') # col_ref is CollectionReference
results = col_ref.where('name', '==', 'Pepa').get() # one way to query
results = col_ref.order_by('date',direction='DESCENDING').limit(1).get() # another way - get the last document by date
for item in results:
print(item.to_dict())
print(item.id)
# item is DocumentSnapshot
# note: the documentation says get() is depreciated in favour of stream(), however stream() did not work for me
2) Still do not know, but I do not need it as 1) works ok.
3) Update or create document:
# Continuing from 1)
# Udpdate:
doc = col_ref.document(item.id) # doc is DocumentReference
field_updates = {"description": "Updated description"}
doc.update(field_updates)
# Create:
import datetime
new_values = {
"name": "Newbie",
"description": "Shiny New Document",
"date": datetime.datetime.now()
}
col_ref.document().create(new_values)
import firebase_admin
from firebase_admin import credentials
from firebase_admin import firestore
cred = credentials.Certificate("serviceAccountKey.json")
firebase_admin.initialize_app(cred)
db = firestore.client()
docs = db.collection("persons").where("Province","==",cprovince.get()).get()
for doc in docs:
print(doc.to_dict())
pip install --upgrade google-cloud-firestore
Then use
# Import
from google.cloud import firestore
# Create your Firebase client
firebase = firestore.Client(project="your project name")
# Define the collection you're working in
collection = firebase.collection("myCollection")
# Filter for docs, get their ref's, grab the first, and convert it back to a dict
collection.where(...).get()[0].to_dict()
# Add a doc (by ID)
doc_explicit_ref = collection.document("my unique doc ID")
doc_explicit_ref.add({"my data":"is here"})
# Add an implicit doc (auto generated ID)
collection.add({"my cool":"data"})
This follows the same convention of pretty much any other python lib that GCP has - ensure your GOOGLE_APPLICATION_CREDENTIALS env var points to your svc account json

Categories

Resources