GCP-Cloud Composer: Secret Manager access variable.json - python

I try to configure Secret Manager for my Composer (ver 1.16, airflow 1.10) but I have a weird situation like below. In my Composer, I've used a variable.json file to manage Variables in Airflow
# variable.json
{
"sleep": "False",
"ssh_host": "my_host"
}
Then, I use this article to configure Secret Manager. Follow the instructions, I override the config with a section secrets
backend: airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend
backend_kwargs: {"variables_prefix":"airflow-variables", "sep":"-", "project_id": "my-project"}
And in my Secret Manager, I also created my secret : airflow-variables-sercret_name .
In fact, everything is fine, I could get my secret via (and of course, I don't have any problem with the service account)
from airflow.models.variable import Variable
my_secret = Variable.get('sercret_name')
But when I check the Logs, I found out that Airflow also tries to find the other variables in my variables.json file
2022-04-13 15:49:46.590 CEST airflow-worker Google Cloud API Call Error (NotFound): Secret ID airflow-secrets-sleep not found.
2022-04-13 15:49:46.590 CEST airflow-worker Google Cloud API Call Error (NotFound): Secret ID airflow-secrets-ssh_host not found.
So how could I avoid this situation, please? Or Did I miss understand something?
Thanks !!!

These errors are known when you use Secret Manager, but here are some workarounds below:
Add a way to skip Secret backend.
File a Feature Request to lower the log priority; You can use this as a template for your issue.
Create logs exclusion in Cloud Logging.
About Logs Exclusion:
Sinks control how Cloud Logging routes logs. Sinks belong to a given Google Cloud resource: Cloud projects, billing accounts, folders, and organizations. When the resource receives a log entry, it routes the log entry according to the sinks contained by that resource. The routing behavior for each sink is controlled by configuring the inclusion filter and exclusion filters for that sink.
When you create a sink, you can set multiple exclusion filters, letting you exclude matching log entries from being routed to the sink's destination or from being ingested by Cloud Logging.

Related

Google Service Account + Drive API unable to set expirationTime (python)

I'm trying to use a service account and the Google Drive API to automate sharing a folder and i want to be able to set the expirationTime property.
I've found previous threads that mention setting it in a permissions.update() call, but whatever i try i get the same generic error - 'Expiration dates cannot be set on this item.'.
I've validated i'm passing the correct date format, because i've even shared manually from my drive account and then used the permissions.list() to get the expirationTime from the returned data.
I've also tried creating a folder in my user drive, making my service account and editor and then trying to share that folder via API but i get the same problem.
Is there something that prevents a service account being able to set this property?
To note - I haven't enabled the domain wide delegation and tried impersonating yet.
Sample code:
update_body = {
'role': 'reader',
'expirationTime': '2023-03-13T23:59:59.000Z'
}
driveadmin.permissions().update(
fileId='<idhere>', permissionId='<idhere>', body=update_body).execute()
Checking the documentation from the feature it seems that it's only available to paid Google Workspace subscriptions as mentioned in the Google Workspace updates blog. You are most likely getting the error Expiration dates can't be set on this item as the service account is treated as a regular Gmail account and you can notice that this feature is not available for this type of accounts in the availability section of the update:
If you perform impersonation with your Google Workspace user I'm pretty sure that you won't receive the error as long as you have one of the subscriptions in which the feature is enabled. You can check more information about how to perform impersonation and Domain Wide Delegation here.

Azure IoTHub - FileUpload - Remove Device Name from BlobName

I work on a solution in order to upload images from a LocalFileSystem to Azure Storage.
For the moment we use a TokenSaS and a BlobClient but we would like to avoid to store locally an expiring SaSToken.
In order to do this, we thought about Azure IoTHub, that allows us to replace this process.
def __upload_file_Azure_IoTHub(self,src_path:str,blob_path:str) ->BlobClient:
if os.path.exists(src_path):
# We start by creating a blobClient from AzureIoTHub
storage_info=self.IoTHub_client.get_storage_info_for_blob(blob_path)
# We create the SAS Url from Client + Token
sas_url="https://{}/{}/{}{}".format(
storage_info["hostName"],
storage_info["containerName"],
storage_info['blobName'],
storage_info["sasToken"])
try:
with BlobClient.from_blob_url(sas_url) as blob_client:
with open(src_path, "rb") as fp:
blob=blob_client.upload_blob(fp,overwrite=True, timeout=self.config.azure_timeout)
self.IoTHub_client.notify_blob_upload_status(storage_info["correlationId"], True, 200, "OK: {}".format(blob_path))
return blob
except Exception as ex:
self.IoTHub_client.notify_blob_upload_status(storage_info["correlationId"], False, 403, "Upload Failed")
raise Exception("AzureUpload_IoTHub")
https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/iot-hub/iot-hub-devguide-file-upload.md#device-initialize-a-file-upload
The problem is that when we upload this way, the device name is added as a prefix to the blobName.
It is a problem because it will cause us problems of rectrocompatibility :
We defined a naming convention in our storage and this behavior will break everything.
Let's imagine :
DeviceName = FileWatcherDaemon
BlobPath = YYYY/MM/DD/MyBlob.whatever
# Then :
BlobName = FileWatcherDaemon/YYYY/MM/DD/MyBlob.whatever
# Instead of :
BlobName = YYYY/MM/DD/MyBlob.whatever
I tried replacing the blobName by my blob_path, but it is not working because the generated sasToken is blobLevel and not containerLevel.
Do you have an idea about how to remove this device name ?
For us it is a problem because we will have many different devices uploading in the same storage. We would like the naming convention to fit with our business needs and not to technical information.
Are you using any of the other capabilities of Azure IoT Hub besides the fileUpload? Do you need your device to received messages from the Cloud? For your scenario, it may be overkill using IoT Hub just for uploading files.
For the moment we use a TokenSaS and a BlobClient but we would like to avoid to store locally an expiring SaSToken.
Have a look at Identity and access management Security Recommendations and consider using Azure Key Vault in your scenario.
Microsoft recommends using Azure AD to authorize requests to Azure Storage. However, if you must use Shared Key authorization, then secure your account keys with Azure Key Vault. You can retrieve the keys from the key vault at runtime, instead of saving them with your application. For more information about Azure Key Vault, see Azure Key Vault overview.
For more security "Azure Key Vaults may be either software-protected or, with the Azure Key Vault Premium tier, hardware-protected by hardware security modules (HSMs)."

Azure function and Azure Blob Storage

I have created an Azure function which is trigered when a new file is added to my Blob Storage. This part works well !
BUT, now I would like to start the "Speech-To-Text" Azure service using the API. So I try to create my URI leading to my new blob and then add it to the API call. To do so I created an SAS Token (From Azure Portal) and I add it to my new Blob Path .
https://myblobstorage...../my/new/blob.wav?[SAS Token generated]
By doing so I get an error which says :
Authentification failed Invalid URI
What am I missing here ?
N.B : When I generate manually the SAS token from the "Azure Storage Explorer" everything is working well. Plus my token is not expired in my test
Thank you for your help !
You might generate the SAS token with wrong authentication.
Make sure the Object option is checked.
Here is the reason in docs:
Service (s): Access to service-level APIs (e.g., Get/Set Service Properties, Get Service Stats, List Containers/Queues/Tables/Shares)
Container (c): Access to container-level APIs (e.g., Create/Delete Container, Create/Delete Queue, Create/Delete Table, Create/Delete
Share, List Blobs/Files and Directories)
Object (o): Access to object-level APIs for blobs, queue messages, table entities, and files(e.g. Put Blob, Query Entity, Get Messages,
Create File, etc.)

How can I grant a Cloud Run service access to service account's credentials without the key file?

I'm developing a Cloud Run Service that accesses different Google APIs using a service account's secrets file with the following python 3 code:
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(SECRETS_FILE_PATH, scopes=SCOPES)
In order to deploy it, I upload the secrets file during the build/deploy process (via gcloud builds submit and gcloud run deploy commands).
How can I avoid uploading the secrets file like this?
Edit 1:
I think it is important to note that I need to impersonate user accounts from GSuite/Workspace (with domain wide delegation). The way I deal with this is by using the above credentials followed by:
delegated_credentials = credentials.with_subject(USER_EMAIL)
Using the Secret Manager might help you, as you can manage the multiple secrets you have and not have them stored as files, as you are doing right now. I would recommend you to take a look at this article here, so you can get more information on how to use it with Cloud Run, to improve the way you manage your secrets.
In addition to that, as clarified in this similar case here, you have two options: use default service account that comes with it or deploy another one with the Service Admin role. This way, you won't need to specify keys with variables - as clarified by a Google developer in this specific answer.
To improve the security, the best way is to never use service account key file, locally or on GCP (I wrote an article on this). To achieve this, Google Cloud service have an automatically loaded service account, either this one by default or, when possible, a custom one.
On Cloud Run, the default service account is the Compute Engine default service account (I recommend you to never use it, it has editor role on the project, it's too wide!), or you can specify the service account to use (--service-account= parameter)
Then, in your code, simply use the ADC mechanism (Application Default Credential) to get your credentials, like this in Python
import google.auth
credentials, project_id = google.auth.default(scopes=SCOPES)
I've found one way to solve the problem.
First, as suggested by guillaume blaquiere answer, I used google.auth ADC mechanism:
import google.auth
credentials, project_id = google.auth.default(scopes=SCOPES)
However, as I need to impersonate GSuite's (now Workspace) accounts, this method is not enough, as the credentials object generated from this method does not have the with_subject property. This led me to this similar post and specific answer which works a way to convert google.auth.credentials into the Credential object returned by service_account.Credentials.from_service_account_file. There was one problem with his solution, as it seemed that an authentication scope was missing.
All I had to do is add the https://www.googleapis.com/auth/cloud-platform scope to the following places:
The SCOPES variable in the code
Google Admin > Security > API Controls > Set client ID and scope for the service account I am deploying with
At the OAuth Consent Screen of my project
After that, my Cloud Run had access to credentials that were able to impersonate user's accounts without using key files.

Custom service account for App Engine service

I have multiple micro-services running on our project's GCP (Google Cloud Platform) App Engine. In our case, it would be preferable to minimize permissions on a per-service basis. Currently, we keep all the credential files in Keybase, keeping secrets sectioned off by team membership. So if I am working on one App Engine service, I can't see the secrets for another team's app engine service.
What we have been doing for other secrets, e.g. config files with passwords and secret tokes, we give the app engine service account kms decryption privileges and simply pull an encrypted copy of the config file from firestore and decrypt it. But we don't want to use the default app engine service account everywhere, because different teams using the same service accounts would have access to everyone's secrets. So we want to move to a service account on a per-service basis for each app engine service in development.
From what I can tell in the Google documentation, they want to you to upload up the credential file(s) when we deploy the app, which works, however, from the cloud console, it appears difficult to lock down who can look at the the files deployed to the service, and anyone with access could simply copy/paste all the credentials.
If you have the config in a dictionary, you do something like this:
from google.oauth2 import service_account
from google.cloud import kms_v1
d = {'type': 'service_account',
'project_id': 'my-awesome-project',
'private_key_id': '074139282fe9834ac23401',
'private_key': '-----BEGIN PRIVATE KEY----\n supersecretkeythatnobodyknows==\n-----END PRIVATE KEY-----\n',
'client_email': 'my-cool-address#my-awesome-project.iam.gserviceaccount.com',
'client_id': '1234567890',
'auth_uri': 'https://accounts.google.com/o/oauth2/auth',
'token_uri': 'https://oauth2.googleapis.com/token',
'auth_provider_x509_cert_url':
'https://www.googleapis.com/oauth2/v1/certs',
'client_x509_cert_url': 'https://www.googleapis.com/robot/v1/metadata/x509/my-cool-addres%40my-awesome-project.iam.gserviceaccount.com'}
credentials = service_account.Credentials.from_service_account_info(d)
kms_client = kms_v1.KeyManagementServiceClient(credentials=credentials)
This works, but how do we get the dictionary "d" into the program without it showing up in the code and having accessible by a broad range of people?
If each service in your AppEngine environment have to have their own identity, it's not possible with AppEngine. Newer service like Cloud Function or Cloud run can do this, but the very old service (for cloud era) AppEngine (more than 10 years old) can't.
You have a service account for all AppEngine service. You can use it to decrypt others services account key file, per service and use them in the appropriate service, but the root authorization stay the default App Engine service account, and thereby all services/all teams can have access to it (and to all the encrypted service account key files).
Maybe the solution is to redesign the application and to have a project per team?
What guillaume blaquiere said in his answer is correct and I agree with his idea of having a project per team, also. You cannot make a different service account for each of your App Engine's services.
Although what you want to achieve may be somehow possible in the App Engine flexible. OK, so you cannot find a way to only allow some team for some service and other team for the other. What you can do ( but I do not think that you want that and I would not recommend either ) is blocking certain IPs, by controlling your access with Firewalls. You can create your own firewall rules and allow request from one place and do not from another place.
This is just a trick that may or may not work for you. The real advice is that if you really want to implement the system you have described clean and correctly with more service accounts you should consider migrating to Cloud Run.

Categories

Resources