I am using the Python Google Storage Client, however I am using a bucket with public read/write access. (I know this is usually a terrible idea but I have a rare use case where it is fine).
When I try to retrieve some files, I get a DefaultCredentialsError.
BUCKET_NAME = 'my-public-bucket-name'
storage_client = storage.Client()
bucket = storage_client.get_bucket(BUCKET_NAME)
def list_blobs(prefix, delimiter=None):
blobs = bucket.list_blobs(prefix=prefix, delimiter=delimiter)
print('Blobs:')
for blob in blobs:
print(blob.name)
The specific error reads:
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
That page suggests using Oath or other tokens, but I shouldn't need these since my bucket is public? I can make an HTTP request to the bucket in chrome and receive data.
How should I get around this issue? Can I provide default or null credentials?
The default for a storage client with no parameters is to use environment credentials (e.g. authenticate with the gcloud tools first). If you want to use a client with no credentials you have to use
the create_anonymous_client method, which lets you access resources available to allUsers.
Be careful though which APIs you use, not all of them support anonymous credentials. E.g. instead of client.get_bucket('my-bucket') you have to use client.bucket(bucket_name='my-bucket').
Also note that it seems any permissions error returns a generic ValueError: Anonymous credentials cannot be refreshed.. E.g. if you try to overwrite an existing file while only having read/write permissions.
So a full example of uploading a file to a publicly accessible bucket is
from google.cloud import storage
client = storage.Client.create_anonymous_client()
bucket = client.bucket(bucket_name='my-public-bucket')
blob = bucket.blob('my-file')
blob.upload_from_filename('my-local-file')
From "Cloud Storage Authentication":
Most of the operations you perform in Cloud Storage must be authenticated. The only exceptions are operations on objects that allow anonymous access. Objects are anonymously accessible if the allUsers group has READ permission. The allUsers group includes anyone on the Internet.
Related
I am using python sdk to copy blobs from one container to another, Here is the code,
from azure.storage.blob import BlobServiceClient
src_blob = '{0}/{1}'.format(src_url,blob_name)
destination_client = BlobServiceClient.from_connection_string(connectionstring)
copied_blob = destination_client.get_blob_client(dst_container,b_name)
copied_blob.start_copy_from_url(src_blob)
It throws the below error,
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>CannotVerifyCopySource</Code><Message>Public access is not permitted on this storage account.
I already gone through this post here and in my case the public access is disabled .
I do not have sufficient privilege to enable public access on the storage and test? Is there a work around solution to accomplish copy without changing that setting?
Azcopy 409 Public access is not permitted on this storage account
Do I need to change the way I connect to the account?
When copying a blob across storage accounts, the source blob must be publicly accessible so that Azure Storage Service can access the source blob. You were getting the error because you were using just the blob's URL. If the blob is in a private blob container, Azure Storage Service won't be able to access the blob using just its URL.
To fix this issue, you would need to generate a SAS token on the source blob with at least Read permission and use that SAS URL as copy source.
So your code would be something like:
src_blob_sas_token = generate_sas_token_somehow()
src_blob = '{0}/{1}?{2}'.format(src_url,blob_name, src_blob_sas_token)
check the privilege of your SAS token.
In your example, it doesn't look like you are passing the SAS token
I work on a solution in order to upload images from a LocalFileSystem to Azure Storage.
For the moment we use a TokenSaS and a BlobClient but we would like to avoid to store locally an expiring SaSToken.
In order to do this, we thought about Azure IoTHub, that allows us to replace this process.
def __upload_file_Azure_IoTHub(self,src_path:str,blob_path:str) ->BlobClient:
if os.path.exists(src_path):
# We start by creating a blobClient from AzureIoTHub
storage_info=self.IoTHub_client.get_storage_info_for_blob(blob_path)
# We create the SAS Url from Client + Token
sas_url="https://{}/{}/{}{}".format(
storage_info["hostName"],
storage_info["containerName"],
storage_info['blobName'],
storage_info["sasToken"])
try:
with BlobClient.from_blob_url(sas_url) as blob_client:
with open(src_path, "rb") as fp:
blob=blob_client.upload_blob(fp,overwrite=True, timeout=self.config.azure_timeout)
self.IoTHub_client.notify_blob_upload_status(storage_info["correlationId"], True, 200, "OK: {}".format(blob_path))
return blob
except Exception as ex:
self.IoTHub_client.notify_blob_upload_status(storage_info["correlationId"], False, 403, "Upload Failed")
raise Exception("AzureUpload_IoTHub")
https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/iot-hub/iot-hub-devguide-file-upload.md#device-initialize-a-file-upload
The problem is that when we upload this way, the device name is added as a prefix to the blobName.
It is a problem because it will cause us problems of rectrocompatibility :
We defined a naming convention in our storage and this behavior will break everything.
Let's imagine :
DeviceName = FileWatcherDaemon
BlobPath = YYYY/MM/DD/MyBlob.whatever
# Then :
BlobName = FileWatcherDaemon/YYYY/MM/DD/MyBlob.whatever
# Instead of :
BlobName = YYYY/MM/DD/MyBlob.whatever
I tried replacing the blobName by my blob_path, but it is not working because the generated sasToken is blobLevel and not containerLevel.
Do you have an idea about how to remove this device name ?
For us it is a problem because we will have many different devices uploading in the same storage. We would like the naming convention to fit with our business needs and not to technical information.
Are you using any of the other capabilities of Azure IoT Hub besides the fileUpload? Do you need your device to received messages from the Cloud? For your scenario, it may be overkill using IoT Hub just for uploading files.
For the moment we use a TokenSaS and a BlobClient but we would like to avoid to store locally an expiring SaSToken.
Have a look at Identity and access management Security Recommendations and consider using Azure Key Vault in your scenario.
Microsoft recommends using Azure AD to authorize requests to Azure Storage. However, if you must use Shared Key authorization, then secure your account keys with Azure Key Vault. You can retrieve the keys from the key vault at runtime, instead of saving them with your application. For more information about Azure Key Vault, see Azure Key Vault overview.
For more security "Azure Key Vaults may be either software-protected or, with the Azure Key Vault Premium tier, hardware-protected by hardware security modules (HSMs)."
I need to write a cloud function in GCP, which responds to HTTP requests and has service account access to GCP cloud storage. My function will receive a string and threshold parameters. It will retrieve a csv file from cloud storage, compute similarity between the text string supplied and the entities in the csv file and return the entities that satisfy the threshold requirements.
From Google's cloud function tutorials, I have yet to see anything that gives it cloud storage access, a service account for access therein, etc.
Could anyone link a resource or otherwise explain how to get from A to B?
Let the magic happens!
In fact, nothing is magic. With most of Google Cloud products, you have a service account that you can grant the permission that you want. On Cloud Functions, the default service account is the AppEngine default service account with this pattern <projectID>#appspot.gserviceaccount.com.
When you deploy a Cloud FUnctions you can use a custom service account by using this paramenter --service-account=. It's safer because your Cloud Functions can have his own service account, with limited permissions (App Engine default service account is Project Editor by default, which is too wide!!)
So, this service is loaded automatically with your cloud functions and the Google Cloud auth libraries can access it via the Metadata server. The credentials is taken from the runtime context, it's the default credential of the environment
About your code, keep it as simple as that
from google.cloud import storage
client = storage.Client() # Use default credentials
bucket = client.get_bucket('myBucket')
blobs = bucket.list_blobs()
for blob in blobs:
print(blob.size)
On your workstation, if you want to execute the same code, you can use your own credential by running this command gcloud auth application-default login If you prefer using a service account key file (that I strongly don't recommend, but it's not the topic), you can set the environment variable GOOGLE_APPLICATION_CREDENTIALS with the file path as value
now to define Google storage client I'm using:
client = storage.Client.from_service_account_json('creds.json')
But I need to change client dynamically and prefer not deal with storing auth files to local fs.
So, is there some another way to connect by sending credentials as variable?
Something like for AWS and boto3:
iam_client = boto3.client(
'iam',
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY
)
I guess, I miss something in docs and would be happy if someone point me where can I found this in docs.
If you want to use built-in methods, an option could be to create the constructor for the Client (Cloud Storage). In order to perform that actions these two links can be helpful.
Another possible option in order to avoid store auth files locally is using environment variable pointing to credentials outside of your applications code such as Cloud Key Management Service. To have more context about this you can take a look at this article.
I made a publicly listable bucket on google cloud storage. I can see all the keys if I try to list the bucket objects in the browser. I was trying to use the create_anonymous_client() function so that I can list the bucket keys in the python script. It is giving me an exception. I looked up everywhere and still can't find the proper way to use the function.
from google.cloud import storage
client = storage.Client.create_anonymous_client()
a = client.lookup_bucket('publically_listable_bucket')
a.list_blobs()
Exception I am getting:
ValueError: Anonymous credentials cannot be refreshed.
Additional Query: Can I list and download contents of public google cloud storage buckets using boto3, If yes, how to do it anonymously?
I was also struggling with thing and couldn't find an answer anywhere online. Turns out you can access the bucket with just the bucket() method.
I'm not sure why, but this method can take several seconds sometimes.
client = storage.Client.create_anonymous_client()
bucket = client.bucket('publically_listable_bucket')
blobs = list(bucket.list_blobs())
This error means the bucket you are attempting to list does not grant the right permission. You must Give "Storage Object Viewer" or "Storage Legacy Bucket Reader" role to "allUsers".