Azure IoTHub - FileUpload - Remove Device Name from BlobName - python

I work on a solution in order to upload images from a LocalFileSystem to Azure Storage.
For the moment we use a TokenSaS and a BlobClient but we would like to avoid to store locally an expiring SaSToken.
In order to do this, we thought about Azure IoTHub, that allows us to replace this process.
def __upload_file_Azure_IoTHub(self,src_path:str,blob_path:str) ->BlobClient:
if os.path.exists(src_path):
# We start by creating a blobClient from AzureIoTHub
storage_info=self.IoTHub_client.get_storage_info_for_blob(blob_path)
# We create the SAS Url from Client + Token
sas_url="https://{}/{}/{}{}".format(
storage_info["hostName"],
storage_info["containerName"],
storage_info['blobName'],
storage_info["sasToken"])
try:
with BlobClient.from_blob_url(sas_url) as blob_client:
with open(src_path, "rb") as fp:
blob=blob_client.upload_blob(fp,overwrite=True, timeout=self.config.azure_timeout)
self.IoTHub_client.notify_blob_upload_status(storage_info["correlationId"], True, 200, "OK: {}".format(blob_path))
return blob
except Exception as ex:
self.IoTHub_client.notify_blob_upload_status(storage_info["correlationId"], False, 403, "Upload Failed")
raise Exception("AzureUpload_IoTHub")
https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/iot-hub/iot-hub-devguide-file-upload.md#device-initialize-a-file-upload
The problem is that when we upload this way, the device name is added as a prefix to the blobName.
It is a problem because it will cause us problems of rectrocompatibility :
We defined a naming convention in our storage and this behavior will break everything.
Let's imagine :
DeviceName = FileWatcherDaemon
BlobPath = YYYY/MM/DD/MyBlob.whatever
# Then :
BlobName = FileWatcherDaemon/YYYY/MM/DD/MyBlob.whatever
# Instead of :
BlobName = YYYY/MM/DD/MyBlob.whatever
I tried replacing the blobName by my blob_path, but it is not working because the generated sasToken is blobLevel and not containerLevel.
Do you have an idea about how to remove this device name ?
For us it is a problem because we will have many different devices uploading in the same storage. We would like the naming convention to fit with our business needs and not to technical information.

Are you using any of the other capabilities of Azure IoT Hub besides the fileUpload? Do you need your device to received messages from the Cloud? For your scenario, it may be overkill using IoT Hub just for uploading files.
For the moment we use a TokenSaS and a BlobClient but we would like to avoid to store locally an expiring SaSToken.
Have a look at Identity and access management Security Recommendations and consider using Azure Key Vault in your scenario.
Microsoft recommends using Azure AD to authorize requests to Azure Storage. However, if you must use Shared Key authorization, then secure your account keys with Azure Key Vault. You can retrieve the keys from the key vault at runtime, instead of saving them with your application. For more information about Azure Key Vault, see Azure Key Vault overview.
For more security "Azure Key Vaults may be either software-protected or, with the Azure Key Vault Premium tier, hardware-protected by hardware security modules (HSMs)."

Related

azure copy blobs across storage accounts fails with ErrorCode:CannotVerifyCopySource

I am using python sdk to copy blobs from one container to another, Here is the code,
from azure.storage.blob import BlobServiceClient
src_blob = '{0}/{1}'.format(src_url,blob_name)
destination_client = BlobServiceClient.from_connection_string(connectionstring)
copied_blob = destination_client.get_blob_client(dst_container,b_name)
copied_blob.start_copy_from_url(src_blob)
It throws the below error,
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>CannotVerifyCopySource</Code><Message>Public access is not permitted on this storage account.
I already gone through this post here and in my case the public access is disabled .
I do not have sufficient privilege to enable public access on the storage and test? Is there a work around solution to accomplish copy without changing that setting?
Azcopy 409 Public access is not permitted on this storage account
Do I need to change the way I connect to the account?
When copying a blob across storage accounts, the source blob must be publicly accessible so that Azure Storage Service can access the source blob. You were getting the error because you were using just the blob's URL. If the blob is in a private blob container, Azure Storage Service won't be able to access the blob using just its URL.
To fix this issue, you would need to generate a SAS token on the source blob with at least Read permission and use that SAS URL as copy source.
So your code would be something like:
src_blob_sas_token = generate_sas_token_somehow()
src_blob = '{0}/{1}?{2}'.format(src_url,blob_name, src_blob_sas_token)
check the privilege of your SAS token.
In your example, it doesn't look like you are passing the SAS token

Azure function and Azure Blob Storage

I have created an Azure function which is trigered when a new file is added to my Blob Storage. This part works well !
BUT, now I would like to start the "Speech-To-Text" Azure service using the API. So I try to create my URI leading to my new blob and then add it to the API call. To do so I created an SAS Token (From Azure Portal) and I add it to my new Blob Path .
https://myblobstorage...../my/new/blob.wav?[SAS Token generated]
By doing so I get an error which says :
Authentification failed Invalid URI
What am I missing here ?
N.B : When I generate manually the SAS token from the "Azure Storage Explorer" everything is working well. Plus my token is not expired in my test
Thank you for your help !
You might generate the SAS token with wrong authentication.
Make sure the Object option is checked.
Here is the reason in docs:
Service (s): Access to service-level APIs (e.g., Get/Set Service Properties, Get Service Stats, List Containers/Queues/Tables/Shares)
Container (c): Access to container-level APIs (e.g., Create/Delete Container, Create/Delete Queue, Create/Delete Table, Create/Delete
Share, List Blobs/Files and Directories)
Object (o): Access to object-level APIs for blobs, queue messages, table entities, and files(e.g. Put Blob, Query Entity, Get Messages,
Create File, etc.)

Connect google storage client python without creds.json

now to define Google storage client I'm using:
client = storage.Client.from_service_account_json('creds.json')
But I need to change client dynamically and prefer not deal with storing auth files to local fs.
So, is there some another way to connect by sending credentials as variable?
Something like for AWS and boto3:
iam_client = boto3.client(
'iam',
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY
)
I guess, I miss something in docs and would be happy if someone point me where can I found this in docs.
If you want to use built-in methods, an option could be to create the constructor for the Client (Cloud Storage). In order to perform that actions these two links can be helpful.
Another possible option in order to avoid store auth files locally is using environment variable pointing to credentials outside of your applications code such as Cloud Key Management Service. To have more context about this you can take a look at this article.

Use Python Google Storage Client without credentials

I am using the Python Google Storage Client, however I am using a bucket with public read/write access. (I know this is usually a terrible idea but I have a rare use case where it is fine).
When I try to retrieve some files, I get a DefaultCredentialsError.
BUCKET_NAME = 'my-public-bucket-name'
storage_client = storage.Client()
bucket = storage_client.get_bucket(BUCKET_NAME)
def list_blobs(prefix, delimiter=None):
blobs = bucket.list_blobs(prefix=prefix, delimiter=delimiter)
print('Blobs:')
for blob in blobs:
print(blob.name)
The specific error reads:
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
That page suggests using Oath or other tokens, but I shouldn't need these since my bucket is public? I can make an HTTP request to the bucket in chrome and receive data.
How should I get around this issue? Can I provide default or null credentials?
The default for a storage client with no parameters is to use environment credentials (e.g. authenticate with the gcloud tools first). If you want to use a client with no credentials you have to use
the create_anonymous_client method, which lets you access resources available to allUsers.
Be careful though which APIs you use, not all of them support anonymous credentials. E.g. instead of client.get_bucket('my-bucket') you have to use client.bucket(bucket_name='my-bucket').
Also note that it seems any permissions error returns a generic ValueError: Anonymous credentials cannot be refreshed.. E.g. if you try to overwrite an existing file while only having read/write permissions.
So a full example of uploading a file to a publicly accessible bucket is
from google.cloud import storage
client = storage.Client.create_anonymous_client()
bucket = client.bucket(bucket_name='my-public-bucket')
blob = bucket.blob('my-file')
blob.upload_from_filename('my-local-file')
From "Cloud Storage Authentication":
Most of the operations you perform in Cloud Storage must be authenticated. The only exceptions are operations on objects that allow anonymous access. Objects are anonymously accessible if the allUsers group has READ permission. The allUsers group includes anyone on the Internet.

Access to Azure Blob Storage with Firewall from Azure Function App

I have an Azure blob storage account which is fire walled to selected networks only. I would like to access this storage account from a function app running on a dynamic plan whose outbound IP addresses are known to me. Problem is that I add these outbound ips to the Allowed IP addresses in Firewall and Virtual Network settings of the blob storage but I still continue to get an error which says:
This request is not authorized to perform this operation.
Can someone please point out where I am going wrong?
N.B. I am using PythonSDK for accessing the blob storage with the account name and the account key!
I did some test on my side using consumption function app to access my files in blob storage and it works for me.
There are 2 steps I did :
I enabled storage account firewall and added all function app outbound IPs to it.
Enabling anonymous access on the container that the blob file I would like to access,so that my function app can access the blob file directly (as storage firewall is enabled so that only the specified IPs would be able to access your storage , I think the security here is ok. If you need higher security level in your scenario, as #Marie Hoeger said , you should use private container and SAS token to control blob access).
If you have any further concerns , pls feel free to let me know : )

Categories

Resources