How to generate azure storage container SAS url using python - python

I am unable to copy blob(Errror : CopySourceNotVerfied) using the SAS url generated dynamically using this code which I found here
How can I generate an Azure blob SAS URL in Python?
account_name = 'account'
account_key = 'key'
container_name = 'feed'
blob_name='standard_feed'
requiredSASToken = generate_blob_sas(account_name=account_name,
container_name=container_name,
blob_name=blob_name,
account_key=account_key,
permission=BlobSasPermissions(read=True),
expiry=datetime.utcnow() + timedelta(hours=24))
Instead am able to copy blob when used the SAS token generated from the azure portal, I hardcoded and tested.
The Difference observed between two tokens are like below
Code generated : se=2022-09-15T17%3A47%3A06Z&sp=r&sv=2021-08-06&sr=b&sig=xxx (sr is b)
Manually Generated : sp=r&st=2022-09-14T17:13:49Z&se=2022-09-17T01:13:49Z&spr=https&sv=2021-06-08&sr=c&sig=xxxx (sr is c)
How to generate SAS token for an azure storage container? i.e the SAS token must have all required fields especially signedResource(sr) must be container(c)?

I tried to reproduce the same in my environment and got below results:
from datetime import datetime, timedelta
from azure.storage.blob import BlobClient, generate_blob_sas, BlobSasPermissions
account_name = 'your account name'
account_key = 'your account key'
container_name = 'container'
blob_name = 'pandas.json'
def get_blob_sas(account_name,account_key, container_name, blob_name):
sas_blob = generate_blob_sas(account_name=account_name,
container_name=container_name,
blob_name=blob_name,
account_key=account_key,
permission=BlobSasPermissions(read=True),
expiry=datetime.utcnow() + timedelta(hours=1))
return sas_blob
blob = get_blob_sas(account_name,account_key, container_name, blob_name)
print(blob)
I got similar output:
I added start time in generate SAS in code:
How to generate SAS token for an azure storage container? i.e the SAS token must have all required fields especially signedResource(sr) must be container(c)?
Output : It has all required fields for SAS:
st=2022-09-15T12%3A23%3A44Z&se=2022-09-15T13%3A23%3A44Z&sp=r&sv=2021-08-06&sr=b&sig=XXXXX
I also tested my generated SAS token with url and got below result:
It says BlobNotFound, The specified blob does not exist. Strange is It says the same for the one generated manually.
I seen this error in your comment please check you have given correct blob name and path. kindly check the below proper URL.
https://{storage-account}.blob.core.windows.net/{container-name}/files/{file-name}.pdf.+ SAS

Related

"Required Content-Range response header is missing or malformed" while using download_blob in azure blob storage using cdn hostname

I have a storage account in azure with a connection string in the format :
connection_string = 'DefaultEndpointsProtocol=https;AccountName=<storage_account_name>;AccountKey=<redacted_account_key>;EndpointSuffix=azureedge.net'
I am trying to download a blob from a container using the cdn hostname https://<redacted_hostname>.azureedge.net instead of the origin hostname namely https://<redacted_hostname_2>.blob.core.windows.net
I am trying to download and store a blob present in the following way :
from azure.storage.blob import BlobServiceClient, generate_container_sas , ContainerSasPermissions
from urllib.parse import urlparse
from azure.storage.blob import BlobClient
# get container details
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = blob_service_client.get_container_client("container_name")
# get permission
perm = ContainerSasPermissions(read=True,list=True)
# set expiry
from datetime import datetime, timedelta
expiry=datetime.utcnow() + timedelta(hours=1)
# generate sas token
sas_token = generate_container_sas(
container_client.account_name,
container_client.container_name,
account_key=container_client.credential.account_key,
permission = perm,
expiry=datetime.utcnow() + timedelta(hours=1)
)
sas_url = f"https://<redacted_hostname>.azureedge.net/container_name?{sas_token}"
container_with_blob = "container_name/file.wav"
sas_url_parts = urlparse(sas_url)
account_endpoint = sas_url_parts.scheme + '://' + sas_url_parts.netloc
sas_token = sas_url_parts.query
blob_sas_url = account_endpoint + '/' + container_with_blob + '?' + sas_token;
blob_client = BlobClient.from_blob_url(blob_sas_url);
with open("download_file.wav", "wb") as current_blob:
stream = blob_client.download_blob()
current_blob.write(stream.readall())
However , this fails with the following error
raise ValueError("Required Content-Range response header is missing or malformed.")
ValueError: Required Content-Range response header is missing or malformed
however, same snippet works with the .blob.core.windows.net hostname
Attempts to solve issue
Changed EndpointSuffix=core.windows.net to EndpointSuffix=azureedge.net in connection_string.
Got the blob_properties from blob_client and sent it to download_blob API as shown below
...
blob_properties = blob_client.get_blob_properties()
...
stream = blob_client.download_blob(0, blob_properties.size)
This throws the same error if I am using cdn hostname but works fine using the origin.
Tried using BlobEndpoint=azureedge.net instead of EndpointSuffix .
Trying to set_http_headers in blob_client doc but don't seem to have any content_range property.
However, when I directly use the blob_sas_url i.e. https://<cdn_hostname>/container_name/file.wav?se=<sas_token> , I am able to download the file in my browser.
Additional point : I have also configured the caching rules to cache all unique url.
I tried with same code and got same error in my environment downloading blob through CDN end point:
After getting sas_url, the below code will only download  blob client Url (so kindly do not use below code)
< https://<redacted_hostname>.blob.core.windows.net/container/blobname?sas>
blob_client = BlobClient.from_blob_url(blob_sas_url);
with open("download_file.wav", "wb") as current_blob:
stream = blob_client.download_blob()
current_blob.write(stream.readall())
I tried with request method after getting sas_url it downloaded file successfully.
Code:
import datetime as dt
from azure.identity import DefaultAzureCredential
from azure.storage.blob import (
BlobSasPermissions,
BlobServiceClient,
generate_blob_sas,
)
import requests
credential = DefaultAzureCredential()
storage_acct_name = "venkat123"
container_name = "test"
blob_name = "scores.json"
url = f"https://{storage_acct_name}.blob.core.windows.net"
blob_service_client = BlobServiceClient(url, credential=credential)
udk = blob_service_client.get_user_delegation_key(
key_start_time=dt.datetime.utcnow() - dt.timedelta(hours=1),
key_expiry_time=dt.datetime.utcnow() + dt.timedelta(hours=1))
sas = generate_blob_sas(
account_name=storage_acct_name,
container_name=container_name,
blob_name=blob_name,
user_delegation_key=udk,
permission=BlobSasPermissions(read=True,write=True,list=True),
start = dt.datetime.utcnow() - dt.timedelta(minutes=15),
expiry = dt.datetime.utcnow() + dt.timedelta(hours=2),
)
sas_url = (
f'https://venkat3261.azureedge.net/'
f'{container_name}/{blob_name}?{sas}'
)
print(sas_url)
r = requests.get(sas_url, allow_redirects=True)
open('scores.json', 'wb').write(r.content)
Here, In my code I’m using azure-identity with user delegation key. You have to make sure that you have an access (authorization) and assign the role of storage-blob-contributor.
Go to portal -> your storage account -> Access Control (IAM) ->Add -> Add role assignments -> storage-blob-contributor or storage-blob-owner.
Portal:
Console:
The above code is executed successfully and downloaded the file through azure cdn hostname with SAS_url.
File:
For more reference:
Create a user delegation SAS - Azure Storage | Microsoft Learn.

Copy file from file storage to blob storage using python

I am trying to copy an image from file storage to blob storage using python but unable to get it to work. I have based it on this node.js example Copy Azure File Share to Blob with node.js. The exception I get when running the python code locally is shown below.
azure.core.exceptions.ResourceNotFoundError: This request is not authorized to perform this operation using this resource type.
Here is a snippet of the python code.
sas_token = generate_account_sas(
account_name=account_name,
account_key=account_key,
resource_types=ResourceTypes(service=True),
permission=AccountSasPermissions(read=True),
expiry=datetime.utcnow() + timedelta(hours=1)
)
blob_service_client = BlobServiceClient(account_url=blob_uri, credential=account_key)
container_client = blob_service_client.get_container_client("event")
share_directory_client = ShareDirectoryClient.from_connection_string(conn_str=conn_string,
share_name="test-share",
directory_path="")
dir_list = list(share_directory_client.list_directories_and_files())
logging.info(f"list directories and files : {dir_list}")
for item in dir_list:
if item['is_directory']:
pass
else:
fileClient = share_directory_client.get_file_client(item['name'])
source_url = fileClient.url + "?" + sas_token
logging.info(f"{source_url}")
container_client.get_blob_client(item['name']).start_copy_from_url(source_url)
The source url created using the SAS token is shown below.
https://<STORAGE_NAME>.file.core.windows.net/test-share/test.jpg?se=2021-10-01T05%3A46%3A22Z&sp=r&sv=2020-10-02&ss=f&srt=s&sig=<signature>
The reason you're getting this error is because of incorrect resource type in your SAS token (srt value).
Since your code is copying files to blob storage, you need to have object as allowed resource type. Please see this link to learn more about resource types.
Your code would look something like:
sas_token = generate_account_sas(
account_name=account_name,
account_key=account_key,
resource_types=ResourceTypes(object=True),
permission=AccountSasPermissions(read=True),
expiry=datetime.utcnow() + timedelta(hours=1)
)
Your SAS URL should look something like:
https://<STORAGE_NAME>.file.core.windows.net/test-share/test.jpg?se=2021-10-01T05%3A46%3A22Z&sp=r&sv=2020-10-02&ss=f&srt=o&sig=<signature>

Incorrect SAS URL produced for blob when using generate_blob_sas()?

I am trying to retrieve the URL of a blob in a container of a storage account for an Azure US Government account. In particular, I am trying to return the URL of the most recently updated/uploaded blob through this function. However, the URL generated has the incorrect SAS token at the end and I am not sure why. This is my code:
import pytz
import keys as key
from datetime import datetime, timedelta
from azure.storage.blob import BlobServiceClient, BlobSasPermissions, generate_blob_sas
"""
Access contents of the Azure blob storage. Retrieve the most recently uploaded block blob URL.
#param blob_service_client: BlobServiceClient object
#param container_name: string name of container (can change this parameter in the driver code)
#return URL of the most recently uploaded blob to the storage container with the SAS token at the end
#return name of the blob that was most recently updated/uploaded
"""
def retrieve_blob_from_storage(account_name, container_name):
# Instantiate a BlobServiceClient using a connection string
blob_service_client = BlobServiceClient.from_connection_string(key.CONNECTION_STRING_1)
container_client = blob_service_client.get_container_client(container_name)
# Get the most recently uploaded (modified) blob on the container
now = datetime.now(pytz.utc)
datetime_list = []
for blob in container_client.list_blobs():
datetime_list.append(blob.last_modified)
youngest = max(dt for dt in datetime_list if dt < now)
blob_name = ""
for blob in container_client.list_blobs():
if blob.last_modified == youngest:
blob_name = blob.name
# Generate SAS token to place at the end of the URL of the last uploaded (modified) blob on the container
sas = generate_blob_sas(account_name = account_name,
account_key = key.CONNECTION_STRING_1,
container_name = container_name,
blob_name = blob_name,
permission = BlobSasPermissions(read = True),
expiry = datetime.utcnow() + timedelta(hours = 2))
sas_url = 'https://' + account_name + '.blob.core.usgovcloudapi.net/' + container_name + '/' + blob_name + '?' + sas
return sas_url, blob_name
When I try to use the sas_url given, I receive the following authentication error:
<Error>
<Code>AuthenticationFailed</Code>
<Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. RequestId:fd40ba66-b01e-008b-7350-5a1de4000000 Time:2021-06-05T21:22:51.5265808Z</Message>
<AuthenticationErrorDetail>Signature did not match. String to sign used was rt 2021-06-05T23:22:33Z /blob/weedsmedia/weeds3d/MD-C2M-1-CALIB-1-GX010215.MP4 2020-06-12 b </AuthenticationErrorDetail>
</Error>
Does anyone know why this error is occurring? To clarify further, the correct blob (most recently updated/uploaded) name is being returned. I suspect it's something related to the Azure US Government and permissions, but I am not sure how to fix this error. How do I go about fixing this?
Here are the Azure dependencies I'm using:
azure-cli-core 2.24.0
azure-cli-telemetry 1.0.6
azure-common 1.1.26
azure-core 1.14.0
azure-identity 1.6.0
azure-nspkg 3.0.2
azure-storage 0.36.0
azure-storage-blob 12.8.1
azure-storage-common 2.1.0
azure-storage-nspkg 3.1.0
I believe the issue is with the following line of code in your generate_blob_sas method:
account_key = key.CONNECTION_STRING_1
It seems you're signing your SAS token parameters with storage account connection string. You will need to use storage account key (either key1 or key2) for signing you SAS token parameters to get a SAS token.

not able to copy the blob in GOOGLE CLOUD from one bucket to another bucket please find code below

I am trying to copy the file from one Google Cloud Storage bucket to another in GCP, the function is getting deployed successfully but its not working.
Please see the below sample code:
from google.cloud import storage
import os
def copy(
bucket_name, blob_name, destination_bucket_name, destination_blob_name
):
"""Copies a blob from one bucket to another with a new name."""
bucket_name = "ups_vikky_test56"
blob_name = "city.csv"
destination_bucket_name = "ups_vikky_test57"
destination_blob_name = "city.csv" + "temp"
storage_client = storage.Client()
source_bucket = storage_client.bucket(bucket_name)
source_blob = source_bucket.blob(blob_name)
destination_bucket = storage_client.bucket(destination_bucket_name)
blob_copy = source_bucket.copy_blob(
source_blob, destination_bucket, destination_blob_name
)
print(
"Blob {} in bucket {} copied to blob {} in bucket {}.".format(
source_blob.name,
source_bucket.name,
blob_copy.name,
destination_bucket.name,
)
)
I tested the code and it works properly. First, before deploying your code, make sure that your Cloud Function's entry point is correct. It must be the name of the function in your source code to be executed, which in this case is copy.
And last, if you're deploying with an HTTP trigger, you must always pass an argument and it needs to be included even if you're not using it. If you're using background functions, there must be 2 arguments. See the overview and examples in this documentation.
Here's a sample of a function with HTTP trigger:
from google.cloud import storage
import os
def copy(request):
"""Copies a blob from one bucket to another with a new name."""
bucket_name = "ups_vikky_test56"
blob_name = "city.csv"
destination_bucket_name = "ups_vikky_test57"
destination_blob_name = "city.csv" + "temp"
# ... the rest of your logic

Azure: create storage account with container and upload blob to it in Python

I'm trying to create a storage account in Azure and upload a blob into it using their python SDK.
I managed to create an account like this:
client = get_client_from_auth_file(StorageManagementClient)
storage_account = client.storage_accounts.create(
resourceGroup,
name,
StorageAccountCreateParameters(
sku=Sku(name=SkuName.standard_ragrs),
enable_https_traffic_only=True,
kind=Kind.storage,
location=region)).result()
The problem is that later I'm trying to build a container and I don't know what to insert as "account_url"
I have tried doing:
client = get_client_from_auth_file(BlobServiceClient, account_url=storage_account.primary_endpoints.blob)
return client.create_container(name)
But I'm getting:
azure.core.exceptions.ResourceNotFoundError: The specified resource does not exist
I did manage to create a container using:
client = get_client_from_auth_file(StorageManagementClient)
return client.blob_containers.create(
resourceGroup,
storage_account.name,
name,
BlobContainer(),
public_access=PublicAccess.Container
)
But later when I'm trying to upload a blob using BlobServiceClient or BlobClien I still need the "account_url" so I'm still getting an error:
azure.core.exceptions.ResourceNotFoundError: The specified resource does not exist
Anyone can help me to understand how do I get the account_url for a storage account I created with the SDK?
EDIT:
I managed to find a workaround to the problem by creating the connection string from the storage keys.
storage_client = get_client_from_auth_file(StorageManagementClient)
storage_keys = storage_client.storage_accounts.list_keys(resource_group, account_name)
storage_key = next(v.value for v in storage_keys.keys)
return BlobServiceClient.from_connection_string(
'DefaultEndpointsProtocol=https;' +
f'AccountName={account_name};' +
f'AccountKey={storage_key};' +
'EndpointSuffix=core.windows.net')
This works but I thin George Chen answer is more elegant.
I could reproduce this problem, then I found get_client_from_auth_file could not pass the credential to the BlobServiceClient, cause if just create BlobServiceClient with account_url without credential it also could print the account name.
So if you want to use a credential to get BlobServiceClient, you could use the below code, then do other operations.
credentials = ClientSecretCredential(
'tenant_id',
'application_id',
'application_secret'
)
blobserviceclient=BlobServiceClient(account_url=storage_account.primary_endpoints.blob,credential=credentials)
If you don't want this way, you could create the BlobServiceClient with the account key.
client = get_client_from_auth_file(StorageManagementClient,auth_path='auth')
storage_account = client.storage_accounts.create(
'group name',
'account name',
StorageAccountCreateParameters(
sku=Sku(name=SkuName.standard_ragrs),
enable_https_traffic_only=True,
kind=Kind.storage,
location='eastus',)).result()
storage_keys = client.storage_accounts.list_keys(resource_group_name='group name',account_name='account name')
storage_keys = {v.key_name: v.value for v in storage_keys.keys}
blobserviceclient=BlobServiceClient(account_url=storage_account.primary_endpoints.blob,credential=storage_keys['key1'])
blobserviceclient.create_container(name='container name')

Categories

Resources