Using versioning with signed urls in google cloud storage

Using versioning with signed urls in google cloud storage - python

I'm having difficulty signing GET requests for Google Cloud Storage (GCS) when specifying a 'generation' (version number) on a the object. Signing the URL without the generation works like a charm and GET requests work fine. However, when I prepend #generation to the path, the GCS server always returns "access denied" when attempting to GET the signed URL.
For example, signing this URL path works fine:
https://storage.googleapis.com/BUCKET/OBJECT
signing this URL path gives me access denied:
https://storage.googleapis.com/BUCKET/OBJECT#1360887697105000
Note that for brevity and privacy, I've omitted what the actual signed URL with Signature, Expires, GoogleAccessId parameters. Also note, that I have also verified the bucket, object, and generation are correct using gsutil.
Does GCS allow for Signed URL access to specific object versions by 'generation' number? Is the URL signing procedure different when accessing a specific version?

The URL you're using is gsutil-compatible, but the XML API requires that you denote generation with a query parameter (which would look like storage.googleapis.com/BUCKET/OBJECT?generation=1360887697105000).
Documentation is here for reference: developers.google.com/storage/docs/reference-headers#generation

Related

can't get gcs to return md5hash and crc32c for uploaded object

I have a python program that uploads a file using google's resumable upload protocol. The upload works fine, but when I try to follow google's suggestion of requesting metadata for the file after upload in order to compare the server-generated md5hash with the hash I generated during upload, my GET request returns an object metadata json blob with no checksum fields.
I've found a reference in gcs docs that indicates that I have to send some special encryption headers in order to get these server-generated checksum fields to be returned in my metadata GET request, but the docs don't say which headers have to be included - and frankly - I'm not using encryption anyway, so I wouldn't know what headers I should send:
https://cloud.google.com/storage/docs/json_api/v1/objects/get (3rd paragraph)
Interestingly, the google playground (accessible from the link above) allows me to make the object request from their web interface using oauth to access my bucket - I can get this request to return the full object metadata with hash fields. But the playground doesn't indicate the full set of request headers sent (sadly) so I can't even use that to see what I should be sending.
Question: What's the trick to getting google to return the checksum fields when asking for object metadata?

To get an object's resource representation (metadata), specify the path parameter alt=json.
Example:
GET https://storage.googleapis.com/storage/v1/b/bucket/o/object?alt=json
Note: This is the default case. You do need to process the returned JSON data to extract the md5Hash key/value.
Google Cloud Object Resource:
https://cloud.google.com/storage/docs/json_api/v1/objects

How to use google python oauth libraries to implement OpenID Connect?

I am evaluating different options for authentication in a python App Engine flex environment, for apps that run within a G Suite domain.
I am trying to put together the OpenID Connect "Server flow" instructions here with how google-auth-library-python implements the general OAuth2 instructions here.
I kind of follow things up until 4. Exchange code for access token and ID token, which looks like flow.fetch_token, except it says "response to this request contains the following fields in a JSON array," and it includes not just the access token but the id token and other things. I did see this patch to the library. Does that mean I could use some flow.fetch_token to create an IDTokenCredentials (how?) and then use this to build an OpenID Connect API client (and where is that API documented)? And what about validating the id token, is there a separate python library to help with that or is that part of the API library?
It is all very confusing. A great deal would be cleared up with some actual "soup to nuts" example code but I haven't found anything anywhere on the internet, which makes me think (a) perhaps this is not a viable way to do authentication, or (b) it is so recent the python libraries have not caught up? I would however much rather do authentication on the server than in the client with Google Sign-In.
Any suggestions or links to code are much appreciated.

It seems Google's python library contains a module for id token validation. This can be found at google.oauth2.id_token module. Once validated, it will return the decoded token which you can use to obtain user information.
from google.oauth2 import id_token
from google.auth.transport import requests
request = requests.Request()
id_info = id_token.verify_oauth2_token(
token, request, 'my-client-id.example.com')
if id_info['iss'] != 'https://accounts.google.com':
raise ValueError('Wrong issuer.')
userid = id_info['sub']
Once you obtain user information, you should follow authentication process as described in Authenticate the user section.

OK, I think I found my answer in the source code now.
google.oauth2.credentials.Credentials exposes id_token:
Depending on the authorization server and the scopes requested, this may be populated when credentials are obtained and updated when refresh is called. This token is a JWT. It can be verified and decoded [as #kavindu-dodanduwa pointed out] using google.oauth2.id_token.verify_oauth2_token.
And several layers down the call stack we can see fetch_token does some minimal validation of the response JSON (checking that an access token was returned, etc.) but basically passes through whatever it gets from the token endpoint, including (i.e. if an OpenID Connect scope is included) the id token as a JWT.
EDIT:
And the final piece of the puzzle is the translation of tokens from the (generic) OAuthSession to (Google-specific) credentials in google_auth_oauthlib.helpers, where the id_token is grabbed, if it exists.
Note that the generic oauthlib library does seem to implement OpenID Connect now, but looks to be very recent and in process (July 2018). Google doesn't seem to use any of this at the moment (this threw me off a bit).

Is there a boto3 funciton to convert authorization_code into authorization_token

My project is python and using boto3 lib.
I'm using aws cognito Authorization code grant flow with return_type=code instead of return_type=token (implicit flow). Once my user is authorized my redirect url is injected with the queryStringParameter code=4d55a121-8ffc-4058-844b-xxxx.
outlined here
I need to be able to verify this code. Because of course someone can take the redirect url and make a fake code and paste it into the browser. According to this doc I can exchange the code for a token. This works as expected via a rest client. I get the token and can continue to pass the token as the Authorization header. But what I'm asking is there has to be a boto3 method that takes this code and converts it into a token for me. If i have to use the requests lib I will.
I have tried for days. get_user isnt the answer as that requires a token not the code.
For reference on what I'm trying to do heres my repo. The focus is in def edit(). I'm currently using requests to achieve the same thing but would like to use the boto library
https://github.com/knittledan/python-lambda-cognito

Nope, believe you should use an https client to exchange the authorization code for tokens with the token endpoint provided:
https://docs.aws.amazon.com/cognito/latest/developerguide/token-endpoint.html

Azure Blob Store SAS token missing Service Resource field

I created a Shared Access Signature(SAS) token on my Azure storage account using the web interface. The token looks like
?sv=xxxx-xx-xx&ss=b&srt=sco&sp=rl&se=xxxx-xx-xxTxx:xx:xxZ&st=xxxx-xx-xxTxx:xx:xxZ&spr=https&sig=xxxxxxxxxxxxxxxxxxxxxx
The SAS token here is missing the sr field for Service Resource. I have to manually prepend the sr=b to the query string to get things to work. I must be doing something wrong, because this seems extremely finicky.
from azure.storage.blob import BlockBlobService
sas_token = "?sv=xxxx-xx-xx&ss=b&srt=sco&sp=rl&se=xxxx-xx-xxTxx:xx:xxZ&st=xxxx-xx-xxTxx:xx:xxZ&spr=https&sig=xxxxxxxxxxxxxxxxxxxxxx"
sas_token = "?sr=b&" + sas_token[1:]
serv = BlockBlobService(account_name='myaccount', sas_token=sas_token)
for cont in serv.list_containers():
print cont.name
Without the sas_token = "?sr=b&" + sas_token[1:] I get the error:
sr is mandatory. Cannot be empty
And if the sr=b field is not first in the query, I get an authentication error like
Access without signed identifier cannot have time window more than 1 hour

Access without signed identifier cannot have time window more than 1 hour
Based on this error message, you may need to set expiry time less than 1 hour from now. See Windows Azure Shared Access Signature always gives: Forbidden 403.
I took your code with Python v2.7.12 and #azure-storage-python v0.34.3 (the latest version). And it worked well on my site. So, I'd recommend you upgrade to latest version and try it again.
UPDATE:
I traced the code of Azure Storage SDK for Python and here's what I found. The SDK is a REST API warpper which assumes that the SAS token looks like this:
sv=2015-04-05&ss=bfqt&srt=sco&sp=rl&se=2015-09-20T08:49Z&sip=168.1.5.60-168.1.5.70&sig=a39%2BYozJhGp6miujGymjRpN8tsrQfLo9Z3i8IRyIpnQ%3d
As you can see, the token doesn't include ?. And the SDK will append ? before the SAS token when it makes a GET request to the Azure Storage REST service.
This would cause that the key of the signed version was parsed as ?sv, then it raised the issue. So, to avoid this, we should remove the ? from the SAS token.

Signed CloudFront URL for a S3 bucket

I'm trying to create a signed CloudFront URL for an object in a Frankfurt S3 bucket (using the python library boto). This used to work very well with eu-west-1 buckets, but now I'm getting the following error message:
<Error>
<Code>InvalidRequest</Code>
<Message>
The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.
</Message>
...
I understand that new S3 locations need API requests to be signed using AWS4-HMAC-SHA256, but I can't find anything in the AWS documentation how this changes the creation of signed CloudFront URLs
Edit:
To clarify, the following code produces a signed URL without raising an error ... The error occurs when opening the created URL in the browser afterwards
cf = cloudfront.CloudFrontConnection(aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY)
distribution_summary = cf.get_all_distributions()[0]
distribution = distribution_summary.get_distribution()
return distribution.create_signed_url(url,
settings.CLOUDFRONT_KEY_ID,
int(time()) + expiration,
private_key_file=settings.PRIVATE_KEY_FILE)

I found the issue, it was actually the cloud front distribution itself. It seems like moving the origin of the distribution which was already existing (for a long time) from an US bucket to an EU bucket didn't work out.
I created a new one with the same settings (except a new Origin Access Identity) and it worked without any issues.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using versioning with signed urls in google cloud storage - python

Related

can't get gcs to return md5hash and crc32c for uploaded object

How to use google python oauth libraries to implement OpenID Connect?

Is there a boto3 funciton to convert authorization_code into authorization_token

Azure Blob Store SAS token missing Service Resource field

Signed CloudFront URL for a S3 bucket

Categories

Resources