I have .aws/credentials at a different location than the current folder, how do I specify a different location?
# Create a client using the credentials and region defined in the [adminuser]
# section of the AWS credentials file (~/.aws/credentials).
session = Session(profile_name="adminuser")
polly = session.client("polly")
It appears you are using python boto library
The shared credentials file has a default location of
~/.aws/credentials. You can change the location of the shared
credentials file by setting the AWS_SHARED_CREDENTIALS_FILE
environment variable.
e.g. export AWS_SHARED_CREDENTIALS_FILE=mycustompath
Then run your python script
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#:~:text=The%20shared%20credentials%20file%20has,section%20names%20corresponding%20to%20profiles.
Related
I am using Python boto 3 to get info on one of my AWS account. Can I automate it such a way that I can get the same info from all accounts? I know the boto 3 lib reads creds from the credentials file but can I have multiple credentials read?
Boto3 can read credentials in many different ways.
One option is to have a credential file with a profile for each of your accounts (as mentioned in the comment by #jordanm) like this (examples from the AWS documentation):
[default]
aws_access_key_id=foo
aws_secret_access_key=bar
[dev]
aws_access_key_id=foo2
aws_secret_access_key=bar2
[prod]
aws_access_key_id=foo3
aws_secret_access_key=bar3
And select the profile you want in the code, like this (again, from the AWS documentation):
session = boto3.Session(profile_name='dev')
# Any clients created from this session will use credentials
# from the [dev] section of ~/.aws/credentials.
dev_s3_client = session.client('s3')
Now you just need to loop through all your profiles.
I would like to download a public dataset from the NIMH Data Archive. After creating an account on their website and accepting their Data Usage Agreement, I can download a CSV file which contains the path to all the files in the dataset I am interested in. Each path is of the form s3://NDAR_Central_1/....
1 Download on my personal computer
In the NDA Github repository, the nda-tools Python library exposes some useful Python code to download those files to my own computer. Say I want to download the following file:
s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz
Given my username (USRNAME) and password (PASSWD) (the ones I used to create my account on the NIMH Data Archive), the following code allows me to download this file to TARGET_PATH on my personal computer:
from NDATools.clientscripts.downloadcmd import configure
from NDATools.Download import Download
config = configure(username=USRNAME, password=PASSWD)
s3Download = Download(TARGET_PATH, config)
target_fnames = ['s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz']
s3Download.get_links('paths', target_fnames, filters=None)
s3Download.get_tokens()
s3Download.start_workers(False, None, 1)
Behind the hood, the get_tokens method of s3Download will use USRNAME and PASSWD to generate temporary access key, secret key and security token. Then, the start_workers method will use the boto3 and s3transfer Python libraries to download the selected file.
Everything works fine !
2 Download to a GCP bucket
Now, say I created a project on GCP and would like to directly download this file to a GCP bucket.
Ideally, I would like to do something like:
gsutil cp s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz gs://my-bucket
To do this, I execute the following Python code in the Cloud Shell (by running python3):
from NDATools.TokenGenerator import NDATokenGenerator
data_api_url = 'https://nda.nih.gov/DataManager/dataManager'
generator = NDATokenGenerator(data_api_url)
token = generator.generate_token(USRNAME, PASSWD)
This gives me the access key, the secret key and the session token. Indeed, in the following,
ACCESS_KEY refers to the value of token.access_key,
SECRET_KEY refers to the value of token.secret_key,
SECURITY_TOKEN refers to the value of token.session.
Then, I set these credentials as environment variables in the Cloud Shell:
export AWS_ACCESS_KEY_ID = [copy-paste ACCESS_KEY here]
export AWS_SECRET_ACCESS_KEY = [copy-paste SECRET_KEY here]
export AWS_SECURITY_TOKEN = [copy-paste SECURITY_TOKEN here]
Eventually, I also set up the .boto configuration file in my home. It looks like this:
[Credentials]
aws_access_key_id = $AWS_ACCESS_KEY_ID
aws_secret_access_key = $AWS_SECRET_ACCESS_KEY
aws_session_token = $AWS_SECURITY_TOKEN
[s3]
calling_format = boto.s3.connection.OrdinaryCallingFormat
use-sigv4=True
host=s3.us-east-1.amazonaws.com
When I run the following command:
gsutil cp s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz gs://my-bucket
I end up with:
AccessDeniedException: 403 AccessDenied
The full traceback is below:
Non-MD5 etag ("a21a0b2eba27a0a32a26a6b30f3cb060-6") present for key <Key: NDAR_Central_1,submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz>, data integrity checks are not possible.
Copying s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz [Content-Type=application/x-gzip]...
Exception in thread Thread-2:iB]
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/google/google-cloud-sdk/platform/gsutil/gslib/daisy_chain_wrapper.py", line 213, in PerformDownload
decryption_tuple=self.decryption_tuple)
File "/google/google-cloud-sdk/platform/gsutil/gslib/cloud_api_delegator.py", line 353, in GetObjectMedia
decryption_tuple=decryption_tuple)
File "/google/google-cloud-sdk/platform/gsutil/gslib/boto_translation.py", line 590, in GetObjectMedia
generation=generation)
File "/google/google-cloud-sdk/platform/gsutil/gslib/boto_translation.py", line 1723, in _TranslateExceptionAndRaise
raise translated_exception # pylint: disable=raising-bad-type
AccessDeniedException: AccessDeniedException: 403 AccessDenied
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>A93DBEA60B68E04D</RequestId><HostId>Z5XqPBmUdq05btXgZ2Tt7HQMzodgal6XxTD6OLQ2sGjbP20AyZ+fVFjbNfOF5+Bdy6RuXGSOzVs=</HostId></Error>
AccessDeniedException: 403 AccessDenied
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>A93DBEA60B68E04D</RequestId><HostId>Z5XqPBmUdq05btXgZ2Tt7HQMzodgal6XxTD6OLQ2sGjbP20AyZ+fVFjbNfOF5+Bdy6RuXGSOzVs=</HostId></Error>
I would like to be able to directly download this file from a S3 bucket to my GCP bucket (without having to create a VM, setup Python and run the code above [which works]). Why is it that the temporary generated credentials work on my computer but do not work in GCP Cloud Shell?
The complete log of the debug command
gsutil -DD cp s3://NDAR_Central_1/submission_13364/00m/0.C.2/9007827/20041006/10263603.tar.gz gs://my-bucket
can be found here.
The procedure you are trying to implement is called "Transfer Job"
In order to transfer a file from Amazon S3 bucket to a Cloud Storage bucket:
A. Click the Burger Menu on the top left corner
B. Go to Storage > Transfer
C. Click Create Transfer
Under Select source, select Amazon S3 bucket.
In the Amazon S3 bucket text box, specify the source Amazon S3 bucket name.
The bucket name is the name as it appears in the AWS Management Console.
In the respective text boxes, enter the Access key ID and Secret key associated
with the Amazon S3 bucket.
To specify a subset of files in your source, click Specify file filters beneath
the bucket field. You can include or exclude files based on file name prefix and
file age.
Under Select destination, choose a sink bucket or create a new one.
To choose an existing bucket, enter the name of the bucket (without the prefix
gs://), or click Browse and browse to it.
To transfer files to a new bucket, click Browse and then click the New bucket
icon.
Enable overwrite/delete options if needed.
By default, your transfer job only overwrites an object when the source version is
different from the sink version. No other objects are overwritten or deleted.
Enable additional overwrite/delete options under Transfer options.
Under Configure transfer, schedule your transfer job to Run now (one time) or Run
daily at the local time you specify.
Click Create.
Before setting up the Transfer Job please make sure you have the necessary roles assigned to your account and the required permissions described here.
Also take into consideration that the Storage Transfer Service is currently available to certain Amazon S3 regions, described under the AMAZON S3 tab, of the Setting up a transfer job
Transfer jobs can also be done programmatically. More information here
Let me know if this was helpful.
EDIT
Neither the Transfer Service or gsutil command support currently "Temporary Security Credentials" even though they are supported by AWS. A workaround to do what you want is to change the source code of the gsutil command.
I also filed a Feature Request on your behalf, I suggest you to star it in order to get updates of the procedure.
I'm currently working on using the Google Cloud Platform Speech-to-Text Api. I've followed the prompts and read some of the possible errors and troubleshooting information on the developer site. But I can't seem to make the env variable needed to access the auth credentials.
The name of the variable is GOOGLE_APPLICATION_CREDENTIALS. I have tried the following:
Setting up the variable using the command prompt: set GOOGLE_APPLICATION_CREDENTIALS=[path]
Setting up the variable using the anaconda power shell: $env:GOOGLE_APPLICATION_CREDENTIALS=[path]
Including the path in PATH variables on the environment variables native option on Windows 10
Including the environment variables on the same native option state above.
Setting up the variable inside Sublime3 using the following:
def explicit():
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json(
'path')
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
The code I am trying to run is the following:
import io
import os
# Imports the Google Cloud client library
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
# Instantiates a client
client = speech.SpeechClient()
# The name of the audio file to transcribe
file_name = os.path.join(
os.path.dirname(__file__),
'resources',
'audio.raw')
# Loads the audio into memory
with io.open(file_name, 'rb') as audio_file:
content = audio_file.read()
audio = types.RecognitionAudio(content=content)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')
# Detects speech in the audio file
response = client.recognize(config, audio)
for result in response.results:
print('Transcript: {}'.format(result.alternatives[0].transcript))
I have read other answers on setting up and verifying environment variables, but no solution has worked.
I can also see the environment in the env folder in the anaconda shell prompt. Does anyone have any ideas as to why the script cannot find the file?
The error message is:
line 317, in default
raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application.
Hi I was also facing the same issue but resolved it later. So the reason the path is not getting set is probably because the file is blocked by the system. Just go to the location where the file is and right click on the file -> properties. there you will see an option called unblock. click on it and that should resolve your problem.
I am trying to upload a folder in my local machine to google cloud bucket. I get an error with the credentials. Where should I be providing the credentials and what all information is needed in it.
from_dest = '/Users/xyzDocuments/tmp'
gsutil_link = 'gs://bucket-1991'
from google.cloud import storage
try:
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_filename(source_file_name)
print('File {} uploaded to {}.'.format(source_file_name,destination_blob_name))
except Exception as e:
print e
The error is
could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://developers.google.com/accounts/do`cs/application-default-credentials.
You need to acquire the application default credentials for your project and set them as an environmental variable:
Go to the Create service account key page in the GCP Console.
From the Service account drop-down list, select New service account.
Enter a name into the Service account name field.
From the Role drop-down list, select Project > Owner.
Click Create. A JSON file that contains your key downloads to your computer.
Then, set an environmental variable which will provide the application credentials to your application when it runs locally:
$ export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/[FILE_NAME].json"
This error message is usually thrown when the application is not being authenticated correctly due to several reasons such as missing files, invalid credential paths, incorrect environment variables assignations, among other causes. Keep in mind that when you set an environment variable value in a session, it is reset every time the session is dropped.
Based on this, I recommend you to validate that the credential file and file path are being correctly assigned, as well as follow the Obtaining and providing service account credentials manually guide, in order to explicitly specify your service account file directly into your code; In this way, you will be able to set it permanently and verify if you are passing the service credentials correctly.
Passing the path to the service account key in code example:
def explicit():
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json('service_account.json')
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
I am using ckan version 2.3 & currently using local storage.
want to stored the resouces on amazon s3 buckets using boto.
Did the configuration with production.ini file for ckan.
Not getting stored on s3 not any exceptions, errors
on upload via GUI.
You need to setup your file uploads to point to s3 in config file e.g.- /etc/ckan/default/development.ini:
## OFS configuration
ofs.impl = s3
ofs.aws_access_key_id = xxxxxxxxx
ofs.aws_secret_access_key = xxxxxxxx`
##bucket to use in storage:
ckanext.storage.bucket = your_s3_bucket`
save and exit.
Run the following on the command line to migrate
from local file storage to remote storage.
paster db migrate-filestore