Boto3 doesn't use keys from config file - python

I want Boto3 to get the access and secret key from a config file instead of hard coding them. On my Linux server I set the following environment variable AWS_SHARED_CREDENTIALS_FILE with the value /app/.aws/credentials. In /app/.aws/ I put a file with the name credentials with the following content:
[default]
aws_access_key_id = abcd
aws_secret_access_key = abcd
Of course I used the actual keys instead of abcd.
Python:
import boto3
conn = boto3.client('s3',
region_name="eu-west-1",
endpoint_url="endpoint",
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
config=Config(signature_version="s3", s3={'addressing_style': 'path'}))
However it says name'aws_access_key_id' is not defined. How can I fix it? Thanks
Edit:
>>> os.environ['AWS_SHARED_CREDENTIALS_FILE']
'/app/.aws/credentials'

If you have credentials folder with aws credentials already created, this means you don't need to specify them when instantiating your client. The following should work:
import boto3
conn = boto3.client('s3',
region_name="eu-west-1",
endpoint_url="endpoint",
config=Config(signature_version="s3", s3={'addressing_style': 'path'}))

If you are running your application on an EC2 instance, you can also assign a S3 role to your instance and have boto assume that role. Prevents you from having to store keys in your instance.
Look at the section "Assume Role Provider" in the docs:
http://boto3.readthedocs.io/en/latest/guide/configuration.html
Link to IAM roles as well:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html

Related

do you have to pass SSO profile credentials in order to assume the IAM role using boto3

I have my config file set up with multiple profiles and I am trying to assume an IAM role, but all the articles I see about assuming roles are starting with making an sts client using
import boto3 client = boto3.client('sts')
which makes sense but the only problem is, It gives me an error when I try to do it like this. but when I do it like this, while passing a profile that exists in my config file, it works. here is the code below:
import boto3 session = boto3.Session(profile_name="test_profile")
sts = session.client("sts")
response = sts.assume_role(
RoleArn="arn:aws:iam::xxx:role/role-name",
RoleSessionName="test-session"
)
new_session = Session(aws_access_key_id=response['Credentials']['AccessKeyId'], aws_secret_access_key=response['Credentials']['SecretAccessKey'], aws_session_token=response['Credentials']['SessionToken'])
when other people are assuming roles in their codes without passing a profile in, how does that even work? does boto3 automatically grabs the default profile from the config file or something like that in their case?
Yes. This line:
sts = session.client("sts")
tells boto3 to create a session using the default credentials.
The credentials can be provided in the ~/.aws/credentials file. If the code is running on an Amazon EC2 instance, boto3 will automatically use credentials associated with the IAM Role associated with the instance.
Credentials can also be passed via Environment Variables.
See: Credentials — Boto3 documentation

Boto3: How to assume IAM Role to access other account

Looking for some guidance with regards to uploading files into AWS S3 bucket via a python script and an IAM role. I am able to upload files using BOTO3 and an aws_access_key_id & aws_secret_access_key for other scripts.
However, I have now been given an IAM role to login to a certain account. I have no issue using AWS CLI to authenticate and query the S3 data so I do believe that my .aws/credential and .aws/config files are correct. However I am not sure how to use the ARN value within my python code.
This is what I have put together so far, but get a variety of errors which all lead to denied access:
session = boto3.Session(profile_name='randomName')
session.client('sts').get_caller_identity()
assumed_role_session = boto3.Session(profile_name='randomNameAccount')
print(assumed_role_session.client('sts').get_caller_identity())
credentials = session.get_credentials()
aws_access_key_id = credentials.access_key
aws_secret_access_key = credentials.secret_key
s3 = boto3.client('s3',
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key)
bucket_name = 'bucketName'
This is a sample of what my credential and config files looks like as a referal.
.aws/config file:
[profile randomNameAccount]
role_arn = arn:aws:iam::12345678910:role/roleName
source_profile = randomName
aws/credentials file:
[randomName]
aws_access_key_id = 12345678910
aws_secret_access_key = 1234567-abcdefghijk
My question is help around the python code to be able to authenticate against AWS and navigate around a S3 bucket using an IAM role and then upload files when I call an upload function.
Thank you in advance.
You should create an entry for the IAM Role in ~/.aws/credentials that refers to a set of IAM User credentials that have permission to assume the role:
[my-user]
aws_access_key_id = AKIAxxx
aws_secret_access_key = xxx
[my-role]
source_profile = my-user
role_arn = arn:aws:iam::123456789012:role/the-role
Add an entry to ~/.aws/config to provide a default region:
[profile my-role]
region = ap-southeast-2
Then you can assume the IAM Role with this code:
import boto3
# Create a session by assuming the role in the named profile
session = boto3.Session(profile_name='my-role')
# Use the session to access resources via the role
s3_client = session.client('s3')
response = s3_client.list_objects(Bucket=...)

Setting ["GOOGLE_APPLICATION_CREDENTIALS"] from a dict rather than file path

I'm trying to set the environment variable from a dict but getting and error when connecting.
#service account pulls in airflow variable that contains the json dict with service_account credentials
service_account = Variable.get('google_cloud_credentials')
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]=str(service_account)
error
PermissionDeniedError: Error executing an HTTP request: HTTP response code 403 with body '<?xml version='1.0' encoding='UTF-8'?><Error><Code>AccessDenied</Code><Message>Access denied.</Message><Details>Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object.</Details></Error>'
when reading if I use and point to file then there are no issues.
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]=/file/path/service_account.json
I'm wondering is there a way to convert the dict object to an os path like object? I don't want to store the json file on the container and airflow/google documentation isn't clear at all.
The Python stringio package lets you create a file-like object backed by a string, but that won't help here because the consumer of this environment variable is expecting a file path, not a file-like object. I don't think it's possible to do what you're trying to do. Is there a reason you don't want to just put the credentials in a file?
There is a way to do it, but the Google documentation is terrible. So I wrote a Github gist to document the recipe that I and a colleague (Imee Cuison) developed to use the key securely. Sample code below:
import json
from google.oauth2.service_account import Credentials
from google.cloud import secretmanager
def access_secret(project_id:str, secret_id:str, version_id:str="latest")->str:
"""Return the secret in string format"""
# Create the Secret Manager client.
client = secretmanager.SecretManagerServiceClient()
# Build the resource name of the secret version.
name = f"projects/{project_id}/secrets/{secret_id}/versions/{version_id}"
# Access the secret version.
response = client.access_secret_version(name=name)
# Return the decoded payload.
return response.payload.data.decode('UTF-8')
def get_credentials_from_token(token:str)->Credentials:
"""Given an authentication token, return a Credentials object"""
credential_dict = json.loads(secret_payload)
return Credentials.from_service_account_info(credential_dict)
credentials_secret = access_secret("my_project", "my_secret")
creds = get_credentials_from_token(credentials_secret)
# And now you can use the `creds` Credentials object to authenticate to an API
Putting the service account into the repository is not a good practice. As a best practice; You need to use authentication propagating from the default google auth within your application.
For instance, using Google Cloud Kubernetes you can use the following python code :
from google.cloud.container_v1 import ClusterManagerClient
credentials, project = google.auth.default(
scopes=['https://www.googleapis.com/auth/cloud-platform', ])
credentials.refresh(google.auth.transport.requests.Request())
cluster_manager = ClusterManagerClient(credentials=credentials)

Boto3 Error in AWS SDK: botocore.exceptions.NoCredentialsError: Unable to locate credentials

When I simply run the following code, I always gets this error.
import boto3 as boto
import sys
import json
role_to_assume_arn="arn:aws:iam::xxxxxxxxxxxx:role/AWSxxxx_xxxxxxAdminaccess_xxxxx24fexxx"
role_session_name='AssumeRoleSession1'
sts_client=boto.client('sts')
assumed_role_object=sts_client.assume_role(
RoleArn="arn:aws:iam::xxxxxxxxxxxx:role/AWSxxxx_xxxxxxAdminaccess_xxxxx24fexxx",
RoleSessionName="Sess1",
)
creds=assumed_role_object['Credentials']
sts_assumed_role = boto3.client('sts',
aws_access_key_id=creds['AccessKeyId'],
aws_secret_access_key=creds['SecretAccessKey'],
aws_session_token=creds['SessionToken'],
)
rds_client = boto.client('rds',
aws_access_key_id=creds['AccessKeyId'],
aws_secret_access_key=creds['SecretAccessKey'],
aws_session_token=creds['SessionToken']
)
I don't want to set and change the temporary session keys frequently, instead I want them to be set directly through a code like I've just written.
Am I wrong? Is there a way to set the credentials like this directly in the program or not?
Or is it mandatory to give the credentials in the "~/.aws/credentials"
I assume you are running this code in your local machine.
The STS client you created is expecting access key and secret access key.
You have to either configure it using credentials file or you can directly hardcode your access key and secret access key like below(Not recommended).
client = boto3.client('sts', aws_access_key_id=key, aws_secret_access_key=sec_key, region_name=region_name)
https://docs.aws.amazon.com/sdk-for-php/v3/developer-guide/guide_credentials_profiles.html
If you are running this code in EC2 instance, install boto3 and do AWS Configure. Follow the below link.
https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html

Download Files From AWS S3 Without Access and Secret Keys in Python

I have a working code to download files from one of my buckets in S3 and does some conversion work through in Python. I do not embed the Access and Secret Keys in the code but the keys are in my AWS CLI configuration.
import boto3
import botocore
BUCKET_NAME = 'converted-parquet-bucket' # replace with your own bucket name
KEY = 'json-to-parquet/names.snappy.parquet' # replace with path and follow with key object
s3 = boto3.resource('s3')
try:
s3.Bucket(BUCKET_NAME).download_file(KEY, 'names.snappy.parquet') # replace the key object name
except botocore.exceptions.ClientError as e: # exception handling
if e.response['Error']['Code'] == "404":
print("The object does not exist.") # if object that you are looking for does not exist it will print this
else:
raise
# Un comment lines 21 and 22 to convert csv to parquet
# dataframe = pandas.read_csv('names.csv')
# dataframe.to_parquet('names.snappy.parquet' ,engine='auto', compression='snappy')
data = pq.read_pandas('names.snappy.parquet', columns=['Year of Birth', 'Gender', 'Ethnicity', "Child's First Name", 'Count', 'Rank']).to_pandas()
#print(data) # this code will print the ALL the data in the parquet file
print(data.loc[data['Gender'] == 'MALE']) # this code will print the data in the parquet file ONLY what is in the query (SQL query)
Could someone help me how to get this code working without having access and secret keys embedded in the code or in AWS configure
If you are running your function locally, you need to have your credentials on your local credentials/config file to interact with AWS resources.
One alternative would be to run on AWS Lambda (if your function runs periodically, you can set that up with CloudWatch Events) and use Environment Variables or AWS Security Token Service (STS) to generate temporary credentials.
If you do not want to use secret/access key, you should use roles and policies, then. Here's the deal:
Define a role (ex. RoleWithAccess) and be sure that your user (defined in your credentials) can assume this role
Set a policy for RoleWithAccess, giving read/write access to your buckets
If you are executing it in your local machine, run the necessary commands (AWS CLI) to create a profile that makes you assume RoleWithAccess (ex. ProfileWithAccess)
Execute your script using a session passing this profile as the argument, what means you need to replace:
s3 = boto3.resource('s3')
with
session = boto3.session.Session(profile_name='ProfileWithAccess')
s3 = session.resource('s3')
The upside of this approach is that if you are running it inside an EC2 instance, you can tie your instance to a specific role when you build it (ex. RoleWithAccess). In that case, you can completely ignore session, profile, all the AWS CLI hocus pocus, and just run s3 = boto3.resource('s3').
You can also use AWS Lambda, setting a role and a policy with read/write permission to your bucket.

Categories

Resources