Boto3 - how to connect to S3 via proxy? - python

I'm using a container that simulate a S3 server running on http://127.0.0.1:4569 (with no authorization or credentials needed)
and I'm trying to simply connect and print a list of all the bucket names using python and boto3
here's my docker-compose:
s3:
image: andrewgaul/s3proxy
environment:
S3PROXY_AUTHORIZATION: none
hostname: s3
ports:
- 4569:80
volumes:
- ./data/s3:/data
here's my code:
s3 = boto3.resource('s3', endpoint_url='http://127.0.0.1:4569')
for bucket in s3.buckets.all():
print(bucket.name)enter code here
here's the error message that I received:
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I tried this solution => How do you use an HTTP/HTTPS proxy with boto3?
but still not working, I don't understand what I'm doing wrong

First, boto3 always try to handshake with S3 server with AWS API key. Even your simulation server don't need password, you still need to specify them either in your .aws/credentials or inside your program. e.g.
[default]
aws_access_key_id = x
aws_secret_access_key = x
hardcoded dummy access key example
import boto3
session = boto3.session(
aws_access_key_id = 'x',
aws_secret_access_key = 'x')
s3 = session.resource('s3', endpoint_url='http://127.0.0.1:4569')
Second, I don't know how reliable and what kind of protocol is implemented by your "s3 simulation container". To make life easier, I always suggest anyone that wants to simulate S3 load test or whatever to use fake-s3

Related

S3 AWS Session Expired

I am new to AWS S3. I did a lot of googling around how to connect S3 using python and found everyone is using Boto so its what I am using as the client. I used powershell to login and create the .aws/credentials . In that file, I was able to get the aws_access_key_id, aws_secret_access_key AND aws_session_token needed to establish the session. I understand that session is only about 8 hours so the next day when my python script runs to connect to S3 obviously the session is expired. How can i overcome this and how can I establish a new session daily? Below is my code.
s3_client = boto3.client(
"s3",
aws_access_key_id=id_,
aws_secret_access_key=secret,
aws_session_token=token,
region_name='r'
)
# Test it on a service (yours may be different)
# s3 = session.resource('s3')
# Print out bucket names
for bucket in s3.buckets.all():
# print(bucket.name)
bucket = 'automated-reports' # already created on S3
csv_buffer = StringIO()
all_active_scraper_counts_df.to_csv(csv_buffer, index=False)
# s3_resource = boto3.resource('s3')
put_response = s3_client.put_object(Bucket=bucket, Key="all_active_scrapers.csv", Body=csv_buffer.getvalue())
status = put_response.get("ResponseMetadata", {}).get("HTTPStatusCode")
if status == 200:
print(f"Successful S3 put_object response. Status - {status}")
else:
print(f"Unsuccessful S3 put_object response. Status - {status}")
You have three options:
Run a script that updates the credentials before you run your Python script. You can use AWS CLI sts assume-role to get a new set of credentials.
Add a try/catch statement inside your code to handle credentials expired error. Then, generate new credentials and re-initialize the S3 client.
Instead of using a Role, use a User. (IAM Identities). User credentials can be valid forever. You won't need to update the credentials in this case.

boto3 s3 initialized session return credentials

I am facing something similar to How to load file from custom hosted Minio s3 bucket into pandas using s3 URL format?
however, I already have an initialized s3 session (from boto3).
How can I get the credentials returned from it to feed these directly to pandas?
I.e. how can I extract the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from the initialized boto3 s3 client?
You can use session.get_credentials
import boto3
session = boto3.Session()
credentials = session.get_credentials()
AWS_ACCESS_KEY_ID = credentials.access_key
AWS_SECRET_ACCESS_KEY = credentials.secret_key
AWS_SESSION_TOKEN = credentials.token
If you only have access to boto client (like the S3 client), you can find the credentials hidden here:
client = boto3.client("s3")
client._request_signer._credentials.access_key
client._request_signer._credentials.secret_key
client._request_signer._credentials.token
If you don't want to handle credentials (I assume you're using the SSO here), you can load the S3 object directly with pandas: pd.read_csv(s3_client.get_object(Bucket='Bucket', Key ='FileName').get('Body'))

botocore.exceptions.InvalidConfigError: The source profile "default" must have credentials

The code below fails in row s3 = boto3.client('s3') returning error botocore.exceptions.InvalidConfigError: The source profile "default" must have credentials.
def connect_s3_boto3():
try:
os.environ["AWS_PROFILE"] = "a"
s3 = boto3.client('s3')
return s3
except:
raise
I have set up the key and secret using aws configure
My file vim ~/.aws/credentials looks like:
[default]
aws_access_key_id = XXXXXXXXXXXXXXXXX
aws_secret_access_key = YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
My file vim ~/.aws/config looks like:
[default]
region = eu-west-1
output = json
[profile b]
region=eu-west-1
role_arn=arn:aws:iam::XX
source_profile=default
[profile a]
region=eu-west-1
role_arn=arn:aws:iam::YY
source_profile=default
[profile d]
region=eu-west-1
role_arn=arn:aws:iam::EE
source_profile=default
If I run aws-vault exec --no-session --debug a
it returns:
aws-vault: error: exec: Failed to get credentials for a9e: InvalidClientTokenId: The security token included in the request is invalid.
status code: 403, request id: 7087ea72-32c5-4b0a-a20e-fd2da9c3c747
I noticed you tagged this question with "docker". Is it possible that you're running your code from a Docker container that does not have your AWS credentials in it?
Use a docker volume to pass your credential files into the container:
https://docs.docker.com/storage/volumes/
It is not a good idea to add credentials into a container image because anybody who uses this image will have and use your credentials.
This is considered a bad practice.
For more information how to properly deal with secrets see https://docs.docker.com/engine/swarm/secrets/
I ran into this problem while trying to assume a role on an ECS container. It turned out that in such cases, instead of source_profile, credential_source should be used. It takes the value of EcsContainer for the container, Ec2InstanceMetadata for the EC2 machine or Environment for other cases.
Since the solution is not very intuitive, I thought it might save someone the trouble despite the age of this question.
Finally the issue is that Docker didn't had the credentials. And despite connect through bash and add them, it didn't work.
So, in the dockerfile I added:
ADD myfolder/aws/credentials /root/.aws/credentials
To move my locahost credentials files added through aws cli using aws configure to the docker. Then, I build the docker again and it works.

I can use CyberDuck to connect to S3 Bucket, but I cannot programmatically connect

I am attempting to connect to an S3 bucket (A 3rd party is the owner, so I cannot access through AWS console). Using CyberDuck, I can connect and upload files no problem. However I have tried several libraries to connect to the bucket all of which return a 403 forbidden. I am posting here in hopes that someone can spot what I am doing incorrectly.
def send_to_s3(file_name):
csv = open("/tmp/" + file_name, 'rb')
conn = tinys3.Connection("SECRET",
"SECRET",
tls=True,
endpoint="s3.amazonaws.com")
conn.upload("MDA-Data-Ingest/input/" + file_name, csv, bucket="gsext-69qlakrroehhgr0f47bhffnwct")
def send_via_ftp(file_name):
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None
srv = pysftp.Connection(host="gsext-69qlakrroehhgr0f47bhffnwct.s3.amazonaws.com",
username="SECRET",
password="SECRET",
port=443,
cnopts=cnopts)
with srv.cd('\MDA-Data-Ingest\input'):
srv.put('\\tmp\\'+file_name)
# Closes the connection
srv.close()
def send_via_boto(file_name):
access_key = 'SECRET'
secret_key = 'SECRET'
conn = boto.connect_s3(
aws_access_key_id=access_key,
aws_secret_access_key=secret_key,
host='s3.amazonaws.com',
# is_secure=False, # uncomment if you are not using ssl
calling_format=boto.s3.connection.OrdinaryCallingFormat(),
)
All of these functions return a 403 forbidden as shown bellow:
HTTPError: 403 Client Error: Forbidden for url: https://gsext-69qlakrroehhgr0f47bhffnwct.s3.amazonaws.com/MDA-Data-Ingest/input/accounts.csv
However when I use CyberDuck I can connect just fine:
The easiest method would be to use the AWS Command-Line Interface (CLI), which uses boto3 to access AWS services.
For example:
aws s3 ls s3://bucket-name --region us-west-2
aws s3 cp s3://gsext-69qlakrroehhgr0f47bhffnwct/MDA-Data-Ingest/input/accounts.csv accounts.csv
You would first run aws configure to provide your credentials and a default region, but the syntax above allows you to specify the particular region that the bucket is located. (It is possible that your Python code failed due to calling the wrong region.)

"TypeError: expected string, tuple found" when passing aws credentials to amazon client constructor

I have a python script that calls the Amazon SES api using boto3. It works when I create the client like this client = boto3.client('ses') and allow the aws credentials to come from ~/.aws/credentials, but I wanted to pass the aws_access_key_id and aws_secret_access_key into the constructor somehow.
I thought I had found somewhere that said it was acceptable to do something like this
client = boto3.client(
'ses',
aws_access_key_id=kwargs['aws_access_key_id'],
aws_secret_access_key=kwargs['aws_secret_access_key'],
region_name=kwargs['region_name']
)
but then when I try to send an email, it tells me that there is a TypeError: sequence item 0: expected string, tuple found when it tries to return '/'.join(scope) in botocore/auth.py (line 276).
I know it's a bit of a long shot, but I was hoping someone had an idea of how I can pass these credentials to the client from somewhere other than the aws credentials file. I also have the full stack trace from the error, if that's helpful I can post it as well. I just didn't want to clutter up the question initially.
You need to configure your connection info elsewhere and then connect using:
client = boto3.client('ses', AWS_REGION)
An alternative way, using Session can be done like this:
from boto3.session import Session
# create boto session
session = Session(
aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
region_name=settings.AWS_REGION
)
# make connection
client =session.client('s3')

Categories

Resources