using boto3 to get file list and download files

using boto3 to get file list and download files - python

I was given this s3 url: s3://file.share.external.bdex.com/Offrs
In this url is a battery of files I need to download.
I have this code:
import boto3
s3_client = boto3.client('s3',
aws_access_key_id='<<ACCESS KEY>>',
aws_secret_access_key='<<SECRET_ACCESS_KEY>>'
)
object_listing = s3_client.list_objects_v2(Bucket='file.share.external.bdex.com/Offrs',
Prefix='')
print(object_listing)
I have tried:
Bucket='file.share.external.bdex.com', Prefix='Offrs'
Bucket='s3://file.share.external.bdex.com/Offrs/'
Bucket='file.share.external.bdx.com/Offrs', Prefix='Offrs'
and several other configurations, all saying I'm not following the regex. due to the slash, or not found.
What am I missing?
Thank you.

Bucket = 'file.share.external.bdx.com'
Prefix = 'Offrs/'
You can test your access permissions via the AWS CLI:
aws s3 ls s3://file.share.external.bdex.com/Offrs/

Related

No Credentials Error - Using boto3 and aws s3 bucket

I am using python and jupyter notebook to read files from an aws s3 bucket, and I am getting the error 'No Credentials Error:Unable to locate credentials' when running the following code:
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
for obj in bucket.objects.all():
key = obj.key
body = obj.get()['Body'].read()
I believe I need to put my access key somewhere, but I am not sure where. Thank you!

If you have AWS CLI installed, just do a simple aws configure. Then you will be good to go.

Move all files in s3 bucket from s3 account to another using boto3

I'm trying to move the contents of a bucket from account-a to a bucket in account-b which I already have the credentials for both of them.
Here's the code I'm currently using:
import boto3
SRC_AWS_KEY = 'src-key'
SRC_AWS_SECRET = 'src-secret'
DST_AWS_KEY = 'dst-key'
DST_AWS_SECRET = 'dst-secret'
srcSession = boto3.session.Session(
aws_access_key_id=SRC_AWS_KEY,
aws_secret_access_key=SRC_AWS_SECRET
)
dstSession = boto3.session.Session(
aws_access_key_id=DST_AWS_KEY,
aws_secret_access_key=DST_AWS_SECRET
)
copySource = {
'Bucket': 'src-bucket',
'Key': 'test-bulk-src'
}
srcS3 = srcSession.resource('s3')
dstS3 = dstSession.resource('s3')
dstS3.meta.client.copy(CopySource=copySource, Bucket='dst-bucket', Key='test-bulk-dst', SourceClient=srcS3.meta.client)
print('success')
The problem is that when I specify a file's name in the field Key followed by /file.csv it works really fine, but when I set it to copy the whole folder, as showed in the code, it fails and throws this exception:
botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
What I need to do is to move the contents in one call, not by iterating through the contents of the src-folder, because this is time/money consuming, as I may have thousands of files to be moved.

There is no API call in Amazon S3 to copy folders. (Folders do not actually exist — the Key of each object includes its full path.)
You will need to iterate through each object and copy it.
The AWS CLI (written in Python) provides some higher-level commands that will do this iteration for you:
aws s3 cp --recursive s3://source-bucket/folder/ s3://destination-bucket/folder/
If the buckets are in different accounts, I would recommend:
Use a set of credentials for the destination account (avoids problems with object ownership)
Modify the bucket policy on the source bucket to permit access by the credentials from the destination account (avoids the need to use two sets of credentials)

Python (Boto/tinys3) Upload file to AWS S3 bucket subdirectory

conn = tinys3.Connection(S3_ACCESS_KEY,S3_SECRET_KEY)
f = open('sample.zip','rb')
conn.upload('sample.zip',f,bucketname)
I can upload the file to my bucket (test) via the code above, but I want to upload it directly to test/images/example. I am open to moving over to boto, but I can't seem to import boto.s3 in my environment.
I have looked through How to upload a file to directory in S3 bucket using boto but none of the tinys3 examples show this.

import boto3
client = boto3.client('s3', region_name='ap-southeast-2')
client.upload_file('/tmp/foo.txt', 'my-bucket', 'test/images/example/foo.txt')

The following worked for me
from boto3.s3.transfer import S3Transfer
from boto3 import client
client_obj = client('s3',
aws_access_key_id='my_aws_access_key_id',
aws_secret_access_key='my_aws_secret_access_key')
transfer = S3Transfer(client_obj)
transfer.upload_file(src_file,
'my_s3_bucket_name',
dst_file,
extra_args={'ContentType': "application/zip"})

Can't download from AWS boto3 generated signed url

I'm trying to upload a file to an existing AWS s3 bucket, generate a public URL and use that URL (somewhere else) to download the file.
I'm closely following the example here:
import os
import boto3
import requests
import tempfile
s3 = boto3.client('s3')
with tempfile.NamedTemporaryFile(mode="w", delete=False) as outfile:
outfile.write("dummycontent")
file_name = outfile.name
with open(file_name, mode="r") as outfile:
s3.upload_file(outfile.name, "twistimages", "filekey")
os.unlink(file_name)
url = s3.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': 'twistimages',
'Key': 'filekey'
}
)
response = requests.get(url)
print(response)
I would expect to see a success return code (200) from the requests library.
Instead, stdout is: <Response [400]>
Also, if I navigate to the corresponding URL with a webbrowser, I get an XML file with an error code: InvalidRequest and an error message:
The authorization mechanism you have provided is not supported. Please
use AWS4-HMAC-SHA256.
How can I use boto3 to generate a public URL, which can easily be downloaded by any user by just navigating to the corresponding URL, without generating complex headers?
Why does the example code from the official documentation not work in my case?

AWS S3 still support legacy old v2 signature in US region(prior 2014). But under new AWS region, only AWS4-HMAC-SHA256(s3v4) are allowed.
To support this features , you must specify them explicitly in .aws/config file or during boto3.s3 resource/client instantiation. e.g.
# add this entry under ~/.aws/config
[default]
s3.signature_version = s3v4
[other profile]
s3.signature_version = s3v4
Or declare them explicitly
s3client = boto3.client('s3', config= boto3.session.Config(signature_version='s3v4'))
s3resource = boto3.resource('s3', config= boto3.session.Config(signature_version='s3v4'))

I solved the issue. I'm using an S3 bucket in the eu-central-1 region and after specifying the region in the config file, everything worked as expected and the script stdout was <Response [200]>.
The configuration file (~/.aws/config) now looks like:
[default]
region=eu-central-1

how to write airflow error logs into s3 bucket using python

when my airflow dag fails i get the errors in the path
"/home/ec2-user/software/airflow/logs/dagtest_dag/trigger_load/2019-10-10T06:01:33.342433+00:00/1.log
"
how to take these log to s3 bucket?

Configure this as a cron job.
import boto3
s3 = boto3.client('s3')
# Make sure your client is authenticated
with open('path/to/your/logs.log', 'rb') as data:
s3.upload_fileobj(data, 'bucketname', 'path/to/your/logs/in/s3.log')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

using boto3 to get file list and download files - python

Bucket = 'file.share.external.bdx.com' Prefix = 'Offrs/' You can test your access permissions via the AWS CLI: aws s3 ls s3://file.share.external.bdex.com/Offrs/

Related

No Credentials Error - Using boto3 and aws s3 bucket

Move all files in s3 bucket from s3 account to another using boto3

Python (Boto/tinys3) Upload file to AWS S3 bucket subdirectory

Can't download from AWS boto3 generated signed url

how to write airflow error logs into s3 bucket using python

Categories

Resources