Inserting items in sub-buckets on S3 using boto

Inserting items in sub-buckets on S3 using boto - python

Say I have the following bucket set up on S3
MyBucket/MySubBucket
And, I am planning on serving static media for a website out of MySubBucket (like images users have uploaded, etc.). The way the S3 configurations appear to be currently set up, there is no way to make the top-level bucket "MyBucket" public, so to mimic this you need to make every individual item in that bucket public after it has been inserted.
You can, make a sub-bucket like "MySubBucket" public, but the problem I am currently having is trying to figure out how to call boto to insert a test image into that bucket? Can anyone provide an example?
Thanks

You actually just need to put the sub bucket name and a / before the file name. The following code snippet will put a file called 'test.jpg' into MySubBucket. The k.make_public() makes that file public, but not necessarily the bucket itself.
from boto.s3.connection import S3Connection
from boto.s3.key import Key
connection = S3Connection(AWSKEY,AWS_SECRET_KEY)
bucket = connection.get_bucket('MyBucket')
file = 'test.jpg'
k = Key(bucket)
k.key = 'MySubBucket/' + file
k.set_contents_from_filename(file)
k.make_public()

Related

Read content of a file located under subfolders of S3 in Python

I'm trying to read a file content's (not to download it) from an S3 bucket. The problem is that the file is located under a multi-level folder. For instance, the full path could be s3://s3-bucket/folder-1/folder-2/my_file.json. How can I get that specific file instead of using my iterative approach that lists all objects?
Here is the code that I want to change:
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('s3-bucket')
for obj in my_bucket.objects.all():
key = obj.key
if key == 'folder-1/folder-2/my_file.json':
return obj.get()['Body'].read()
Can it be done in a simpler, more direct way?

Yes - there is no need to enumerate the bucket.
Read the file directly using s3.Object, providing the bucket name as the 1st parameter & the object key as the 2nd parameter.
"Folders" don't really exist in S3 - Amazon S3 doesn't use hierarchy to organize its objects and files. For the sake of organizational simplicity, the Amazon S3 console shows "folders" as a means of grouping objects but they are ultimately baked into your object key.
This should work:
import boto3
s3 = boto3.resource('s3')
obj = s3.Object("s3-bucket", "folder-1/folder-2/my_file.json")
body = obj.get()['Body'].read()

Is there a way to update metadata headers on Amazon S3 with boto3?

I am trying to check all the keys in an S3 bucket, if it has a certain userd-defined metadata header, then move onto the next, but it does not have the user-defined metadata header -- I want to add it with a certain key value pair. I have been trying to do some kind of put command but I am fairly new to this.
I have gotten this far:
Basically able to print all the keys within my specified bucket
import boto3
s3 = boto3.resource('s3')
# This will search ALL of the files within a specified bucket
my_bucket = s3.Bucket('client-data')
for file in my_bucket.objects.all():
print(file.key)
If I am able to do a conditional statement saying if the metadata header does not exist, then add it, that would be awesome. Any help is appreciated.
Thanks!

need help configuring python boto3

what ive been doing:
ive been reading a lot on the documentation on boto3 but im still struggling to get it working the way i am wanting as this is my first time using AWS.
im trying to use boto3 to access microsoft excel files that are uploaded to a S3 bucket... im able to use boto3.session() to give my "hard coded" credentials and from there print the names of the files that are in my bucket
however, im trying to figure out how to access the contents of that excel file in that bucket...
My end goal/what im trying to do:
The end goal of this project is to have people upload excel files into the S3 bucket with zip codes in them(organized in the cells)... and then have that file sent to an EC2 instance to have a program i wrote read from the file one zip code at a time and process certain things...
any help is greatly appreciated as this is all new and overwhelming
This is the code i am trying:
import boto3
session = boto3.Session(
aws_access_key_id='put key here',
aws_secret_access_key='put key here',
)
s3 = session.resource('s3')
bucket = s3.Bucket('bucket name')
for f in bucket.objects.all():
print(f.key)
f.download_file('testrun')

Since you are using the boto3 resource API instead of the client API, you should be calling the Object.download_file method inside the for loop, like so:
for f in bucket.objects.all():
print(f.key)
f.download_file('the name i want it to be downloaded as')

Try the following:
for f in bucket.objects.all():
obj = bucket.Object(f.key)
with open(f.key, 'wb') as data:
obj.download_fileobj(data)

How to get top-level folders in an S3 bucket using boto3?

I have an S3 bucket with a few top level folders, and hundreds of files in each of these folders. How do I get the names of these top level folders?
I have tried the following:
s3 = boto3.resource('s3', region_name='us-west-2', endpoint_url='https://s3.us-west-2.amazonaws.com')
bucket = s3.Bucket('XXX')
for obj in bucket.objects.filter(Prefix='', Delimiter='/'):
print obj.key
But this doesn't seem to work. I have thought about using regex to filter all the folder names, but this doesn't seem time efficient.
Thanks in advance!

Try this.
import boto3
client = boto3.client('s3')
paginator = client.get_paginator('list_objects')
result = paginator.paginate(Bucket='my-bucket', Delimiter='/')
for prefix in result.search('CommonPrefixes'):
print(prefix.get('Prefix'))

The Amazon S3 data model is a flat structure: you create a bucket, and the bucket stores objects. There is no hierarchy of subbuckets or subfolders; however, you can infer logical hierarchy using key name prefixes and delimiters as the Amazon S3 console does (source)
In other words, there's no way around iterating all of the keys in the bucket and extracting whatever structure that you want to see (depending on your needs, a dict-of-dicts may be a good approach for you).

You could also use Amazon Athena in order to analyse/query S3 buckets.
https://aws.amazon.com/athena/

Transfer files from S3 Bucket to another keeping folder structure - python boto

Have found many questions related to this with solutions using boto3, however I am in a position where I have to use boto, running Python 2.38.
Now I can successfully transfer my files in their folders (Not real folders I know as S3 doesn't have this concept) but I want them to be saved into a particular folder in my destination bucket
from boto.s3.connection import S3Connection
def transfer_files():
conn = S3Connection()
srcBucket = conn.get_bucket("source_bucket")
dstBucket = conn.get_bucket(bucket_name="destination_bucket")
objectlist = srcbucket.list()
for obj in objectlist:
dstBucket.copy_key(obj.key, srcBucket.name, obj.key)
My srcBucket will look like folder/subFolder/anotherSubFolder/file.txt which when transferred will land in the dstBucket like so destination_bucket/folder/subFolder/anotherSubFolder/file.txt
I would like it to end up in destination_bucket/targetFolder so the final directory structure would look like
destination_bucket/targetFolder/folder/subFolder/anotherSubFolder/file.txt
Hopefully I have explained this well enough and it makes sense

The first parameter is the name of the destination key.
Therefore, just use:
dstBucket.copy_key('targetFolder/' + obj.key, srcBucket.name, obj.key)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Inserting items in sub-buckets on S3 using boto - python

Related

Read content of a file located under subfolders of S3 in Python

Is there a way to update metadata headers on Amazon S3 with boto3?

need help configuring python boto3

How to get top-level folders in an S3 bucket using boto3?

Transfer files from S3 Bucket to another keeping folder structure - python boto

Categories

Resources