I am following this help guide, which explains how to save a file to an S3 bucket after creating it. I cannot find, however, an explanation of how to save to an existing bucket. More specifically, I am unsure how to reference a preexisting bucket. I believe replacing create_bucket with get_bucket does the trick. This allows me to save the file but because the documentation says that get_bucket "retrieves a bucket by name" I wanted to check here and make sure that it only retreives the bucket's metadata and does not does not download all of the contents of the bucket to my computer. Am I doing this right or is there a more pythonic way?
import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket')
from boto.s3.key import Key
k = Key(bucket)
k.key = 'foobar'
k.set_contents_from_string('This is a test of S3')
Your code looks reasonable. The get_bucket method will either return a Bucket object or, if the specified bucket name does not exist, it will raise an S3ResponseError.
You could simplify your code a little:
import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket')
k = bucket.new_key('foobar')
k.set_contents_from_string('This is a test of S3')
But it accomplishes the same thing.
Related
I have written code on my backend (hosted on Elastic Beanstalk) to retrieve a file from an S3 bucket and save it back to the bucket under a different name. I am using boto3 and have created an s3 client called 's3'.
bucketname is the name of the bucket, keyname is name of the key. I am also using the tempfile module
tmp = tempfile.NamedTemporaryFile()
with open(tmp.name, 'wb') as f:
s3.download_fileobj(bucketname, keyname, f)
s3.upload_file(tmp, bucketname, 'fake.jpg')
I was wondering if my understanding was off (still debugging why there is an error) - I created a tempfile and opened and saved within it the contents of the object with the keyname and bucketname. Then I uploaded that temp file to the bucket under a different name. Is my reasoning correct?
The upload_file() command is expecting a filename (as a string) in the first parameter, not a file object.
Instead, you should use upload_fileobj().
However, I would recommend something different...
If you simply wish to make a copy of an object, you can use copy_object:
response = client.copy_object(
Bucket='destinationbucket',
CopySource='/sourcebucket/HappyFace.jpg',
Key='HappyFaceCopy.jpg',
)
I am using the boto3 library to create a S3 folder using python.(Want to create a directory 'c' in already existing directory structure like '/a/b'
s3_client=boto3.client('s3')
s3_client =put_object(Bucket=bucket_name, Key='a/b/c/')
I am not getting any error but the directory is also not getting created. I cant really figure out the reason, any suggestions?
Not sure if it is a typo in the code you show but it should give you an error. I think what you are trying to do is:
s3_client = boto3.client('s3')
response = s3_client.put_object(Bucket=bucket_name, Key='a/b/c/')
print("Response: {}".format(response)) # See result of request.
there are no such things as folders or directory in S3
to upload a file to your bucket you could use:
s3_client=boto3.client('s3')
# you have to provide my_binary_data
response = s3_client.put_object(Body=my_binary_data, Bucket=bucket_name, Key='a/b/c/')
where Key represents the name or your file
you can read more aboout Client.put_object here
I'm having trouble figuring out how to access a certain folder within a bucket in s3 using Python
Let's say I'm trying to access this folder in the bucket which contains a bunch of images that I want to run rekognition on:
"myBucket/subfolder/images/"
In /images/ folder there are:
one.jpg
two.jpg
three.jpg
four.jpg
I want to run rekognition's detect_labels on this folder. However, I can't seem to access this folder but if I change the bucket_name to just the root folder ("myBucket"/), then I can access just that folder.
bucket_name = "myBucket/subfolder/images/"
rekognition = boto3.client('rekognition')
s3 = boto3.resource('s3')
bucket = s3.Bucket(name=bucket_name)
That is operating as expected. The bucket name should be just the name of the bucket.
You can then run operation on the bucket, eg:
import boto3
s3 = boto3.resource('s3', region_name='ap-southeast-2')
bucket = s3.Bucket('my-bucket')
for object in bucket.objects.all():
if object.key.startswith('images'):
print object.key
Or, using the client instead of the resource:
import boto3
client = boto3.client('s3', region_name='ap-southeast-2')
response = client.list_objects_v2(Bucket='my-bucket', Prefix='images/')
for object in response['Contents']:
print object['Key']
You could index elsewhere what you have in S3, in such a way, you directly access what you need. Have in mind that to loop against files stored in buckets, may offer very low performance and it gets really slow if the number of keys you have is big.
Following your example, another way to do it:
bucket_name = "myBucket"
folder_name = "subfolder/images/"
rekognition = boto3.client('rekognition')
keys= ['one.jpg','two.jpg','three.jpg','four.jpg']
s3 = boto3.resource('s3')
for k in keys:
obj = s3.Object(bucket_name, folder_name+k )
print(obj.key)
Get the list of items(keys) from any db table in your system.
In the case of AWS Rekognition (as asked), an image file stored in a folder in a S3 bucket will have the key in the form of folder_name/subfolder_name/image_name.jpg. So, since the boto3 Rekognition detect_labels() method has this syntax (per https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/rekognition.html#Rekognition.Client.detect_labels):
response = client.detect_labels(
Image={
'Bytes': b'bytes',
'S3Object': {
'Bucket': 'string',
'Name': 'string',
'Version': 'string'
}
},
MaxLabels=123,
MinConfidence=...
)
where the value of Name should be a S3 object key name, you can just feed the entire folder-image path to that dictionary as a string. To loop over multiple images, generate a list of image file names, as recommended in Evhz's answer, and loop over that list while calling the detect_labels() method above (or use a generator).
This code writes json to a file in s3,
what i wanted to achieve is instead of opening data.json file and writing to s3 (sample.json) file,
how do i pass the json directly and write to a file in s3 ?
import boto3
s3 = boto3.resource('s3', aws_access_key_id='aws_key', aws_secret_access_key='aws_sec_key')
s3.Object('mybucket', 'sample.json').put(Body=open('data.json', 'rb'))
I'm not sure, if I get the question right. You just want to write JSON data to a file using Boto3? The following code writes a python dictionary to a JSON file.
import json
import boto3
s3 = boto3.resource('s3')
s3object = s3.Object('your-bucket-name', 'your_file.json')
s3object.put(
Body=(bytes(json.dumps(json_data).encode('UTF-8')))
)
I don't know if anyone is still attempting to use this thread, but I was attempting to upload a JSON to s3 trying to use the method above, which didnt quite work for me. Boto and s3 might have changed since 2018, but this achieved the results for me:
import json
import boto3
s3 = boto3.client('s3')
json_object = 'your_json_object here'
s3.put_object(
Body=json.dumps(json_object),
Bucket='your_bucket_name',
Key='your_key_here'
)
Amazon S3 is an object store (File store in reality). The primary operations are PUT and GET. You can not add data into an existing object in S3. You can only replace the entire object itself.
For a list of available operations you can perform on s3 see this link
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectOps.html
An alternative to Joseph McCombs answer can be achieved using s3fs.
from s3fs import S3FileSystem
json_object = {'test': 3.14}
path_to_s3_object = 's3://your-s3-bucket/your_json_filename.json'
s3 = S3FileSystem()
with s3.open(path_to_s3_object, 'w') as file:
json.dump(json_object, file)
I'm attempting to save an image to S3 using boto. It does save a file, but it doesn't appear to save it correctly. If I try to open the file in S3, it just shows a broken image icon. Here's the code I'm using:
# Get and verify the file
file = request.FILES['file']
try:
img = Image.open(file)
except:
return api.error(400)
# Determine a filename
filename = file.name
# Upload to AWS and register
s3 = boto.connect_s3(aws_access_key_id=settings.AWS_KEY_ID,
aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY)
bucket = s3.get_bucket(settings.AWS_BUCKET)
f = bucket.new_key(filename)
f.set_contents_from_file(file)
I've also tried replacing the last line with:
f.set_contents_from_string(file.read())
But that didn't work either. Is there something obvious that I'm missing here? I'm aware django-storages has a boto backend, but because of complexity with this model, I do not want to use forms with django-storages.
Incase you don't want to go for django-storages and just want to upload few files to s3 rather then all the files then below is the code:
import boto3
file = request.FILES['upload']
s3 = boto3.resource('s3', aws_access_key_id=settings.AWS_ACCESS_KEY, aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY)
bucket = s3.Bucket('bucket-name')
bucket.put_object(Key=filename, Body=file)
You should use django-storages which uses boto internally.
You can either swap the default FileSystemStorage, or create a new storage instance and manually save files. Based on your code example I guess you really want to go with the first option.
Please consider using django's Form instead of directly accessing the request.