Get a specific file from s3 bucket (boto3)

Get a specific file from s3 bucket (boto3) - python

So I have a file.csv on my bucket 'test', I'm creating a new session and I wanna download the contents of this file:
session = boto3.Session(
aws_access_key_id=KEY,
aws_secret_access_key=SECRET_KEY
)
s3 = session.resource('s3')
obj = s3.Bucket('test').objects.filter(Prefix='file.csv')
This returns me a collection but is there a way to fetch the file directly? Without any loops, I wanna do something like:
s3.Bucket('test').objects.get(key='file.csv')
I could achieve the same result without passing credentials like this:
s3 = boto3.client('s3')
obj = s3.get_object(Bucket='test', Key='file.csv')

If you take a look at the client method:
import boto3
s3_client = boto3.client('s3')
s3_client.download_file('mybucket', 'hello.txt', '/tmp/hello.txt')
and the resource method:
import boto3
s3 = boto3.resource('s3')
s3.meta.client.download_file('mybucket', 'hello.txt', '/tmp/hello.txt')
you'll notice that you can convert from the resource to the client with meta.client.
So, combine it with your code to get:
session = boto3.Session(aws_access_key_id=KEY, aws_secret_access_key=SECRET_KEY)
s3 = session.resource('s3')
obj = s3.meta.client.download_file('mybucket', 'hello.txt', '/tmp/hello.txt')

I like mpu.aws.s3_download, but I'm biased ;-)
It does it like that:
import os
import boto3
def s3_download(bucket_name, key, profile_name, exists_strategy='raise'):
session = boto3.Session(profile_name=profile_name)
s3 = session.resource('s3')
if os.path.isfile(destination):
if exists_strategy == 'raise':
raise RuntimeError('File \'{}\' already exists.'
.format(destination))
elif exists_strategy == 'abort':
return
s3.Bucket(bucket_name).download_file(key, destination)
For authentication, I recommend using environment variables. See boto3: Configuring Credentials for details.

you can use the following boto3 method.
download_file(Bucket, Key, Filename, ExtraArgs=None, Callback=None,
Config=None)
s3 = boto3.resource('s3')
s3.meta.client.download_file('mybucket', 'hello.txt', '/tmp/hello.txt')
find more details here - download_file()

Related

S3 folder to folder file copy using Lambda [duplicate]

I am trying to copy multiple files from one s3 bucket to another s3 bucket using lambda function but it is just copying 2 files in destination s3 bucket.
Here is my code:
# using python and boto3
import json
import boto3
s3_client = boto3.client('s3')
def lambda_handler(event, context):
source_bucket_name = event['Records'][0]['s3']['bucket']['name']
file_name = event['Records'][0]['s3']['object']['key']
destination_bucket_name = 'nishantnkd'
copy_object = {'Bucket': source_bucket_name, 'Key': file_name}
s3_client.copy_object(CopySource=copy_object,
Bucket=destination_bucket_name, Key=file_name)
return {'statusCode': 3000,
'body': json.dumps('File has been Successfully Copied')}

I presume that the Amazon S3 bucket is configured to trigger the AWS Lambda function when a new object is created.
When the Lambda function is triggered, it is possible that multiple event records are sent to the function. Therefore, it should loop through the event records like this:
# using python and boto3
import json
import boto3
s3_client = boto3.client('s3')
def lambda_handler(event, context):
for record in event['Records']: # This loop added
source_bucket_name = record['s3']['bucket']['name']
file_name = urllib.parse.unquote_plus(record['s3']['object']['key']) # Note this change too
destination_bucket_name = 'nishantnkd'
copy_object = {'Bucket': source_bucket_name, 'Key': file_name}
s3_client.copy_object(CopySource=copy_object, Bucket=destination_bucket_name, Key=file_name)
return {'statusCode': 3000,
'body': json.dumps('File has been Successfully Copied')}

How to retrieve AWS S3 objects URL using python

I need to write a lambda function that retrieves s3 object URL for object preview. I came across this solution, but I have a question about it. In my case, I would like to retrieve URL of any object in my s3 bucket, hence there is no Keyname.How can i retriece url of any future objects stored in my s3 bucket.
bucket_name = 'aaa'
aws_region = boto3.session.Session().region_name
object_key = 'aaa.png'
s3_url = f"https://{bucket_name}.s3.{aws_region}.amazonaws.com/{object_key}"
return {
'statusCode': 200,
'body': json.dumps({'s3_url': s3_url})
}

You have some examples here. But, what exactly would you like to do? What do you mean by future objects? You can put a creation event on your bucket that will trigger your lambda each time when a new object is uploaded into that bucket.
import boto3
def lambda_handler(event, context):
print(event)
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
s3 = boto3.client('s3')
obj = s3.get_object(
Bucket=bucket,
Key=key
)
print(obj['Body'].read().decode('utf-8'))

FastAPI Upload to S3

I've been scratching my head on this for days now. I still couldn't solve the problem. Basically, I just wanted to put a CSV file in a LocalStack S3, and I can't get it working.
Here's the snippet of my code:
api.py
from files import s3, AWS_S3_BUCKET_NAME, upload_file_to_bucket
#router.post('/api/customer/generate/upload',
name='Upload CSV to AWS S3 Bucket',
status_code=201)
async def post_upload_user_csv(file_obj: UploadFile = File(...)):
upload_obj = upload_file_to_bucket(s3_client=s3(),
file_obj=file_obj.file,
bucket=AWS_S3_BUCKET_NAME,
folder='CSV', # To Be updated
object_name=file_obj.filename)
if upload_obj:
return JSONResponse(content="Object has been uploaded to bucket successfully",
status_code=status.HTTP_201_CREATED)
else:
raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail="File could not be uploaded")
files.py
import os
import boto3
import logging
from botocore.client import BaseClient
from botocore.exceptions import ClientError
AWS_ACCESS_KEY_ID = os.getenv('POSTGRES_HOST')
AWS_SECRET_KEY = os.getenv('AWS_SECRET_KEY')
AWS_S3_BUCKET_NAME = os.getenv('AWS_S3_BUCKET_NAME')
def s3() -> BaseClient:
client = boto3.client(service_name='s3',
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_KEY,
endpoint_url='http://localhost:4566/') # Use LocalStack Endpoint
return client
def upload_file_to_bucket(s3_client, file_obj, bucket, folder, object_name=None):
"""Upload a file to an S3 bucket
:param s3_client: S3 Client
:param file_obj: File to upload
:param bucket: Bucket to upload to
:param folder: Folder to upload to
:param object_name: S3 object name. If not specified then file_name is used
:return: True if file was uploaded, else False
"""
# If S3 object_name was not specified, use file_name
if object_name is None:
object_name = file_obj
# Upload the file
try:
# with open("files", "rb") as f:
s3_client.upload_fileobj(file_obj, bucket, f"{folder}/{object_name}")
except ClientError as e:
logging.error(e)
return False
return True
The problem is that s3_client needs to open the file in binary mode first before I can upload it to s3 bucket. However, this can't be done directly and the file needs to be saved temporarily on the FastAPI server but I really don't want to do that for obvious reasons.
Any help will be much appreciated. Thank you in advance!

Can you show the error first?
Bassicly. I have done it before. Upload directly a file from the front-end to aws-s3. Can you try to add ContentType to upload_fileobj. Here is my code
content_type = mimetypes.guess_type(fpath)[0]
s3.Bucket(bucket_name).upload_fileobj(Fileobj=file, Key=file_path,
ExtraArgs={"ACL": "public-read",
"ContentType": content_type})
another way. You should try to convert files to io.BytesIO
def s3_upload(self, file, file_path, bucket_name, width=None, height=None, make_thumb=False, make_cover=False):
s3 = boto3.resource(service_name='s3')
obj = BytesIO(self.image_optimize_from_buffer(file, width, height, make_thumb, make_cover))
s3.Bucket(bucket_name).upload_fileobj(Fileobj=obj, Key=file_path,
ExtraArgs={"ACL": "public-read", "ContentType": file.content_type})
return f'https://{bucket_name}.s3.amazonaws.com/{file_path}'

Download files from public S3 bucket with boto3

I cannot download a file or even get a listing of the public S3 bucket with boto3.
The code below works with my own bucket, but not with public one:
def s3_list(bucket, s3path_or_prefix):
bsession = boto3.Session(aws_access_key_id=settings.AWS['ACCESS_KEY'],
aws_secret_access_key=settings.AWS['SECRET_ACCESS_KEY'],
region_name=settings.AWS['REGION_NAME'])
s3 = bsession.resource('s3')
my_bucket = s3.Bucket(bucket)
items = my_bucket.objects.filter(Prefix=s3path_or_prefix)
return [ii.key for ii in items]
I get an AccessDenied error on this code. The bucket is not in my own and I cannot set permissions there, but I am sure it is open to public read.

I had the similar issue in the past. I have found a key to this bug in https://github.com/boto/boto3/issues/134 .
You can use undocumented trick:
import botocore
def s3_list(bucket, s3path_or_prefix, public=False):
bsession = boto3.Session(aws_access_key_id=settings.AWS['ACCESS_KEY'],
aws_secret_access_key=settings.AWS['SECRET_ACCESS_KEY'],
region_name=settings.AWS['REGION_NAME'])
client = bsession.client('s3')
if public:
client.meta.events.register('choose-signer.s3.*', botocore.handlers.disable_signing)
result = client.list_objects(Bucket=bucket, Delimiter='/', Prefix=s3path_or_prefix)
return [obj['Prefix'] for obj in result.get('CommonPrefixes')]

copy file from gcs to s3 in boto3

I am looking to copy files from gcs to my s3 bucket. In boto2, easy as a button.
conn = connect_gs(user_id, password)
gs_bucket = conn.get_bucket(gs_bucket_name)
for obj in bucket:
s3_key = key.Key(s3_bucket)
s3_key.key = obj
s3_key.set_contents_from_filename(obj)
However in boto3, I am lost trying to find equivalent code. Any takers?

If all you're doing is a copy:
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('bucket-name')
for obj in gcs:
s3_obj = bucket.Object(gcs.key)
s3_obj.put(Body=gcs.data)
Docs: s3.Bucket, s3.Bucket.Object, s3.Bucket.Object.put
Alternatively, if you don't want to use the resource model:
import boto3
s3_client = boto3.client('s3')
for obj in gcs:
s3_client.put_object(Bucket='bucket-name', Key=gcs.key, Body=gcs.body)
Docs: s3_client.put_object
Caveat: The gcs bits are pseudocode, I am not familiar with their API.
EDIT:
So it seems gcs supports an old version of the S3 API and with that an old version of the signer. We still have support for that old signer, but you have to opt into it. Note that some regions don't support old signing versions (you can see a list of which S3 regions support which versions here), so if you're trying to copy over to one of those you will need to use a different client.
import boto3
from botocore.client import Config
# Create a client with the s3v2 signer
resource = boto3.resource('s3', config=Config(signature_version='s3'))
gcs_bucket = resource.Bucket('phjordon-test-bucket')
s3_bucket = resource.Bucket('phjordon-test-bucket-tokyo')
for obj in gcs_bucket.objects.all():
s3_bucket.Object(obj.key).copy_from(
CopySource=obj.bucket_name + "/" + obj.key)
Docs: s3.Object.copy_from
This, of course, will only work assuming gcs is still S3 compliant.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Get a specific file from s3 bucket (boto3) - python

you can use the following boto3 method. download_file(Bucket, Key, Filename, ExtraArgs=None, Callback=None, Config=None) s3 = boto3.resource('s3') s3.meta.client.download_file('mybucket', 'hello.txt', '/tmp/hello.txt') find more details here - download_file()

Related

S3 folder to folder file copy using Lambda [duplicate]

How to retrieve AWS S3 objects URL using python

FastAPI Upload to S3

Download files from public S3 bucket with boto3

copy file from gcs to s3 in boto3

Categories

Resources