Decoding s3 file names having special characters

Decoding s3 file names having special characters - python

while writing lambda code to copy one file from source to destination bucket I'm facing some issue. Getting cloudwatch logs specifying folder name as file_name
insidesource/newsource/test/Active%3D1/devopsnotes.txt
whereas my actual folder name contains Active=1
Please find logs from cloud watch and lambda code as well.
Need assistance on how to decode particular file name in cloudwatch logs.
lambda code and cloudwatch logs

The s3 object key names are URL encoded in cloudwatch logs.
Check out Characters that might require special handling
for s3 object names.
You can use the unquote_plus() from urllib.parse to get the decoded file name -
from urllib.parse import unquote_plus
file_name = "insidesource/newsource/test/Active%3D1/devopsnotes.txt"
print(unquote_plus(file_name))
Output:
insidesource/newsource/test/Active=1/devopsnotes.txt

Related

How can I write the output inside the folder of a S3 bucket using Lambda fuction

I am coding in Python Lang. in AWS Lambda and I need to write the output which shows in output logs in a folder of S3 bucket. The code is,
client = boto3.client('s3')
ouput_data = b'Checksums are matching for file'
client.put_object(Body=ouput_data, Bucket='bucket', Key='key')
And now I want to add Checksums are matching for file and then I will need to add the file names. How can I add

need help configuring python boto3

what ive been doing:
ive been reading a lot on the documentation on boto3 but im still struggling to get it working the way i am wanting as this is my first time using AWS.
im trying to use boto3 to access microsoft excel files that are uploaded to a S3 bucket... im able to use boto3.session() to give my "hard coded" credentials and from there print the names of the files that are in my bucket
however, im trying to figure out how to access the contents of that excel file in that bucket...
My end goal/what im trying to do:
The end goal of this project is to have people upload excel files into the S3 bucket with zip codes in them(organized in the cells)... and then have that file sent to an EC2 instance to have a program i wrote read from the file one zip code at a time and process certain things...
any help is greatly appreciated as this is all new and overwhelming
This is the code i am trying:
import boto3
session = boto3.Session(
aws_access_key_id='put key here',
aws_secret_access_key='put key here',
)
s3 = session.resource('s3')
bucket = s3.Bucket('bucket name')
for f in bucket.objects.all():
print(f.key)
f.download_file('testrun')

Since you are using the boto3 resource API instead of the client API, you should be calling the Object.download_file method inside the for loop, like so:
for f in bucket.objects.all():
print(f.key)
f.download_file('the name i want it to be downloaded as')

Try the following:
for f in bucket.objects.all():
obj = bucket.Object(f.key)
with open(f.key, 'wb') as data:
obj.download_fileobj(data)

No file after S3 boto put

I am trying to write the file to S3 from the JSON structure in the Python 2.7 script. The code is as follows:
S3_bucket = s3.Bucket(__S3_BUCKET__)
result = S3_bucket.put_object(Key=__S3_BUCKET_PATH__ + 'file_prefix_' + str(int(time.time()))+'.json', Body = str(json.dumps(dict_list)).encode("utf-8"))
I end up with the S3 bucket handler is which is
s3.Bucket(name='bucket_name')
S3 file path is /file_prefix_1545039898.json
{'statusCode': s3.Object(bucket_name='bucket_name', key='/file_prefix_1545039898.json')}
But I see nothing on S3 - no files were created. I have a suspicion that I may require commit of some kind, bit all the manuals I came across are saying otherwise. Did anyone had a problem like this?

Apparently, the leading slash works not as a standard path designator - it creates an empty name directory, which has not been seen. Removing the one puts things where they belong.

Adding s3 bucket folder's address to output bucket + python

How do I link my s3 bucket folder to output bucket in python?
I tried several permutation and combination but it still didn't work out. All I need is to link my folder address to output bucket in python.
I found error when I tried below combination -
output Bucket = "s3-bucket.folder-name"
output Bucket = "s3-bucket/folder-name/"
output Bucket = "s3-bucket\folder-name\"
None from the above worked, throws an error as -
Parameter validation failed:
Invalid bucket name "s3-bucket/folder-name/": Bucket name must match the
regex "^[a-z A-Z 0-9.\-_]{1,255}$"
Is there any alternate way to put the folder address into python script?
Please help!

In AWS, the "folder", or the object file path, is all part of the object key.
So when accessing a bucket you specify the bucket name, which is strictly the bucket name with nothing else, and then the object key would be the file path.

Access a amazon s3 bucket subfolder using python

I am trying to access a bucket subfolder using python's boto3.
The problem is that I cannot find anywhere how to input the subfolder information inside the boto code.
All I find is how to put the bucket name, but I do not have access to the whole bucket, just to a specific subfolder. Can anyone give me a light?
What I did so far:
BUCKET = "folder/subfolder"
conn = S3Connection(AWS_KEY, AWS_SECRET)
bucket = conn.get_bucket(BUCKET)
for key in bucket.list():
print key.name.encode('utf-8')
The error messages:
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListBuckets operation: Access Denied
I do not need to use boto for the operation, I just need to list/get the files inside this subfolder.
P.S.: I can access the files using cyberduck by putting the path folder/subfolder, which means I have access to the date.
Sincerely,
Israel

I fixed the problem using something similar vtl suggested:
I had to put the prefix in my bucket and a delimiter. The final code was something like this:
objects = s3.list_objects(Bucket=bucketName, Prefix=bucketPath+'/', Delimiter='/')
As he said, there's not folder structure, then you have to state a delimiter and also put it after the Prefix like I did.
Thanks for the reply.

Try:
for obj in bucket.objects.filter(Prefix="your_subfolder"):
do_something()
AWS doesn't actually have a directory structure - it just fakes one by putting "/"s in names. The Prefix option restricts the search to all objects whose name starts with the given prefix, which should be your "subfolder".

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Decoding s3 file names having special characters - python

Related

How can I write the output inside the folder of a S3 bucket using Lambda fuction

need help configuring python boto3

No file after S3 boto put

Adding s3 bucket folder's address to output bucket + python

Access a amazon s3 bucket subfolder using python

Categories

Resources