How to read not my own bucket with python and boto3? - python

I know bucket name, I have access to it, I can browse it by web and by awscli.
How access it by Python's boto3? All examples assume accessing my own buckets:
import boto3
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
print(bucket.name)
How to reach other's bucket?

If you have access to someone else's bucket and you know the name of that bucket you can access it like
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('some-bucket-i-have-access-to')
for obj in bucket.objects.all():
print(obj.key)

Related

boto3 s3 initialized session return credentials

I am facing something similar to How to load file from custom hosted Minio s3 bucket into pandas using s3 URL format?
however, I already have an initialized s3 session (from boto3).
How can I get the credentials returned from it to feed these directly to pandas?
I.e. how can I extract the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from the initialized boto3 s3 client?
You can use session.get_credentials
import boto3
session = boto3.Session()
credentials = session.get_credentials()
AWS_ACCESS_KEY_ID = credentials.access_key
AWS_SECRET_ACCESS_KEY = credentials.secret_key
AWS_SESSION_TOKEN = credentials.token
If you only have access to boto client (like the S3 client), you can find the credentials hidden here:
client = boto3.client("s3")
client._request_signer._credentials.access_key
client._request_signer._credentials.secret_key
client._request_signer._credentials.token
If you don't want to handle credentials (I assume you're using the SSO here), you can load the S3 object directly with pandas: pd.read_csv(s3_client.get_object(Bucket='Bucket', Key ='FileName').get('Body'))

Boto3 client alternative for resource

What would be the client alternate for this?
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('name')
I want to use client but that has a list_buckets function. Is there a way to pass a bucket name to client instead of getting the details from the result array??
This will give you the bucket names
import boto3
s3 = boto3.resource('s3')
buckets = s3.buckets.all()
for bucket in buckets:
print(bucket.name)
I don't think you can only get the names.
But there is a tricky way to use the client
import boto3
s3 = boto3.resource('s3')
s3.meta.client.list_buckets()['Buckets']
where it gives the client responses.

No Credentials Error - Using boto3 and aws s3 bucket

I am using python and jupyter notebook to read files from an aws s3 bucket, and I am getting the error 'No Credentials Error:Unable to locate credentials' when running the following code:
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
for obj in bucket.objects.all():
key = obj.key
body = obj.get()['Body'].read()
I believe I need to put my access key somewhere, but I am not sure where. Thank you!
If you have AWS CLI installed, just do a simple aws configure. Then you will be good to go.

How to check whether a bucket exists in GCS with python

Code:
from google.cloud import storage
client = storage.Client()
bucket = ['symbol_wise_nse', 'symbol_wise_final']
for i in bucket:
if client.get_bucket(i).exists():
BUCKET = client.get_bucket(i)
if the bucket exists i want to do client.get_bucket. How to check whether the bucket exists or not?
Another option that doesn't use try: except is:
from google.cloud import storage
client = storage.Client()
bucket = ['symbol_wise_nse', 'symbol_wise_final']
for i in bucket:
BUCKET = client.bucket(i)
if BUCKET.exists():
BUCKET = client.get_bucket(i)
There is no method to check if the bucket exists or not, however you will get an error if you try to access a non existent bucket.
I would recommend you to either list the buckets in the project with storage_client.list_buckets() and then use the response to confirm if the bucket exists in your code, or if you wish to perform the client.get_bucket in every bucket in your project, you can just iterate through the response directly.
Hope you find this information useful
You can use something like this:
from google.cloud import storage
client = storage.Client()
buckets = ['symbol_wise_nse', 'symbol_wise_final']
for i in buckets:
try:
bucket = client.get_bucket(i)
print(bucket)
except:
pass
The following worked for me (re-using params in question):
from google.cloud import storage
from google.cloud.storage import Bucket
client = storage.Client()
exists = Bucket(client, 'symbol_wise_nse').exists()

Uploading File to a specific location in Amazon S3 Bucket using boto3?

I am trying to upload few files into Amazon S3. I am able to upload the file in my bucket. However, the file needs to be in my-bucket-name,Folder1/Folder2.
import boto3
from boto.s3.key import Key
session = boto3.Session(aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY)
bucket_name = 'my-bucket-name'
prefix = 'Folder1/Folder2'
s3 = session.resource('s3')
bucket = s3.Bucket(bucket_name)
objs = bucket.objects.filter(Prefix=prefix)
I tried to upload into bucket using this code and succeeded:
s3.meta.client.upload_file('C:/hello.txt', bucket, 'hello.txt')
When I tried to upload the same file into specified Folder2 using this code the failed with error:
s3.meta.client.upload_file('C:/hello.txt', objs, 'hello.txt')
ERROR>>>
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid bucket name "s3.Bucket.objectsCollection(s3.Bucket(name='my-bucket-name'), s3.ObjectSummary)": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$"
So, how can I upload the file into my-bucket-name,Folder1/Folder2?
s3.meta.client.upload_file('C:/hello.txt', objs, 'hello.txt')
What's happening here is that the bucket argument to the client.upload_file needs to be the name of the bucket as a string
For specific folder
upload_file('C:/hello.txt', bucket, 'Folder1/Folder2/hello.txt')

Categories

Resources