I am trying to execute query on Athena using python.
Sample code
client = boto3.client(
'athena',
region_name=region,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
execution = client.start_query_execution(
QueryString=query,
QueryExecutionContext={
'Database': database
},
WorkGroup=workgroup,
ResultConfiguration={
'OutputLocation': S3_OUTPUT_LOCATION
}
)
This is working code, But I got an unusual scenario.
One day it throws an InvalidRequestException error
Error
InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: Unable to verify/create output bucket <BUCKET NAME>
As per the DevOps application have all the permission, It should work.
We try to execute the same query on the AWS Athena console(Query editor). There it is working.
Then we re-run the python script, it is not throwing any error.
But on the next day, the python script start's throwing the same InvalidRequestException error.
Then we execute the same query on the AWS Athena console(Query editor) and re-run the python script, it started working.
We observed this scenario for a few days, Every 24 hours python script throws the error then we execute the query on the Athena console(Query editor) and re-run the python script.
I don't understand why it is happening, is there any permission issue.
Permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"athena:GetWorkGroup",
"athena:StartQueryExecution",
"athena:ListDatabases",
"athena:StopQueryExecution",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"athena:GetDatabase",
"athena:GetDataCatalog",
"athena:ListQueryExecutions",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::<BUCKET NAME>",
"arn:aws:s3:::<BUCKET NAME>/*",
]
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket",
"athena:UpdateWorkGroup",
],
"Resource": [
"arn:aws:s3:::<BUCKET NAME>/*",
"arn:aws:s3:::<BUCKET NAME>",
"arn:aws:athena:*:<BUCKET NAME>/<PATH>",
]
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"athena:ListDataCatalogs",
"s3:ListAllMyBuckets"
],
"Resource": "*"
}
]
}
I also faced same error today and found that execution role requires s3:GetBucketLocation permission also, AWS doc: https://aws.amazon.com/premiumsupport/knowledge-center/athena-output-bucket-error/
I was experiencing the same issue - random failures. The issue turned out to be s3:GetBucketLocation policy being configured wrong. It was bundled with the same cluster as other s3 actions where the resource points to the s3 bucket, including path. It does not work this way.
I fixed it as below, works now.
- Effect: Allow
Action:
- s3:GetBucketLocation
Resource:
- arn:aws:s3:::*
- Effect: Allow
Action:
- s3:PutObject
- s3:GetObject
Resource:
- arn:aws:s3:::<BUCKET NAME>/<PATH>/*
See documentation: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-with-s3-actions.html
Related
I have the following permission policy on an IAM Role
statement {
effect = "Allow"
actions = [
"sagemaker:CreateModelPackageGroup",
"sagemaker:ListModelPackageGroups",
]
resources = [
"arn:aws:sagemaker:my_account_region:my_account_id:model-package-group/*",
]
to be able to create a ModelPackageGroup using boto3
import boto3
import logging
logger = logging.getLogger(__name__)
sagemaker_client = boto3.client('sagemaker')
mpg_name = 'my_model_package_name'
matching_mpg = sagemaker_client.list_model_package_groups(NameContains=mpg_name)[
"ModelPackageGroupSummaryList"
]
if matching_mpg:
logger.info(f"Using existing Model Package Group: {mpg_name}")
else:
mpg_input_dict = {
"ModelPackageGroupName": mpg_name,
"ModelPackageGroupDescription": mpg_description,
}
mpg_response = sagemaker_client.create_model_package_group(**mpg_input_dict)
but I got the following error
botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the ListModelPackageGroups operation: User: arn:aws:sts::my_account_id:assumed-role/my_role_name/botocore-session-some_session_id is not authorized to perform: sagemaker:ListModelPackageGroups because no identity-based policy allows the sagemaker:ListModelPackageGroups action
After doing some research I've found out that the SageMaker Action ListModelPackageGroups requires an All Resources permission
The actions in your policy do not support resource-level permissions and require you to choose All resources
In fact, the Actions defined by Amazon SageMaker Documentation mentions that some SageMaker actions support resource level permission and others must specify all resources
The Resource types column indicates whether each action supports resource-level permissions. If there is no value for this column, you must specify all resources ("*") in the Resource element of your policy statement. If the column includes a resource type, then you can specify an ARN of that type in a statement with that action.
So, changing to the following policy worked out
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Action": [
"sagemaker:ListModelPackageGroups"
],
"Resource": "*"
},
{
"Sid": "",
"Effect": "Allow",
"Action": [
"sagemaker:CreateModelPackageGroup"
],
"Resource": "arn:aws:sagemaker:my_account_region:my_account_id:model-package-group/*"
}
]
}
I am attempting to create an S3 Batch (not AWS Batch, this is S3 Batch) job via boto3, and cannot figure out what permissions I need to enable for successful creation. I keep getting an "Access Denied" when I try to create the job, but it works fine when I apply the S3 Full Access policy to the execution role. Not a good long-term solution, obviously...
I am pretty certain that I need to add a specific permission in IAM, but I can't figure out which one. I can't see a "CreateJob" permission anywhere. Possibly I need to add access to some kind of s3 control bucket where the job is written?
I have tried adding permissions to a couple variations of what could be the S3 control bucket, but I haven't been successful yet.
This works fine when full S3 perms policy is applied:
import boto3
s3_control_client = boto3.client('s3control', region_name='us-east-1')
response = s3_control_client.create_job([very long and boring])
This is the output (scrubbed) that I get in the logs when I try to run with what I think are acceptable permissions.
2019-05-23 18:35:37,934 Starting new HTTPS connection (1): [ACCOUNTIDNUMBER].s3-control.us-east-1.amazonaws.com:443
2019-05-23 18:35:38,040 https://[ACCOUNTIDNUMBER].s3-control.us-east-1.amazonaws.com:443 "POST /v20180820/jobs HTTP/1.1" 403 204
2019-05-23 18:35:38,040 Response headers: {'x-amz-id-2': '[SCRUBBED]', 'x-amz-request-id': '[SCRUBBED], [SCRUBBED]', 'Date': 'Thu, 23 May 2019 18:35:38 GMT', 'Content-Type': 'application/xml', 'Content-Length': '204', 'Server': 'AmazonS3'}
2019-05-23 18:35:38,041 Response body:
b'<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>[SCRUBBED]</RequestId>
Any ideas on what permissions I need to enable here for this to complete?
According to this, you need s3:CreateJob, as well as iam:PassRole to the role that will be attached to the batch job.
So, something likes this:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": [
"arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME"
]
},
{
"Effect": "Allow",
"Action": [
"s3:CreateJob"
],
"Resource": [
"*"
]
}
]
}
There are no special S3 batch job permissions that you can use. There might be some variations of permissions that you need depending on your use case. In general, you will need these permissions.
Permissions for your destination bucket
s3:PutObject
s3:PutObjectAcl
s3:PutObjectTagging
Permissions for your source bucket
s3:GetObject
Permissions for your manifest bucket
s3:GetObject
s3:GetObjectVersion
s3:GetBucketLocation
Permissions for your report bucket
s3:PutObject
s3:GetBucketLocation
Here is a template that you can use
{
"Version":"2012-10-17",
"Statement":[
{
"Action": [
"s3:PutObject",
"s3:PutObjectAcl",
"s3:PutObjectTagging"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::{{DestinationBucket}}/*"
},
{
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::{{SourceBucket}}/*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::{{ManifestBucket}}/*"
]
},
{
"Effect":"Allow",
"Action":[
"s3:PutObject",
"s3:GetBucketLocation"
],
"Resource":[
"arn:aws:s3:::{{ReportBucket}}/*"
]
}
]
}
You can check this link for more information.
First time boto3 user.
I had a user with ACL S3FullAccess and used the following code to try and upload a file; it uses a pandas DataFrame as the source.
s3_client = boto3.client('s3')
io = StringIO()
df.to_csv(io)
response = s3_client.put_object(
Bucket=self.bucket,
Body=io,
Key=self.filename
)
This lead to this response
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the PutObject operation: Access Denied
So I checked that the secret key and access key were being picked up by boto3 from my ~/.aws/credentials file, and they are, on line 604 of client.py in boto3 - request_signer=self._request_signer
So I researched here on SO, and it seemed a lot of people had to add a Policy document, so I did that as follows:
{
"Version": "2012-10-17",
"Id": "Policyxxx",
"Statement": [
{
"Sid": "Stmtx1",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<12 digit id>:root"
},
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::<my-bucket>/*"
},
{
"Sid": "Stmtx6",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<12 digit id>:root"
},
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::<my-bucket>"
}
]
}
I still get the same error, so I added this to my put_object call since the S3 bucket uses AES-256 encryption, which I thought was server-side only, but running out of ideas, so worth a try.
SSECustomerKey=os.urandom(32),
SSECustomerAlgorithm='AES256',
Next I removed those terms associated with the SSE keys, realising that the AES-256 encryption is server side and should not affect my access.
Then I tried to generate a new pair of Access keys and use those instead, same result.
Does anyone have any idea what I might look at next, what have I missing in the hundreds of pages of AWS documentation?
This was simply a case of when the user was created they were added to a couple of groups. Administrators and EC2MFA. I was unaware of the implications of this, but assume the EC2MFA group prevented API or CLI access. I am assuming the combination of the Policy on the user and S3 side is sufficiently secure.
I'm trying to download some files I already uploaded to S3 with some Python code, but I'm getting headaches trying to use a tight policy.
I can list all the files in the bucket, but when I try do download them with what I see as a correct policy, I get botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
Then, when I was trying to add a different policy that worked for 2 different buckets, I added part of the bucket's name, then the asterisk, and for some reason, the same exact thing worked.
So can someone tell me why this happens?
This for example, is what works like a charm:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1499955913000",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::THE-BEGINING-OF-THE-NAME*"
}
]
}
But this doesn't:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1499955913000",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::THE-EXACT-COMPLETE-FULL-NAME"
}
]
}
I can add the python code for the download if it's relevant, but this questions seems long enough, and the code is pretty straightforward
Seems I just needed some rubber duck debugging, the answer was I think counter intuitive, but easy:
It seems the ARN it's not only an identifier for the AWS resource itself, but also its content. So, when giving permissions, you need to give permissions to "the bucket" for listing it, and "the content" to download it
Which leads to a policy like this:
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "Stmt1499955913000",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::THE-EXACT-COMPLETE-FULL-NAME",
"arn:aws:s3:::THE-EXACT-COMPLETE-FULL-NAME/*"
]
}]
}
Which as I said, gives control over the bucket itself, with no asterisks, and whatever goes after the slash bar.
I have found lot of questions regarding this on stackoverflow but none solved my problem. After lot of googling still i am facing AccessDenied Exception:
<Error>
<Code>AccessDenied</Code>
</Message><RequestId>ADF9C0DE6C86DF4F</RequestId>
<HostId>JwQLkNB0LuJvh0jwrsJe9wazxLsd+hrZ2qwvjCvmXYd2A/ckCrsotRMHm</HostId>
</Error>
Here are my policy docs for user and group:
User Policy:
{
"Statement":[
{
"Sid":"AllowListBucketIfSpecificPrefixIsIncludedInRequest",
"Action":"s3:*",
"Effect":"Allow",
"Resource":["arn:aws:s3::: mybucket", "arn:aws:s3:::mybucket/*"],
"Condition":{
"StringLike":{"s3:prefix":["Development/*"]
}
}
},
{
"Sid":"AllowUserToReadWriteObjectDataInDevelopmentFolder",
"Action":"s3:*",
"Effect":"Allow",
"Resource":["arn:aws:s3::: mybucket/Development/*"]
},
{
"Sid": "ExplicitlyDenyAnyRequestsForAllOtherFoldersExceptDevelopment",
"Action": ["s3:ListBucket"],
"Effect": "Deny",
"Resource": ["arn:aws:s3::: mybucket", "arn:aws:s3::: mybucket/*"],
"Condition":{ "StringNotLike": {"s3:prefix":["Development/*"] },
"Null" : {"s3:prefix":false }
}
}
]
}
Group Policy:
{
"Statement": [
{
"Sid": "AllowGroupToSeeBucketListAndAlsoAllowGetBucketLocationRequiredForListBucket",
"Action": ["s3:ListAllMyBuckets", "s3:GetBucketLocation"],
"Effect": "Allow",
"Resource": ["arn:aws:s3:::*"]
},
{
"Sid": "AllowRootLevelListingOfCompanyBucket",
"Action": ["s3:ListBucket"],
"Effect": "Allow",
"Resource": ["arn:aws:s3::: mybucket", "arn:aws:s3::: mybucket/*"],
"Condition":{
"StringEquals":{"s3:prefix":[""]}
}
},
{
"Sid": "RequireFolderStyleList",
"Action": ["s3:ListBucket"],
"Effect": "Deny",
"Resource": ["arn:aws:s3:::*"],
"Condition":{
"StringNotEquals":{"s3:delimiter":"/"}
}
},
{
"Sid": "ExplictDenyAccessToPrivateFolderToEveryoneInTheGroup",
"Action": ["s3:*"],
"Effect": "Deny",
"Resource":["arn:aws:s3:::mybucket/Private/*"]
},
{
"Sid": "DenyListBucketOnPrivateFolder",
"Action": ["s3:ListBucket"],
"Effect": "Deny",
"Resource": ["arn:aws:s3:::*"],
"Condition":{
"StringLike":{"s3:prefix":["Private/"]}
}
}
]
}
Created a user with username - testuser
then got access_key and secret_access_key for this IAM user.
Now i am able to access mybucket and its subfolder using aws web console and cyberduck.
But whenever i am trying to access using boto , getting AccessDenied Exception (Error 403).
Boto Code:
<!-- language: python -->
from boto.s3.connection import S3Connection
connect = S3Connection('_______________________','_____________________')
# Without Validate
bucket = conn.get_bucket('mybucket', validate=False) #here got bucket object
bucket.get_key('one/two/three.png') # AccessDenied
#With Validate
bucket = conn.get_bucket('mybucket') #AccessDenied
Even i faced same problem when i was trying to use boto-rsync.
Any Suggestions ??
Error 403 means Access Denied so there is a authentication problem. To analyze the API call and the response one can use the following line:
boto.set_stream_logger('boto')
some points that I have noticed:
the Group and User Rules are okay, with removed leading space in front of "mybucket"
the first directory name is "Development" instead of "one"
"Without Validate" means access file directly
The following code works fine:
import boto
conn = boto.connect_s3("id","secret")
bucket = conn.get_bucket('mybucket', validate=False)
bucket.get_key('Development/two/three.png')
# <Key: mybucket,Development/two/three.png>
But i am new to IAM, and it seems "With Validate" first tries to read "/mybucket/" but it is denied via User Policy ExplicitlyDenyAnyRequestsForAllOtherFoldersExceptDevelopment.
edited to comment "to access all keys inside Development" try this::
list = bucket.list("Development/",delimiter="/")
for key in list:
print key.name