Query aws to list all resources using boto3 python sdk

Query aws to list all resources using boto3 python sdk - python

Is there a way to get all the resources in the aws account through python code using boto3. I went through the documentation, didn't find any list function which might solve this.

Try this command.
but prerequisite enable aws config for this region before running this command.
import boto3
session = boto3.Session(profile_name=’your-profilename’)
client = session.client(‘config’)
resources = ["AWS::EC2::CustomerGateway", "AWS::EC2::EIP", "AWS::EC2::Host", "AWS::EC2::Instance", "AWS::EC2::InternetGateway", "AWS::EC2::NetworkAcl", "AWS::EC2::NetworkInterface", "AWS::EC2::RouteTable", "AWS::EC2::SecurityGroup", "AWS::EC2::Subnet", "AWS::CloudTrail::Trail", "AWS::EC2::Volume", "AWS::EC2::VPC", "AWS::EC2::VPNConnection", "AWS::EC2::VPNGateway", "AWS::EC2::RegisteredHAInstance", "AWS::EC2::NatGateway", "AWS::EC2::EgressOnlyInternetGateway", "AWS::EC2::VPCEndpoint", "AWS::EC2::VPCEndpointService", "AWS::EC2::FlowLog", "AWS::EC2::VPCPeeringConnection", "AWS::IAM::Group", "AWS::IAM::Policy", "AWS::IAM::Role", "AWS::IAM::User", "AWS::ElasticLoadBalancingV2::LoadBalancer", "AWS::ACM::Certificate", "AWS::RDS::DBInstance", "AWS::RDS::DBParameterGroup", "AWS::RDS::DBOptionGroup", "AWS::RDS::DBSubnetGroup", "AWS::RDS::DBSecurityGroup", "AWS::RDS::DBSnapshot", "AWS::RDS::DBCluster", "AWS::RDS::DBClusterParameterGroup", "AWS::RDS::DBClusterSnapshot", "AWS::RDS::EventSubscription", "AWS::S3::Bucket", "AWS::S3::AccountPublicAccessBlock", "AWS::Redshift::Cluster", "AWS::Redshift::ClusterSnapshot", "AWS::Redshift::ClusterParameterGroup", "AWS::Redshift::ClusterSecurityGroup", "AWS::Redshift::ClusterSubnetGroup", "AWS::Redshift::EventSubscription", "AWS::SSM::ManagedInstanceInventory", "AWS::CloudWatch::Alarm", "AWS::CloudFormation::Stack", "AWS::ElasticLoadBalancing::LoadBalancer", "AWS::AutoScaling::AutoScalingGroup", "AWS::AutoScaling::LaunchConfiguration", "AWS::AutoScaling::ScalingPolicy", "AWS::AutoScaling::ScheduledAction", "AWS::DynamoDB::Table", "AWS::CodeBuild::Project", "AWS::WAF::RateBasedRule", "AWS::WAF::Rule", "AWS::WAF::RuleGroup", "AWS::WAF::WebACL", "AWS::WAFRegional::RateBasedRule", "AWS::WAFRegional::Rule", "AWS::WAFRegional::RuleGroup", "AWS::WAFRegional::WebACL", "AWS::CloudFront::Distribution", "AWS::CloudFront::StreamingDistribution", "AWS::Lambda::Alias", "AWS::Lambda::Function", "AWS::ElasticBeanstalk::Application", "AWS::ElasticBeanstalk::ApplicationVersion", "AWS::ElasticBeanstalk::Environment", "AWS::MobileHub::Project", "AWS::XRay::EncryptionConfig", "AWS::SSM::AssociationCompliance", "AWS::SSM::PatchCompliance", "AWS::Shield::Protection", "AWS::ShieldRegional::Protection", "AWS::Config::ResourceCompliance", "AWS::LicenseManager::LicenseConfiguration", "AWS::ApiGateway::DomainName", "AWS::ApiGateway::Method", "AWS::ApiGateway::Stage", "AWS::ApiGateway::RestApi", "AWS::ApiGatewayV2::DomainName", "AWS::ApiGatewayV2::Stage", "AWS::ApiGatewayV2::Api", "AWS::CodePipeline::Pipeline", "AWS::ServiceCatalog::CloudFormationProvisionedProduct", "AWS::ServiceCatalog::CloudFormationProduct", "AWS::ServiceCatalog::Portfolio"]
for resource in resources:
response = client.list_discovered_resources(resourceType=resource)
print(‘##################### {} #################’.format(resource))
for i in range(len(response[‘resourceIdentifiers’])):
print( ‘{} , {}’.format(response[‘resourceIdentifiers’][i][‘resourceType’], response[‘resourceIdentifiers’][i][‘resourceId’]))

In boto3 you can use ResourceGroupsTaggingAPI method get_resources(). Which is used to get resources mainly based on tags but you can leave blank tag filter parameter and get all the resources supported.
Consider that not all resources are included and it is limited to a specific region but I hope that it can help you.
Examples:
Get all resources:
import boto3
client = boto3.client('resourcegroupstaggingapi')
client.get_resources()
Get resources of an especific service type:
import boto3
client = boto3.client('resourcegroupstaggingapi')
client.get_resources(
ResourceTypeFilters=[
'ec2:instance'
])
Official documentation:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/resourcegroupstaggingapi.html#ResourceGroupsTaggingAPI.Client.get_resources

Following on from the answer Omar gave, I came up with the following:
from pprint import pprint
import boto3
from botocore.exceptions import ClientError
client = boto3.client('resourcegroupstaggingapi', )
regions = boto3.session.Session().get_available_regions('ec2')
for region in regions:
print(region)
try:
client = boto3.client('resourcegroupstaggingapi', region_name=region)
pprint([x.get('ResourceARN') for x in client.get_resources().get('ResourceTagMappingList')])
except ClientError as e:
print(f'Could not connect to region with error: {e}')
print()
which will loop most regions and list any ARNs in that region, like so:
eu-north-1
[]
eu-west-1
['arn:aws:mq:eu-west-1:xxxxxxxxxxxx:broker:example:b-125099aa-8e22-462e-a8e9-bcc6b29c010a']
eu-west-2
[]
eu-west-3
[]

Related

How to get all resources with details from Azure subscription via Python

I am trying to get all resources and providers from Azure subscription by using Python SDK.
Here is my code:
get all resources by "resource group"
extract id of each resource within "resource group"
then calling details about particular resource by its id
The problem is that each call from point 3. requires a correct "API version" and it differs from object to object. So obviously my code keeps failing when trying to find some common API version that fits to everything.
Is there a way to retrieve suitable API version per resource in resource group ??? (similarly as retrieving id, name, ...)
# Import specific methods and models from other libraries
from azure.mgmt.resource import SubscriptionClient
from azure.identity import AzureCliCredential
from azure.mgmt.resource import ResourceManagementClient
credential = AzureCliCredential()
client = ResourceManagementClient(credential, "<subscription_id>")
rg = [i for i in client.resource_groups.list()]
# Retrieve the list of resources in "myResourceGroup" (change to any name desired).
# The expand argument includes additional properties in the output.
rg_resources = {}
for i in range(0, len(rg)):
rg_resources[rg[i].as_dict()
["name"]] = client.resources.list_by_resource_group(
rg[i].as_dict()["name"],
expand="properties,created_time,changed_time")
data = {}
for i in rg_resources.keys():
details = []
for _data in iter(rg_resources[i]):
a = _data
details.append(client.resources.get_by_id(vars(_data)['id'], 'latest'))
data[i] = details
print(data)
error:
azure.core.exceptions.HttpResponseError: (NoRegisteredProviderFound) No registered resource provider found for location 'westeurope' and API version 'latest' for type 'workspaces'. The supported api-versions are '2015-03-20, 2015-11-01-preview, 2017-01-01-preview, 2017-03-03-preview, 2017-03-15-preview, 2017-04-26-preview, 2020-03-01-preview, 2020-08-01, 2020-10-01, 2021-06-01, 2021-03-01-privatepreview'. The supported locations are 'eastus, westeurope, southeastasia, australiasoutheast, westcentralus, japaneast, uksouth, centralindia, canadacentral, westus2, australiacentral, australiaeast, francecentral, koreacentral, northeurope, centralus, eastasia, eastus2, southcentralus, northcentralus, westus, ukwest, southafricanorth, brazilsouth, switzerlandnorth, switzerlandwest, germanywestcentral, australiacentral2, uaecentral, uaenorth, japanwest, brazilsoutheast, norwayeast, norwaywest, francesouth, southindia, jioindiawest, canadaeast, westus3

What information exactly do you want to retrieve from the resources?
In most cases, I would recommend to use the Graph API to query over all resources. This is very powerful, as you can query the whole platform using a simple Query language - Kusto Query Lanaguage (KQL)
You can try the queries directly in the Azure service Azure Resource Graph Explorer in the Portal
A query that summarizes all types of resources would be:
resources
| project resourceGroup, type
| summarize count() by type, resourceGroup
| order by count_
A simple python-codeblock can be seen on the linked documentation above.
The below sample uses DefaultAzureCredential for authentication and lists the first resource in detail, that is in a resource group, where its name starts with "rg".
# Import Azure Resource Graph library
import azure.mgmt.resourcegraph as arg
# Import specific methods and models from other libraries
from azure.mgmt.resource import SubscriptionClient
from azure.identity import DefaultAzureCredential
# Wrap all the work in a function
def getresources( strQuery ):
# Get your credentials from environment (CLI, MSI,..)
credential = DefaultAzureCredential()
subsClient = SubscriptionClient(credential)
subsRaw = []
for sub in subsClient.subscriptions.list():
subsRaw.append(sub.as_dict())
subsList = []
for sub in subsRaw:
subsList.append(sub.get('subscription_id'))
# Create Azure Resource Graph client and set options
argClient = arg.ResourceGraphClient(credential)
argQueryOptions = arg.models.QueryRequestOptions(result_format="objectArray")
# Create query
argQuery = arg.models.QueryRequest(subscriptions=subsList, query=strQuery, options=argQueryOptions)
# Run query
argResults = argClient.resources(argQuery)
# Show Python object
print(argResults)
getresources("Resources | where resourceGroup startswith 'rg' | limit 1")

Rasterio " does not exist in the file system, and is not recognized as a supported dataset name."

Following this tutorial: https://www.usgs.gov/media/files/landsat-cloud-direct-access-requester-pays-tutorial
import boto3
import rasterio as rio
from matplotlib.pyplot import imshow
from rasterio.session import AWSSession
s3 = boto3.client('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
resources = boto3.resource('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
aws_session = AWSSession(boto3.Session())
cog = 's3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF'
with rio.Env(aws_session):
with rio.open(cog) as src:
profile = src.profile
arr = src.read(1)
imshow(arr)
I get the below error:
rasterio.errors.RasterioIOError: '/vsis3/usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF' does not exist in the file system, and is not recognized as a supported dataset name.
In AWS CloudShell if I run:
```
aws s3 ls s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/
```
I get:
An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
I ran the cloudshell commands in an EC2 instance, same errors.
I needed to specify that I am requester its right in the documentation, this works:
aws s3 ls s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/ --request-payer requ
ester
Using boto3 still doesn't work.
I have admin permissions on the user I was running boto3 with. Got the same error in CloudShell as both the boto user and root. I have used the access key and secret key before and it works fine for downloading from the "landsat-pds" bucket (only has L8 images) and the "sentinel-s2-l1c" bucket. Only seems to be an issue with the "usgs-landsat" bucket (https://registry.opendata.aws/usgs-landsat/)
Also tried accessing the usgs-landsat bucket with s3.list_objects:
landsat = resources.Bucket("usgs-landsat")
all_objects = s3.list_objects(Bucket = 'usgs-landsat')
Get a similar error:
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
After looking at other solutions some users found:
os.environ["AWS_REQUEST_PAYER"] = "requester"
os.environ["CURL_CA_BUNDLE"] = "/etc/ssl/certs/ca-certificates.crt"
To fix their issue, it hasn't worked for me.

As you have correctly pointed out, the usgs-landsat S3 bucket is requester pays, so you need to configure rasterio correctly in order to handle that.
As you can see here, rasterio.session.AWSSession has a requester_pays argument that you can set to True in order to do this.
I can also point out that the lines:
s3 = boto3.client('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
resources = boto3.resource('s3', aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
in your code snippet are not needed since you do not reuse the s3 and resources variables later on.
In fact, if your credentials are correctly located in your ~/.aws/ folder - which can be done by running the command-line utility aws configure provided by the awscli python package (see documentation) - you do not need to import boto3 at all, rasterio does it for you.
Your code snippet can therefore be modified to:
import rasterio as rio
from matplotlib.pyplot import imshow
from rasterio.session import AWSSession
aws_session = AWSSession(requester_pays=True)
cog = 's3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF'
with rio.Env(aws_session):
with rio.open(cog) as src:
profile = src.profile
arr = src.read(1)
imshow(arr)
which runs correctly on my machine.

This worked for me
s3sr = boto3.resource('s3')
bucket='usgs-landsat'
prefix = 'collection02/'
keys_list = []
paginator = s3sr.meta.client.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter='/', RequestPayer='requester'):
keys = [content['Key'] for content in page.get('Contents')]
keys_list.extend(keys)
len(keys_list)
# keys_list
['collection02/catalog.json',
'collection02/landsat-c2l1.json',
'collection02/landsat-c2l2-sr.json',
'collection02/landsat-c2l2-st.json',
'collection02/landsat-c2l2alb-bt.json',
'collection02/landsat-c2l2alb-sr.json',
'collection02/landsat-c2l2alb-st.json',
'collection02/landsat-c2l2alb-ta.json']
# getting the catalog.json
response = boto3.client('s3').get_object(Bucket=bucket, Key='collection02/catalog.json', RequestPayer='requester')
jsondata = response['Body'].read().decode()

How to get ID of EMR matching specific name only with boto3

How do I get a list of AWS EMR cluster IDs matching a specific name with boto3?
I have this code here:
import sys
import time
import boto3
client = boto3.client("emr")
cluster_name = 'Adhoc-CSDP-EMR'
response = client.list_clusters(
ClusterStates=[
'RUNNING', 'WAITING'
]
)
for cluster in response['Clusters']:
print(cluster['Name'])
print(cluster['Id'])
That will print all clusters in the running or waiting state. How do I filter the results that match cluster_name?

Umm, why can't we do something like this?
matching_cluster_ids = list()
for cluster in response['Clusters']:
if cluster_name == cluster['Name']:
matching_cluster_ids.append(cluster['Id'])
Later you can execute a describe_cluster() (or any other operation) on any of the matching cluster_ids if you want.

With Python Kubernetes client, how to replicate `kubectl create -f` generally?

My Bash script using kubectl create/apply -f ... to deploy lots of Kubernetes resources has grown too large for Bash. I'm converting it to Python using the PyPI kubernetes package.
Is there a generic way to create resources given the YAML manifest? Otherwise, the only way I can see to do it would be to create and maintain a mapping from Kind to API method create_namespaced_<kind>. That seems tedious and error prone to me.
Update: I'm deploying many (10-20) resources to many (10+) GKE clusters.

Update in the year 2020, for anyone still interested in this (since the docs for the python library is mostly empty).
At the end of 2018 this pull request has been merged,
so it's now possible to do:
from kubernetes import client, config
from kubernetes import utils
config.load_kube_config()
api = client.ApiClient()
file_path = ... # A path to a deployment file
namespace = 'default'
utils.create_from_yaml(api, file_path, namespace=namespace)
EDIT: from a request in a comment, a snippet for skipping the python error if the deployment already exists
from kubernetes import client, config
from kubernetes import utils
config.load_kube_config()
api = client.ApiClient()
def skip_if_already_exists(e):
import json
# found in https://github.com/kubernetes-client/python/blob/master/kubernetes/utils/create_from_yaml.py#L165
info = json.loads(e.api_exceptions[0].body)
if info.get('reason').lower() == 'alreadyexists':
pass
else
raise e
file_path = ... # A path to a deployment file
namespace = 'default'
try:
utils.create_from_yaml(api, file_path, namespace=namespace)
except utils.FailToCreateError as e:
skip_if_already_exists(e)

I have written a following piece of code to achieve the functionality of creating k8s resources from its json/yaml file:
def create_from_yaml(yaml_file):
"""
:param yaml_file:
:return:
"""
yaml_object = yaml.loads(common.load_file(yaml_file))
group, _, version = yaml_object["apiVersion"].partition("/")
if version == "":
version = group
group = "core"
group = "".join(group.split(".k8s.io,1"))
func_to_call = "{0}{1}Api".format(group.capitalize(), version.capitalize())
k8s_api = getattr(client, func_to_call)()
kind = yaml_object["kind"]
kind = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', kind)
kind = re.sub('([a-z0-9])([A-Z])', r'\1_\2', kind).lower()
if "namespace" in yaml_object["metadata"]:
namespace = yaml_object["metadata"]["namespace"]
else:
namespace = "default"
try:
if hasattr(k8s_api, "create_namespaced_{0}".format(kind)):
resp = getattr(k8s_api, "create_namespaced_{0}".format(kind))(
body=yaml_object, namespace=namespace)
else:
resp = getattr(k8s_api, "create_{0}".format(kind))(
body=yaml_object)
except Exception as e:
raise e
print("{0} created. status='{1}'".format(kind, str(resp.status)))
return k8s_api
In above function, If you provide any object yaml/json file, it will automatically pick up the API type and object type and create the object like statefulset, deployment, service etc.
PS: The above code doesn't handler multiple kubernetes resources in one file, so you should have only one object per yaml file.

I see what you are looking for. This is possible with other k8s clients available in other languages. Here is an example in java. Unfortunately the python client library does not support that functionality yet. I opened a new feature request requesting the same and you can either choose to track it or contribute yourself :). Here is the link for the issue on GitHub.
The other way to still do what you are trying to do is to use java/golang client and put your code in a docker container.

Cannot create a Elasticache cluster in sa-east-1 region using boto

I'm trying to create a Redis Elasticache cluster using boto in the sa-east-1 region, and boto is giving me this error message:
{"Error":{"Code":"InvalidParameterValue","Message":"sa-east-1 is not a valid availability zone.","Type":"Sender"},"RequestId":"2q34hj192-6902-11e4-8b4a-afafaefasefsadfsadf"}
with this code:
from boto.elasticache.layer1 import ElastiCacheConnection
self.elasticache = ElastiCacheConnection()
boto.elasticache.connect_to_region(
'sa-east-1a',
aws_access_key_id=settings.AWS_ACCESS_KEY,
aws_secret_access_key=settings.AWS_SECRET_KEY
)
elasticache.create_cache_cluster(
cache_cluster_id='test1',
engine='redis',
cache_node_type='cache.m3.medium',
num_cache_nodes=1,
preferred_availability_zone='sa-east-1',
)
Thanks

It's asking you for an availability zone but you are providing it with a region. Correct values would be one of sa-east-1a or sa-east-1b or just leave it blank if you have no preference.

After searching in boto code, I found that
elasticache = boto.elasticache.connect_to_region(
'sa-east-1',
aws_access_key_id=settings.AWS_ACCESS_KEY,
aws_secret_access_key=settings.AWS_SECRET_KEY
)
elasticache.create_cache_cluster(
cache_cluster_id=cache_cluster_id,
engine=engine,
cache_node_type=cache_node_type,
num_cache_nodes=num_cache_nodes,
preferred_availability_zone='sa-east-1a',
preferred_maintenance_window=preferred_maintenance_window,
)
works.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Query aws to list all resources using boto3 python sdk - python

Is there a way to get all the resources in the aws account through python code using boto3. I went through the documentation, didn't find any list function which might solve this.

Related

How to get all resources with details from Azure subscription via Python

Rasterio " does not exist in the file system, and is not recognized as a supported dataset name."

How to get ID of EMR matching specific name only with boto3

With Python Kubernetes client, how to replicate `kubectl create -f` generally?

Cannot create a Elasticache cluster in sa-east-1 region using boto

Categories

Resources