EC2 are not starting using ec2.start_instance()
import boto3
ec2 = boto3.client('ec2')
# Define the tag key and values
tag_key = "Environment"
tag_values = \["MT1", "UAT2"\]
\#Exclusion
exclude_tag_key = "Exclude"
exclude_tag_value = "Yes"
def lambda_handler(event, context):
# Get all the instances with either of the specified tag values
response = ec2.describe_instances(
Filters=[
{
'Name': 'tag:' + tag_key,
'Values': tag_values
},
{
'Name': 'instance-state-name',
'Values': ['stopped']
}
# {
# 'Name': 'tag:' + exclude_tag_key,
# 'Values': [exclude_tag_value],
# 'Operator': 'NOT'
# }
]
)
# Extract the instance IDs from the response
instance_ids = [instance['InstanceId'] for reservation in response['Reservations'] for instance in reservation['Instances']]
\#Start the instances
ec2.start_instances(InstanceIds=instance_ids)
print(instance_ids)
We are able to get the filtered instances in "instance_ids", but none of the stopped instances goes to starting.
Related
I'm looking to find instances that is not equal to platform "Windows" and tag them with specific tags.
For now i have this script that is tagging the instances that are equal to platform "Windows":
import boto3
ec2 = boto3.client('ec2')
response = ec2.describe_instances(Filters=[{'Name' : 'platform', 'Values' : ['windows']}])
instances = response['Reservations']
for each_res in response['Reservations']:
for each_inst in each_res['Instances']:
for instance in instances:
response = ec2.create_tags(
Resources=[each_inst['InstanceId']],
Tags = [
{
'Key' : 'test',
'Value': 'test01'
}
]
)
I need help to add a block to this script that will add another tag only to EC2 instance that is NOT equal to platform "Windows".
Try this. Working for me. Also, Running create_tags inside the for loop, you are executing one API for each resource. Whereas create_tags supports multiple resource as input. Reference : https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ec2.html#EC2.Client.create_tags
import boto3
#Initialize an empty list to store non windows instance IDs.
list_nonwindows = []
ec2 = boto3.client("ec2", region_name="us-east-1")
response = ec2.describe_instances()
instances = response["Reservations"]
for each_res in response["Reservations"]:
for each_inst in each_res["Instances"]:
if each_inst.get('Platform') == None:
instance_s = each_inst.get('InstanceId')
list_nonwindows.append(instance_s)
response = ec2.create_tags(
Resources=list_nonwindows,
Tags = [
{
'Key' : 'test',
'Value': 'test01'
}
]
)
Just remove the filter and iterate over all the instances and inside the code add an if condition on the platform key.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import boto3
ec2 = boto3.client("ec2", region_name="eu-central-1")
response = ec2.describe_instances()
instances = response["Reservations"]
for each_res in response["Reservations"]:
for each_inst in each_res["Instances"]:
platform = each_inst.get('Plaform')
instance_id = each_inst.get('InstanceId')
if platform == 'Windows':
response = ec2.create_tags(
Resources=[instance_id],
Tags = [
{
'Key' : 'test',
'Value': 'test01'
}
]
)
else:
print(f'found non windows intance: {instance_id}')
response = ec2.create_tags(
Resources=[instance_id],
Tags = [
{
'Key' : 'nonwindow',
'Value': 'nonwindowvalue'
}
]
)
As per the API docs
The value is Windows for Windows instances; otherwise blank.
Code is working correctly I tested:
$ python3 describe_instances.py
found non windows intance: i-0ba1a62801c895
describe_instances
Response structure received from the describe_instnaces call
{
'Reservations': [
{
'Groups': [
{
'GroupName': 'string',
'GroupId': 'string'
},
],
'Instances': [
{
'AmiLaunchIndex': 123,
'ImageId': 'string',
'InstanceId': 'string',
....
'Platform': 'Windows',
'PrivateDnsName': 'string',
'PrivateIpAddress': 'string',
'ProductCodes': [
....
#app.route("/companies/<string:companyId>/<string:name>/")
def get_search(companyId,name):
resp = client.get_item(
TableName=COMPANIES_TABLE,
Key={
'companyId': { 'S': companyId },
'name': { 'S': name }
}
)
item = resp.get('Item')
if not item:
return jsonify({'error': 'Company does not exist'}), 404
return jsonify({
'companyId': item.get('companyId').get('S'),
'name': item.get('name').get('S'),
'region': item.get('region').get('S')
})
The response from a DynamoDB resource object looks doesn't require me to parse the low level data structure from DynamoDB, but when I use the boto3 client I have to do that, why is that?
response = table.scan(
FilterExpression=Attr('name').eq(name)
)
item = response['Items']
import pdb;pdb.set_trace()
if not item:
return jsonify({'error': 'Company does not exist'}), 404
return jsonify({
'companyId': item.get('companyId'),
'name': item.get('name'),
'region': item.get('region')
})
In general the resource API in boto3 is a higher level abstraction from the underlying client API. It tries to hide some of the implementation details of the underlying client calls, but comes at a performance cost.
You can also use the deserializer that comes with boto3 to turn the values from client.get_item() into a Python object.
from boto3.dynamodb.types import TypeDeserializer
def main():
dynamodb_item = {
"PK": {
"S": "key"
},
"SK": {
"S": "value"
}
}
deserializer = TypeDeserializer()
deserialized = {
key: deserializer.deserialize(dynamodb_item[key])
for key in dynamodb_item.keys()
}
print(deserialized) # {'PK': 'key', 'SK': 'value'}
if __name__ == "__main__":
main()
I want to execute spark submit job on AWS EMR cluster based on the file upload event on S3. I am using AWS Lambda function to capture the event but I have no idea how to submit spark submit job on EMR cluster from Lambda function.
Most of the answers that i searched talked about adding a step in the EMR cluster. But I do not know if I can add add any step to fire "spark submit --with args" in the added step.
You can, I had to same thing last week!
Using boto3 for Python (other languages would definitely have a similar solution) you can either start a cluster with the defined step, or attach a step to an already up cluster.
Defining the cluster with the step
def lambda_handler(event, context):
conn = boto3.client("emr")
cluster_id = conn.run_job_flow(
Name='ClusterName',
ServiceRole='EMR_DefaultRole',
JobFlowRole='EMR_EC2_DefaultRole',
VisibleToAllUsers=True,
LogUri='s3n://some-log-uri/elasticmapreduce/',
ReleaseLabel='emr-5.8.0',
Instances={
'InstanceGroups': [
{
'Name': 'Master nodes',
'Market': 'ON_DEMAND',
'InstanceRole': 'MASTER',
'InstanceType': 'm3.xlarge',
'InstanceCount': 1,
},
{
'Name': 'Slave nodes',
'Market': 'ON_DEMAND',
'InstanceRole': 'CORE',
'InstanceType': 'm3.xlarge',
'InstanceCount': 2,
}
],
'Ec2KeyName': 'key-name',
'KeepJobFlowAliveWhenNoSteps': False,
'TerminationProtected': False
},
Applications=[{
'Name': 'Spark'
}],
Configurations=[{
"Classification":"spark-env",
"Properties":{},
"Configurations":[{
"Classification":"export",
"Properties":{
"PYSPARK_PYTHON":"python35",
"PYSPARK_DRIVER_PYTHON":"python35"
}
}]
}],
BootstrapActions=[{
'Name': 'Install',
'ScriptBootstrapAction': {
'Path': 's3://path/to/bootstrap.script'
}
}],
Steps=[{
'Name': 'StepName',
'ActionOnFailure': 'TERMINATE_CLUSTER',
'HadoopJarStep': {
'Jar': 's3n://elasticmapreduce/libs/script-runner/script-runner.jar',
'Args': [
"/usr/bin/spark-submit", "--deploy-mode", "cluster",
's3://path/to/code.file', '-i', 'input_arg',
'-o', 'output_arg'
]
}
}],
)
return "Started cluster {}".format(cluster_id)
Attaching a step to an already running cluster
As per here
def lambda_handler(event, context):
conn = boto3.client("emr")
# chooses the first cluster which is Running or Waiting
# possibly can also choose by name or already have the cluster id
clusters = conn.list_clusters()
# choose the correct cluster
clusters = [c["Id"] for c in clusters["Clusters"]
if c["Status"]["State"] in ["RUNNING", "WAITING"]]
if not clusters:
sys.stderr.write("No valid clusters\n")
sys.stderr.exit()
# take the first relevant cluster
cluster_id = clusters[0]
# code location on your emr master node
CODE_DIR = "/home/hadoop/code/"
# spark configuration example
step_args = ["/usr/bin/spark-submit", "--spark-conf", "your-configuration",
CODE_DIR + "your_file.py", '--your-parameters', 'parameters']
step = {"Name": "what_you_do-" + time.strftime("%Y%m%d-%H:%M"),
'ActionOnFailure': 'CONTINUE',
'HadoopJarStep': {
'Jar': 's3n://elasticmapreduce/libs/script-runner/script-runner.jar',
'Args': step_args
}
}
action = conn.add_job_flow_steps(JobFlowId=cluster_id, Steps=[step])
return "Added step: %s"%(action)
AWS Lambda function python code if you want to execute Spark jar using spark submit command:
from botocore.vendored import requests
import json
def lambda_handler(event, context):
headers = { "content-type": "application/json" }
url = 'http://ip-address.ec2.internal:8998/batches'
payload = {
'file' : 's3://Bucket/Orchestration/RedshiftJDBC41.jar
s3://Bucket/Orchestration/mysql-connector-java-8.0.12.jar
s3://Bucket/Orchestration/SparkCode.jar',
'className' : 'Main Class Name',
'args' : [event.get('rootPath')]
}
res = requests.post(url, data = json.dumps(payload), headers = headers, verify = False)
json_data = json.loads(res.text)
return json_data.get('id')
To follow up on this question:
Filter CloudWatch Logs to extract Instance ID
I think it leaves the question incomplete because it does not say how to access the event object with python.
My goal is to:
read the instance that was triggered by a change in running state
get a tag value associated with the instance
start all other instances that have the same tag
The Cloudwatch trigger event is:
{
"source": [
"aws.ec2"
],
"detail-type": [
"EC2 Instance State-change Notification"
],
"detail": {
"state": [
"running"
]
}
}
I can see examples like this:
def lambda_handler(event, context):
# here I want to get the instance tag value
# and set the tag filter based on the instance that
# triggered the event
filters = [{
'Name': 'tag:StartGroup',
'Values': ['startgroup1']
},
{
'Name': 'instance-state-name',
'Values': ['running']
}
]
instances = ec2.instances.filter(Filters=filters)
I can see the event object but I don't see how to drill down into the tag of the instance that had it's state changed to running.
Please, what is the object attribute through which I can get a tag from the triggered instance?
I suspect it is something like:
myTag = event.details.instance-id.tags["startgroup1"]
The event data passed to Lambda contains the Instance ID.
You then need to call describe_tags() to retrieve a dictionary of the tags.
import boto3
client = boto3.client('ec2')
client.describe_tags(Filters=[
{
'Name': 'resource-id',
'Values': [
event['detail']['instance-id']
]
}
]
)
In the Details Section of the Event, you will get the instance Id's. Using the instance id and AWS SDK you can query the tags. The following is the sample event
{
"version": "0",
"id": "ee376907-2647-4179-9203-343cfb3017a4",
"detail-type": "EC2 Instance State-change Notification",
"source": "aws.ec2",
"account": "123456789012",
"time": "2015-11-11T21:30:34Z",
"region": "us-east-1",
"resources": [
"arn:aws:ec2:us-east-1:123456789012:instance/i-abcd1111"
],
"detail": {
"instance-id": "i-abcd1111",
"state": "running"
}
}
This is what I came up with...
Please let me know how it can be done better. Thanks for the help.
# StartMeUp_Instances_byOne
#
# This lambda script is triggered by a CloudWatch Event, startGroupByInstance.
# Every evening a separate lambda script is launched on a schedule to stop
# all non-essential instances.
#
# This script will turn on all instances with a LaunchGroup tag that matches
# a single instance which has been changed to the running state.
#
# To start all instances in a LaunchGroup,
# start one of the instances in the LaunchGroup and wait about 5 minutes.
#
# Costs to run: approx. $0.02/month
# https://s3.amazonaws.com/lambda-tools/pricing-calculator.html
# 150 executions per month * 128 MB Memory * 60000 ms Execution Time
#
# Problems: talk to chrisj
# ======================================
# test system
# this is what the event object looks like (see below)
# it is configured in the test event object with a specific instance-id
# change that to test a different instance-id with a different LaunchGroup
# { "version": "0",
# "id": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
# "detail-type": "EC2 Instance State-change Notification",
# "source": "aws.ec2",
# "account": "999999999999999",
# "time": "2015-11-11T21:30:34Z",
# "region": "us-east-1",
# "resources": [
# "arn:aws:ec2:us-east-1:123456789012:instance/i-abcd1111"
# ],
# "detail": {
# "instance-id": "i-0aad9474", # <---------- chg this
# "state": "running"
# }
# }
# ======================================
import boto3
import logging
import json
ec2 = boto3.resource('ec2')
def get_instance_LaunchGroup(iid):
# When given an instance ID as str e.g. 'i-1234567',
# return the instance LaunchGroup.
ec2 = boto3.resource('ec2')
ec2instance = ec2.Instance(iid)
thisTag = ''
for tags in ec2instance.tags:
if tags["Key"] == 'LaunchGroup':
thisTag = tags["Value"]
return thisTag
# this is the entry point for the cloudwatch trigger
def lambda_handler(event, context):
# get the instance id that triggered the event
thisInstanceID = event['detail']['instance-id']
print("instance-id: " + thisInstanceID)
# get the LaunchGroup tag value of the thisInstanceID
thisLaunchGroup = get_instance_LaunchGroup(thisInstanceID)
print("LaunchGroup: " + thisLaunchGroup)
if thisLaunchGroup == '':
print("No LaunchGroup associated with this InstanceID - ending lambda function")
return
# set the filters
filters = [{
'Name': 'tag:LaunchGroup',
'Values': [thisLaunchGroup]
},
{
'Name': 'instance-state-name',
'Values': ['stopped']
}
]
# get the instances based on the filter, thisLaunchGroup and stopped
instances = ec2.instances.filter(Filters=filters)
# get the stopped instance IDs
stoppedInstances = [instance.id for instance in instances]
# make sure there are some instances not already started
if len(stoppedInstances) > 0:
startingUp = ec2.instances.filter(InstanceIds=stoppedInstances).start()
print ("Finished launching all instances for tag: " + thisLaunchGroup)
So, here's how I got the tags in my Python code for my Lambda function.
ec2 = boto3.resource('ec2')
instance = ec2.Instance(instanceId)
# get image_id from instance-id
imageId = instance.image_id
print(imageId)
for tags in instance.tags:
if tags["Key"] == 'Name':
newName = tags["Value"] + ".mydomain.com"
print(newName)
So, using instance.tags and then checking the "Key" matching my Name tags and pulling out the "Value" for creating the FQDN (Fully Qualified Domain Name).
I am trying to add a tag to existing ec2 instances using create_tags.
ec2 = boto3.resource('ec2', region_name=region)
instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name',
'Values': ['running']}])
for instance in instances:
ec2.create_tags([instance.id], {"TagName": "TagValue"})
This is giving me this error:
TypeError: create_tags() takes exactly 1 argument (3 given)
First, you CANNOT use boto3.resource("ec2") like that. The boto3.resource is a high level layer that associate with particular resources. Thus following already return the particular instances resources. The collection document always looks like this
# resource will inherit associate instances/services resource.
tag = resource.create_tags(
DryRun=True|False,
Tags=[
{
'Key': 'string',
'Value': 'string'
},
]
)
So in your code,you JUST reference it directly on the resource collection :
for instance in instances:
instance.create_tags(Tags={'TagName': 'TagValue'})
Next, is the tag format, follow the documentation. You get the filter format correct, but not the create tag dict
response = client.create_tags(
DryRun=True|False,
Resources=[
'string',
],
Tags=[
{
'Key': 'string',
'Value': 'string'
},
]
)
On the other hand, boto3.client()are low level client that require an explicit resources ID .
import boto3
ec2 = boto3.client("ec2")
reservations = ec2.describe_instances(
Filters=[{'Name': 'instance-state-name',
'Values': ['running']}])["Reservations"]
mytags = [{
"Key" : "TagName",
"Value" : "TagValue"
},
{
"Key" : "APP",
"Value" : "webapp"
},
{
"Key" : "Team",
"Value" : "xteam"
}]
for reservation in reservations :
for each_instance in reservation["Instances"]:
ec2.create_tags(
Resources = [each_instance["InstanceId"] ],
Tags= mytags
)
(update)
A reason to use resources is code reuse for universal object, i.e., following wrapper let you create tags for any resources.
def make_resource_tag(resource , tags_dictionary):
response = resource.create_tags(
Tags = tags_dictionary)