Python - AWS Lambda extract a key from JSON input - python

Im trying to implement a function that will get the event from cloudwatch and print the results. I am able to get the event but I want to extract one particular key from that JSON.
Here is my function:
import json
def lambda_handler(event, context):
print("Received event: " + json.dumps(event, indent=2))
message = event['Records'][0]['Sns']['Message']
print(message)
The event got from Cloudwatch:
"Records": [
{
"EventVersion": "1.0",
"EventSubscriptionArn": "arn:aws:sns:us-east-1:xxxxxxxxxxxxx:bhuvi:XXXXXXXXXXXXXXXXXXXXXXXXXX",
"EventSource": "aws:sns",
"Sns": {
"SignatureVersion": "1",
"Timestamp": "2018-01-13T19:18:44.369Z",
"Signature": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"SigningCertUrl": "https://sns.us-east-1.amazonaws.com/SimpleNotificationService-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.pem",
"MessageId": "4b76b0ea-5e0f-502f-81ec-e23e03dbaf01",
"Message": "{\"AlarmName\":\"test\",\"AlarmDescription\":\"test\",\"AWSAccountId\":\"xxxxxxxxxxxxx\",\"NewStateValue\":\"ALARM\",\"NewStateReason\":\"Threshold Crossed: 1 out of the last 1 datapoints [2.6260535333900545 (13/01/18 19:13:00)] was greater than or equal to the threshold (1.0) (minimum 1 datapoint for OK -> ALARM transition).\",\"StateChangeTime\":\"2018-01-13T19:18:44.312+0000\",\"Region\":\"US East (N. Virginia)\",\"OldStateValue\":\"OK\",\"Trigger\":{\"MetricName\":\"CPUUtilization\",\"Namespace\":\"AWS/RDS\",\"StatisticType\":\"Statistic\",\"Statistic\":\"AVERAGE\",\"Unit\":null,\"Dimensions\":[{\"name\":\"DBInstanceIdentifier\",\"value\":\"myrds\"}],\"Period\":300,\"EvaluationPeriods\":1,\"ComparisonOperator\":\"GreaterThanOrEqualToThreshold\",\"Threshold\":1.0,\"TreatMissingData\":\"\",\"EvaluateLowSampleCountPercentile\":\"\"}}",
"MessageAttributes":
{}
,
"Type": "Notification",
"UnsubscribeUrl": "https://sns.us-east-1.amazonaws.com/?xcsgagrgrwgwrg",
"TopicArn": "arn:aws:sns:us-east-1:xxxxxxxxxxxxx:bhuvi",
"Subject": "ALARM: \"test\" in US East (N. Virginia)"
}
}
]
}
My extract command(Upto message) and its result:
message = event['Records'][0]['Sns']['Message']
print(message)
Result
{
"AlarmName": "test",
"AlarmDescription": "test",
"AWSAccountId": "xxxxxxxxxxxxx",
"NewStateValue": "ALARM",
"NewStateReason": "Threshold Crossed: 1 out of the last 1 datapoints [2.6260535333900545 (13/01/18 19:13:00)] was greater than or equal to the threshold (1.0) (minimum 1 datapoint for OK -> ALARM transition).",
"StateChangeTime": "2018-01-13T19:18:44.312+0000",
"Region": "US East (N. Virginia)",
"OldStateValue": "OK",
"Trigger": {
"MetricName": "CPUUtilization",
"Namespace": "AWS/RDS",
"StatisticType": "Statistic",
"Statistic": "AVERAGE",
"Unit": null,
"Dimensions": [
{
"name": "DBInstanceIdentifier",
"value": "myrds"
}
],
"Period": 300,
"EvaluationPeriods": 1,
"ComparisonOperator": "GreaterThanOrEqualToThreshold",
"Threshold": 1,
"TreatMissingData": "",
"EvaluateLowSampleCountPercentile": ""
}
I want to extract some values from this message pane.
For eg: I want to extract name. So I tried the below command, but unfortunately its not working. Can anyone help me on this?
my code for this:
message = event['Records'][0]['Sns']['Message']['Trigger']['Dimensions']['name']
print(message)
ERROR:
{
"stackTrace": [
[
"/var/task/lambda_function.py",
14,
"lambda_handler",
"message = event['Records'][0]['Sns']['Message']['Trigger']['Dimensions']['name']"
]
],
"errorType": "TypeError",
"errorMessage": "string indices must be integers"
}

So there are 3 problems:
Problem 1: In your example event, ['Records'][0]['Sns']['Message'] is a str in JSON format. That means that you need to parse to a dict like this:
message = event['Records'][0]['Sns']['Message']
message = json.loads(message)
Problem 2: message['Trigger']['Dimensions'] is a list but you are trying to access it like if it were a dict. So you only need to change your code to:
message = message['Trigger']['Dimensions'][0]['name']
Problem 3: Message is a str that means that you need to verify that is a plain str or json str (otherwise you are going to have problems with multiple structures and types). For that your code could look like:
message = event['Records'][0]['Sns']['Message']
if isinstance(message, str):
try:
message = json.loads(message)
except Exception as e:
print(e) # Or do nothing, this is just to log the error
elif isinstance(message, list):
message = message[0]
# Maybe evaluate bool, tuple, etc other types
print('RESPONSE', message['Trigger']['Dimensions'][0]['name'] if isinstance(message, dict) else message)
However I would also recommend to make it more extensible iterating the elements that you know are list. And for safety reasons (trying to avoid null pointer exceptions), use the get() function with a default value. http://www.tutorialspoint.com/python/dictionary_get.htm . Try maybe to create a function to parse structures and make it reusable.
Good luck!

Just as Records is a list, so you use ['Records'][0]['Sns']..., so is Dimensions, so again you need to access the first element.

Related

Save values from POST request of a list of dicts

I a trying to expose an API (if that's the correct way to say it). I am using Quart, a python library made out of Flask and this is what my code looks like:
async def capture_post_request(request_json):
for item in request_json:
callbackidd = item['callbackid']
print(callbackidd)
#app.route('/start_work/', methods=['POST'])
async def start_work():
content_type = request.headers.get('content-type')
if (content_type == 'application/json'):
request_json = await request.get_json()
loop = asyncio.get_event_loop()
loop.create_task(capture_post_request(request_json))
body = "Async Job Started"
return body
else:
return 'Content-Type not supported!'
My schema looks like that:
[
{
"callbackid": "dd",
"itemid": "234r",
"input": [
{
"type": "thistype",
"uri": "www.uri.com"
}
],
"destination": {
"type": "thattype",
"uri": "www.urino2.com"
}
},
{
"statusCode": "202"
}
]
So far what I am getting is this error:
line 11, in capture_post_request
callbackidd = item['callbackid']
KeyError: 'callbackid'
I've tried so many stackoverflow posts to see how to iterate through my list of dicts but nothing worked. At one point in my start_work function I was using the get_data(as_text=True) method but still no results. In fact with the last method (or attr) I got:
TypeError: string indices must be integers
Any help on how to access those values is greatly appreciated. Cheers.
Your schema indicates there are two items in the request_json. The first indeed has the callbackid, the 2nd only has statusCode.
Debugging this should be easy:
async def capture_post_request(request_json):
for item in request_json:
print(item)
callbackidd = item.get('callbackid')
print(callbackidd) # will be None in case of the 2nd 'item'
This will print two dicts:
{
"callbackid": "dd",
"itemid": "234r",
"input": [
{
"type": "thistype",
"uri": "www.uri.com"
}
],
"destination": {
"type": "thattype",
"uri": "www.urino2.com"
}
}
And the 2nd, the cause of your KeyError:
{
"statusCode": "202"
}
I included the 'fix' of sorts already:
callbackidd = item.get('callbackid')
This will default to None if the key isn't in the dict.
Hopefully this will get you further!
Edit
How to work with only the dict containing your key? There are two options.
First, using filter. Something like this:
def has_callbackid(dict_to_test):
return 'callbackid' in dict_to_test
list_with_only_list_callbackid_items = list(filter(has_callbackid, request_json))
# Still a list at this point! With dicts which have the `callbackid` key
Filter accepts some arguments:
Function to call to determine if the value being tested should be filtered out or not.
The iterable you want to filter
Could also use a 'lambda function', but it's a bit evil. But serves the purpose just as well:
list_with_only_list_callbackid_items = list(filter(lambda x: 'callbackid' in x, request_json))
# Still a list at this point! With dict(s) which have the `callbackid` key
Option 2, simply loop over the result and only grab the one you want to use.
found_item = None # default
for item in request_json:
if 'callbackid' in item:
found_item = item
break # found what we're looking for, stop now
# Do stuff with the found_item from this point.

KeyError in Python 3 when looping items

The problem with KeyError is when one of the fields within my JSON don't have a value or don't exist at all. To solve it, I put my loop in an exception so it can skip this error and continue with the rest of the loop. However, the problem with this method is when one field is missing I can't print the rest of the JSON data because of that one or two missing item.
For example:
Scenario 1:
JSON format:
{
"data": {
"name": "Bloomberg",
"city": "NYC",
"country": "USA"
}
}
This scenario1 works fine since all items with their values are available
Output:
('name:Bloomberg', 'city:NYC', 'country:USA')
Scenario 2:
JSON format:
{
"data": {
"name": "Bloomberg",
"country": "USA"
}
}
In this scenario, the exception will capture that KeyError and skip it. However, I still need to print that data out regardless of that one missing field. I am looking for an output like this:
('name:Bloomberg', 'city field not available', 'country:USA')
The exception I used in the loop:
try :
myData = (myJSON [ 'data' ] [ 'name' ] , myJSON [ 'data' ] [ 'city' ], myJSON ['data']['country'])
print (myData)
except Error as e:
pass
You can use the get method. It returns None when the requested key doesn't exist in the dictionary.
my_json = {
"data": {
"name": "Bloomberg",
"country": "USA"
}
}
my_json_data = my_json.get('data')
if my_json_data is not None:
my_data = (my_json_data.get('name'), my_json_data.get('city'), my_json_data.get('country'))

Feed string using slack webhook

So basically, I have a set of data (domain name, severity score) that is in string format as post, and I am trying to get it to post in slack and It keeps throwing errors out and I don't know why. I appreciate the help.
I have tried changing the JSON portion a bit to see if it was that as well as changing what is being sent to the function in general, and nothing helps.
def slackHook(post):
webhook_url = #Ommited
slack_content = {"channel": "#brian", "user": "Awesom-O", "attachment": [{
"type": "section",
"text": {
"text": "Random message before domains",
"type": "mrkdwn",
},
"fields": [
{
"type": "mrkdwn",
"text": "Domain Severity Score"
},
{
"type": "plain_text",
"text": post
}
]
}]}
string_payload = json.dumps(slack_content)
r = requests.post(webhook_url, data=string_payload)
if r.status_code != 200:
raise ValueError('Request to slack.com returned an error %s, the response is:\n%s' % (r.status_code, r.text))
domains = db_query()
domains = str(domains)
slackHook(domains)
Happy Path: I would just like to take my string and post it to my slack channel using the fields that I've given for context.
The current error:
raise ValueError('Request to slack.com returned an error %s, the response is:\n%s' % (r.status_code, r.text))
ValueError: Request to slack.com returned an error 400, the response is:
no_text
Your main issue is that you where mixing the syntax for attachments and blocks which are different concepts. attachments are outdated and should no longer be used.
Just replace "attachment" with "blocks" like so:
slack_content = {"channel": "#brian", "user": "Awesom-O", "blocks": [{

Getting KeyError when parsing JSON in Python for following response

TL;DR:
Confused on how to parse following JSON response and get the value of [status of 12345 of dynamicValue_GGG of payload] in this case.
Full question:
I get the following as (sanitized) response upon hitting a REST API via Python code below:
response = requests.request("POST", url, data=payload,
headers=headers).json()
{
"payload": {
"name": "asdasdasdasd",
"dynamicValue_GGG": {
"12345": {
"model": "asad",
"status": "active",
"subModel1": {
"dynamicValue_67890": {
"model": "qwerty",
"status": "active"
},
"subModel2": {
"dynamicValue_33445": {
"model": "gghjjj",
"status": "active"
},
"subModel3": {
"dynamicValue_66778": {
"model": "tyutyu",
"status": "active"
}
}
}
},
"date": "2016-02-04"
},
"design": "asdasdWWWsaasdasQ"
}
If I do a type(response['payload']), it gives me 'dict'.
Now, I'm trying to parse the response above and fetch certain keys and values out of it. The problem is that I'm not able to iterate through using "index" and rather have to specify the "key", but then the response has certain "keys" that are dynamically generated and sent over. For instance, the keys called "dynamicValue_GGG", "dynamicValue_66778" etc are not static unlike the "status" key.
I can successfully parse by mentioning like:
print response['payload']['dynamicValue_GGG']['12345'][status]
in which case I get the expected output = 'active'.
However, since I have no control on 'dynamicValue_GGG', it would work only if I can specify something like this instead:
print response['payload'][0][0][status]
But the above line gives me error: " KeyError: 0 " when the python code is executed.
Is there someway in which I can use the power of both keys as well as index together in this case?
The order of values in a dictionary in Python are random, so you cannot use indexing. You'll have to iterate over all elements, potentially recursive, and test to see if it's the thing you're looking for. For example:
def find_submodels(your_dict):
for item_key, item_values in your_dict.items():
if 'status' in item_values:
print item_key, item_values['status']
if type(item_values) == dict:
find_submodels(item_values)
find_submodels(your_dict)
Which would output:
12345 active
dynamicValue_67890 active
dynamicValue_66778 active
dynamicValue_33445 active

Equivalent of Python "json.dumps()" in R?

I'm a very beginner student of R (still coursing the "R Programming" course on Coursera) and I'm trying to practice R porting some easy code from Python to R.
Currently I'm trying to make API calls for a KairosDB database. In order to make the query, I need to encode the Python object with json.dumps() (from the json native library), but I've searched a lot and I don't get how I can do that with R and it's jsonlite library. I don't even know if I'm creating the JSON object corretly, but that's what I've found in some searches.
My code written in Python 3 (from this repo):
import requests
import json
kairosdb_server = "http://localhost:8080"
# Simple test
query = {
"start_relative": {
"value": "4",
"unit": "years"
},
"metrics": [
{
"name": "test",
"limit": 10000
}
]
}
response = requests.post(kairosdb_server + "/api/v1/datapoints/query", data=json.dumps(query))
print("Status code: %d" % response.status_code)
print("JSON response:")
print(response.json())
My current code written in R 3.2.3:
library(httr)
library(jsonlite)
kairosdb_server <- 'http://localhost:8080'
query <- serializeJSON(toJSON('
"start_relative": {
"value": "4",
"unit": "years"
},
"metrics": [
{
"name": "test",
"limit": 1000
}
]
'))
url <- paste(kairosdb_server, '/api/v1/datapoints/query')
response <- POST(url, body = query, encode = 'json')
print(paste("Query status code: ", response$status_code))
print(paste("JSON response: \n", content(response, type = 'application/json')))
If I run that I got the following error:
print(paste("Query status code: ", response$status_code))
# [1] "Query status code: 400"
print(paste("JSON response: \n", content(response, type = 'application/json')))
# [1] "JSON response: \n list(\"query.metric[] must have a size of at least 1\")"
What I'm doing wrong?
Normally one would pass a named list into body but trying to get R to preserve the array in "metrics" is tricky. Since you kinda already have JSON with the original Python structure, why not just add brackets and pass it in as a character vector? i.e.
query <- '{"start_relative": {
"value": "4",
"unit": "years"
},
"metrics": [
{
"name": "test",
"limit": 10000
}
]}'
(then just use that query in the POST). It's equivalent JSON to what json.dumps() spits out:
# get rid of newlines and spaces just to show they are the same,
# the server won't (shouldn't) care if there are newlines/spaces
cat(gsub(" \\]", "]", gsub("\\[ ", "[", gsub(" \\}", "}", gsub("\\{ ", "{", gsub("\ +", " ", gsub("\\n", "", query)))))))
{"start_relative": {"value": "4", "unit": "years"}, "metrics": [{"name": "test", "limit": 10000}]}
# python
json.dumps(query)
'{"metrics": [{"limit": 10000, "name": "test"}], "start_relative": {"unit": "years", "value": "4"}}'
If you do need an R data structure to work with, you're going to end up manipulating the output of toJSON.

Categories

Resources