Extracting a value in python with specific JSON array - python

New to python how would I get the value out of the key value pair appid in the below JSON?
{
"Datadog":[
"host:i-068fee2324438213477be9a4"
],
"Amazon Web Services":[
"availability-zone:us-east-1a",
"aws:cloudformation:logical-id:ec2instance01",
"aws:cloudformation:stack-id:arn:aws:cloudformation:us-east-1:353245",
"appid:42928482474dh28424a",
"name:devinstance",
"region:us-east-1",
"security-group:sg-022442414d8a",
"security-group:sg-0691af18875ad9d0b",
"security-group:sg-022442414d8a",
"security-group:sg-022442414d8a"
]
}

What you're using is a dictionnary. You can access the values like this
nameOfYourDict["nameOfYourKey"]
For example if the name of your dict is data and you want to access Datadog :
data["Datadog"]

Start by getting the AWS pairs into their own variable:
aws_pairs = data["Amazon Web Services"]
Then loop over the pairs until you find one with the correct anchor:
appid_pair = None
for pair in aws_pairs:
if pair.startswith("appid:"):
appid_pair = pair
break
appid_value = None
if appid_pair:
appid_value = appid_pair.split(":", 1)[1]
print(appid_value)
Breaking this down into a simple next statement:
aws_pairs = data["Amazon Web Services"]
appid_value = next(
(
pair.split(":", 1)[1]
for pair in aws_pairs
if pair.startswith("appid:")
),
None
)
print(appid_value)

It's not really a JSON thing, you have a dictionary of lists so extract the relevant list then search it for the item you're looking for:
x = {
"Datadog":[
"host:i-068fee2324438213477be9a4"
],
"Amazon Web Services":[
"availability-zone:us-east-1a",
"aws:cloudformation:logical-id:ec2instance01",
"aws:cloudformation:stack-id:arn:aws:cloudformation:us-east-1:353245",
"appid:42928482474dh28424a",
"name:devinstance",
"region:us-east-1",
"security-group:sg-022442414d8a",
"security-group:sg-0691af18875ad9d0b",
"security-group:sg-022442414d8a",
"security-group:sg-022442414d8a"
]
}
aws = x["Amazon Web Services"]
for string in aws:
name, value = string.split(":", 1)
if name == "appid":
print(value)
Gives:
42928482474dh28424a

The most efficient approach I can think of is to check if each tag (under the "Amazon Web Services" key) starts with a specified prefix, or tag name in this case.
Note that you can also use str.startswith, however here I just use a substring lookup, which also has the same effect.
data = {
"Datadog": [
"host:i-068fee2324438213477be9a4"
],
"Amazon Web Services": [
"availability-zone:us-east-1a",
"aws:cloudformation:logical-id:ec2instance01",
"aws:cloudformation:stack-id:arn:aws:cloudformation:us-east-1:353245",
"appid:42928482474dh28424a",
"name:devinstance",
"region:us-east-1",
"security-group:sg-022442414d8a",
"security-group:sg-0691af18875ad9d0b",
"security-group:sg-022442414d8a",
"security-group:sg-022442414d8a"
]
}['Amazon Web Services']
target_tag = 'appid:'
len_tag_name = len(target_tag)
for tag in data:
if tag[:len_tag_name] == target_tag:
app_id = tag[len_tag_name:]
break
else: # no `break` statement encountered, hence app_id not found
app_id = None
assert app_id == '42928482474dh28424a' # True
And finally, here is a one-liner version of the above, using a next iterator to find the first match in a generator expression. This should work if you know for sure that an appid tag exists.
app_id = next(tag[len_tag_name:] for tag in data if tag[:len_tag_name] == target_tag)

Related

python trading bot( paper trading)

So I am writing a trading bot in python. It is more for fun and I just started. Every method works alone, so I excluded them here, to not give you 300lines of Code. I also exclude the hole analyze method, since even if I clear the rest of the method, the same Error appears. When I use analyze just once, it doesn't do anything but also gets no error and when I use it in a loop I get an error: 
Exception has occurred: KeyError
'result' (mark get_crypto_data)
This doesn't make sense to me, since if I print get_crypto_data it works just fine.
def get_crypto_data(pair,since):
return api.query_public("OHLC", data = {"pair" : pair, "since" : since})["result"][pair] #array of prices (60sek)
def analyze(pair,since):
data = get_crypto_data(pair[0]+pair[1], since)
if __name__ == "__main__":
api = krakenex.API()
api.load_key("KrakenKey.txt")
pair = ('XETH' , 'ZEUR') # Currency pair
since = str(int(time.time() - 3600))
while True:
analyze(pair,since)
The data structure receiving from the API looks like this(example)(without indents):
{
"error": [ ],
"result": {
"XXBTZUSD": [
[
1616662740,
"52591.9",
"52599.9",
"52591.8",
"52599.9",
"52599.1",
"0.11091626",
5
],
[
1616662800,
"52600.0",
"52674.9",
"52599.9",
"52665.2",
"52643.3",
"2.49035996",
30
],
[
1616662860,
"52677.7",
"52686.4",
"52602.1",
"52609.5",
"52634.5",
"1.25810675",
20
],
[
1616662920,
"52603.9",
"52627.5",
"52601.2",
"52616.4",
"52614.0",
"3.42391799",
23
],
[
1616662980,
"52601.2",
"52601.2",
"52599.9",
"52599.9",
"52599.9",
"0.43748934",
7
]
],
"last": 1616662920
}
}
Context
A KeyError in Python is raised when you try to search for an item in an object that doesn't exist. For example, if you make a request:
response = requests.get(url).json()
response['nonexistent']
# KeyError raised as nonexistent doesn't exist in the object
With that in mind, a KeyError when you make an API call to receive this object:
api.query_public("OHLC", data = {"pair" : pair, "since" : since})
We can infer that for whatever reason, ["result"] is not a key in the object above. To debug the issue, follow the steps below.
Debugging
Make the API call and save the response to a variable. Then print the variable along with its type to understand how you can interact with it.
response = api.query_public("OHLC", data = {"pair" : pair, "since" : since})
print(response, type(response))
If it's in a String format (or another standard convertible format), you can use the inbuilt json library to convert it to a dictionary object you can call as you did in your example.
import json
response = api.query_public("OHLC", data = {"pair" : pair, "since" : since})
response = json.loads(response)
Otherwise, given the structure of the output you displayed, it may be wise to convert the response to a string and then follow step 2.
import json
# Get the response
response = api.query_public("OHLC", data = {"pair" : pair, "since" : since})
# Convert to string
response = str(response)
# Convert to dictionary using JSON
response = json.loads(response)
# Call the data you want
data = response["result"]["XXBTZUSD"]

s3 policy update through lambda/boto3 - list index out of range error

I have a requirement to update S3 bucket policy (Get the current resource ARNs and append new Resource ARN). Here is the snippet of code:
import os
import boto3
import pprint
import json
import sys
s3 = boto3.resource('s3')
buckets = ["test090909"]
for s3b in buckets:
print("processing bucket" + s3b)
bucket = s3.Bucket(s3b)
policy = bucket.Policy()
p = json.loads(policy.policy)
print(p) # Good until here
stmt = p["Statement"][1]
print(stmt)
The output of p is as below all good until that but if I want to get Resource section then it should be stmt = p["Statement"][1] as this is dict and list index is 1 but I am getting an error IndexError: list index out of range but if I do stmt = p["Statement"][0] it returning everything. I believe I am doing some thing wrong with string/json items I believe.
{
u'Version': u'2012-10-17',
u'Id': u'Policy1544682557303',
u'Statement': [
{
u'Action': u's3:DeleteBucket',
u'Principal': {
u'Service': u'config.amazonaws.com'
},
u'Resource': [
u'arn:aws:s3:::test090909',
u'arn:aws:s3:::test090909/AWSLogs/111111111111/Config/*'
],
u'Effect': u'Allow',
u'Sid': u'Stmt1544682555302'
}
]
}
it should be "stmt = p["Statement"][1]" as this is dict and list index
is 1 but i am getting an error " IndexError: list index out of range"
but if i do "stmt = p["Statement"][0]" it returning everything
This is not correct. Given the json output, p["Statement"][1] doesn't exist hence the out of range error is raised. p["Statement"] contains only one item. Using your words, p["Statement"][0] returns "everything" because it actually contains everything. It contains a list of Item and one of the item is a list of arn resources.
There you go:
>>print(p["Statement"][0])
[ { u'Action': u's3:DeleteBucket', u'Principal': { u'Service': u'config.amazonaws.com' }, u'Resource': [ u'arn:aws:s3:::test090909', u'arn:aws:s3:::test090909/AWSLogs/111111111111/Config/*' ], u'Effect': u'Allow', u'Sid': u'Stmt1544682555302' } ]
>>print(p["Statement"][0]["Resource"])
[ u'arn:aws:s3:::test090909', u'arn:aws:s3:::test090909/AWSLogs/111111111111/Config/*' ]
Then if you want to access one of the specific resource:
>>print(p["Statement"][0]["Resource"][0])
arn:aws:s3:::test090909
>>print(p["Statement"][0]["Resource"][1])
arn:aws:s3:::test090909/AWSLogs/111111111111/Config/*
Happy coding!
Python indexing begins at 0, so p["Statement"][0] will return the 1st item in that list.

How to get all documents under an elasticsearch index with python client ?

I'm trying to get all index document using python client but the result show me only the first document
This is my python code :
res = es.search(index="92c603b3-8173-4d7a-9aca-f8c115ff5a18", doc_type="doc", body = {
'size' : 10000,
'query': {
'match_all' : {}
}
})
print("%d documents found" % res['hits']['total'])
data = [doc for doc in res['hits']['hits']]
for doc in data:
print(doc)
return "%s %s %s" % (doc['_id'], doc['_source']['0'], doc['_source']['5'])
try "_doc" instead of "doc"
res = es.search(index="92c603b3-8173-4d7a-9aca-f8c115ff5a18", doc_type="_doc", body = {
'size' : 100,
'query': {
'match_all' : {}
}
})
Elasticsearch by default retrieve only 10 documents. You could change this behaviour - doc here . The best practice for pagination are search after query and scroll query. It depends from your needs. Please read this answer Elastic search not giving data with big number for page size
To show all the results:
for doc in res['hits']['hits']:
print doc['_id'], doc['_source']
You can try the following query. It will return all the documents.
result = es.search(index="index_name", body={"query":{"match_all":{}}})
You can also use elasticsearch_dsl and its Search API which allows you to iterate over all your documents via the scan method.
import elasticsearch
from elasticsearch_dsl import Search
client = elasticsearch.Elasticsearch()
search = Search(using=client, index="92c603b3-8173-4d7a-9aca-f8c115ff5a18")
for hit in search.scan():
print(hit)
I dont see mentioned that the index must be refreshed if you just added data. Use this:
es.indices.refresh(index="index_name")

how can i take a specific element from this list in python?

I'm working with the Microsoft Azure face API and I want to get only the glasses response.
heres my code:
########### Python 3.6 #############
import http.client, urllib.request, urllib.parse, urllib.error, base64, requests, json
###############################################
#### Update or verify the following values. ###
###############################################
# Replace the subscription_key string value with your valid subscription key.
subscription_key = '(MY SUBSCRIPTION KEY)'
# Replace or verify the region.
#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'https://westcentralus.api.cognitive.microsoft.com'
# Request headers.
headers = {
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}
# Request parameters.
params = {
'returnFaceAttributes': 'glasses',
}
# Body. The URL of a JPEG image to analyze.
body = {'url': 'https://upload.wikimedia.org/wikipedia/commons/c/c3/RH_Louise_Lillian_Gish.jpg'}
try:
# Execute the REST API call and get the response.
response = requests.request('POST', uri_base + '/face/v1.0/detect', json=body, data=None, headers= headers, params=params)
print ('Response:')
parsed = json.loads(response.text)
info = (json.dumps(parsed, sort_keys=True, indent=2))
print(info)
except Exception as e:
print('Error:')
print(e)
and it returns a list like this:
[
{
"faceAttributes": {
"glasses": "NoGlasses"
},
"faceId": "0f0a985e-8998-4c01-93b6-8ef4bb565cf6",
"faceRectangle": {
"height": 162,
"left": 177,
"top": 131,
"width": 162
}
}
]
I want just the glasses attribute so it would just return either "Glasses" or "NoGlasses"
Thanks for any help in advance!
I think you're printing the whole response, when really you want to drill down and get elements inside it. Try this:
print(info[0]["faceAttributes"]["glasses"])
I'm not sure how the API works so I don't know what your specified params are actually doing, but this should work on this end.
EDIT: Thank you to #Nuageux for noting that this is indeed an array, and you will have to specify that the first object is the one you want.
I guess that you can get few elements in that list, so you could do this:
info = [
{
"faceAttributes": {
"glasses": "NoGlasses"
},
"faceId": "0f0a985e-8998-4c01-93b6-8ef4bb565cf6",
"faceRectangle": {
"height": 162,
"left": 177,
"top": 131,
"width": 162
}
}
]
for item in info:
print (item["faceAttributes"]["glasses"])
>>> 'NoGlasses'
Did you try:
glasses = parsed[0]['faceAttributes']['glasses']
This looks more like a dictionary than a list. Dictionaries are defined using the { key: value } syntax, and can be referenced by the value for their key. In your code, you have faceAttributes as a key that for value contains another dictionary with a key glasses leading to the last value that you want.
Your info object is a list with one element: a dictionary. So in order to get at the values in that dictionary, you'll need to tell the list where the dictionary is (at the head of the list, so info[0]).
So your reference syntax will be:
#If you want to store it in a variable, like glass_var
glass_var = info[0]["faceAttributes"]["glasses"]
#Or if you want to print it directly
print(info[0]["faceAttributes"]["glasses"])
What's going on here? info[0] is the dictionary containing several keys, including faceAttributes,faceId and faceRectangle. faceRectangle and faceAttributes are both dictionaries in themselves with more keys, which you can reference to get their values.
Your printed tree there is showing all the keys and values of your dictionary, so you can reference any part of your dictionary using the right keys:
print(info["faceId"]) #prints "0f0a985e-8998-4c01-93b6-8ef4bb565cf6"
print(info["faceRectangle"]["left"]) #prints 177
print(info["faceRectangle"]["width"]) #prints 162
If you have multiple entries in your info list, then you'll have multiple dictionaries, and you can get all the outputs as so:
for entry in info: #Note: "entry" is just a variable name,
# this can be any name you want. Every
# iteration of entry is one of the
# dictionaries in info.
print(entry["faceAttributes"]["glasses"])
Edit: I didn't see that info was a list of a dictionary, adapted for that fact.

Example of update_item in dynamodb boto3

Following the documentation, I'm trying to create an update statement that will update or add if not exists only one attribute in a dynamodb table.
I'm trying this
response = table.update_item(
Key={'ReleaseNumber': '1.0.179'},
UpdateExpression='SET',
ConditionExpression='Attr(\'ReleaseNumber\').eq(\'1.0.179\')',
ExpressionAttributeNames={'attr1': 'val1'},
ExpressionAttributeValues={'val1': 'false'}
)
The error I'm getting is:
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the UpdateItem operation: ExpressionAttributeNames contains invalid key: Syntax error; key: "attr1"
If anyone has done anything similar to what I'm trying to achieve please share example.
Found working example here, very important to list as Keys all the indexes of the table, this will require additional query before update, but it works.
response = table.update_item(
Key={
'ReleaseNumber': releaseNumber,
'Timestamp': result[0]['Timestamp']
},
UpdateExpression="set Sanity = :r",
ExpressionAttributeValues={
':r': 'false',
},
ReturnValues="UPDATED_NEW"
)
Details on dynamodb updates using boto3 seem incredibly sparse online, so I'm hoping these alternative solutions are useful.
get / put
import boto3
table = boto3.resource('dynamodb').Table('my_table')
# get item
response = table.get_item(Key={'pkey': 'asdf12345'})
item = response['Item']
# update
item['status'] = 'complete'
# put (idempotent)
table.put_item(Item=item)
actual update
import boto3
table = boto3.resource('dynamodb').Table('my_table')
table.update_item(
Key={'pkey': 'asdf12345'},
AttributeUpdates={
'status': 'complete',
},
)
If you don't want to check parameter by parameter for the update I wrote a cool function that would return the needed parameters to perform a update_item method using boto3.
def get_update_params(body):
"""Given a dictionary we generate an update expression and a dict of values
to update a dynamodb table.
Params:
body (dict): Parameters to use for formatting.
Returns:
update expression, dict of values.
"""
update_expression = ["set "]
update_values = dict()
for key, val in body.items():
update_expression.append(f" {key} = :{key},")
update_values[f":{key}"] = val
return "".join(update_expression)[:-1], update_values
Here is a quick example:
def update(body):
a, v = get_update_params(body)
response = table.update_item(
Key={'uuid':str(uuid)},
UpdateExpression=a,
ExpressionAttributeValues=dict(v)
)
return response
The original code example:
response = table.update_item(
Key={'ReleaseNumber': '1.0.179'},
UpdateExpression='SET',
ConditionExpression='Attr(\'ReleaseNumber\').eq(\'1.0.179\')',
ExpressionAttributeNames={'attr1': 'val1'},
ExpressionAttributeValues={'val1': 'false'}
)
Fixed:
response = table.update_item(
Key={'ReleaseNumber': '1.0.179'},
UpdateExpression='SET #attr1 = :val1',
ConditionExpression=Attr('ReleaseNumber').eq('1.0.179'),
ExpressionAttributeNames={'#attr1': 'val1'},
ExpressionAttributeValues={':val1': 'false'}
)
In the marked answer it was also revealed that there is a Range Key so that should also be included in the Key. The update_item method must seek to the exact record to be updated, there's no batch updates, and you can't update a range of values filtered to a condition to get to a single record. The ConditionExpression is there to be useful to make updates idempotent; i.e. don't update the value if it is already that value. It's not like a sql where clause.
Regarding the specific error seen.
ExpressionAttributeNames is a list of key placeholders for use in the UpdateExpression, useful if the key is a reserved word.
From the docs, "An expression attribute name must begin with a #, and be followed by one or more alphanumeric characters". The error is because the code hasn't used an ExpressionAttributeName that starts with a # and also not used it in the UpdateExpression.
ExpressionAttributeValues are placeholders for the values you want to update to, and they must start with :
Based on the official example, here's a simple and complete solution which could be used to manually update (not something I would recommend) a table used by a terraform S3 backend.
Let's say this is the table data as shown by the AWS CLI:
$ aws dynamodb scan --table-name terraform_lock --region us-east-1
{
"Items": [
{
"Digest": {
"S": "2f58b12ae16dfb5b037560a217ebd752"
},
"LockID": {
"S": "tf-aws.tfstate-md5"
}
}
],
"Count": 1,
"ScannedCount": 1,
"ConsumedCapacity": null
}
You could update it to a new digest (say you rolled back the state) as follows:
import boto3
dynamodb = boto3.resource('dynamodb', 'us-east-1')
try:
table = dynamodb.Table('terraform_lock')
response = table.update_item(
Key={
"LockID": "tf-aws.tfstate-md5"
},
UpdateExpression="set Digest=:newDigest",
ExpressionAttributeValues={
":newDigest": "50a488ee9bac09a50340c02b33beb24b"
},
ReturnValues="UPDATED_NEW"
)
except Exception as msg:
print(f"Oops, could not update: {msg}")
Note the : at the start of ":newDigest": "50a488ee9bac09a50340c02b33beb24b" they're easy to miss or forget.
Small update of Jam M. Hernandez Quiceno's answer, which includes ExpressionAttributeNames to prevent encoutering errors such as:
"errorMessage": "An error occurred (ValidationException) when calling the UpdateItem operation:
Invalid UpdateExpression: Attribute name is a reserved keyword; reserved keyword: timestamp",
def get_update_params(body):
"""
Given a dictionary of key-value pairs to update an item with in DynamoDB,
generate three objects to be passed to UpdateExpression, ExpressionAttributeValues,
and ExpressionAttributeNames respectively.
"""
update_expression = []
attribute_values = dict()
attribute_names = dict()
for key, val in body.items():
update_expression.append(f" #{key.lower()} = :{key.lower()}")
attribute_values[f":{key.lower()}"] = val
attribute_names[f"#{key.lower()}"] = key
return "set " + ", ".join(update_expression), attribute_values, attribute_names
Example use:
update_expression, attribute_values, attribute_names = get_update_params(
{"Status": "declined", "DeclinedBy": "username"}
)
response = table.update_item(
Key={"uuid": "12345"},
UpdateExpression=update_expression,
ExpressionAttributeValues=attribute_values,
ExpressionAttributeNames=attribute_names,
ReturnValues="UPDATED_NEW"
)
print(response)
An example to update any number of attributes given as a dict, and keep track of the number of updates. Works with reserved words (i.e name).
The following attribute names shouldn't be used as we will overwrite the value: _inc, _start.
from typing import Dict
from boto3 import Session
def getDynamoDBSession(region: str = "eu-west-1"):
"""Connect to DynamoDB resource from boto3."""
return Session().resource("dynamodb", region_name=region)
DYNAMODB = getDynamoDBSession()
def updateItemAndCounter(db_table: str, item_key: Dict, attributes: Dict) -> Dict:
"""
Update item or create new. If the item already exists, return the previous value and
increase the counter: update_counter.
"""
table = DYNAMODB.Table(db_table)
# Init update-expression
update_expression = "SET"
# Build expression-attribute-names, expression-attribute-values, and the update-expression
expression_attribute_names = {}
expression_attribute_values = {}
for key, value in attributes.items():
update_expression += f' #{key} = :{key},' # Notice the "#" to solve issue with reserved keywords
expression_attribute_names[f'#{key}'] = key
expression_attribute_values[f':{key}'] = value
# Add counter start and increment attributes
expression_attribute_values[':_start'] = 0
expression_attribute_values[':_inc'] = 1
# Finish update-expression with our counter
update_expression += " update_counter = if_not_exists(update_counter, :_start) + :_inc"
return table.update_item(
Key=item_key,
UpdateExpression=update_expression,
ExpressionAttributeNames=expression_attribute_names,
ExpressionAttributeValues=expression_attribute_values,
ReturnValues="ALL_OLD"
)
Hope it might be useful to someone!
In a simple way you can use below code to update item value with new one:
response = table.update_item(
Key={"my_id_name": "my_id_value"}, # to get record
UpdateExpression="set item_key_name=:item_key_value", # Operation action (set)
ExpressionAttributeValues={":value": "new_value"}, # item that you need to update
ReturnValues="UPDATED_NEW" # optional for declarative message
)
Simple example with multiple fields:
import boto3
dynamodb_client = boto3.client('dynamodb')
dynamodb_client.update_item(
TableName=table_name,
Key={
'PK1': {'S': 'PRIMARY_KEY_VALUE'},
'SK1': {'S': 'SECONDARY_KEY_VALUE'}
}
UpdateExpression='SET #field1 = :field1, #field2 = :field2',
ExpressionAttributeNames={
'#field1': 'FIELD_1_NAME',
'#field2': 'FIELD_2_NAME',
},
ExpressionAttributeValues={
':field1': {'S': 'FIELD_1_VALUE'},
':field2': {'S': 'FIELD_2_VALUE'},
}
)
using previous answer from eltbus , it worked for me , except for minor bug,
You have to delete the extra comma using update_expression[:-1]

Categories

Resources