How to retrieve data from this list - python

This list how to retrieve only problems by using python.
[{"ordination": [{"condition": "system_drive_free_space < 10000","match": true,"problems": [{"id": "disk_cleanup","point": "/remote_action/disk_cleanup/hgk5255sfghjkd516465s"
}]},{"condition": "total_drive_free_space < 20000","match": true,"problems": [{ "id": "disk_cleanup","point": "/remote_action/disk_cleanup/h41525c274558hgfdbd3b"}]}]},{
"ordination": [{"condition": "\"action:Get Startup Impact/HighImpactCount\" > 0","match": true}]},{"ordination": [{"condition": "average_network_response_time > 30000",
"match": true},{"condition": "network_availability_level != high","match": true}] }]
For example the output like this
"problems": [{"id": "disk_cleanup","point": "/remote_action/disk_cleanup/hgk5255sfghjkd516465s" }]

you can try something
items = [{"ordination": [{"condition": "system_drive_free_space < 10000", "match": True, "problems": [
{"id": "disk_cleanup", "point": "/remote_action/disk_cleanup/hgk5255sfghjkd516465s"}]},
{"condition": "total_drive_free_space < 20000", "match": True, "problems": [
{"id": "disk_cleanup", "point": "/remote_action/disk_cleanup/h41525c274558hgfdbd3b"}]}]},
{"ordination": [{"condition": "action: Get Startup Impact / HighImpactCount> 0", "match": True}]},
{"ordination": [{"condition": "average_network_response_time > 30000", "match": True},
{"condition": "network_availability_level != high", "match": True}]}]
for item in items:
for ordination in item['ordination']:
print(ordination.get('problems'))
It prints None for ordinations without problems.

If that array is stored in a variable called list, (e.g. list = [{ordination...) then you can use this code to retrieve the value of problems:
problems = list[0].get("ordination")[0].get("problems")
print(problems)
This should print out the list which is stored the problems section.

if your data is in a variable named data, then try this:
for item in data:
for inside_item in item.get("ordination", []):
for inside_value in inside_item.get("problems", []):
print(inside_value[0]["id"])

Related

retrieving data from json with python

I have a nested JSON data. I want to get the value of key "name" inside the dictionary "value" based on the key "id" in "key" dictionary (let the user enter the id). I don't want to use indexing which, because places are changing on every url differently. Also data is large, so I need one row solution (without for loop).
Code
import requests, re, json
r = requests.get('https://www.trendyol.com/apple/macbook-air-13-m1-8gb-256gb-ssd-altin-p-67940132').text
json_data1 = json.loads(re.search(r"window.__PRODUCT_DETAIL_APP_INITIAL_STATE__=({.*}});window", r).group(1))
print(json_data1)
print('json_data1:',json_data1['product']['attributes'][0]['value']['name'])
Output
{'product': {'attributes': [{'key': {'name': 'İşlemci Tipi', 'id': 168}, 'value': {'name': 'Apple M1', 'id': 243383}, 'starred': True, 'description': '', 'mediaUrls': []}, {'key': {'name': 'SSD Kapasitesi', 'id': 249}..........
json_data1: Apple M1
JSON Data
{
"product": {
"attributes": [
{
"key": { "name": "İşlemci Tipi", "id": 168 },
"value": { "name": "Apple M1", "id": 243383 },
"starred": true,
"description": "",
"mediaUrls": []
},
{
"key": { "name": "SSD Kapasitesi", "id": 249 },
"value": { "name": "256 GB", "id": 3376 },
"starred": true,
"description": "",
"mediaUrls": []
},
.
.
.
]
}
}
Expected Output is getting value by key id: (type must be str)
input >> id: 168
output >> name: Apple M1
Since you originally didn't want a for loop, but now it's a matter of speed,
Here's a solution with for loop, you can test it and see if it's faster than the one you already had
import json
with open("file.json") as f:
data = json.load(f)
search_key = int(input("Enter id: "))
for i in range(0, len(data['product']['attributes'])):
if search_key == data['product']['attributes'][i]['key']['id']:
print(data['product']['attributes'][i]['value']['name'])
Input >> Enter id: 168
Output >> Apple M1
I found the solution with for loop. It works fast so I preferred it.
for i in json_data1['product']['attributes']:
cpu = list(list(i.values())[0].values())[1]
if cpu == 168:
print(list(list(i.values())[1].values())[0])
Iteration is unavoidable if the index is unknown, but the cost can be reduced substantially by using a generator expression and Python's built-in next function:
next((x["value"]["name"] for x in data["product"]["attributes"] if x["key"]["id"] == 168), None)
To verify that a generator expression is in fact faster than a for loop, here is a comparison of the running time of xFranko's solution and the above:
import time
def time_func(func):
def timer(*args):
time1 = time.perf_counter()
func(*args)
time2 = time.perf_counter()
return (time2 - time1) * 1000
return timer
number_of_attributes = 100000
data = {
"product": {
"attributes": [
{
"key": { "name": "İşlemci Tipi", "id": i },
"value": { "name": "name" + str(i), "id": 243383 },
"starred": True,
"description": "",
"mediaUrls": []
} for i in range(number_of_attributes)
]
}
}
def getName_generator(id):
return next((x["value"]["name"] for x in data["product"]["attributes"] if x["key"]["id"] == id), None)
def getName_for_loop(id):
return_value = None
for i in range(0, len(data['product']['attributes'])):
if id == data['product']['attributes'][i]['key']['id']:
return_value = data['product']['attributes'][i]['value']['name']
return return_value
print("Generator:", time_func(getName_generator)(0))
print("For loop:", time_func(getName_for_loop)(0))
print()
print("Generator:", time_func(getName_generator)(number_of_attributes - 1))
print("For loop:", time_func(getName_for_loop)(number_of_attributes - 1))
My results:
Generator: 0.0075999999999964984
For loop: 43.73920000000003
Generator: 23.633300000000023
For loop: 49.839699999999986
Conclusion:
For large data sets, a generator expression is indeed faster, even if it has to traverse the entire set.

Pull key from json file when values is known (groovy or python)

Is there any way to pull the key from JSON if the only thing I know is the value? (In groovy or python)
An example:
I know the "_number" value and I need a key.
So let's say, known _number is 2 and as an output, I should get dsf34f43f34f34f
{
"id": "8e37ecadf4908f79d58080e6ddbc",
"project": "some_project",
"branch": "master",
"current_revision": "3rtgfgdfg2fdsf",
"revisions": {
"43g5g534534rf34f43f": {
"_number": 3,
"created": "2019-04-16 09:03:07.459000000",
"uploader": {
"_account_id": 4
},
"description": "Rebase"
},
"dsf34f43f34f34f": {
"_number": 2,
"created": "2019-04-02 10:54:14.682000000",
"uploader": {
"_account_id": 2
},
"description": "Rebase"
}
}
}
With Groovy:
def json = new groovy.json.JsonSlurper().parse("x.json" as File)
println(json.revisions.findResult{ it.value._number==2 ? it.key : null })
// => dsf34f43f34f34f
Python 3: (assuming that data is saved in data.json):
import json
with open('data.json') as f:
json_data = json.load(f)
for rev, revdata in json_data['revisions'].items():
if revdata['_number'] == 2:
print(rev)
Prints all revs where _number equals 2.
using dict-comprehension:
print({k for k,v in d['revisions'].items() if v.get('_number') == 2})
OUTPUT:
{'dsf34f43f34f34f'}

Using .values() with list of dictionaries?

I'm comparing json files between two different API endpoints to see which json records need an update, which need a create and what needs a delete. So, by comparing the two json files, I want to end up with three json files, one for each operation.
The json at both endpoints is structured like this (but they use different keys for same sets of values; different problem):
{
"records": [{
"id": "id-value-here",
"c": {
"d": "eee"
},
"f": {
"l": "last",
"f": "first"
},
"g": ["100", "89", "9831", "09112", "800"]
}, {
…
}]
}
So the json is represented as a list of dictionaries (with further nested lists and dictionaries).
If a given json endpoint (j1) id value ("id":) exists in the other endpoint json (j2), then that record should be added to j_update.
So far I have something like this, but I can see that .values() doesn't work because it's trying to operate on the list instead of on all the listed dictionaries(?):
j_update = {r for r in j1['records'] if r['id'] in
j2.values()}
This doesn't return an error, but it creates an empty set using test json files.
Seems like this should be simple, but tripping over the nesting I think of dictionaries in a list representing the json. Do I need to flatten j2, or is there a simpler dictionary method python has to achieve this?
====edit j1 and j2====
have same structure, use different keys; toy data
j1
{
"records": [{
"field_5": 2329309841,
"field_12": {
"email": "cmix#etest.com"
},
"field_20": {
"last": "Mixalona",
"first": "Clara"
},
"field_28": ["9002329309999", "9002329309112"],
"field_44": ["1002329309832"]
}, {
"field_5": 2329309831,
"field_12": {
"email": "mherbitz345#test.com"
},
"field_20": {
"last": "Herbitz",
"first": "Michael"
},
"field_28": ["9002329309831", "9002329309112", "8002329309999"],
"field_44": ["1002329309832"]
}, {
"field_5": 2329309855,
"field_12": {
"email": "nkatamaran#test.com"
},
"field_20": {
"first": "Noriss",
"last": "Katamaran"
},
"field_28": ["9002329309111", "8002329309112"],
"field_44": ["1002329309877"]
}]
}
j2
{
"records": [{
"id": 2329309831,
"email": {
"email": "mherbitz345#test.com"
},
"name_primary": {
"last": "Herbitz",
"first": "Michael"
},
"assign": ["8003329309831", "8007329309789"],
"hr_id": ["1002329309877"]
}, {
"id": 2329309884,
"email": {
"email": "yinleeshu#test.com"
},
"name_primary": {
"last": "Lee Shu",
"first": "Yin"
},
"assign": ["8002329309111", "9003329309831", "9002329309111", "8002329309999", "8002329309112"],
"hr_id": ["1002329309832"]
}, {
"id": 23293098338,
"email": {
"email": "amlouis#test.com"
},
"name_primary": {
"last": "Maxwell Louis",
"first": "Albert"
},
"assign": ["8002329309111", "8007329309789", "9003329309831", "8002329309999", "8002329309112"],
"hr_id": ["1002329309877"]
}]
}
If you read the json it will output a dict. You are looking for a particular key in the list of the values.
if 'records' in j2:
r = j2['records'][0].get('id', []) # defaults if id does not exist
It it prettier to do a recursive search but i dunno how you data is organized to quickly come up with a solution.
To give an idea for recursive search consider this example
def resursiveSearch(dictionary, target):
if target in dictionary:
return dictionary[target]
for key, value in dictionary.items():
if isinstance(value, dict):
target = resursiveSearch(value, target)
if target:
return target
a = {'test' : 'b', 'test1' : dict(x = dict(z = 3), y = 2)}
print(resursiveSearch(a, 'z'))
You tried:
j_update = {r for r in j1['records'] if r['id'] in j2.values()}
Aside from the r['id'/'field_5] problem, you have:
>>> list(j2.values())
[[{'id': 2329309831, ...}, ...]]
The id are buried inside a list and a dict, thus the test r['id'] in j2.values() always return False.
The basic solution is fairly simple.
First, create a set of j2 ids:
>>> present_in_j2 = set(record["id"] for record in j2["records"])
Then, rebuild the json structure of j1 but without the j1 field_5 that are not present in j2:
>>> {"records":[record for record in j1["records"] if record["field_5"] in present_in_j2]}
{'records': [{'field_5': 2329309831, 'field_12': {'email': 'mherbitz345#test.com'}, 'field_20': {'last': 'Herbitz', 'first': 'Michael'}, 'field_28': ['9002329309831', '9002329309112', '8002329309999'], 'field_44': ['1002329309832']}]}
It works, but it's not totally satisfying because of the weird keys of j1. Let's try to convert j1 to a more friendly format:
def map_keys(json_value, conversion_table):
"""Map the keys of a json value
This is a recursive DFS"""
def map_keys_aux(json_value):
"""Capture the conversion table"""
if isinstance(json_value, list):
return [map_keys_aux(v) for v in json_value]
elif isinstance(json_value, dict):
return {conversion_table.get(k, k):map_keys_aux(v) for k,v in json_value.items()}
else:
return json_value
return map_keys_aux(json_value)
The function focuses on dictionary keys: conversion_table.get(k, k) is conversion_table[k] if the key is present in the conversion table, or the key itself otherwise.
>>> j1toj2 = {"field_5":"id", "field_12":"email", "field_20":"name_primary", "field_28":"assign", "field_44":"hr_id"}
>>> mapped_j1 = map_keys(j1, j1toj2)
Now, the code is cleaner and the output may be more useful for a PUT:
>>> d1 = {record["id"]:record for record in mapped_j1["records"]}
>>> present_in_j2 = set(record["id"] for record in j2["records"])
>>> {"records":[record for record in mapped_j1["records"] if record["id"] in present_in_j2]}
{'records': [{'id': 2329309831, 'email': {'email': 'mherbitz345#test.com'}, 'name_primary': {'last': 'Herbitz', 'first': 'Michael'}, 'assign': ['9002329309831', '9002329309112', '8002329309999'], 'hr_id': ['1002329309832']}]}

Python3 remove entire nested dictionary

{'result':{'result':[
{
'Company':{
'PostAddress':None
},
'ExternalPartnerProperties':None,
'Id':123456,
'Level':'Level1',
'Name':'Name1',
'ParentId':456789,
'State':'InTrial',
'TrialExpirationTime':1435431669
},
{
'Company':{
'PostAddress':None
},
'ExternalPartnerProperties':None,
'Id':575155,
'Level':'Level2',
'Name':'Name2',
'ParentId':456789,
'State':'InTrial',
'TrialExpirationTime':1491590226
},
{
'Company':{
'PostAddress':None
},
'ExternalPartnerProperties':None,
'Id':888888,
'Level':'Level2',
'Name':'Name3',
'ParentId':456789,
'State':'InProduction',
'TrialExpirationTime':1493280310
},
My code:
for i in partner_output['result']['result']:
if "InProduction" in i['State']:
del i['Company'], i['ExternalPartnerProperties'], i['Id'], i['Level'], i['Name'], i['ParentId'], i['State'], i['TrialExpirationTime']
If I do this, then I return the following result
{'result': {'result': [{
'Company':{
'PostAddress':None
},
'ExternalPartnerProperties':None,
'Id':123456,
'Level':'Level1',
'Name':'Name1',
'ParentId':456789,
'State':'InTrial',
'TrialExpirationTime':1435431669
},
{
'Company':{
'PostAddress':None
},
'ExternalPartnerProperties':None,
'Id':575155,
'Level':'Level2',
'Name':'Name2',
'ParentId':456789,
'State':'InTrial',
'TrialExpirationTime':1491590226
},
{},
but the total number of items is still 3 ... the 3rd container is just empty, but still a container. How can I delete the 3rd container all together?
I cannot use:
for i in partner_output['result']['result']:
if "InProduction" in i['State']:
del partner_output['result'][i]
because I get the error:
TypeError: unhashable type: 'dict'
So I don't know what to do now :-(
You can use a list comprehension to replace the whole list, keeping the other items:
partner_output['result']['result'] = [
i for i in partner_output['result']['result']
if i['State'] != "InProduction"
]
Note that the test for the filter has been reversed; you want to keep all items that do not have 'State' set to InProduction. Alternatively, keep values where the state is set to InTrial:
partner_output['result']['result'] = [
i for i in partner_output['result']['result']
if i['State'] == "InTrial"
]
Your second attempt failed because you tried to use i, a reference to a dictionary, as a key in the outer partner_output['result'] dictionary. If you wanted to delete something from the partner_output['result']['result'] list, you'd have to use the integer index (del partner_output['result']['result'][2]), but you can't do that in a loop because that has consequences for the for loop progress across the list.

Adding key to values in json using Python

This is the structure of my JSON:
"docs": [
{
"key": [
null,
null,
"some_name",
"12345567",
"test_name"
],
"value": {
"lat": "29.538208354844658",
"long": "71.98762580927113"
}
},
I want to add the keys to the key list. This is what I want the output to look like:
"docs": [
{
"key": [
"key1":null,
"key2":null,
"key3":"some_name",
"key4":"12345567",
"key5":"test_name"
],
"value": {
"lat": "29.538208354844658",
"long": "71.98762580927113"
}
},
What's a good way to do it. I tried this but doesn't work:
for item in data['docs']:
item['test'] = data['docs'][3]['key'][0]
UPDATE 1
Based on the answer below, I have tweaked the code to this:
for number, item in enumerate(data['docs']):
# pprint (item)
# print item['key'][4]
newdict["key1"] = item['key'][0]
newdict["yek1"] = item['key'][1]
newdict["key2"] = item['key'][2]
newdict["yek2"] = item['key'][3]
newdict["key3"] = item['key'][4]
newdict["latitude"] = item['value']['lat']
newdict["longitude"] = item['value']['long']
This creates the JSON I am looking for (and I can eliminate the list I had previously). How does one make this JSON persist outside the for loop? Outside the loop, only the last value from the dictionary is added otherwise.
In your first block, key is a list, but in your second block it's a dict. You need to completely replace the key item.
newdict = {}
for number,item in enumerate(data['docs']['key']):
newdict['key%d' % (number+1)] = item
data['docs']['key'] = newdict

Categories

Resources