Iterating though nested JSON values with python - python

I want to be able to iterate through nested values in a JSON file and add them to a list.
For example, I want to find the values contained in each instance of Test below.
A= {"Tags": [
{
"Item":{
"Test":"mouse",
},
},
{
"Item":{
"Test":"dog",
},
},
{
"Item":{
"Test":"cat",
},
},
{
"Item":{
"Test":"dog",
},
}
]
}
I can select values individually like so:
print(A['Tags'][1]['Item']['Test'])
But I can't iterate over the entire JSON file.

This iterates over the whole and appends each value to a list.
output = list()
for tag in A['Tags']:
output.append(tag['Item']['Test'])
If you're worried about missing values, the following will correct for that.
output = list()
for tag in A['Tags']:
if item := tag.get('Item', dict()).get('Test'):
output.append(item)
Output:
['mouse', 'dog', 'cat', 'dog']

I have a simpler method: use the jsonpath package
You can learn jsonpath, which is very convenient, but the disadvantage is that it is slow
from jsonpath import jsonpath
A = {"Tags": [
{
"Item": {
"Test": "mouse",
},
},
{
"Item": {
"Test": "dog",
},
},
{
"Item": {
"Test": "cat",
},
},
{
"Item": {
"Test": "dog",
},
}
]
}
print(jsonpath(A, "$..Test"))
out
['mouse', 'dog', 'cat', 'dog']

for tag in A['Tags']:
print(tag['Item']['Test'])
A['Tags'] is just a list of objects/dictionary, which is referred by tag and then you can access any value inside the object via a key.

Related

Creating a defining dictionary in python (glossary and input)

I already have all the code done, I am just needing help on how I can get all my words, I have previously done databases before with countries, currencies, and leaders of the world, ect... but now I need definitions, can someone point me in the right way?
CCCdatbase = [
{
"country": "Afghanistan",
"capital": "Kabul",
"currency": "Afghani",
"language": "Dari Persian; Pashto",
"leader": "Amrullah Saleh",
"k2": ""
},
]
#except like this:
dictionary = [
{
"word": 'dog',
"definition": "worse than cats"
}
]```
Why not
dictionary = [
{
'dog': 'worse than cats',
'cat': 'feline...'
}
]```

Create a list of new urls contained in objects in python

I have two json databases. If there is a new value in the "img_url" (one in the last json that isn't in the other), I want to print the url or place it in a variable. The goal is just to find a list of the new values.
Input json:
last_data = [
{
"objectID": 16240,
"results": [
{
"img_url": "https://img.com/1.jpg"
},
{
"img_url": "https://img.com/2.jpg"
},
{
"img_url": "https://img.com/30.jpg"
}
]
}
{
"objectID": 16242,
"results": [
{
"img_url": "https://img.com/1.jpg"
},
{
"img_url": "https://img.com/2.jpg"
},
{
"img_url": "https://img.com/3.jpg"
}
]
}]
# ...
#multiple other objectIDs
]
Second input:
second_data =[
{
"objectID": 16240,
"results": [
{
"img_url": "https://img.com/1.jpg"
},
{
"img_url": "https://img.com/2.jpg"
}
]
},
{
"objectID": 16242,
"results": [
{
"img_url": "https://img.com/1.jpg"
},
{
"img_url": "https://img.com/2.jpg"
}
]
}...
#multiple other objectIDs
]
And I want to output only the https://img.com/3.jpg and the https://img.com/3.jpg urls (it can be a list because I have multiples objects) or place it in a variable
My code:
#last file
for item_last in last_data:
results_last = item_last["results"]
if results_last is not []:
for result_last in results_last:
ccv_last = result_last["img_url"]
#second file
for item_second in second_data:
results_second = item_second["results"]
if results_second is not []:
# loop in results
for result_second in results_second:
ccv_second = result_second["img_url"]
if gm_last != gm_second and gm_last is not None:
print(gm_last)
If you are trying to find difference between two different list here it is.
I have slightly modified your same code to get the expected result.
#last file
ccv_last = []
for item_last in last_data:
results_last = item_last["results"]
if results_last:
for result_last in results_last:
ccv_last.append(result_last["img_url"])
#second file
ccv_second = []
for item_second in second_data:
results_second = item_second["results"]
if results_second:
for result_second in results_second:
ccv_second.append(result_second["img_url"])
diff_list = list(set(ccv_last)-set(ccv_second)))
Output:
['https://img.com/30.jpg', 'https://img.com/3.jpg']
However you can plan to slightly change your results model for better performance please find below.
If you think no further keys are planned for the dictionaries in result list then probably you just want list. So you can change dict -> list
from
...
"results": [
{
"img_url": "https://img.com/1.jpg"
},
{
"img_url": "https://img.com/2.jpg"
}
]
...
to just list of urls
...
"img_url_results": ["https://img.com/1.jpg","https://img.com/2.jpg"]
...
By doing this change you can just skip one for loop.
#last file
ccv_last = []
for item_last in last_data:
if item_last.get('img_url_results'):
ccv_last.extend(item_last["img_url_results"])

Transforming JSON keys and values

I have a simple json in python that looks like :
{
"list": [{
"key1": "value1"
},
{
"key1": "value1"
}
]
}
I want to transform this to the following json. Any suggestions how I can do it with python without installing additional libraries?
{
"list": [{
"keys": {
"name": "key1",
"value": "value1"
}
}, {
"keys": {
"name": "key1",
"value": "value1"
}
}]
}
Not sure from your question if you already have the json read into a variable, or if it is in a file. This is assuming you have it in a variable already:
in_json = {
"list": [{
"key1": "value1"
},
{
"key2": "value2"
}
]
}
out_json = {"list":[]}
for kd in in_json["list"]:
sub_kd = {"keys": {}}
for k,v in kd.iteritems():
sub_kd["keys"]["name"] = k
sub_kd["keys"]["value"] = v
out_json["list"].append(sub_kd)
print(out_json)
It just loops through the json making dictionaries to append to the out_json dictionary. You could make this print pretty with the json library and also save to file with it
You didn't indicate exactly what contains the JSON data is in, so I've put it all in a string in the example code below and uses the json.loads() function to turn it into a Python dictionary. If it's in a file, you can use the module's json.load() function instead.
It also make the assume that each sub-JSON object in the "list" list consists of only one key/value pair as shown in your question.
The code below changes the deserialized dictionary in-place and pretty-prints the result of doing that by using the json.dumps() function to reserialize it.
Note that I changed the keys and values in sample input JSON slightly to make it easier to see the correspondence between it and the printed results.
import json
json_in = '''
{
"list": [
{
"key1": "value1"
},
{
"key2": "value2"
}
]
}
'''
json_data = json.loads(json_in) # Deserialize.
for i, obj in enumerate(json_data['list']):
# Assumes each object in list contains only one key, value pair.
newobj = { 'name': next(iter(obj.keys())),
'value': next(iter(obj.values()))}
json_data['list'][i] = {'keys': newobj}
print(json.dumps(json_data, indent=4)) # Reserialize and print.
Printed result:
{
"list": [
{
"keys": {
"name": "key1",
"value": "value1"
}
},
{
"keys": {
"name": "key2",
"value": "value2"
}
}
]
}

Filtering out desired data from a JSON file (Python)

this is a sample of my json file:
{
"pops": [{
"name": "pop_a",
"subnets": {
"Public": ["1.1.1.0/24,2.2.2.0/24"],
"Private": ["192.168.0.0/24,192.168.1.0/24"],
"more DATA":""
}
},
{
"name": "pop_b",
"subnets": {
"Public": ["3.3.3.0/24,4.4.4.0/24"],
"Private": ["192.168.2.0/24,192.168.3.0/24"],
"more DATA":""
}
}
]
}
after i read it, i want to make a dic object and store some of the things that i need from this file.
i want my object to be like this ..
[{
"name": "pop_a",
"subnets": {"Public": ["1.1.1.0/24,2.2.2.0/24"],"Private": ["192.168.0.0/24,192.168.1.0/24"]}
},
{
"name": "pop_b",
"subnets": {"Public": ["3.3.3.0/24,4.4.4.0/24"],"Private": ["192.168.2.0/24,192.168.3.0/24"]}
}]
then i want to be able to access some of the public/private values
here is what i tried, and i know there is update(), setdefault() that gave also same unwanted results
def my_funckion():
nt_json = [{'name':"",'subnets':[]}]
Pname = []
Psubnet= []
for pop in pop_json['pops']: # it print only the last key/value
nt_json[0]['name']= pop['name']
nt_json[0]['subnet'] = pop['subnet']
pprint (nt_json)
for pop in pop_json['pops']:
"""
it print the names in a row then all of the ipss
"""
Pname.append(pop['name'])
Pgre.append(pop['subnet'])
nt_json['pop_name'] = Pname
nt_json['subnet']= Psubnet
pprint (nt_json)
Here's a quick solution using list comprehension. Note that this approach can be taken only with enough knowledge of the json structure.
>>> import json
>>>
>>> data = ... # your data
>>> new_data = [{ "name" : x["name"], "subnets" : {"Public" : x["subnets"]["Public"], "Private" : x["subnets"]["Private"]}} for x in data["pops"]]
>>>
>>> print(json.dumps(new_data, indent=2))
[
{
"name": "pop_a",
"subnets": {
"Private": [
"192.168.0.0/24,192.168.1.0/24"
],
"Public": [
"1.1.1.0/24,2.2.2.0/24"
]
}
},
{
"name": "pop_b",
"subnets": {
"Private": [
"192.168.2.0/24,192.168.3.0/24"
],
"Public": [
"3.3.3.0/24,4.4.4.0/24"
]
}
}
]

How can I convert my JSON into the format required to make a D3 sunburst diagram?

I have the following JSON data:
{
"data": {
"databis": {
"dataexit": {
"databis2": {
"1250": { }
}
},
"datanode": {
"20544": { }
}
}
}
}
I want to use it to generate a D3 sunburst diagram, but that requires a different data format:
{
"name": "data",
"children": [
{
"name": "databis",
"children": [
{
"name": "dataexit",
"children": [
{
"name": "databis2",
"size": "1250"
}
]
},
{
"name": "datanode",
"size": "20544"
}
]
}
]
}
How can I do this with Python? I think I need to use a recursive function, but I don't know where to start.
You could use recursive solution with function that takes name and dictionary as parameter. For every item in given dict it calls itself again to generate list of children which look like this: {'name': 'name here', 'children': []}.
Then it will check for special case where there's only one child which has key children with value of empty list. In that case dict which has given parameter as a name and child name as size is returned. In all other cases function returns dict with name and children.
import json
data = {
"data": {
"databis": {
"dataexit": {
"databis2": {
"1250": { }
}
},
"datanode": {
"20544": { }
}
}
}
}
def helper(name, d):
# Collect all children
children = [helper(*x) for x in d.items()]
# Return dict containing size in case only child looks like this:
# {'name': '1250', 'children': []}
# Note that get is used to so that cases where child already has size
# instead of children work correctly
if len(children) == 1 and children[0].get('children') == []:
return {'name': name, 'size': children[0]['name']}
# Normal case where returned dict has children
return {'name': name, 'children': [helper(*x) for x in d.items()]}
def transform(d):
return helper(*next(iter(d.items())))
print(json.dumps(transform(data), indent=4))
Output:
{
"name": "data",
"children": [
{
"name": "databis",
"children": [
{
"name": "dataexit",
"children": [
{
"name": "databis2",
"size": "1250"
}
]
},
{
"name": "datanode",
"size": "20544"
}
]
}
]
}

Categories

Resources