This question already has answers here:
Flatten nested dictionaries, compressing keys
(32 answers)
Python 3: Flatten Dictionary including lists
(2 answers)
Closed 7 days ago.
I've been trying to get better at data manipulation and flattening large sources of data. I have a JSON structure that looks like this:
json_structure =
{
"a": "1",
"b": {
"c": 2,
"d": "3",
"e": {
"f": 4
}
},
"g": 5,
"h": [
{
"i": "6",
"j": "7"
}
],
"k": {
"l": "8",
"m": "9"
},
"n": {
"o": "10",
"p": {
"q": 11,
"r": 12,
},
"s": [
{
"t": "13",
"u": {
"v": 14,
},
"w": 15,
}
],
"x": [
{
"y": "16",
"z": [],
},
{
"aa": "17",
"bb": [
"abc"
]
},
}
}
I am able to get keys and values using this recursive code:
def deepValue(D,key,*rest,default=None):
try: return deepValue(D[key],*rest,default=default) if rest else D[key]
except: return default
def deepKeys(D,key,*rest):
try:
return deepKeys(D[key],*rest) if rest \
else D[key].keys() if isinstance(D[key],dict) \
else range(len(D[key]))
except:
return []
def deepItems(D,key,*rest):
try:
if rest:
yield from deepItems(D[key],*rest)
elif isinstance(D[key],dict):
yield from D[key].items()
else:
yield from enumerate(D[key])
except: return
However, when I get these keys, I also want to rename and store them in a dictionary that would look like this:
flattened = {
"a": 1,
"b_c": 2,
"b_d": 3,
"b_e_f": 4,
"g": 5,
"h_i_n": 6, # n refers to the index of the list, 0, 1, 2, etc.
"h_j_n": 7, # n refers to the index of the list, 0, 1, 2, etc.
"k_l": 8,
"k_m": 9,
"n_o": 10,
"n_o_p_q": 11,
"n_o_p_q_r": 12,
"n_s_t_i": 13, # i refers to the index of the list, 0, 1, 2, etc.
"s_u_v_n": 14 # n refers to the index of the list, 0, 1, 2, etc.
...
}
The naming convention being parentkey_childkey_nextchildkey.
This seems very complex to me and wondering if its even possible to do
Related
Working on a freshwater fish conservation project. I scraped a JSON file that looks like this:
{
"fish": [
{
"id": 0,
"n": "NO INFORMATION",
"a": "NONE",
"i": "none.png"
},
{
"id": 1,
"n": "Hampala barb",
"a": "Hampala macrolepidota",
"i": "hampala.png"
},
{
"id": 2,
"n": "Giant snakehead",
"a": "Channa micropeltes",
"i": "toman.png"
},
{
"id": 3,
"n": "Clown featherback",
"a": "Chitala ornata",
"i": "belida.png"
}
]
}
And I'm trying to extract the keys "id" and "a" into a python dictionary like this:
fish_id = {
0 : "NONE",
1 : "Hampala macrolepidota",
2 : "Channa micropeltes",
3 : "Chitala ornata"
}
import json
data = """{
"fish": [
{
"id": 0,
"n": "NO INFORMATION",
"a": "NONE",
"i": "none.png"
},
{
"id": 1,
"n": "Hampala barb",
"a": "Hampala macrolepidota",
"i": "hampala.png"
},
{
"id": 2,
"n": "Giant snakehead",
"a": "Channa micropeltes",
"i": "toman.png"
},
{
"id": 3,
"n": "Clown featherback",
"a": "Chitala ornata",
"i": "belida.png"
}
]
}"""
data_dict = json.loads(data)
fish_id = {}
for item in data_dict["fish"]:
fish_id[item["id"]] = item["a"]
print(fish_id)
First create a fish.json file and get your JSON file;
with open('fish.json') as json_file:
data = json.load(json_file)
Then, take your fishes;
fish1 = data['fish'][0]
fish2 = data['fish'][1]
fish3 = data['fish'][2]
fish4 = data['fish'][3]
After that take only values for each, because you want to create a dictionary only from values;
value_list1=list(fish1.values())
value_list2=list(fish2.values())
value_list3=list(fish3.values())
value_list4=list(fish4.values())
Finally, create fish_id dictionary;
fish_id = {
f"{value_list1[0]}" : f"{value_list1[2]}",
f"{value_list2[0]}" : f"{value_list2[2]}",
f"{value_list3[0]}" : f"{value_list3[2]}",
f"{value_list4[0]}" : f"{value_list4[2]}",
}
if you run;
print(fish_id)
Result will be like below, but if you can use for loops, it can be more effective.
{'0': 'NONE', '1': 'Hampala macrolepidota', '2': 'Channa micropeltes', '3': 'Chitala ornata'}
I use indent = 2, but I want the first level of indentation to be zero. For example:
Partial Code
json.dump(json_data, json_file, indent=2)
Output
{
"a": 1,
"b": "2",
"list": [
{
"c": 3,
"d": 4,
}
]
}
What I want instead
{
"a": 1,
"b": "2",
"list": [
{
"c": 3,
"d": 4,
}
]
}
As stated in the comments, it doesn't make functional difference and you will need custom pretty-print. something like
import json
import textwrap
spam = {"a": 1, "b": "2",
"list": [{"c": 3, "d": 4,}]}
eggs = json.dumps(spam, indent=2).splitlines()
eggs = '\n'.join([eggs[0], textwrap.dedent('\n'.join(eggs[1:-1])), eggs[-1]])
print(eggs)
with open('spam.json', 'w') as f:
f.write(eggs)
output
{
"a": 1,
"b": "2",
"list": [
{
"c": 3,
"d": 4
}
]
}
I have the following JSON document:
{
"A": "A_VALUE",
"B": {
"C": [
{
"D": {
"E": "E_VALUE1",
"F": "F_VALUE1",
"G": "G_VALUE1"
},
"H": ["01", "23" ]
},
{
"D": {
"E": "E_VALUE2",
"F": "F_VALUE2",
"G": "G_VALUE3"
},
"H": ["45", "67" ]
}
]
}
}
and I would like to extract field H using a jsonpath2 expression where I specify a value for E field,
for example :
$..C[?(#.D.G="G_VALUE1")].H[1]
The code I use to parse this is the following ( jsonpath version 0.4.3 ):
from jsonpath2.path import Path
s='{ "A": "A_VALUE", "B": { "C": [ { "D": { "E": "E_VALUE1", "F": "F_VALUE1", "G": "G_VALUE1" }, "H": ["01", "23" ] }, { "D": { "E": "E_VALUE2", "F": "F_VALUE2", "G": "G_VALUE3" }, "H": ["45", "67" ] } ] } }"'
p = Path.parse_str("$..C[?(#.D.E=\"E_VALUE1\")].H[1]")
print ([m.current_value for m in p.match(s)])
output
[]
Now, if I use JsonPath evaluator on https://jsonpath.com/ I obtain the following result which is not exatly what I need
$..C[?(#.D.E="E_VALUE1")].H[1]
output
[23,67]
But If I change the expression this way than it works and I obtain what I need;
$..C[?(#.D.E=="E_VALUE1")].H[1]
output
[23]
Same results with other online evaluator such as https://codebeautify.org/jsonpath-tester
So what would be the correct jsonpath expression I should use with jsonpath2 api in order to correctly extract the two required fields ?
You have to use [*] to access individual objects inside an array. This code works -
from jsonpath2.path import Path
import json
s='{ "A": "A_VALUE", "B": { "C": [ { "D": { "E": "E_VALUE1", "F": "F_VALUE1", "G": "G_VALUE1" }, "H": ["01", "23" ] }, { "D": { "E": "E_VALUE2", "F": "F_VALUE2", "G": "G_VALUE3" }, "H": ["45", "67" ] } ] } }'
jso = json.loads(s)
p = Path.parse_str('$..C[*][?(#.D.E="E_VALUE1")].H[1]') # C[*] access each bject in the array
print (*[m.current_value for m in p.match(jso)]) # 23
You can refer to this example from the jsonpath2 docs
You should use the == syntax.
Full disclosure: I've never heard of jsonpath before coming across your question, but being somewhat familiar with XPath, I figured I would read about this tool. I came across a site that can evaluate your expresssion using diffeernt implementations: http://jsonpath.herokuapp.com. The net result was that your expression with = could not be parsed by 3 of the 4 implementations. Moreover, the Goessner implementation returned results that you weren't expecting (all C elements matched and the result was [23,67]. With the == boolean expression, 3 of the 4 implementations provided the expected result of [23]. The Nebhale implementation again complained about the expresssion.
I have a json that looks like this
{
"values": {
"a": 1,
"b": 2,
"c": 3,
"d": 4
},
"sales-year": [
{ "a": 0, "b": 0, "c": 0, "d": 0, "e": "karl" },
{ "a": 0, "b": 0, "c": 0, "d": 0, "e": "karl" },
{ "a": 4, "b": 10, "c": 20, "d": 30, "e": "karl" },
{ "a": 0, "b": 0, "c": 0, "d": 0, "e": "karl" }
]
}
And I pass it through get_context_data with django to my 'index.html'. Further explanation here
I can access the values pretty easy with {{my_json.values.a}} However I am having problems accessing the sales-year array. How do I do that? I tried the following, none of them work:
{{my_json['sales-this'].2.a}}
{{my_json.['sales-this'].2.a}}
{{my_json.[sales-this].2.a}}
{{my_json[sales-this].2.a}}
you need to create a custom template filter to handle this
First create a custom template filter like:
from django import template
register = template.Library()
#register.filter
def getItem(dict, key):
return dict.get(key)
next in your template do like:
{{my_json|getItem:'sales-year'}}
Learn more on how to use/create custom filters here
I have a JSON file that looks like this
{
"values": {
"a": 1,
"b": 2,
"c": 3,
"d": 4
},
"sales": [
{ "a": 0, "b": 0, "c": 0, "d": 0, "e": "karl" },
{ "a": 0, "b": 0, "c": 0, "d": 0, "e": "karl" },
{ "a": 4, "b": 10, "c": 20, "d": 30, "e": "karl" },
{ "a": 0, "b": 0, "c": 0, "d": 0, "e": "karl" }
]
}
and I am importing that via get_context_data
import json
class MyCreateView(CreateView):
def get_context_data(self, **kwargs):
context = super(MyCreateView, self).get_context_data(**kwargs)
with open('/path/to/my/JSON/file/my_json.cfg', 'r') as f:
myfile = json.load(f)
context['my_json'] = my_data
which works, when I do print myfile["sales"][0]["a"] I get 0 and when I put {{my_json}} into the index.html then I get the whole array.
So now my question is how to read the values best. Do I have to create context variables for each of the values or is it possible to read the json array in my html?
I tried {{my_json["sales"][0]["a"]}} but didn't work
If you want to get myfile["sales"][0]["a"] in template you can do like:
{{my_json.sales.0.a}}
or if you want to get myfile["values"]["a"] this can be done like:
{{my_json.values.a}}