I'm a beginner in Python pulling JSON data consisting of nested objects (dictionaries?). I'm trying to iterate through everything to locate a key all of them share, and select only the objects that have a specific value in that key. I spent days researching and applying and now everything is kind of blurring together in some mix of JS/Python analysis paralysis. This is the general format for the JSON data:
{
"things":{
"firstThing":{
"one":"x",
"two":"y",
"three":"z"
},
"secondThing":{
"one":"a",
"two":"b",
"three":"c"
},
"thirdThing":{
"one":"x",
"two":"y",
"three":"z"
}
}
}
In this example I want to isolate the dictionaries where two == y. I'm unsure if I should be using
JSON selection (things.things[i].two)
for loop through things, then things[i] looking for two
k/v when I have 3 sets of keys
Can anyone point me in the right direction ?
Assuming this is only ever one level deep (things), and you want a 'duplicate' of this dictionary with only the matching child dicts included, then you can do this with a dictionary comprehension:
data = {
"things":{
"firstThing":{
"one":"x",
"two":"y",
"three":"z"
},
"secondThing":{
"one":"a",
"two":"b",
"three":"c"
},
"thirdThing":{
"one":"x",
"two":"y",
"three":"z"
}
}
}
print({"things": {k:v for k, v in data['things'].items() if 'two' in v and v['two'] == 'y'}})
Since you've tagged this with python I assume you'd prefer a python solution. If you know that your 'two' key (whatever it is) is only present at the level of objects that you want, this might be a nice place for a recursive solution: a generator that takes a dictionary and yields any sub-dictionaries that have the correct key and value. This way you don't have to think too much about the structure of your data. Something like this will work, if you're using at least Python 3.3:
def findSubdictsMatching(target, targetKey, targetValue):
if not isinstance(target, dict):
# base case
return
# check "in" rather than get() to allow None as target value
if targetKey in target and targetKey[target] == targetValue:
yield target
else:
for key, value in target.items():
yield from findSubdictsMatching(value, targetKey, targetValue)
This code allows You to add objects with "two":"y" to list:
import json
m = '{"things":{"firstThing":{"one":"x","two":"y","three":"z"},"secondThing":{"one":"a","two":"b","three":"c"},"thirdThing":{"one":"x","two":"y","three":"z"}}}'
l = json.loads(m)
y_objects = []
for key in l["things"]:
l_2 = l["things"][key]
for key_1 in l_2:
if key_1 == "two":
if l_2[key_1] == 'y':
y_objects.append(l_2)
print(y_objects)
Console:
[{'one': 'x', 'two': 'y', 'three': 'z'}, {'one': 'x', 'two': 'y', 'three': 'z'}]
Related
I have a JSON with an unknown number of keys & values, I need to store the user's selection in a list & then access the selected key's value; (it'll be guaranteed that the keys in the list are always stored in the correct sequence).
Example
I need to access the value_key1-2.
mydict = {
'key1': {
'key1-1': {
'key1-2': 'value_key1-2'
},
},
'key2': 'value_key2'
}
I can see the keys & they're limited so I can manually use:
>>> print(mydict['key1']['key1-1']['key1-2'])
>>> 'value_key1-2'
Now after storing the user's selections in a list, we have the following list:
Uselection = ['key1', 'key1-1', 'key1-2']
How can I convert those list elements into the similar code we used earlier?
How can I automate it using Python?
You have to loop the list of keys and update the "current value" on each step.
val = mydict
try:
for key in Uselection:
val = val[key]
except KeyError:
handle non-existing keys here
Another, more 'posh' way to do the same (not generally recommended):
from functools import reduce
val = reduce(dict.get, Uselection, mydict)
I'm not sure if I'm just having a brainblock here or if this is actually supposed to be a challenge, but I am having trouble figuring out how to check the depth of nested dictionaries if the keys are not known.
Here is an example of what I am trying to do (in the most simple/efficient way):
Optimally, there would be some way for me to determine the maximum depth of this dict, without knowing the keys and values -
nested_dict = {
'nest1': {
'nest2': {
'nest3': 'val'
},
'unknown_key', 'val',
'unknown_key': 'val'
}
}
Please let me know if this makes sense.
Check if its a dict, if so, iterate over the values and recursively call the function get the max of the value.
PS : Dict was a syntax error, fixed it
def max_depth(d):
if isinstance(d, dict):
return 1 + max((max_depth(value) for value in d.values()), default=0)
return 0
nested_dict = {'nest1': {'nest2': {'nest3': 'val'}, 'unknown_key': 'val', 'unknown_key': 'val'}}
print(max_depth(nested_dict))
Output
3
I am trying to serialize a python class named Origin, containing a dictionary as an attribute, into an xml with lxml objectify. This dictionary is initialized with the value "default" for each key.
class Origin:
def __init__(self):
dict = {"A":"default", "B":"default" ...} // my dictionnary has 6 keys actually
The dictionary is filled by first parsing a XML. Not every key is filled. For exemple: dict = {A:"hello", B:"default" ...}
I want to create my Origin XML Element with my dictionary as attribute but I don't want to have the "default" keys.
My solution is to have nested if:
ìf(self.dict["A"] != "default"):
if(self.dict["B"] != "default"):
...
objectify.Element("Origin", A=self.dict["A"], B=self.dict["B"]...)
But it's an ugly and non practical solution if you have more than one or two keys.
Is there a way to first create my Element origin = objectify.Element("Origin") and then add my dictionary keys' if there are different from "default"?
Something more dynamic, like
for key in self.dict:
if(self.dict[key] != "default"):
origin.addAttribute(key=self.dict[key])
Thank you
I would filter the dictionary to only the values that are not "default".
The dict comprehension feature is a big help here.
Example:
data = {
"A": "foo",
"B": "default",
"C": "bar",
}
data = {key: value for key, value in data.items() if value != "default"}
print(data)
Output:
{'A': 'foo', 'C': 'bar'}
I'm using this as a reference: Elegant way to remove fields from nested dictionaries
I have a large number of JSON-formatted data here and we've determined a list of unnecessary keys (and all their underlying values) that we can remove.
I'm a bit new to working with JSON and Python specifically (mostly did sysadmin work) and initially thought it was just a plain dictionary of dictionaries. While some of the data looks like that, several more pieces of data consists of dictionaries of lists, which can furthermore contain more lists or dictionaries with no specific pattern.
The idea is to keep the data identical EXCEPT for the specified keys and associated values.
Test Data:
to_be_removed = ['leecher_here']
easy_modo =
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
}
lunatic_modo =
{
'hello_wold':
{'
leecher_here':'nushiyowa','goodbye_world':'aokigahara'
},
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa',
'something_inside':
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
},
'list_o_dicts':
[
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
}
]
}
Obviously, the original question posted there isn't accounting for lists.
My code, modified appropriately to work with my requirements.
from copy import deepcopy
def remove_key(json,trash):
"""
<snip>
"""
keys_set = set(trash)
modified_dict = {}
if isinstance(json,dict):
for key, value in json.items():
if key not in keys_set:
if isinstance(value, dict):
modified_dict[key] = remove_key(value, keys_set)
elif isinstance(value,list):
for ele in value:
modified_dict[key] = remove_key(ele,trash)
else:
modified_dict[key] = deepcopy(value)
return modified_dict
I'm sure something's messing with the structure since it doesn't pass the test I wrote since the expected data is exactly the same, minus the removed keys. The test shows that, yes it's properly removing the data but for the parts where it's supposed to be a list of dictionaries, it's only getting returned as a dictionary instead which will have unfortunate implications down the line.
I'm sure it's because the function returns a dictionary but I don't know to proceed from here in order to maintain the structure.
At this point, I'm needing help on what I could have overlooked.
When you go through your json file, you only need to determine whether it is a list, a dict or neither. Here is a recursive way to modify your input dict in place:
def remove_key(d, trash=None):
if not trash: trash = []
if isinstance(d,dict):
keys = [k for k in d]
for key in keys:
if any(key==s for s in trash):
del d[key]
for value in d.values():
remove_key(value, trash)
elif isinstance(d,list):
for value in d:
remove_key(value, trash)
remove_key(lunatic_modo,to_be_removed)
remove_key(easy_modo,to_be_removed)
Result:
{
"hello_wold": {
"goodbye_world": "aokigahara"
},
"leeching_forbidden": "wanpan kinshi",
"something_inside": {
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
},
"list_o_dicts": [
{
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
}
]
}
{
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
}
I am facing some trouble with trying to add more key:value pairs to a dictionary object that is itself nested within another dictionary object. Also, the usual way of doing dict[key] = value to assign additional key:value pairs to the dictionary is not suitable for my case here (I'll explain why later below), and thus this makes my objective a lot more challenging to achieve.
I'll illustrate what I'm trying to achieve with some statements from my source code.
First, I have a dictionary object that contains nesting:
environment = { 'v' :
{
'SDC_PERIOD':'{period}s'.format(period = self.period),
'FAMILY':'{legup_family}s'.format(legup_family = self.legup_family),
'DEVICE_FAMILY':'"{fpga_family}s"'.format(fpga_family = self.fpga_family)
}
}
and then following this line, I will do an if test that, if passed, will require me to add this other dictionary:
environment_add = { 'v' : {'LM_LICENSE_FILE' : '1800#adsc-linux'} ,
'l' : 'quartus_full' }
to ultimately form this complete dictionary:
environment = { 'v' :
{
'SDC_PERIOD':'{period}s'.format(period = self.period),
'FAMILY':'{legup_family}s'.format(legup_family = self.legup_family),
'DEVICE_FAMILY':'"{fpga_family}s"'.format(fpga_family = self.fpga_family),
'LM_LICENSE_FILE' : '1800#adsc-linux'
} ,
'l' : 'quartus_full'
}
As you can see, if I were to try and assign a new key:value pair using the dict[key] = value syntax, it would not work for me because it would end up either creating an new key:value pair for me, or overwrite the existing dictionary object and the key:value pairs that are nested under the 'v' key.
So far, in order to accomplish the creation of the dictionary, I've been using the following:
environment = """{ v: {'SDC_PERIOD':'%(period)s','FAMILY':'%(legup_family)s','DEVICE_FAMILY':'"%(fpga_family)s"'}}""" % self
if self.require_license: # this is the if statement test that I was referring to earlier
environment = environment.replace('}', '')
environment += """ ,'LM_LICENSE_FILE':'1800#adsc-linux'}, 'l': 'quartus_full'}"""
and then obtaining the dictionary object later with:
import ast
env_dict = ast.literal_eval(environment)
which gives effectively converts the environment string into a dictionary object stored under a new variable name of env_dict.
My teammates think that this is much too overkill, especially since the environment or env_dict object will be parsed in 2 separate modules later on. In the first module, the key-value pairs will be broken up and reconstructed to form strings that look like '-v SDC_PERIOD=2500s, LM_LICENSE_FILE=1800#adsc-linux' , while in the second module, the dictionary nested under the 'v' key (of the environment/env_dict dictionary object) will be extracted out and directly fed as an argument to a function that accepts a mapping object.
So as you can see, there is quite a lot of precise parsing required to do the job, and although my method fulfills the objective, it is not accepted by my team and they think that there must be a better way to do this directly from environment being a dictionary object and not a string object.
Thank you very much for studying my detailed post, and I will greatly appreciate any help or suggestions to move forward on this!
for k,v in environment_add.iteritems(): # .items() in Python 3
if k in environment:
environment[k].update(v)
else:
environment[k] = v
That is, for each item to add, check if it exists, and update it if so, or simply create it. This assumes the items being added, if they exist, will be dicts (you can't update a string like quartus_full).
Why not just use update
In [4]: dict_ = {"a": {"b": 2, "c": 3}}
In [5]: dict_["a"].update(d=4)
In [6]: dict_
Out[6]: {'a': {'b': 2, 'c': 3, 'd': 4}}