Python get/set value of dict of dicts by key in variable - python

I have dict of dicts in Python, it can be deep, not just 2 levels.
data = {
0: {
1: {
2: "Yes"
},
3: {
4: "No"
}
},
}
I need to get and set the data, but the key is dynamic and is stored in list
key_to_get = [0,1,2]
print(data.get(key_to_get))
Should print Yes
Ideas?

Here is a simple recursive function which solves the problem
def getValue(dict, keys):
if len(keys) == 1:
return dict.get(keys[0])
return getValue(dict.get(keys[0]), keys[1:])
And an iterative approach if you're boring
temp = data
for key in key_to_get:
temp = temp.get(key)
print(temp)

Related

Turning dynamically sized list into tree structured like dictionary

I'm trying to build up an dictionary based on a list with a dynamic size.
parts = [['Sect', 'H1'],
['Sect', 'P'],
['Sect', 'P[2]'],
['Sect[2]', 'Sect', 'H2']]
Should result in such a dictionary like:
{
"Sect": {
"H1": {
},
"P": {
},
"P[2]": {
},
},
"Sect[2]": {
"Sect": {
"H2": {
}
}
}
}
I can't get it. Any idea how to turn a list of dynamic size into a tree structured dictionary?
My approach so far:
for i, part in enumerate(parts):
if i == 0:
if part not in result:
result[part] = dict()
if i == 1:
if part not in result[parts[i-1]]:
result[parts[i-1]][part] = dict()
if i == 2:
if part not in result[parts[i-2]][parts[i-1]]:
result[parts[i-2]][parts[i-1]][part] = dict()
...
But this isn't a dynamic approach so far.
Similar to quamrana's answer, but a bit terser with dict.setdefault.
d = {}
for p in parts:
inner = d
for k in p:
inner = inner.setdefault(k, {})
print(d) # {'Sect[2]': {'Sect': {'H2': {}}}, 'Sect': {'P[2]': {}, 'P': {}, 'H1': {}}}
The answer is iteration. You can iterate over a list of lists. And for each list you can iterate to produce hierarchical dicts.
parts = [['Sect', 'H1'],
['Sect', 'P'],
['Sect', 'P[2]'],
['Sect[2]', 'Sect', 'H2']]
data = {}
for part in parts:
d = data # refer to a current dict
for key in part:
new_dict = d[key] if key in d else {}
d[key] = new_dict
d = new_dict # recurse down the nested dicts
print(data)

How can I access value in nested dictionary using a single key?

Assuming that I have a nested dictionary, that is extracted from pickle file, that contains various levels, I would like to get the value by giving only the last key. Keys are unique considering own 'branch'.
The main problem is that I have multiple keys and levels:
dict = {
'A': {
'X': {
1: [...],
2: [...]
},
'Y': {
3: [...],
4: [...]
}
}
'B': {
'G': {
'H': {
'Z': [...]
}
}
}
'C': [...]
}
How can I do that?
a simple solution would be a recusrive function that even works for nested, nested dictionarys
outer_dict = {'outer': {'inner': 10, 'even_inner': {'innerst': 25}}}
and the function:
def get_val(search_dict, key):
""" recursive searching the dict """
for elem in search_dict:
if elem == key:
return search_dict[elem]
if isinstance(search_dict[elem], dict):
retval = get_val(search_dict[elem], key)
if retval is not None:
return retval
value = get_val(outer_dict, 'innerst')
print(value)
>> 25
Problems:
if the key is not unique you will get only the first match. You will need a list to fill a values into if the key can be there more than once.
Please provide a example next time!

Pyspark - get attribute names from json file

I am new to pyspark . My requirement is to get/extract the attribute names from a nested json file . I tried using json_normalize imported from pandas package. It works for direct attributes but never fetches the attributes within json array attributes. My json doesn't have a static structure. It varies for each document that we receive. Could someone please help me with explanation for the small example provided below,
{
"id":"1",
"name":"a",
"salaries":[
{
"salary":"1000"
},
{
"salary":"5000"
}
],
"states":{
"state":"Karnataka",
"cities":[
{
"city":"Bangalore"
},
{
"city":"Mysore"
}
],
"state":"Tamil Nadu",
"cities":[
{
"city":"Chennai"
},
{
"city":"Coimbatore"
}
]
}
}
Especially for the json array elements..
Expected output :
id
name
salaries.salary
states.state
states.cities.city``
Here is the another solution for extracting all nested attributes from json
import json
result_set = set([])
def parse_json_array(json_obj, parent_path):
array_obj = list(json_obj)
for i in range(0, len(array_obj)):
json_ob = array_obj[i]
if type(json_obj) == type(json_obj):
parse_json(json_ob, parent_path)
return None
def parse_json(json_obj, parent_path):
for key in json_obj.keys():
key_value = json_obj.get(key)
# if isinstance(a, dict):
if type(key_value) == type(json_obj):
parse_json(key_value, str(key) if parent_path == "" else parent_path + "." + str(key))
elif type(key_value) == type(list(json_obj)):
parse_json_array(key_value, str(key) if parent_path == "" else parent_path + "." + str(key))
result_set.add((parent_path + "." + key).encode('ascii', 'ignore'))
return None
file_name = "C:/input/sample.json"
file_data = open(file_name, "r")
json_data = json.load(file_data)
print json_data
parse_json(json_data, "")
print list(result_set)
Output:
{u'states': {u'state': u'Tamil Nadu', u'cities': [{u'city': u'Chennai'}, {u'city': u'Coimbatore'}]}, u'id': u'1', u'salaries': [{u'salary': u'1000'}, {u'salary': u'5000'}], u'name': u'a'}
['states.cities.city', 'states.cities', '.id', 'states.state', 'salaries.salary', '.salaries', '.states', '.name']
Note:
My Python version: 2.7
you can do in this way also.
data = { "id":"1", "name":"a", "salaries":[ { "salary":"1000" }, { "salary":"5000" } ], "states":{ "state":"Karnataka", "cities":[ { "city":"Bangalore" }, { "city":"Mysore" } ], "state":"Tamil Nadu", "cities":[ { "city":"Chennai" }, { "city":"Coimbatore" } ] } }
def dict_ittr(lin,data):
for k, v in data.items():
if type(v)is list:
for l in v:
dict_ittr(lin+"."+k,l)
elif type(v)is dict:
dict_ittr(lin+"."+k,v)
pass
else:
print lin+"."+k
dict_ittr("",data)
output
.states.state
.states.cities.city
.states.cities.city
.id
.salaries.salary
.salaries.salary
.name
If you treat the json like a python dictionary, this should work.
I just wrote a simple recursive program.
Script
import json
def js_r(filename):
with open(filename) as f_in:
return(json.load(f_in))
g = js_r("city.json")
answer_d = {}
def base_line(g, answer_d):
for key in g.keys():
answer_d[key] = {}
return answer_d
answer_d = base_line(g, answer_d)
def recurser_func(g, answer_d):
for k in g.keys():
if type(g[k]) == type([]): #If the value is a list
answer_d[k] = {list(g[k][0].keys())[0]:{}}
if type(g[k]) == type({}): #If the value is a dictionary
answer_d[k] = {list(g[k].keys())[0]: {}} #set key equal to
answer_d[k] = recurser_func(g[k], answer_d[k])
return answer_d
recurser_func(g,answer_d)
def printer_func(answer_d, list_to_print, parent):
for k in answer_d.keys():
if len(answer_d[k].keys()) == 1:
list_to_print.append(parent)
list_to_print[-1] += k
list_to_print[-1] += "." + str(list(answer_d[k].keys())[0])
if len(answer_d[k].keys()) == 0:
list_to_print.append(parent)
list_to_print[-1] += k
if len(answer_d[k].keys()) > 1:
printer_func(answer_d[k], list_to_print, k + ".")
return list_to_print
l = printer_func(answer_d, [], "")
final = " ".join(l)
print(final)
Explanation
base_line makes a dictionary of all your base keys.
recursur_func checks if the key's value is a list or dict then adds to the answer dictionary as is necessary until answer_d looks like: {'id': {}, 'name': {}, 'salaries': {'salary': {}}, 'states': {'state': {}, 'cities': {'city': {}}}}
After these 2 functions are called you have a dictionary of keys in a sense. Then printer_func is a recursive function to print it as you desired.
NOTE:
Your question is similar to this one: Get all keys of a nested dictionary but since you have a nested list/dictionary instead of just a nested dictionary, their answers won't work for you, but there is more discussion on the topic on that question if you like more info
EDIT 1
my python version is 3.7.1
I have added a json file opener to the top. I assume that the json is named city.json and is in the same directory
EDIT 2: More thorough explanation
The main difficulty that I found with dealing with your data is the fact that you can have infinitely nested lists and dictionaries. This makes it complicated. Since it was infinite possible nesting, I new this was a recursion problem.
So, I build a dictionary of dictionaries representing the key structure that you are looking for. Firstly I start with the baseline.
base_line makes {'id': {}, 'name': {}, 'salaries': {}, 'states': {}} This is a dictionary of empty dictionaries. I know that when you print. Every key structure (like states.state) starts with one of these words.
recursion
Then I add all the child keys using recursur_func.
When given a dictionary g this function for loop through all the keys in that dictionary and (assuming answer_d has each key that g has) for each key will add that keys child to answer_d.
If the child is a dictionary. Then I recurse with the given dictionary g now being the sub-part of the dictionary that pertains to the children, and answer_d being the sub_part of answer_d that pertains to the child.

Check dictionary's values are included in another dictionary in Python 3

I have two dictionaries like below. What I want to do is checking that all a's values are included in b dictionary. Two dictionaries may be different structure. And some a's keys are not included in b. I want to know generic ways to realize this.
Check value list. All a's values should be included in b
Expected outputs are like below text output. I know a[0].name is not valid in python. This is not python's raw code.
a[0]['name'] in b? => yes, same value
a[0]['vals'][0]['apple'] in b? => yes, but different value
a[0]['vals'][0]['banana'][0]['hoge'] in b? => not exists
a[0]]'vals'][0]['banana'][0]['fuga'] in b? => not exits
Two dictionaries.
a = [
{
"name":"hoge",
"vals":[
{
"apple":11,
"banana":{
"hoge":1,
"fuga":"aaa"
}
}
]
}
]
b = [
{
"name":"hoge",
"vals":[
{
"apple":21,
"grape":{
"foo":1
}
}
]
}
]
You can implement a dict comparison function as I did below:
def compare_ndic(src, dst, pre=''):
for skey, sval in src.items():
if pre:
print_skey = pre + '.' + skey
else:
print_skey = skey
if skey not in dst.keys():
print('Key "{}" in {} does not existed in {}'.format(print_skey, 'src', 'dst'))
else:
if isinstance(sval, dict) and isinstance(dst.get(skey), dict):
#If the value of the same key is still dict
compare_ndic(sval, dst.get(skey), print_skey)
elif sval == dst.get(skey):
print('Value of key "{}" in {} is the same with value in {}'.format(print_skey, 'src', 'dst'))
else:
print('Value of key "{}" in {} is different with value in {}'.format(print_skey, 'src', 'dst'))
a = {
"name":"hoge",
"vals":
{
"apple":11,
"banana":{
"hoge":1,
"fuga":"aaa"
}
}
}
b = {
"name": "hoge",
"vals":
{
"apple": 11,
"banana": {
"hoge": 2,
"fuga": "aaa",
}
}
}
compare_ndic(a, b)
The output is like this:
Value of key "vals.banana.fuga" in src is the same with value in dst
Value of key "vals.banana.hoge" in src is different with value in dst
Value of key "vals.apple" in src is the same with value in dst
Value of key "name" in src is the same with value in dst
Be careful, my code cannot be used directly for your scenario, because you have list in your data. You can add some conditional statements and to iterate the whole list if necessary. Anyway, I've just provided an idea to compare two dicts, you need to modify it in your own way.
You have mistake in accessing method.
a is list which is accessed as
a[0]
but a[0] is dictionary which is accessed as
a[0]['vals'] # 'vals' is a key stored in dictionary
to only know keys in dictionary you can try
a[0].keys() # gives you result dict_keys(['name', 'vals']) which you can iterate further as you wish
and you can get all elemnt using
a[0].items() # gives you dict_items([('name', 'hoge'), ('vals', [{'banana': {'hoge': 1, 'fuga': 'aaa'}, 'apple': 21}])])
Moreover use correct syntax in code.
you have used incorrect syntax in your code
a = [{"name": "hoge", "vals": [{"apple": 21, "banana": {"hoge": 1, "fuga": "aaa"}}]}]
b = [{"name": "hoge", "vals": [{ "apple": 21, "grape": {"foo": 1}}]}]
if a[0]['name'] in b[0]['name']:
print('first match')
if a[0]['name'] == b[0]['name']:
print('item exist with same value')
else:
print('item exist but not same value')
for key in a[0]['vals'][0].keys():
if key in b[0]['vals'][0].keys():
print('second match with key : ' + str(key))
if a[0]['vals'][0][str(key)] == b[0]['vals'][0][str(key)]:
print('match exist with same value for key : ' + str(key))
else:
print('match failed for key : ' + str(key))
else:
print('match failed at 1')

convert a list of delimited strings to a tree/nested dict, using python

I am trying to convert a list of dot-separated strings, e.g.
['one.two.three.four', 'one.six.seven.eight', 'five.nine.ten', 'twelve.zero']
into a tree (nested lists or dicts - anything that is easy to walk through).
The real data happens to have 1 to 4 dot-separated parts of different length and has 2200 records in total.
My actual goal is to fill in the set of 4 QComboBox'es with this data, in manner that the 1st QComboBox is filled with first set items ['one', 'five', 'twelve'] (no duplicates). Then depending on the chosen item, the 2nd QComboBox is filled with its related items: for 'one' it would be: ['two', 'six'], and so on, if there's another nested level.
So far I've got a working list -> nested dicts solution, but it's horribly slow, since I use regular dict(). And I seem to have a trouble to redesign it to a defaultdict in a way to easily work out filling the ComboBoxes properly.
My current code:
def list2tree(m):
tmp = {}
for i in range(len(m)):
if m.count('.') == 0:
return m
a = m.split('.', 1)
try:
tmp[a[0]].append(list2tree(a[1]))
except (KeyError, AttributeError):
tmp[a[0]] = list2tree(a[1])
return tmp
main_dict = {}
i = 0
for m in methods:
main_dict = list2tree(m)
i += 1
if (i % 100) == 0: print i, len(methods)
print main_dict, i, len(methods)
ls = ['one.two.three.four', 'one.six.seven.eight', 'five.nine.ten', 'twelve.zero']
tree = {}
for item in ls:
t = tree
for part in item.split('.'):
t = t.setdefault(part, {})
Result:
{
"twelve": {
"zero": {}
},
"five": {
"nine": {
"ten": {}
}
},
"one": {
"six": {
"seven": {
"eight": {}
}
},
"two": {
"three": {
"four": {}
}
}
}
}
While this is beyond the reach of the original question, some comments mentioned a form of this algorithm that incorporates values. I came up with this to that end:
def dictionaryafy(self, in_dict):
tree = {}
for key, value in in_dict.items():
t = tree
parts = key.split(".")
for part in parts[:-1]:
t = t.setdefault(part, {})
t[parts[-1]] = value
return tree

Categories

Resources