I had a task to flatten nested dict, which was easy. This is my code for that:
class Simple:
def __init__(self):
self.store_data = {}
def extract_data(self, config):
for key in config:
if isinstance(config[key], dict):
self.extract_data(config[key])
else:
self.store_data[{key}] = config[key]
return self.store_data
This was my intput:
input = {
'k1_lv1': {
'k1_lv2': 'v1_lv2', 'k2_lv2': 'v2_lv2'},
'k2_lv1': 'v1_lv1',
'k3_lv1': {
'k1_lv2': 'v1_lv2', 'k2_lv2': 'v2_vl2'},
'k4_lv1': 'v1_lv1',
}
and this was my output (imagine that the keys are unique):
output = {
'k1_lv2': 'v1_lv2', 'k2_lv2': 'v2_lv2',
'k2_lv1': 'v1_lv1',
'k1_lv2': 'v1_lv2', 'k2_lv2': 'v2_vl2',
'k4_lv1': 'v1_lv1'
}
but now my task has been changed and my output has to become like this:
output = {
'k1_lv1_k1_lv2': 'v1_lv2',
'k1_lv1_k2_lv2': 'v2_lv2',
'k2_lv1': 'v1_lv1',
'k3_lv1_k1_lv2': 'v1_lv2',
'k3_lv1_k2_lv2': 'v2_vl2',
'k4_lv1': 'v1_lv1'
}
so I have to not only flatten the nested dict, but have to save the keys of nested dicts.
I tried to achieve that output but I am failing.
You can use recursion for the task:
dct = {
"k1_lv1": {"k1_lv2": "v1_lv2", "k2_lv2": "v2_lv2"},
"k2_lv1": "v1_lv1",
"k3_lv1": {"k1_lv2": "v1_lv2", "k2_lv2": "v2_vl2"},
"k4_lv1": "v1_lv1",
}
def flatten(d, path=""):
if isinstance(d, dict):
for k, v in d.items():
yield from flatten(v, (path + "_" + k).strip("_"))
else:
yield (path, d)
out = dict(flatten(dct))
print(out)
Prints:
{
"k1_lv1_k1_lv2": "v1_lv2",
"k1_lv1_k2_lv2": "v2_lv2",
"k2_lv1": "v1_lv1",
"k3_lv1_k1_lv2": "v1_lv2",
"k3_lv1_k2_lv2": "v2_vl2",
"k4_lv1": "v1_lv1",
}
Why don't you loop through the keys using input.keys() and then stack keys using
output['{}_{}'.format(key_level1, key_level2]]= input['key_level1']['key_level2']
You might need to nest for loops and add a condition to test the depth of the keys in your dictionnary.
Related
I have a nested dictionary that looks like below:
d= {"key1":"A", "key2":"B", "score1":0.1, "score2":0.4, "depth":0,
"chain":[
{"key1":"A1", "key2":"B1", "score1":0.2, "score2":0.5, "depth":1,
"chain":[{"key1":"A11", "key2":"B11","score1":0.3, "score2":0.6, "depth":2},
{"key1":"A12", "key2":"B12","score1":0.5, "score2":0.7, "depth":2}]
},
{"key1":"A2", "key2":"B2","score1":0.1, "score2":0.2,"depth":1,
"chain":[{None, None, None, None, None},
{"key1":"A22", "key2":"B22","score1":0.1, "score2":0.5, "depth":2}]
}
]
}
I want to create a function that when I call fun(key1, d), it could return me a dictionary keeping the original hierarchy, but within each level, it will return the value of key1, and sum up the value of score1 and score2, like below:
{"A":0.5, "depth":0,
"chain":[
{"A1":0.7, "depth":1,
"chain":[{"A11":0.9,"depth":2},
{"A12":1.3, "depth":2}]
},
{"A2":0.3,"depth":1,
"chain":[None,
{"A22":0.6, "depth":2}]
}
]
}
How can I do this?
I have tried
def gen_dict_extract(key, input_dic):
return {input_dic[key]:input_dic["score1"]+input_dic["score2"],
"depth":input_dic["depth"],
"chain": gen_dict_extract(key,input_dic["chain"])}
There are two problems with the solution you've tried:
chain is not guaranteed to be present and
chain is a list of dictionaries and you are treating it as a single dictionary
Hopefully the following does what you want it to do:
def gen_dict_extract(key, input_dic):
rv = {
input_dic[key]: input_dic["score1"] + input_dic["score2"],
"depth": input_dic["depth"],
}
if "chain" in input_dic:
rv["chain"] = [gen_dict_extract(key, x) for x in input_dic["chain"]]
return rv
Since I have some None in the list of "chain", the following function worked in the end, which has some slight updates based on the solution #dvk provided:
def gen_dict_extract(key, input_dic):
rv = {
input_dic[key]: input_dic["score1"] + input_dic["score2"],
"depth": input_dic["depth"],
}
if "chain" in input_dic:
rv["chain"]=[]
for x in input_dic["chain"]:
if x is not None:
rv["chain"].insert(input_dic["chain"].index(x),gen_dict_extract(key, x))
return rv
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I made a successful attempt on the TestDome.com Fileowners problem and wanted to see if anyone had suggestions to simplify my answer. The online IDE uses Python 3.5.1. If you are trying to do the problem yourself and are just looking for an answer, here's one. I am by know means a Python expert so this took me quite a while to produce with lots of tinkering. Any comments at all would be helpful even if its about syntax or general cleanliness. THANKS!
Implement a group_by_owners function that:
Accepts a dictionary containing the file owner name for each file name.
Returns a dictionary containing a list of file names for each owner name, in any order.
For example, for dictionary {'Input.txt': 'Randy', 'Code.py': 'Stan', 'Output.txt': 'Randy'} the group_by_owners function should return {'Randy': ['Input.txt', 'Output.txt'], 'Stan': ['Code.py']}.
class FileOwners:
#staticmethod
def group_by_owners(files):
val = (list(files.values())) #get values from dict
val = set(val) #make values a set to remove duplicates
val = list(val) #make set a list so we can work with it
keyst = (list(files.keys())) #get keys from dict
result = {} #creat empty dict for output
for i in range(len(val)): #loop over values(owners)
for j in range(len(keyst)): #loop over keys(files)
if val[i]==list(files.values())[j]: #boolean to pick out files for current owner loop
dummylist = [keyst[j]] #make string pulled from dict a list so we can add it to the output in the correct format
if val[i] in result: #if the owner is already in the output add the new file to the existing dictionary entry
result[val[i]].append(keyst[j]) #add the new file
else: #if the owner is NOT already in the output make a new entry
result[val[i]] = dummylist #make a new entry
return result
files = {
'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'
}
print(FileOwners.group_by_owners(files))
Output:
{'Stan': ['Code.py'], 'Randy': ['Output.txt', 'Input.txt']}
Holly molly, that's a lot of code for something so simple:
def group_by_owners(files):
result = {}
for file, owner in files.items(): # use files.iteritems() on Python 2.x
result[owner] = result.get(owner, []) + [file] # you can use setdefault(), too
return result
files = {
'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'
}
print(group_by_owners(files))
# {'Stan': ['Code.py'], 'Randy': ['Output.txt', 'Input.txt']}
You can simplify it even further by using collections.defaultdict for result and initializing all its keys to list - then you don't even need to do the acrobatics of creating a new list if it's not already there before appending to it.
I personally found the upvoted answer hard to understand and some others a bit bulky. Here is my version:
def group_by_owners(files):
ownerdict = {}
for key, value in files.items():
if value in ownerdict:
ownerdict[value].append(key)
else:
ownerdict[value] = [key]
return ownerdict
files = {
'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'
}
print(group_by_owners(files))
This worked for me. I believe this would be more efficient and simpler one.
# Input dictionary.
files = {
'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'
}
# Function to group the files.
def group_by_owners(files):
# Dictionary object to hold the result.
result = {}
for key, value in files.items():
if value not in result.keys():
# Insert the values into the resulting dictionary
# by interchanging the key, values.
result[value] = [key]
else:
# Append the othet file name if the owner is same.
result[value].append(key)
return result
print(group_by_owners(files))
Here is the output:
{'Randy': ['Input.txt', 'Output.txt'], 'Stan': ['Code.py']}
I just started learning and I am a noob at python too but used 2.7. So spare me the parentheses for prints.
I like your idea of dummy list and append. Something similar but may be a little cleaner:
files = {
'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'
}
owners=[]
filenames=files.keys()
for key in files:
if files[key] not in owners:
owners.append(files[key])
print owners
print filenames
result={}
for i in owners:
resultvalues=[]
for key in files:
if files[key]==i:
resultvalues.append(key)
print resultvalues
result[i]=resultvalues
print result
here I mentioning easily understandable solution,
owner = []
for i in dict1.keys():
if j not in owner:
owner.append(j)
results = {}
for i in owner:
result = []
for j in dict1.keys():
if dict1.get(j) == i:
result.append(j)
results[i] = result
print results
def group_by_owners(files):
values = []
dic = {}
for k,v in files.items():
values.append(v)
s_values = set(values)
l_values = list(s_values)
for i in l_values:
keys=[]
for k,v in files.items():
if v == i:
keys.append(k)
dic[i]=keys
return dic
You all seems better I took 3.5 hours to get answer with lot of try and error as am new to python and weak in thinking focused, thanks to #zwer for teaching something new.
dic={'apple':['green','red'], 'power': ['voltage'],'banana': ['green','red'],'current':['voltage'],'grass':['green'],'tiger':['lion']}
i = 0
adic = {}
while i < len(dic):
key = list(dic)[i]
for key2 in list(dic):
if key2 != key:
j = 0
while j < len(dic[key]):
if dic[key][j] in dic[key2]:
color = dic[key][j]
if dic[key][j] in adic:
if key not in adic[color]:
adic[dic[key][j]].append(key)
else:
mdic = {dic[key][j]: [key]}
adic.update(mdic)
else:
if dic[key][j] not in adic:
mdic = {dic[key][j]: [key]}
adic.update(mdic)
j = j + 1
i = i + 1
print(adic)
class FileOwners:
def group_by_owners(files):
temp=[]
for i in files:
if files[i] not in temp:
temp.append (files[i])
result={}
for j in temp:
temp1=[]
for i in files:
if files[i]==j:
temp1.append(i)
result[j]=temp1
return (result)
files = {'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'}
print(FileOwners.group_by_owners(files))
You have to iterate Dictionary into Dictionary and compare values and append keys in a new list. Take output in a new Dictionary.
class FileOwners:
#staticmethod
def group_by_owners(files):
new_dic = {}
for key,val in files.items():
key_list = []
for k,v in files.items():
if v == val:
#print(v)
key_list.append(k)
new_dic[v]= key_list
return new_dic
#print(new_dic)
files = {
'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'
}
print(FileOwners.group_by_owners(files))
I made a dict of owners and their associated list of filenames. To do that i firstly created the dictionary with the values of the passed dictionary files as keys and empty list as values. Then looped through files and appended the empty list with the keys of files
def group_by_owners(files):
return_dict = {}
owner_files = files.keys()
for f in owner_files:
return_dict[files[f]] = []
for f, n in files.items():
return_dict[n].append(f)
return return_dict
I started learning Python yesterday and came across this question on testdome. Here's my answer :
def group_by_owners(files):
f = files.values()
s = set(f)
newdict={}
fileOwners = list(s)
for owner in fileOwners:
l=[]
for file in files:
if owner == files[file]:
l.append(file)
newdict[owner]=l
return newdict
files = {
'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'
}
print(group_by_owners(files))
This is some complex code to understand.
but is one of the solutions.
files={
'input.txt':'randy',
'output.txt':'randy',
'code.py':'stan'
}
k=files.keys()
newfiles={value:key for key,value in files.items()}
emptylist=[]
for key in k:
if key not in newfiles.values():
missfile=key
valuetoadd=files[missfile]
emptylist.append(newfiles[valuetoadd])
emptylist.append(missfile)
newfiles[valuetoadd]=emptylist
print(newfiles)
Here is my answer with defaultdict:
from collections import defaultdict
d = defaultdict(list)
def group_by_owners(files):
for k,v in files.items():
d[v].append(k)
return dict(d.items())
if __name__ == "__main__":
files = {
'Input.txt': 'Randy',
'Code.py': 'Stan',
'Output.txt': 'Randy'
}
print(group_by_owners(files))
I am trying to skim through a dictionary that contains asymmetrical data and make a list of unique headings. Aside from the normal key:value items, the data within the dictionary also includes other dictionaries, lists, lists of dictionaries, NoneTypes, and so on at various levels throughout. I would like to be able to keep the hierarchy of keys/indexes if possible. This will be used to assess the scope of the data and it's availability. The data comes from a JSON file and it's contents are subject to change.
My latest attempt is to do this through a series of type checks within a function, skim(), as seen below.
def skim(obj, header='', level=0):
if obj is None:
return
def skim_iterable(iterable):
lvl = level +1
if isinstance(iterable, (list, tuple)):
for value in iterable:
h = ':'.join([header, iterable.index(value)])
return skim(value, header=h, level=lvl)
elif isinstance(iterable, dict):
for key, value in iterable.items():
h = ':'.join([header, key])
return skim(value, header=h, level=lvl)
if isinstance(obj, (int, float, str, bool)):
return ':'.join([header, obj, level])
elif isinstance(obj, (list, dict, tuple)):
return skim_iterable(obj)
The intent is to make a recursive call to skim() until the key or list index position at the deepest level is passed and then returned. skim has a inner function that handles iterable objects which carries the level along with the key value or list index position forward through each nestled iterable object.
An example below
test = {"level_0Item_1": {
"level_1Item_1": {
"level_2Item_1": "value",
"level_2Item_2": "value"
},
"level_1Item_2": {
"level_2Item_1": "value",
"level_2Item_2": {}
}},
"level_0Item_2": [
{
"level_1Item_1": "value",
"level_1Item_2": 569028742
}
],
"level_0Item_3": []
}
collection = [skim(test)]
Right now I'm getting a return of [None] on the above code and would like some help troubleshooting or guidance on how best to approach this. What I was expecting is something like this:
['level_0Item_1:level_1Item_1:level_2Item_1',
'level_0Item_1:level_1Item_1:level_2Item_2',
'level_0Item_1:level_1Item_2:level_2Item_1',
'level_0Item_1:level_1Item_2:level_2Item_2',
'level_0Item_2:level_1Item_1',
'level_0Item_2:level_1Item_2',
'level_0Item_3]
Among other resources, I recently came across this question (python JSON complex objects (accounting for subclassing)), read it and it's included references. Full disclosure here, I've only began coding recently.
Thank you for your help.
You can try something like:
def skim(obj, connector=':', level=0, builded_str= ''):
if isinstance(obj, dict):
for k, v in obj.items():
if isinstance(v, dict) and v:
yield from skim(v, connector, level + 1, builded_str + k + connector)
elif isinstance(v, list) and v:
yield from skim(v[0], connector, level + 1, builded_str + k + connector)
else:
yield builded_str + k
else:
yield builded_str
Test:
test = {"level_0Item_1": {
"level_1Item_1": {
"level_2Item_1": "value",
"level_2Item_2": "value"
},
"level_1Item_2": {
"level_2Item_1": "value",
"level_2Item_2": {}
}},
"level_0Item_2": [
{
"level_1Item_1": "value",
"level_1Item_2": 569028742
}
],
"level_0Item_3": []
}
lst = list(skim(test))
print(lst)
['level_0Item_1:level_1Item_2:level_2Item_1`',
'level_0Item_1:level_1Item_2:level_2Item_2',
'level_0Item_1:level_1Item_1:level_2Item_1',
'level_0Item_1:level_1Item_1:level_2Item_2',
'level_0Item_2:level_1Item_2',
'level_0Item_2:level_1Item_1',
'level_0Item_3']`
For some third party APIs, there is a huge data that needs to be sent in the API parameters. And input data comes to our application in the CSV format.
I receive all the rows of the CSV containing around 120 columns, in a plane dict format by CSV DictReader.
file_data_obj = csv.DictReader(open(file_path, 'rU'))
This gives me each row in following format:
CSV_PARAMS = {
'param7': "Param name",
'param6': ["some name"],
'param5': 1234,
'param4': 999999999,
'param3': "some ",
'param2': {"x name":"y_value"},
'param1': None,
'paramA': "",
'paramZ': 2.687
}
And there is one nested dictionary containing all the third-party API parameters as keys with blank value.
eg. API_PARAMS = {
"param1": "",
"param2": "",
"param3": "",
"paramAZ": {
"paramA": "",
"paramZ": {"test1":1234, "name":{"hello":1}},
...
},
"param67": {
"param6": "",
"param7": ""
},
...
}
I have to map all the CSV Values to API parameters dynamically. following code works but upto 3 level nesting only.
def update_nested_params(self, paramdict, inpdict, result={}):
"""Iterate over nested dictionary up to level 3 """
for k, v in paramdict.items():
if isinstance(v, dict):
for k1, v1 in v.items():
if isinstance(v1, dict):
for k2, _ in v1.items():
result.update({k:{k1:{k2: inpdict.get(k2, '')}}})
else:
result.update({k:{k1: inpdict.get(k1, '')}})
else:
result.update({k: inpdict.get(k, '')})
return result
self.update_nested_params(API_PARAMS, CSV_PARAMS)
Is there any other efficient way to achieve this for n number of nestings of the API Parameters?
You could use recursion:
def update_nested_params(self, template, source):
result = {}
for key, value in template.items():
if key in source:
result[key] = source[key]
elif not isinstance(value, dict):
# assume the template value is a default
result[key] = value
else:
# recurse
result[key] = self.update_nested_params(value, source)
return result
This copies the 'template' (API_PARAMS) recursively, taking any key it finds from source if available, and recurses if not but the value in template is another dictionary. This handles nesting up to sys.getrecursionlimit() levels (default 1000).
Alternatively, use an explicit stack:
# extra import to add at the module top
from collections import deque
def update_nested_params(self, template, source):
top_level = {}
stack = deque([(top_level, template)])
while stack:
result, template = stack.pop()
for key, value in template.items():
if key in source:
result[key] = source[key]
elif not isinstance(value, dict):
# assume the template value is a default
result[key] = value
else:
# push nested dict into the stack
result[key] = {}
stack.append((result[key], value))
return top_level
This essentially just moves the call stack used in recursion to an explicit stack. The order in which keys are processed changes from depth to breath first but this doesn’t matter for your specific problem.
I'm using Yahoo Placemaker API which gives different structure of json depending on input.
Simple json file looks like this:
{
'document':{
'itemDetails':{
'id'='0'
'prop1':'1',
'prop2':'2'
}
'other':{
'propA':'A',
'propB':'B'
}
}
}
When I want to access itemDetails I simply write json_file['document']['itemDetails'].
But when I get more complicated response, such as
{
'document':{
'1':{
'itemDetails':{
'id'='1'
'prop1':'1',
'prop2':'2'
}
},
'0':{
'itemDetails':{
'id'='0'
'prop1':'1',
'prop2':'2'
},
'2':{
'itemDetails':{
'id'='1'
'prop1':'1',
'prop2':'2'
}
'other':{
'propA':'A',
'propB':'B'
}
}
}
the solution obviously does not work.
I use id, prop1 and prop2 to create objects.
What would be the best approach to automatically access itemDetails in the second case without writing json_file['document']['0']['itemDetails'] ?
If I understand correctly, you want to loop through all of json_file['document']['0']['itemDetails'], json_file['document']['1']['itemDetails'], ...
If that's the case, then:
item_details = {}
for key, value in json_file['document']:
item_details[key] = value['itemDetails']
Or, a one-liner:
item_details = {k: v['itemDetails'] for k, v in json_file['document']}
Then, you would access them as item_details['0'], item_details['1'], ...
Note: You can suppress the single quotes around 0 and 1, by using int(key) or int(k).
Edit:
If you want to access both cases seamlessly (whether there is one result or many), you could check:
if 'itemDetails' in json_file['document']:
item_details = {'0': json_file['document']['itemDetails']}
else:
item_details = {k: v['itemDetails'] for k, v in json_file['document'] if k != 'other'}
Then loop through the item_details dict.