I have a list of dictionaries l, each of them is a simple 1-level dictionary with the same keys a, b, c, d.
Now I want to build a nested dictionary from l in this shape (i is an member of l):
{
i['a']: {
i['b']: {
i['c']: {
i['d']: some_value,
}
}
}
}
Right now I'm using this snippet:
tmp = {}
for i in l:
if not i['a'] in tmp:
tmp[i['a']] = {}
if not i['b'] in tmp[i['a']]:
tmp[i['a']][i['b']] = {}
if not i['c'] in tmp[i['a']][i['b']]:
tmp[i['a']][i['b']][i['c']] = {}
tmp[i['a']][i['b']][i['c']][i['d']] = some_value
Is that the most efficient way if the original list is huge?
I would nest several collection.defaultdict objects to avoid key testing and use native code instead of slow python code. If a key doesn't exist, a default dictionary is created, except for the deeper one which can just be a dict:
tmp = collections.defaultdict(lambda : collections.defaultdict(lambda : collections.defaultdict(dict)))
for i in l:
tmp[i['a']][i['b']][i['c']][i['d']] = some_value
you can shorten the definition by aliasing the object name:
dd = collections.defaultdict
tmp = dd(lambda : dd(lambda : dd(dict)))
I'm trying to sort a list of objects in python into a dictionary of lists via a property of the objects in the original list
I've done it below, but this feels like something I should be able to do using a dictionary comprehension?
for position in totals["positions"]:
if not hasattr(totals["positions_dict"], position.get_asset_type_display()):
totals["positions_dict"][position.get_asset_type_display()] = []
totals["positions_dict"][position.get_asset_type_display()].append(position)
Some self improvements
totals["positions_dict"] = {}
for position in totals["positions"]:
key = position.get_asset_type_display()
if key not in totals["positions_dict"]:
totals["positions_dict"][key] = []
totals["positions_dict"][key].append(position)
You could use itertools.groupby and operator.methodcaller in a dict comprehension:
from operator import methodcaller
from itertools import groupby
key = methodcaller('get_asset_type_display')
totals["positions_dict"] = {k: list(g) for k, g in groupby(sorted(totals["positions"], key=key), key=key)}
Using a defaultdict as suggested by #Jean-FrançoisFabre allows you to do it with a single call to get_asset_type_display() in one loop:
from collections import defaultdict
totals["positions_dict"] = defaultdict(list)
for position in totals["positions"]:
totals["positions_dict"][position.get_asset_type_display()].append(position)
Haven't tested this, because I don't have your data.
And I think it's rather ugly, but it just may work:
totals ['positions_dict'] = {
key: [
position
for position in totals ['positions']
if position.get_asset_type_display () == key
]
for key in {
position.get_asset_type_display ()
for position in totals ['positions']
}
}
But I would prefer something very simple, and avoid needless lookups / calls:
positions = totals ['positions']
positions_dict = {}
for position in positions:
key = position.get_asset_type_display ()
if key in positions_dict:
positions_dict [key] .append (position)
else:
positions_dict [key] = [position]
totals ['positions_dict'] = positions_dict
positions = totals ['positions']
I've a list with master keys and a list of list of lists, where the first value of each enclosed list (like 'key_01') shall be a sub key for the corresponding values (like 'val_01', 'val_02'). The data is shown here:
master_keys = ["Master_01", "Master_02", "Master_03"]
data_long = [[['key_01','val_01','val_02'],['key_02','val_03','val_04'], ['key_03','val_05','val_06']],
[['key_04','val_07','val_08'], ['key_05','val_09','val_10'], ['key_06','val_11','val_12']],
[['key_07','val_13','val_14'], ['key_08','val_15','val_16'], ['key_09','val_17','val_18']]]
I would like these lists to be combined into a dictionary of dictionaries, like this:
master_dic = {
"Master_01": {'key_01':['val_01','val_02'],'key_02': ['val_03','val_04'], 'key_03': ['val_05','val_06']},
"Master_02": {'key_04': ['val_07','val_08'], 'key_05': ['val_09','val_10'], 'key_06': ['val_11','val_12']},
"Master_03": {'key_07': ['val_13','val_14'], ['key_08': ['val_15','val_16'], 'key_09': ['val_17','val_18']}
}
What I've got so far is the sub dict:
import itertools
master_dic = {}
servant_dic = {}
keys = []
values = []
for line in data_long:
for item in line:
keys.extend(item[:1])
values.append(item[1:])
servant_dic = dict(itertools.izip(keys, values))
Which puts out a dictionary, as expected.
servant_dic = {
'key_06': ['val_11','val_12'], 'key_04': ['val_08','val_07'], 'key_05': ['val_09','val_10'],
'key_02': ['val_03','val_04'], 'key_03': ['val_05','val_06'], 'key_01': ['val_01','val_02']
}
The problem is, that if I want to add the master_keys to this dictionary, so I get the wanted result, I'd have to do this in a certain order, which would be possible, if each line had a counter like this:
enumerated_dic =
{
0: {'key_01':['val_01','val_02'],'key_02': ['val_03','val_04'], 'key_03': ['val_05','val_06']},
1: {'key_04': ['val_07','val_08'], 'key_05': ['val_09','val_10'], 'key_06': ['val_11','val_12']},
2: {'key_07': ['val_13','val_14'], ['key_08': ['val_15','val_16'], 'key_09': ['val_17','val_18']}
}
I'd love to do this with enumerate(), while each line of the servant_dic is build, but can't figure out how. Since afterwards, i could simply replace the counters 0, 1, 2 etc. with the master_keys.
Thanks for your help.
master_keys = ["Master_01", "Master_02", "Master_03"]
data_long = [[['key_01','val_01','val_02'],['key_02','val_03','val_04'], ['key_03','val_05','val_06']],
[['key_04','val_07','val_08'], ['key_05','val_09','val_10'], ['key_06','val_11','val_12']],
[['key_07','val_13','val_14'], ['key_08','val_15','val_16'], ['key_09','val_17','val_18']]]
_dict = {}
for master_key, item in zip(master_keys, data_long):
_dict[master_key] = {x[0]: x[1:] for x in item}
print _dict
Hope this will help:
{master_key: {i[0]: i[1:] for i in subkeys} for master_key, subkeys in zip(master_keys, data_long)}
My functional approach:
master_dic = dict(zip(master_keys, [{k[0]: k[1::] for k in emb_list} for emb_list in data_long]))
print(master_dic)
You can also use pop and a dict comprehension:
for key, elements in zip(master_keys, data_long):
print {key: {el.pop(0): el for el in elements}}
...:
{'Master_01': {'key_02': ['val_03', 'val_04'], 'key_03': ['val_05', 'val_06']}}
{'Master_02': {'key_06': ['val_11', 'val_12'], 'key_04': ['val_07', 'val_08'], 'key_05': ['val_09', 'val_10']}}
{'Master_03': {'key_07': ['val_13', 'val_14'], 'key_08': ['val_15', 'val_16'], 'key_09': ['val_17', 'val_18']}}
I have the following dictionary (short version, real data is much larger):
dict = {'C-STD-B&M-SUM:-1': 0, 'C-STD-B&M-SUM:-10': 4.520475, 'H-NSW-BAC-ART:-9': 0.33784000000000003, 'H-NSW-BAC-ART:0': 0, 'H-NSW-BAC-ENG:-59': 0.020309999999999998, 'H-NSW-BAC-ENG:-6': 0,}
I want to divide it into smaller nested dictionaries, depending on a part of the key name.
Expected output would be:
# fixed closing brackets
dict1 = {'C-STD-B&M-SUM: {'-1': 0, '-10': 4.520475}}
dict2 = {'H-NSW-BAC-ART: {'-9': 0.33784000000000003, '0': 0}}
dict3 = {'H-NSW-BAC-ENG: {'-59': 0.020309999999999998, '-6': 0}}
Logic behind is:
dict1: if the part of the key name is 'C-STD-B&M-SUM', add to dict1.
dict2: if the part of the key name is 'H-NSW-BAC-ART', add to dict2.
dict3: if the part of the key name is 'H-NSW-BAC-ENG', add to dict3.
Partial code so far:
def divide_dictionaries(dict):
c_std_bem_sum = {}
for k, v in dict.items():
if k[0:13] == 'C-STD-B&M-SUM':
c_std_bem_sum = k[14:17], v
What I'm trying to do is to create the nested dictionaries that I need and then I'll create the dictionary and add the nested one to it, but I'm not sure if it's a good way to do it.
When I run the code above, the variable c_std_bem_sum becomes a tuple, with only two values that are changed at each iteration. How can I make it be a dictionary, so I can later create another dictionary, and use this one as the value for one of the keys?
One way to approach it would be to do something like
d = {'C-STD-B&M-SUM:-1': 0, 'C-STD-B&M-SUM:-10': 4.520475, 'H-NSW-BAC-ART:-9': 0.33784000000000003, 'H-NSW-BAC-ART:0': 0, 'H-NSW-BAC-ENG:-59': 0.020309999999999998, 'H-NSW-BAC-ENG:-6': 0,}
def divide_dictionaries(somedict):
out = {}
for k,v in somedict.items():
head, tail = k.split(":")
subdict = out.setdefault(head, {})
subdict[tail] = v
return out
which gives
>>> dnew = divide_dictionaries(d)
>>> import pprint
>>> pprint.pprint(dnew)
{'C-STD-B&M-SUM': {'-1': 0, '-10': 4.520475},
'H-NSW-BAC-ART': {'-9': 0.33784000000000003, '0': 0},
'H-NSW-BAC-ENG': {'-59': 0.020309999999999998, '-6': 0}}
A few notes:
(1) We're using nested dictionaries instead of creating separate named dictionaries, which aren't convenient.
(2) We used setdefault, which is a handy way to say "give me the value in the dictionary, but if there isn't one, add this to the dictionary and return it instead.". Saves an if.
(3) We can use .split(":") instead of hardcoding the width, which isn't very robust -- at least assuming that's the delimiter, anyway!
(4) It's a bad idea to use dict, the name of a builtin type, as a variable name.
That's because you're setting your dictionary and overriding it with a tuple:
>>> a = 1, 2
>>> print a
>>> (1,2)
Now for your example:
>>> def divide_dictionaries(dict):
>>> c_std_bem_sum = {}
>>> for k, v in dict.items():
>>> if k[0:13] == 'C-STD-B&M-SUM':
>>> new_key = k[14:17] # sure you don't want [14:], open ended?
>>> c_std_bem_sum[new_key] = v
Basically, this grabs the rest of the key (or 3 characters, as you have it, the [14:None] or [14:] would get the rest of the string) and then uses that as the new key for the dict.
I am initializing a dictionary of empty lists. Is there a more efficient method of initializing the dictionary than the following?
dictLists = {}
dictLists['xcg'] = []
dictLists['bsd'] = []
dictLists['ghf'] = []
dictLists['cda'] = []
...
Is there a way I do not have to write dictLists each time, or is this unavoidable?
You can use collections.defaultdict it allows you to set a factory method that returns specific values on missing keys.
a = collections.defaultdict(list)
Edit:
Here are my keys
b = ['a', 'b','c','d','e']
Now here is me using the "predefined keys"
for key in b:
a[key].append(33)
If the keys are known in advance, you can do
dictLists = dict((key, []) for key in ["xcg", "bsd", ...])
Or, in Python >=2.7:
dictLists = {key: [] for key in ["xcg", "bsd", ...]}
If you adding static keys, why not just directly put them in the dict constructor?
dictLists = dict(xcg=[], bsd=[], ghf=[])
or, if your keys are not always also valid python identifiers:
dictLists = {'xcg': [], 'bsd': [], 'ghf': []}
If, on the other hand, your keys are not static or easily stored in a sequence, looping over the keys could be more efficient if there are a lot of them. The following example still types out the keys variable making this no more efficient than the above methods, but if keys were generated in some way this could be more efficient:
keys = ['xcg', 'bsd', 'ghf', …]
dictLists = {key: [] for key in keys}
You can also use a for loop for the keys to add:
for i in ("xcg", "bsd", "ghf", "cda"):
dictList[i] = []
In the versions of Python that support dictionary comprehensions (Python 2.7+):
dictLists = {listKey: list() for listKey in listKeys}