I have a dictionary of dictionaries like this small example:
small example:
dict = {1: {'A': 8520, 'C': 5772, 'T': 7610, 'G': 5518}, 2: {'A': 8900, 'C': 6155, 'T': 6860, 'G': 5505}}
I want to make an other dictionary of dictionaries in which instead of absolute numbers I would have the frequency of every number in every sub-dictionary. for example for the 1st inner dictionary I would have the following sub-dictionary:
1: {'A': 31.25, 'C': 21, 'T': 27.75, 'G': 20}
here is the expected output:
dict2 = {1: {'A': 31.25, 'C': 21, 'T': 27.75, 'G': 20}, 2: {'A': 32.5, 'C': 22.50, 'T': 25, 'G': 20}}
I am trying to do that in python using the following command:
dict2 = {}
for item in dict.items():
freq = item.items/sum(item.items())
dict2[] = freq
but the results of this code is not what I want. do you know how to fix it?
What you want is to process the inner dictionaries without modifying the keys of the big one. Outsource the frequency into a function:
def get_frequency(d):
total = sum(d.values())
return {key: value / total * 100 for key, value in d.items()}
Then use a dict comprehension to apply the function on all your sub dictionaries:
dict2 = {key: get_frequency(value) for key, value in dict1.items()}
Note that I added a * 100, it appears from your output that you are looking for percents from 0-100 and not a float from 0-1.
Edit:
If you're using python2 / is integer division so add a float like so:
return {key: float(value) / total * 100 for key, value in d.items()}
You could do the following:
dct = {1: {'A': 8520, 'C': 5772, 'T': 7610, 'G': 5518}, 2: {'A': 8900, 'C': 6155, 'T': 6860, 'G': 5505}}
result = {}
for key, d in dct.items():
total = sum(d.values())
result[key] = {k : a / total for k, a in d.items()}
print(result)
Output
{1: {'C': 0.21050328227571116, 'T': 0.2775346462436178, 'G': 0.2012399708242159, 'A': 0.31072210065645517}, 2: {'C': 0.22447118891320203, 'T': 0.25018234865062, 'G': 0.20076586433260393, 'A': 0.32458059810357404}}
Related
I am a beginner in python trying to create a function that filters through my nested dictionary through by asking multiple values in a dictionary like
filtered_options = {'a': 5, 'b': "Cloth'}
For my dictionary
my_dict = {1.0:{'a': 1, 'b': "Food', 'c': 500, 'd': 'Yams'},
2.0:{'a': 5, 'v': "Cloth', 'c': 210, 'd': 'Linen'}}
If I input my dictionary in the filter function with such options I should get something that looks like
filtered_dict(my_dict, filtered_options = {'a': 5, 'b': "Cloth'})
which outputs the 2nd key and other keys with the same filtered options in my dictionary.
This should do what you want.
def dict_matches(d, filters):
return all(k in d and d[k] == v for k, v in filters.items())
def filter_dict(d, filters=None):
filters = filters or {}
return {k: v for k, v in d.items() if dict_matches(v, filters)}
Here's what happens when you test it:
>>> filters = {'a': 5, 'b': 'Cloth'}
>>> my_dict = {
... 1.0: {'a': 1, 'b': 'Food', 'c': 500, 'd': 'Yams'},
... 2.0: {'a': 5, 'b': 'Cloth', 'c': 210, 'd': 'Linen'}
... }
>>> filter_dict(my_dict, filters)
{2.0: {'b': 'Cloth', 'a': 5, 'd': 'Linen', 'c': 210}}
You can do this :
import operator
from functools import reduce
def multi_level_indexing(nested_dict, key_list):
"""Multi level index a nested dictionary, nested_dict through a list of keys in dictionaries, key_list
"""
return reduce(operator.getitem, key_list, nested_dict)
def filtered_dict(my_dict, filtered_options):
return {k : v for k, v in my_dict.items() if all(multi_level_indexing(my_dict, [k,f_k]) == f_v for f_k, f_v in filtered_options.items())}
So that:
my_dict = {1.0:{'a': 1, 'b': 'Food', 'c': 500, 'd': 'Yams'},
2.0:{'a': 5, 'b': 'Cloth', 'c': 210, 'd': 'Linen'}}
will give you:
print(filtered_dict(my_dict, {'a': 5, 'b': 'Cloth'}))
# prints {2.0: {'a': 5, 'b': 'Cloth', 'c': 210, 'd': 'Linen'}}
I have two dictionaries. one is a nested dictionary and another one is general dictionary. I want to do some divisions:
dict1 = {'document1': {'a': 3, 'b': 1, 'c': 5}, 'document2': {'d': 2, 'e': 4}}
dict2 = {'document1': 28, 'document2': 36}
I want to use the inner dictionary values form dict1 to divided by the value of matching document in dict2. The expect output would be:
enter code here
dict3 = {'document1': {'a': 3/28, 'b': 1/28, 'c': 5/28}, 'document2': {'d': 2/36, 'e': 4/36}}
I tried using two for loop to run each dictionary, but values will be duplicate multiple times and I have no idea how to fix this? Does anyone has idea of how to achieve this goal? I would be appreciate it!``
You can achieve this using dictionary comprehension.
dict3 = {} # create a new dictionary
# iterate dict1 keys, to get value from dict2, which will be used to divide dict 1 values
for d in dict1:
y = dict2[d]
dict3[d] = {k:(v/y) for k, v in dict1[d].items() }
You can try the following code
dict1 = {'document1': {'a': 3, 'b': 1, 'c': 5},
'document2': {'d': 2, 'e': 4}}
dict2 = {'document1': 28, 'document2': 36}
for k,v in dict1.items():
for ki,vi in v.items():
dict1[k][ki] /= dict2[k]
print(dict1)
# output
#{'document1': {'a': 0.10714285714285714, 'b': 0.03571428571428571, 'c': 0.17857142857142858},
#'document2': {'d': 0.05555555555555555, 'e': 0.1111111111111111}}
In one line, using nested dictionary comprehensions:
dict3 = {doc_key: {k: (v/doc_value) for k, v in dict1[doc_key].items()} for doc_key, doc_value in dict2.items()}
Having a dict like:
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
I'd like to have a new key total with the sum of each key in the subdictionaries, like:
x['total'] = {'a': 3, 'b': 7}
I've tried adapting the answer from this question but found no success.
Could someone shed a light?
Assuming all the values of x are dictionaries, you can iterate over their items to compose your new dictionary.
from collections import defaultdict
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
total = defaultdict(int)
for d in x.values():
for k, v in d.items():
total[k] += v
print(total)
# defaultdict(<class 'int'>, {'a': 3, 'b': 7})
A variation of Patrick answer, using collections.Counter and just update since sub-dicts are already in the proper format:
from collections import Counter
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
total = Counter()
for d in x.values():
total.update(d)
print(total)
result:
Counter({'b': 7, 'a': 3})
(update works differently for Counter, it doesn't overwrite the keys but adds to the current value, that's one of the subtle differences with defaultdict(int))
You can use a dictionary comprehension:
x = {'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}}
full_sub_keys = {i for b in map(dict.keys, x.values()) for i in b}
x['total'] = {i:sum(b.get(i, 0) for b in x.values()) for i in full_sub_keys}
Output:
{'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}, 'total': {'b': 7, 'a': 3}}
from collections import defaultdict
dictionary = defaultdict(int)
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
for key, numbers in x.items():
for key, num in numbers.items():
dictionary[key] += num
x['total'] = {key: value for key, value in dictionary.items()}
print(x)
We can create a default dict to iterate through each of they key, value pairs in the nested dictionary and sum up the total for each key. That should enable a to evaluate to 3 and b to evaluate to 7. After we increment the values we can do a simple dictionary comprehension to create another nested dictionary for the totals, and make a/b the keys and their sums the values. Here is your output:
{'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}, 'total': {'a': 3, 'b': 7}}
I have output from python networkX code:
flow_value, flow_dict = nx.maximum_flow(T, 'O', 'T')
print(flow_dict)
#Output as followesenter
#{'O': {'A': 4, 'B': 6, 'C': 4}, 'A': {'B': 1, 'D': 3}, 'B': {'C': 0, 'E': 3,'D': 4}, 'C': {'E': 4}, 'E': {'D': 1, 'T': 6}, 'D': {'T': 8}, 'T': {}}
I want to extract all the data in the form looks like:
#('O','A',4),('O','B','6'),('O','C','4'),('A','B',1),......,('D','T',8)
Any ways can I traverse thru the nested dict and get the data I need?
I tried this and it works. Some type checking to only capture strings
def retrieve_all_strings_from_dict(nested_dict, keys_to_ignore = None):
values = []
if not keys_to_ignore:
keys_to_ignore = []
else: keys_to_ignore = to_list(keys_to_ignore)
if not isinstance(nested_dict,dict):
return values
dict_stack = []
dict_stack.append(nested_dict)
for dict_var in dict_stack:
data_list = [v for k,v in dict_var.items() if all([isinstance(v,str), k not in keys_to_ignore]) ]
additional_dicts = [v for k,v in dict_var.items() if isinstance(v,dict)]
for x in additional_dicts:
dict_stack.append(x)
for w in data_list:
values.append(w)
return values
I have a multidictionary:
{'a': {'b': {'c': {'d': '2'}}},
'b': {'b': {'c': {'d': '7'}}},
'c': {'b': {'c': {'d': '3'}}},
'f': {'d': {'c': {'d': '1'}}}}
How can I sort it based on the values '2' '3' '7' '1'
so my output will be:
f.d.c.d.1
a.b.c.d.2
c.b.c.d.3
b.b.c.d.7
You've got a fixed-shape structure, which is pretty simple to sort:
>>> d = {'a': {'b': {'c': {'d': '2'}}}, 'c': {'b': {'c': {'d': '3'}}}, 'b': {'b': {'c': {'d': '7'}}}, 'f': {'d': {'c': {'d': '1'}}}}
>>> sorted(d, key=lambda x: d[x].values()[0].values()[0].values()[0])
['f', 'a', 'c', 'b']
>>> sorted(d.items(), key=lambda x: x[1].values()[0].values()[0].values()[0])
[('f', {'d': {'c': {'d': '1'}}}),
('a', {'b': {'c': {'d': '2'}}}),
('c', {'b': {'c': {'d': '3'}}}),
('b', {'b': {'c': {'d': '7'}}})]
Yes, this is a bit ugly and clumsy, but only because your structure is inherently ugly and clumsy.
In fact, other than the fact that d['f'] has a key 'd' instead of 'b', it's even more straightforward. I suspect that may be a typo, in which case things are even easier:
>>> d = {'a': {'b': {'c': {'d': '2'}}}, 'c': {'b': {'c': {'d': '3'}}}, 'b': {'b': {'c': {'d': '7'}}}, 'f': {'b': {'c': {'d': '1'}}}}
>>> sorted(d.items(), key=lambda x:x[1]['b']['c']['d'])
[('f', {'b': {'c': {'d': '1'}}}),
('a', {'b': {'c': {'d': '2'}}}),
('c', {'b': {'c': {'d': '3'}}}),
('b', {'b': {'c': {'d': '7'}}})]
As others have pointed out, this is almost certainly not the right data structure for whatever it is you're trying to do. But, if it is, this is how to deal with it.
PS, it's confusing to call this a "multidictionary". That term usually means "dictionary with potentially multiple values per key" (a concept which in Python you'd probably implement as a defaultdict with list or set as its default). A single, single-valued dictionary that happens to contain dictionaries is better named a "nested dictionary".
In my opinion this kind of design is very hard to read and maintain. Can you consider replacing the internal dictionaries with string-names?
E.g.:
mydict = {
'a.b.c.d' : 2,
'b.b.c.d' : 7,
'c.b.c.d' : 3,
'f.d.c.d' : 1,
}
This one is much easier to sort and waaaay more readable.
Now, a dictionary is something unsortable due to its nature. Thus, you have to sort an e.g. a list representation of it:
my_sorted_dict_as_list = sorted(mydict.items(),
key=lambda kv_pair: kv_pair[1])
you can do it recursively:
d = {'a': {'b': {'c': {'d': '2'}}}, 'c': {'b': {'c': {'d': '3'}}}, 'b': {'b': {'c': {'d': '7'}}}, 'f': {'d': {'c': {'d': '1'}}}}
def nested_to_string(item):
if hasattr(item, 'items'):
out = ''
for key in item.keys():
out += '%s.' % key + nested_to_string(item[key])
return out
else:
return item + '\n'
print nested_to_string(d)
or
def nested_to_string(item):
def rec_fun(item, temp, res):
if hasattr(item, 'items'):
for key in item.keys():
temp += '%s.' % key
rec_fun(item[key], temp, res)
temp = ''
else:
res.append(temp + item)
res = []
rec_fun(d, '', res)
return res
why do you want to do this.
Your data structure is basically a multi-level tree, so a good way to do what you want is to do what is called a depth-first traversal of it, which can be done recursively, and then massage the intermediate results a bit to sort and format them them into the desired format.
multidict = {'a': {'b': {'c': {'d': '2'}}},
'b': {'b': {'c': {'d': '7'}}},
'c': {'b': {'c': {'d': '3'}}},
'f': {'d': {'c': {'d': '1'}}}}
def nested_dict_to_string(nested_dict):
chains = []
for key,value in nested_dict.items():
chains.append([key] + visit(value))
chains = ['.'.join(chain) for chain in sorted(chains, key=lambda chain: chain[-1])]
return '\n'.join(chains)
def visit(node):
result = []
try:
for key,value in node.items():
result += [key] + visit(value)
except AttributeError:
result = [node]
return result
print nested_dict_to_string(multidict)
Output:
f.d.c.d.1
a.b.c.d.2
c.b.c.d.3
b.b.c.d.7