i have the following list which can contain multiple dictionaries of different sizes.
The keys in each dictionary are unique, but one key may exist in different dictionaries. Values are unique across dictionaries.
I want to trim down my dictionaries so that they contain the keys and values for which the value is the highest among all dictionaries.
For example, the key '1258' exists in three of the four dictionaries, and it has the highest value only in the last one, so in the reconstructed list, this key and its value will be in the last dictionary only.
If the key doesn't exist in other dictionaries, then it will remain in the dictionary where it belongs to.
here is sample data:
[{'1258': 1.0167004,
'160': 1.5989301000000002,
'1620': 1.3058813000000002,
'2571': 0.7914598,
'26': 4.554409,
'2943': 0.5072369,
'2951': 0.4955711,
'2952': 1.2380746000000002,
'2953': 1.6159719,
'2958': 0.4340355,
'2959': 0.6026906,
'2978': 0.619001,
'2985': 1.5677016,
'3075': 1.04948,
'3222': 0.9721148000000001,
'3388': 1.680108,
'341': 0.8871856,
'3443': 0.6000103,
'361': 2.6682623000000003,
'4': 5.227341,
'601': 2.2614983999999994,
'605': 0.6303175999999999,
'9': 5.0326675},
{'1457': 5.625237999999999,
'1469': 25.45585200000001,
'1470': 25.45585200000001,
'160': 0.395728,
'1620': 0.420267,
'2571': 0.449151,
'26': 0.278281,
'601': 0.384822,
'605': 5.746278700000001,
'9': 1.487241},
{'1258': 0.27440200000000003,
'1457': 0.8723639999999999,
'1620': 0.182567,
'2571': 0.197134,
'2943': 0.3461654,
'2951': 0.47372800000000004,
'2952': 0.6662919999999999,
'2953': 0.6725458,
'2958': 0.4437159,
'2959': 0.690856,
'2985': 0.8106226999999999,
'3075': 0.352618,
'3222': 0.7866500000000001,
'3388': 0.760664,
'3443': 0.129771,
'601': 0.345448,
'605': 1.909823,
'9': 0.888999},
{'1258': 1.0853083,
'160': 0.622579,
'1620': 0.7419095,
'2571': 0.9828758,
'2943': 2.254124,
'2951': 0.6294688,
'2952': 1.0965362,
'2953': 1.8409954000000002,
'2958': 0.7394122999999999,
'2959': 0.9398920000000001,
'2978': 0.672122,
'2985': 1.2385512999999997,
'3075': 0.912366,
'3222': 0.8364904,
'3388': 0.37316499999999997,
'341': 1.0399186,
'3443': 0.547093,
'361': 0.3313275,
'601': 0.5318834,
'605': 0.2909876}]
Here's one approach. I shortened your example to one that's easier to reason about.
>>> dcts = [
... {1:2, 3:4, 5:6},
... {1:3, 6:7, 8:9},
... {6:10, 8:11, 9:12}]
>>>
>>> [{k:v for k,v in d.items() if v == max(d.get(k) for d in dcts)} for d in dcts]
[{3: 4, 5: 6}, {1: 3}, {8: 11, 9: 12, 6: 10}]
edit:
more efficient because the max is only computed once for each key:
>>> from operator import or_
>>> from functools import reduce
>>> allkeys = reduce(or_, (d.viewkeys() for d in dcts))
>>> max_vals = {k:max(d.get(k) for d in dcts) for k in allkeys}
>>> result = [{k:v for k,v in d.items() if v == max_vals[k]} for d in dcts]
>>> result
[{3: 4, 5: 6}, {1: 3}, {8: 11, 9: 12, 6: 10}]
Related
Problem
I have two dictionaries: a and b
a={'e':[1,2]}
b={'d':[11,22]}
How can I convert them to a dictionary of dictioneries that contains all possible combinations of the lists [1,2] and [11,22]. The expected result should be:
dic={1:{'e':1,'d':11},
2:{'e':2,'d':22},
3:{'e':1,'d':11},
4:{'e':2,'d':22}}
My attemt:
I can easily get the combinations of two lists or any set of lists using itertools like so:
l=list(itertools.product([1,2],[11,22]))
But I don't know how to proceed from here. Any suggestions?
You almost have it. Just have to create dictionaries by mapping keys to the tuples in l.
keys = list(a.keys()) + list(b.keys())
dic = {k: dict(zip(keys, tpl)) for k, tpl in enumerate(l), 1)}
Here's a bit more generalized approach:
(i) First combine the dictionaries:
combined = {**a, **b}
(ii) Find the Cartesian product the list items in combined.values().
(iii) Iterate over outcome from (ii) and create dictionaries with each tuple and the keys in combined.keys():
dic = {k: dict(zip(combined.keys(), tpl)) for k, tpl in enumerate(itertools.product(*combined.values()), 1)}
Output:
{1: {'e': 1, 'd': 11}, 2: {'e': 1, 'd': 22}, 3: {'e': 2, 'd': 11}, 4: {'e': 2, 'd': 22}}
Given l as you calculated it you can use dict comprehension and achieve the required output -
{idx+1 : {'e': item[0], 'd': item[1]} for idx, item in enumerate(l)}
(BTW there is no need to convert the itertools.product result to list)
I have a complex dictionary:
l = {10: [{'a':1, 'T':'y'}, {'a':2, 'T':'n'}], 20: [{'a':3,'T':'n'}]}
When I'm trying to iterate over the dictionary I'm not getting a dictionary with a list for values that are a dictionary I'm getting a tuple like so:
for m in l.items():
print(m)
(10, [{'a': 1, 'T': 'y'}, {'a': 2, 'T': 'n'}])
(20, [{'a': 3, 'T': 'n'}])
But when I just print l I get my original dictionary:
In [7]: l
Out[7]: {10: [{'a': 1, 'T': 'y'}, {'a': 2, 'T': 'n'}], 20: [{'a': 3, 'T': 'n'}]}
How do I iterate over the dictionary? I still need the keys and to process each dictionary in the value list.
There are two questions here. First, you ask why this is turned into a "tuple" - the answer to that question is because that is what the .items() method on dictionaries returns - a tuple of each key/value pair.
Knowing this, you can then decide how to use this information. You can choose to expand the tuple into the two parts during iteration
for k, v in l.items():
# Now k has the value of the key and v is the value
# So you can either use the value directly
print(v[0]);
# or access using the key
value = l[k];
print(value[0]);
# Both yield the same value
With a dictionary you can add another variable while iterating over it.
for key, value in l.items():
print(key,value)
I often rely on pprint when processing a nested object to know at a glance what structure that I am dealing with.
from pprint import pprint
l = {10: [{'a':1, 'T':'y'}, {'a':2, 'T':'n'}], 20: [{'a':3,'T':'n'}]}
pprint(l, indent=4, width=40)
Output:
{ 10: [ {'T': 'y', 'a': 1},
{'T': 'n', 'a': 2}],
20: [{'T': 'n', 'a': 3}]}
Others have already answered with implementations.
Thanks for all the help. I did discuss figure out how to process this. Here is the implementation I came up with:
for m in l.items():
k,v = m
print(f"key: {k}, val: {v}")
for n in v:
print(f"key: {n['a']}, val: {n['T']}")
Thanks for everyones help!
I have a dictionary of values:
dic = {1: "a1+b+c", 2: "a1+c+v", 3: "a1+z+e", 4: "a2+p+a", 5: "a2+z+v", 6: "a3+q+v", ...}
I have a page in Flask, that has checkboxes for each partial string value in a dictionary, e.g. checkboxes "a", "b", "c",... etc. On the page, the checkboxes are located in groups a1, a2, a3, etc.
I need to filter the dictionary by the partial values based on the values of the selected checkboxes, for example, when selecting "c" in group a1, it would return:
1: a1+b+c
2: a1+c+v
When selecting "z" from group a2, it would return:
5: "a2+z+v"
The code that generates an error is:
sol = [k for k in dic if 'a1' in k]
Can someone point me to the right direction?
You can easily solve this with a quite short function:
def lookup(dct, *args):
for needle in args:
dct = {key: value for key, value in dct.items() if needle in value}
return dct
For example:
>>> dic = {1: "a1+b+c", 2: "a1+c+v", 3: "a1+z+e", 4: "a2+p+a", 5: "a2+z+v", 6: "a3+q+v"}
>>> lookup(dic, "a1", "c")
{1: 'a1+b+c', 2: 'a1+c+v'}
However that always needs to iterate over all keys for each "needle". You can do better if you have a helper dictionary (I'll use a collections.defaultdict here) that stores all keys that match one needle (assuming + is supposed to be a delimiter in your dictionary):
from collections import defaultdict
helperdict = defaultdict(set)
for key, value in dic.items():
for needle in value.split('+'):
helperdict[needle].add(key)
That helperdict now contains all keys that match one particular part of a value:
>>> print(dict(helperdict))
{'z': {3, 5}, 'p': {4}, 'a1': {1, 2, 3}, 'a3': {6}, 'v': {2, 5, 6}, 'a2': {4, 5}, 'e': {3}, 'b': {1}, 'a': {4}, 'c': {1, 2}, 'q': {6}}
And using set.intersection allows you to quickly get all matches for different combinations:
>>> search = ['a2', 'z']
>>> matches = set.intersection(*[helperdict[needle] for needle in search])
>>> {match: dic[match] for match in matches}
{5: 'a2+z+v'}
It's definitely longer than the first approach and requires more external memory but if you plan to do several queries it will be much faster.
I have a dictionary where the values are lists, and I would like to know how many elements in lists associated with each key. I've found here this one. But I need total number of elements only for one key. for example for
>>> from collections import Counter
>>> my_dict = {'I': [23,24,23,23,24], 'P': [17,23,23,17,24,12]}
>>> {k: Counter(v) for k, v in my_dict.items()}
{'P': Counter({17: 2, 23: 2, 24: 1, 12: 1}), 'I': Counter({23: 3, 24: 2})}
For example {P:6}, will be better if it give just number, count_elements=5
This will get the number of values for the given key key. I believe that's what the question asked.
my_dict= {"I":[23,24,23,23,24],"P":[17,23,23,17,24,12]}
number = len(my_dict.get(key, []))
>>> my_dict= {'I':[23,24,23,23,24],'P':[17,23,23,17,24,12]}
>>> {k: len(v) for k, v in my_dict.items()}
{'I': 5, 'P': 6}
A single key is simple:
>>> len(my_dict['P'])
6
As #Joe suggested len(my_dict.get(key, [])) works when a key doesn't exist, which potentially works, but then you can't distinguish between keys with empty lists, and keys that don't exist. You can catch the KeyError here in that case.
Is that what you had in mind?
my_dict= {'I':[23,24,23,23,24],'P':[17,23,23,17,24,12]}
print {k:len(v) for k, v in my_dict.items()}
{'I': 5, 'P': 6}
My question is: How can I get a dictionary key using a dictionary value?
d={'dict2': {1: 'one', 2: 'two'}, 'dict1': {3: 'three', 4: 'four'}}
I want to get dict2 the key of the key of two.
Thanks.
Here's a recursive solution that can handle arbitrarily nested dictionaries:
>>> import collections
>>> def dict_find_recursive(d, target):
... if not isinstance(d, collections.Mapping):
... return d == target
... else:
... for k in d:
... if dict_find_recursive(d[k], target) != False:
... return k
... return False
It's not as efficient in the long run as a "reverse dictionary," but if you aren't doing such reverse searches frequently, it probably doesn't matter. (Note that you have to explicitly compare the result of dict_find_recursive(d[k], target) to False because otherwise falsy keys like '' cause the search to fail. In fact, even this version fails if False is used as a key; a fully general solution would use a unique sentinel object() to indicate falseness.)
A few usage examples:
>>> d = {'dict1': {3: 'three', 4: 'four'}, 'dict2': {1: 'one', 2: 'two'}}
>>> dict_find_recursive(d, 'two')
'dict2'
>>> dict_find_recursive(d, 'five')
False
>>> d = {'dict1': {3: 'three', 4: 'four'}, 'dict2': {1: 'one', 2: 'two'},
'dict3': {1: {1:'five'}, 2: 'six'}}
>>> dict_find_recursive(d, 'five')
'dict3'
>>> dict_find_recursive(d, 'six')
'dict3'
If you want to reverse an arbitrarily nested set of dictionaries, recursive generators are your friend:
>>> def dict_flatten(d):
... if not isinstance(d, collections.Mapping):
... yield d
... else:
... for value in d:
... for item in dict_flatten(d[value]):
... yield item
...
>>> list(dict_flatten(d))
['three', 'four', 'five', 'six', 'one', 'two']
The above simply lists all the values in the dictionary that aren't mappings. You can then map each of those values to a key like so:
>>> def reverse_nested_dict(d):
... for k in d:
... if not isinstance(d[k], collections.Mapping):
... yield (d[k], k)
... else:
... for item in dict_flatten(d[k]):
... yield (item, k)
...
This generates a iterable of tuples, so no information is lost:
>>> for tup in reverse_nested_dict(d):
... print tup
...
('three', 'dict1')
('four', 'dict1')
('five', 'dict3')
('six', 'dict3')
('one', 'dict2')
('two', 'dict2')
If you know that all your non-mapping values are hashable -- and if you know they are unique, or if you don't care about collisions -- then just pass the resulting tuples to dict():
>>> dict(reverse_nested_dict(d))
{'six': 'dict3', 'three': 'dict1', 'two': 'dict2', 'four': 'dict1',
'five': 'dict3', 'one': 'dict2'}
If you don't want to reverse the dictionary, here's another possible solution:
def get_key_from_value(my_dict, v):
for key,value in my_dict.items():
if value == v:
return key
return None
>>> d = {1: 'one', 2: 'two'}
>>> get_key_from_value(d,'two')
2
The following will create a reverse dictionary for the two-level example:
d={'dict2': {1: 'one', 2: 'two'}, 'dict1': {3: 'three', 4: 'four'}}
r = {}
for d1 in d:
for d2 in d[d1]:
r[d[d1][d2]] = d1
The result:
>>> r
{'four': 'dict1', 'three': 'dict1', 'two': 'dict2', 'one': 'dict2'}
I don't know about the best solution, but one possibility is reversing the dictionary (so that values becomes keys) and then just doing a normal key lookup. This will reverse a dictionary:
forward_dict = { 'key1': 'val1', 'key2': 'val2'}
reverse_dict = dict([(v,k) for k,v in forward_dict.items()])
So given "val1", I can just do:
reverse_dict["val1"]
to find the corresponding key. There are obvious problems with this solution -- for example, if your values aren't unique, you're going to lose some information.
Write code to reverse the dictionary (i.e. create a new dictionary that maps the values of the old one to the keys of the old one).
Since you seem to be dealing with nested dictionaries, this will obviously be trickier. Figure out the least you need to get your problem solved and code that up (i.e. don't create a solution that will work arbitrary depths of nesting if your problem only deals with dicts in dicts which in turn don't have any dicts)
To handle the nested dictionaries I would do just as senderle's answer states.
However if in the future it does not contain nested dictionaries, be very careful doing a simple reversal. By design the dictionary keys are unique, but the values do not have this requirement.
If you have values that are the same for multiple keys, when reversing the dictionary you will lose all but one of them. And because dictionaries are not sorted, you could lose different data arbitrarily.
Example of reversal working:
>>> d={'dict1': 1, 'dict2': 2, 'dict3': 3, 'dict4': 4}
>>> rd = dict([(v,k) for k,v in d.items()])
>>> print d
{'dict4': 4, 'dict1': 1, 'dict3': 3, 'dict2': 2}
>>> print rd
{1: 'dict1', 2: 'dict2', 3: 'dict3', 4: 'dict4'}
Example of reversal failure: Note that dict4 is lost
>>> d={'dict1': 1, 'dict2': 4, 'dict3': 3, 'dict4': 4}
>>> rd = dict([(v,k) for k,v in d.items()])
>>> print d
{'dict4': 4, 'dict1': 1, 'dict3': 3, 'dict2': 4}
>>> print rd
{1: 'dict1', 3: 'dict3', 4: 'dict2'}
here's an example nobody thinks about: (could be used similarly)
raw_dict = { 'key1': 'val1', 'key2': 'val2', 'key3': 'val1' }
new_dict = {}
for k,v in raw_dict.items():
try: new_dict[v].append(k)
except: new_dict[v] = [k]
result:
>>> new_dict
{'val2': ['key2'], 'val1': ['key3', 'key1']}
maybe not the best of methods, but it works for what I need it for.