How could we access the multiple dict in a list? - python

I have a list which contains multiple dict and all the dict have same key but different values. I want to get a dict which has maximum value for a particular key.
l = [{'a':23, 'b': 64, 'c':4},{'a':83, 'b': 34, 'c':47}]
I want to get a dict which has maximum value for 'b'.

Use max with a custom key:
>>> l = [{'a':23, 'b': 64, 'c':4},{'a':83, 'b': 34, 'c':47}]
>>> max(l, key=lambda x: x["b"])
{'a': 23, 'b': 64, 'c': 4}

Related

Combine two dicts and replace missing values [duplicate]

This question already has answers here:
How to merge dicts, collecting values from matching keys?
(17 answers)
Closed 6 days ago.
I am looking to combine two dictionaries by grouping elements that share common keys, but I would also like to account for keys that are not shared between the two dictionaries. For instance given the following two dictionaries.
d1 = {'a':1, 'b':2, 'c': 3, 'e':5}
d2 = {'a':11, 'b':22, 'c': 33, 'd':44}
The intended code would output
df = {'a':[1,11] ,'b':[2,22] ,'c':[3,33] ,'d':[0,44] ,'e':[5,0]}
Or some array like:
df = [[a,1,11] , [b,2,22] , [c,3,33] , [d,0,44] , [e,5,0]]
The fact that I used 0 specifically to denote an entry not existing is not important per se. Just any character to denote the missing value.
I have tried using the following code
df = defaultdict(list)
for d in (d1, d2):
for key, value in d.items():
df[key].append(value)
But get the following result:
df = {'a':[1,11] ,'b':[2,22] ,'c':[3,33] ,'d':[44] ,'e':[5]}
Which does not tell me which dict was missing the entry.
I could go back and look through both of them, but was looking for a more elegant solution
You can use a dict comprehension like so:
d1 = {'a':1, 'b':2, 'c': 3, 'e':5}
d2 = {'a':11, 'b':22, 'c': 33, 'd':44}
res = {k: [d1.get(k, 0), d2.get(k, 0)] for k in set(d1).union(d2)}
print(res)
Another solution:
d1 = {"a": 1, "b": 2, "c": 3, "e": 5}
d2 = {"a": 11, "b": 22, "c": 33, "d": 44}
df = [[k, d1.get(k, 0), d2.get(k, 0)] for k in sorted(d1.keys() | d2.keys())]
print(df)
Prints:
[['a', 1, 11], ['b', 2, 22], ['c', 3, 33], ['d', 0, 44], ['e', 5, 0]]
If you do not want sorted results, leave the sorted() out.

get dict from dataframe which rows contains many values?

Input
df
A B
a 23
b,c 34
d,e,%f 30
Goal
df_dct = {'a':23,'b':34,'c':34,'d':'30','e':'30','f':30}
The details as below:
A as keys , B as values
The values in A is string and some is grouped by ','
The keys comes from by spliting ',' , and should replace all '%' and space.
Try
I know using zip to get dict from two dataframes but could not handle spliting.
You can use df.explode() for pandas >= 0.25 with df.to_dict():
In [32]: df.A = df.A.str.replace("%", "")
In [42]: df_dct = df.assign(var1=df['A'].str.split(',')).explode('var1').drop('A', 1).set_index('var1').to_dict()['B']
In [43]: df_dct
Out[43]: {'a': 23, 'b': 34, 'c': 34, 'd': 30, 'e': 30, 'f': 30}
Remove the percentage from column A with str replace
df["A"] = df.A.str.replace("%", "")
Use itertools' product to get the pairing of each element in A and B for each row, then combine them into one list, using chain
from itertools import product, chain
#apply dict to get your final result
dict(chain.from_iterable((product(A.split(","),[B])) for A,B in df.to_numpy()))
{'a': 23, 'b': 34, 'c': 34, 'd': 30, 'e': 30, 'f': 30}

Prefer a key by max-value in dictionary?

You can get the key with max value in dictionary this way max(d, key=d.get).
The question when two or more keys have the max how can you set a preferred key.
I found a way to do this by perpending the key with a number.
Is there a better way ?
In [56]: d = {'1a' : 5, '2b' : 1, '3c' : 5 }
In [57]: max(d, key=d.get)
Out[57]: '1a'
In [58]: d = {'4a' : 5, '2b' : 1, '3c' : 5 }
In [59]: max(d, key=d.get)
Out[59]: '3c'
The function given in the key argument can return a tuple. The second element of the tuple will be used if there are several maximums for the first element. With that, you can use the method you want, for example with two dictionnaries:
d = {'a' : 5, 'b' : 1, 'c' : 5 }
d_preference = {'a': 1, 'b': 2, 'c': 3}
max(d, key=lambda key: (d[key], d_preference[key]))
# >> 'c'
d_preference = {'a': 3, 'b': 2, 'c': 1}
max(d, key=lambda key: (d[key], d_preference[key]))
# >> 'a'
This is a similar idea to #AxelPuig's solution. But, instead of relying on an auxiliary dictionary each time you wish to retrieve an item with max or min value, you can perform a single sort and utilise collections.OrderedDict:
from collections import OrderedDict
d = {'a' : 5, 'b' : 1, 'c' : 5 }
d_preference1 = {'a': 1, 'b': 2, 'c': 3}
d_preference2 = {'a': 3, 'b': 2, 'c': 1}
d1 = OrderedDict(sorted(d.items(), key=lambda x: -d_preference1[x[0]]))
d2 = OrderedDict(sorted(d.items(), key=lambda x: -d_preference2[x[0]]))
max(d1, key=d.get) # c
max(d2, key=d.get) # a
Since OrderedDict is a subclass of dict, there's generally no need to convert to a regular dict. If you are using Python 3.7+, you can use the regular dict constructor, since dictionaries are insertion ordered.
As noted on the docs for max:
If multiple items are maximal, the function returns the first one
encountered.
A slight variation on #AxelPuig's answer. You fix an order of keys in a priorities list and take the max with key=d.get.
d = {"1a": 5, "2b": 1, "3c": 5}
priorities = list(d.keys())
print(max(priorities, key=d.get))

Sum values of similar keys inside two nested dictionary in python

I have nested dictionary like this:
data = {
"2010":{
'A':2,
'B':3,
'C':5,
'D':-18,
},
"2011":{
'A':1,
'B':2,
'C':3,
'D':1,
},
"2012":{
'A':1,
'B':2,
'C':4,
'D':2
}
}
In my case, i need to sum all values based on its similar keys in every year, from 2010 till 2012..
So the result i expected should be like this:
data = {'A':4,'B':7, 'C':12, 'D':-15}
You can use collections.Counter() (works only for positive values!):
In [17]: from collections import Counter
In [18]: sum((Counter(d) for d in data.values()), Counter())
Out[18]: Counter({'C': 12, 'B': 7, 'A': 4, 'D': 3})
Note that based on python documentation Counter is designed only for use cases with positive values:
The multiset methods are designed only for use cases with positive values. The inputs may be negative or zero, but only outputs with positive values are created. There are no type restrictions, but the value type needs to support addition, subtraction, and comparison.
The elements() method requires integer counts. It ignores zero and negative counts.
So if you want to get a comprehensive result you can do the summation manually. The collections.defaultdict() is a good way for getting around this problem:
In [28]: from collections import defaultdict
In [29]: d = defaultdict(int)
In [30]: for sub in data.values():
....: for i, j in sub.items():
....: d[i] += j
....:
In [31]: d
Out[31]: defaultdict(<class 'int'>, {'D': -15, 'A': 4, 'C': 12, 'B': 7})
Try this,
reduce(lambda x, y: dict((k, v + y[k]) for k, v in x.iteritems()), data.values())
Result
{'A': 4, 'B': 7, 'C': 12, 'D': -15}

Sort a list of dictionary provided an order

I've a list
order = [8, 7, 5, 9, 10, 11]
and a list of dictionaries
list_of_dct = [{'value':11}, {'value':8}, {'value':5}, {'value':7}, {'value':10}, {'value':9}]
I want to sort this list_of_dct by the order given in order list, i.e. the output should be the following:
list_of_dct = [{'value':8}, {'value':7}, {'value':5}, {'value':9}, {'value':10}, {'value':11}]
I know how to sort by a given key, but not when an order is already given. How can I sort it?
PS: I already have an O(n^2) solution. Looking for a better solution.
Use index of the order list to sort-Just try if every dictionary has one value and you want sorting by that value-
sorted(list_of_dct,key=lambda x:order.index(x.values()[0]))
But if you have multiple values for one key then change the index (i.e [0]) on which you will sort.
Make a mapping of 8 to 0, 7 to 1, ..., 11 to 5 using enumerate:
>>> order = [8,7,5,9,10,11]
>>> list_of_dct = [{'value':11}, {'value':8}, {'value':5},
{'value':7}, {'value':10}, {'value':9}]
>>> sort_keys = {item: i for i, item in enumerate(order)}
>>> sort_keys
{5: 2, 7: 1, 8: 0, 9: 3, 10: 4, 11: 5}
And use it as a sorting key:
>>> list_of_dct.sort(key=lambda d: sort_keys.get(d['value'], len(sort_keys)))
>>> list_of_dct
[{'value': 8}, {'value': 7}, {'value': 5}, {'value': 9},
{'value': 10}, {'value': 11}]
use sort_keys.get(..) instead of sort_keys[..] to prevent KeyError in case of value is misisng in order.

Categories

Resources