Filter dictionary and remove lowest values

Filter dictionary and remove lowest values - python

I have dictionary as below. Is there a way to output a dictionary with the 5 highest values?
If there are ties for the 5th highest value, I need to include those keys.
Input dictionary:
{
"1": 1,
"12": 1,
"13":2,
"3": 5,
"5":8,
"7":3,
"4":8,
"10":7
}
Desired result:
{
"3": 5,
"5":8,
"7":3,
"4":8,
"10":7
}

Accounting for ties:
val = sorted(d.values(), reverse=True)[4]
res = {k: v for k, v in d.items() if v >= val}
print(res)
{'3': 5, '5': 8, '7': 3, '4': 8, '10': 7}
Explanation
Calculate the 5th highest value using sorted with reverse=True. Remember indexing begins at 0 so index with [4].
Use a dictionary comprehension to select all items from your dictionary where value is greater than the calculated value.
Optimisation
A more efficient method, as pointed out by #Chris_Rands, is to use heapq to calculate the 5th highest value:
import heapq
val = heapq.nlargest(5, d.values())[-1]
res = {k: v for k, v in d.items() if v >= val}

from collections import Counter
dict(Counter(your_dict).most_common(5))
OUTPUT:
{'10': 7, '3': 5, '4': 8, '5': 8, '7': 3}

Related

Find keys of dictionary that sum to 6

I'm having trouble accessing multiple values in a dictionary. Let's say I have this dictionary:
{'1': 0, '2': 1, '3': 2, '4': 3, '5': 4, '6': 5}
I want to find two keys that sum to 6 and display their values. Here, the keys 4 and 2 add to 6, so the 2 values are 3 and 1.
Where do I start? This is the code I have so far:
for key in dico:
if sum(key + key) == 6:
print(f"Numbers # {key:dico} have a sum of 6")

No need for extra loops (or itertools), they will only slow your program down. You already know what the other index needs to be (because you can subtract the index from 6), so just check if that index exists:
dct = {'1': 0, '2': 1, '3': 2, '4': 3, '5': 4, '6': 5}
for i, key in enumerate(dct):
if i + 2 > len(dct)/2:
break
matchIndex = str(6 - int(key))
if dct.get(matchIndex) is not None:
print(f'Keys {key} and {matchIndex} have values {dct[key]} and {dct[matchIndex]}')
This approach has a O(n/2) time complexity, while the other answer has O(n^2) time complexity.
When I tested this approach with timeit, it took 1.72 seconds to run this answer one million times, but the itertools answer took 5.83 secondss.

You will need to compare each of the dictionary keys with the rest of the other keys. You can use itertools for this.
As you mention you would like to print the value of each of the keys you have in your dictionary, it would be something like this:
import itertools
dico = {'1': 0, '2': 1, '3': 2, '4': 3, '5': 4, '6': 5}
for a, b in itertools.combinations(dico.keys(), 2):
if int(a) + int(b) == 6:
print(f"{dico[a]} - {dico[b]}")

You need two loops for that.
Also, have in mind that there is more than one answer to that problem
a = {'1': 0, '2': 1, '3': 2, '4': 3, '5': 4, '6': 5}
results = list()
for key_1 in a.keys():
for key_2 in a.keys():
if key_1 != key_2:
if a[key_1] + a[key_2] == 6:
if a[key_1] < a[key_2]:
results.append((key_1, key_2))
print(results)

Prefer a key by max-value in dictionary?

You can get the key with max value in dictionary this way max(d, key=d.get).
The question when two or more keys have the max how can you set a preferred key.
I found a way to do this by perpending the key with a number.
Is there a better way ?
In [56]: d = {'1a' : 5, '2b' : 1, '3c' : 5 }
In [57]: max(d, key=d.get)
Out[57]: '1a'
In [58]: d = {'4a' : 5, '2b' : 1, '3c' : 5 }
In [59]: max(d, key=d.get)
Out[59]: '3c'

The function given in the key argument can return a tuple. The second element of the tuple will be used if there are several maximums for the first element. With that, you can use the method you want, for example with two dictionnaries:
d = {'a' : 5, 'b' : 1, 'c' : 5 }
d_preference = {'a': 1, 'b': 2, 'c': 3}
max(d, key=lambda key: (d[key], d_preference[key]))
# >> 'c'
d_preference = {'a': 3, 'b': 2, 'c': 1}
max(d, key=lambda key: (d[key], d_preference[key]))
# >> 'a'

This is a similar idea to #AxelPuig's solution. But, instead of relying on an auxiliary dictionary each time you wish to retrieve an item with max or min value, you can perform a single sort and utilise collections.OrderedDict:
from collections import OrderedDict
d = {'a' : 5, 'b' : 1, 'c' : 5 }
d_preference1 = {'a': 1, 'b': 2, 'c': 3}
d_preference2 = {'a': 3, 'b': 2, 'c': 1}
d1 = OrderedDict(sorted(d.items(), key=lambda x: -d_preference1[x[0]]))
d2 = OrderedDict(sorted(d.items(), key=lambda x: -d_preference2[x[0]]))
max(d1, key=d.get) # c
max(d2, key=d.get) # a
Since OrderedDict is a subclass of dict, there's generally no need to convert to a regular dict. If you are using Python 3.7+, you can use the regular dict constructor, since dictionaries are insertion ordered.
As noted on the docs for max:
If multiple items are maximal, the function returns the first one
encountered.

A slight variation on #AxelPuig's answer. You fix an order of keys in a priorities list and take the max with key=d.get.
d = {"1a": 5, "2b": 1, "3c": 5}
priorities = list(d.keys())
print(max(priorities, key=d.get))

Choose dictionary keys only if their values don't have a certain number of duplicates

Given a dictionary and a limit for the number of keys in a new dictionary, I would like the new dictionary to contain the keys with the highest values.
The given dict is:
dict = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }
I want to get a new dictionary that has the keys with the highest values of length limit.
For instance for limit=1 the new dict is
{'apple':5}
if the limit=2
{'apple':5, 'pears':4}
I tried this:
return dict(sorted(dictation.items(),key=lambda x: -x[1])[:limit])
but when I try limit=3, I get
{'apple':5, 'pears':4, 'orange':3}
But it shouldn't include orange:3 because orange and kiwi have same priority if we include kiwi and orange it will exceed the limit so it shouldn't include both. I should return
{'apple':5, 'pears':4}

The way to go would be to use a collections.Counter and most_common(n). Then you can take one more as needed and keep popping until the last value changes:
from collections import Counter
dct = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}
n = 3
items = Counter(dictation).most_common(n+1)
last_val = items[-1][1]
if len(items) > n:
while items[-1][1] == last_val:
items.pop()
new = dict(items)
# {'apple': 5, 'pears': 4}

This is computationally not very good, but it works. It creates a Counter object to get the sorted output for your data and a inverted defaultdict that holds list that match to a score - it creates the result using both and some math:
from collections import defaultdict, Counter
def gimme(d,n):
c = Counter(d)
grpd = defaultdict(list)
for key,value in c.items():
grpd[value].append(key)
result = {}
for key,value in c.most_common():
if len(grpd[value])+len(result) <= n:
result.update( {k:value for k in grpd[value] } )
else:
break
return result
Test:
data = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1 }
for k in range(10):
print(k, gimme(data,k))
Output:
0 {}
1 {'apple': 5}
2 {'apple': 5, 'pears': 4}
3 {'apple': 5, 'pears': 4}
4 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}
5 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3}
6 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}
7 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}
8 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}
9 {'apple': 5, 'pears': 4, 'orange': 3, 'kiwi': 3, 'banana': 1}

As you note, filtering by the top n doesn't exclude by default all equal values which exceed the stated cap. This is by design.
The trick is to consider the (n+1) th highest value and ensure the values in your dictionary are all higher than this number:
from heapq import nlargest
dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}
n = 3
largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])
n_plus_one_value = largest_items[-1][1]
res = {k: v for k, v in largest_items if v > n_plus_one_value}
print(res)
{'apple': 5, 'pears': 4}
We assume here len(largest_items) < n, otherwise you can just take the input dictionary as the result.
The dictionary comprehension seems expensive. For larger inputs, you can use bisect, something like:
from heapq import nlargest
from operator import itemgetter
from bisect import bisect
dictation = {'apple':5, 'pears':4, 'orange':3, 'kiwi':3, 'banana':1}
n = 3
largest_items = nlargest(n+1, dictation.items(), key=lambda x: x[1])
n_plus_one_value = largest_items[-1][1]
index = bisect(list(map(itemgetter(1), largest_items))[::-1], n_plus_one_value)
res = dict(largest_items[:len(largest_items) - index])
print(res)
{'apple': 5, 'pears': 4}

Return key with highest value

I have the following graph:
graph = {0 : {5:6, 4:8},
1 : {4:11},
2 : {3: 9, 0:12},
3 : {},
4 : {5:3},
5 : {2: 7, 3:4}}
I am trying to return the key that has the highest value in this graph. The expected output in this case would be 2 as key 2 has the highest value of 12.
Any help on how I can achieve this would be greatly appreciated.

Find the key whose maximum value is maximal:
max((k for k in graph), key=lambda k: max(graph[k].values(), default=float("-inf")))
The empty elements are disqualified by the ridiculous maximum. Alternately, you can just pre-filter such keys:
max((k for k in graph if graph[k]), key=lambda k: max(graph[k].values()))

Assuming it's all positive numbers
graph = {0 : {5:6, 4:8},
1 : {4:11},
2 : {3: 9, 0:12},
3 : {},
4 : {5:3},
5 : {2: 7, 3:4}}
highestKey = 0
max = 0
for key, value in graph.items():
for key2, value2 in value.items():
if (max < value2):
max = value2
highestKey = key
print(highestKey)

You can also create (max_weight, key) tuples for each key and get the max of those:
max_val = max((max(e.values()), k) for k, e in graph.items() if e)
# (12, 2)
print(max_val[1])
# 2
Note that we don't need a custom key function for max here because the first value in the tuple is the one we want max to consider.

The recursive solution is below. Does not make assumptions about depth of your tree. Only assumes that data types are either int, float or dict
import type
def getLargest(d):
def getLargestRecursive(d):
if type(d) == “dict”:
getLargestRecursive(d)
elif not largest or d > largest:
largest = d
largest = None
getLargestRecursive(d)
return largest
largestValues = [getLargest(k) for k in graph.keys]
answer = largestValues.index(max(largestValues))

You can also use dict comprehension to flat the dictionary and then print the max key,
graph = {0 : {5:6, 4:8},
1 : {4:11},
2 : {3: 9, 0:12},
3 : {},
4 : {5:3},
5 : {2: 7, 3:4}}
flat_dcit = {k:a for k, v in graph.items() for a in v.values()}
print(max(flat_dcit.keys(), key=(lambda k: flat_dcit[k])))
# output,
2

You can also try flattening your dictionary into a list of tuples then take the max of the tuple with the highest second value:
from operator import itemgetter
graph = {
0: {5: 6, 4: 8},
1: {4: 11},
2: {3: 9, 0: 12},
3: {},
4: {5: 3},
5: {2: 7, 3: 4},
}
result = max(((k, v) for k in graph for v in graph[k].values()), key=itemgetter(1))
print(result)
# (2, 12)
print(result[0])
# 2

For each value in dict?

I've got a dict with integer values, and I'd like to perform an operation on every value in the dict. I'd like to use a for loop for this, but I can't get it right. Something like:
>>>print(myDict)
{'ten': 10, 'fourteen': 14, 'six': 6}
>>>for value in myDict:
... value = value / 2
>>>print(myDict)
{'ten': 5, 'fourteen': 7, 'six': 3}

To iterate over keys and values:
for key, value in myDict.items():
myDict[key] = value / 2
The default loop over a dictionary iterates over its keys, like
for key in myDict:
myDict[key] /= 2
or you could use a map or a comprehension.
map:
myDict = map(lambda item: (item[0], item[1] / 2), myDict)
comprehension:
myDict = { k: v / 2 for k, v in myDict.items() }

for k in myDict:
myDict[k] /= 2

Using the dict.items() method and a dict comprehension:
dic = {'ten': 10, 'fourteen': 14, 'six': 6}
print({k: v/2 for k, v in dic.items()})
Output:
{'ten': 5.0, 'six': 3.0, 'fourteen': 7.0}

Python 3:
>>> my_dict = {'ten': 10, 'fourteen': 14, 'six': 6}
>>> for key, value in my_dict.items():
my_dict[key] = value / 2
>>> my_dict
{'fourteen': 7.0, 'six': 3.0, 'ten': 5.0}
This changes the original dictionary. Use // instead of / to get floor division.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Filter dictionary and remove lowest values - python

from collections import Counter dict(Counter(your_dict).most_common(5)) OUTPUT: {'10': 7, '3': 5, '4': 8, '5': 8, '7': 3}

Related

Find keys of dictionary that sum to 6

Prefer a key by max-value in dictionary?

Choose dictionary keys only if their values don't have a certain number of duplicates

Return key with highest value

For each value in dict?

Categories

Resources