Basic way to find largest values in dictionary [duplicate] - python

This question already has answers here:
finding top k largest keys in a dictionary python
(6 answers)
Closed 7 years ago.
Let's say I have a dictionary. I want to find the 4 keys with the highest values. I want to do this in a very basic way. I'm not that advanced with CS. I just want to iterate over the dictionary. Or how should I do this? I realize it cannot be that challenging. I don't want to use heapq. How can I do this?

I think the most pythonic way is:
sorted(d, key=d.get, reverse=True)[:4]

You should be using a collections.Counter object instead.
c = collections.Counter(original_dict)
c.most_common(4)
If you just want the keys, then:
[k for k, v in c.most_common(4)]
For reference on how it is implemented, check the source code here.

Sort by the values and then slice. The sorted function takes a key function.
items_by_value = sorted(d.items(), key=lambda (k, v): v)
keys = [k for k, v in items_by_value[-4:]]

Good question.
Assuming d is the dictionary, and if you don't care what order the keys are in, you could use something like the following:
keys = [v[0] for v in sorted(d.items(), key=lambda v: v[1])[-4:]]
For huge dictionaries, it would be a more efficient to use:
keys = [v[0] for v in sorted(d.items(), key=operator.itemgetter(1))[-4:]]
The [-4:] means the last four entries. In both cases, if you want the keys in the same order as their corresponding highest values, use [-1:-5:-1] instead, which are the last four entries in reverse order.

Related

Python dictionary comprehension to group together equal keys

I have a code snippit that groups together equal keys from a list of dicts and adds the dict with equal ObjectID to a list under that key.
Code bellow works, but I am trying to convert it to a Dictionary comprehension
group togheter subblocks if they have equal ObjectID
output = {}
subblkDBF : list[dict]
for row in subblkDBF:
if row["OBJECTID"] not in output:
output[row["OBJECTID"]] = []
output[row["OBJECTID"]].append(row)
Using a comprehension is possible, but likely inefficient in this case, since you need to (a) check if a key is in the dictionary at every iteration, and (b) append to, rather than set the value. You can, however, eliminate some of the boilerplate using collections.defaultdict:
output = defaultdict(list)
for row in subblkDBF:
output[row['OBJECTID']].append(row)
The problem with using a comprehension is that if really want a one-liner, you have to nest a list comprehension that traverses the entire list multiple times (once for each key):
{k: [d for d in subblkDBF if d['OBJECTID'] == k] for k in set(d['OBJECTID'] for d in subblkDBF)}
Iterating over subblkDBF in both the inner and outer loop leads to O(n^2) complexity, which is pointless, especially given how illegible the result is.
As the other answer shows, these problems go away if you're willing to sort the list first, or better yet, if it is already sorted.
If rows are sorted by Object ID (or all rows with equal Object ID are at least next to each other, no matter the overall order of those IDs) you could write a neat dict comprehension using itertools.groupby:
from itertools import groupby
from operator import itemgetter
output = {k: list(g) for k, g in groupby(subblkDBF, key=itemgetter("OBJECTID"))}
However, if this is not the case, you'd have to sort by the same key first, making this a lot less neat, and less efficient than above or the loop (O(nlogn) instead of O(n)).
key = itemgetter("OBJECTID")
output = {k: list(g) for k, g in groupby(sorted(subblkDBF, key=key), key=key)}
You can adding an else block to safe on time n slightly improve perfomrance a little:
output = {}
subblkDBF : list[dict]
for row in subblkDBF:
if row["OBJECTID"] not in output:
output[row["OBJECTID"]] = [row]
else:
output[row["OBJECTID"]].append(row)

How to unpack dict with one key-value pair to two variables more elegantly? [duplicate]

This question already has answers here:
How to extract dictionary single key-value pair in variables
(11 answers)
Closed 3 years ago.
Currently, I'm using this:
d = {'a': 'xyz'}
k, v = list(*d.items())
The starred expression is required here, as omitting it causes the list function/constructor to return a list with a single tuple, which contains the key and value.
However, I was wondering if there were a better way to do it.
Keep nesting:
>>> d = {'a': 'xyz'}
>>> ((k,v),) = d.items()
>>> k
'a'
>>> v
'xyz'
Or equivalently:
>>> (k,v), = d.items()
>>> k
'a'
>>> v
'xyz'
>>>
Not sure which I prefer, the last one might be a bit difficult to read if I was glancing at it.
Note, the advantage here is that it is non-destructive and fails if the dict has more than one key-value pair.
Since dict items do not support indexed access, you might resort to the following non-mutating retrieval of the first (and only) item:
k, v = next(iter(d.items()))
This has the advantage of not only working for dicts of any size, but remaining an O(1) operation which other solutions that unpack the items or convert them to a list would not.
If you don't mind the dictionary getting altered, this'll do:
k, v = d.popitem()

Python printing single key/value pairs from a dict [duplicate]

This question already has answers here:
Get key by value in dictionary
(43 answers)
Closed 7 years ago.
Say I have the following code that makes a dict:
x = 0
myHash = {}
name = ["Max","Fred","Alice","Bobby"]
while x <= 3:
myHash[name[x]] = x
x += 1
l = sorted(myHash.values(), reverse=True)
largestNum = l[0]
# print myHash.getKeyFromValue(largestNum)
Is it possible to easily get the key that is paired to my largestNum variable without looping through the entire dict? Something like the pseudo code in the line at the bottom.
Note: I don't want to get a value from a key. I want the reverse of that.
Don't just sort the values. Sort the items by their values, and get the key for free.
from operator import itemgetter
l = sorted(myHash.items(), key=itemgetter(1), reverse=True)
largestKey, largestNum = l[0]
Note: If you only want the largest value, not the rest of the sort results, you can save some work and skip the full sorted work (reducing work from O(n log n) to O(n)):
largestKey, largestNum = max(myHash.items(), key=itemgetter(1))
For the general case of inverting a dict, if the values are unique, it's trivial to create a reversed mapping:
invert_dict = {v: k for k, v in orig_dict.items()}
If the values aren't unique, and you want to find all keys corresponding to a single value with a single lookup, you'd invert to a multi-dict:
from collections import defaultdict
invert_dict = defaultdict(set)
for k, v in orig_dict.items():
invert_dict[v].add(k)
# Optionally convert back to regular dict to avoid lookup auto-vivification in the future:
# invert_dict = dict(invert_dict)

filter items in a python dictionary where keys contain a specific string

I'm a C coder developing something in python. I know how to do the following in C (and hence in C-like logic applied to python), but I'm wondering what the 'Python' way of doing it is.
I have a dictionary d, and I'd like to operate on a subset of the items, only those whose key (string) contains a specific substring.
i.e. the C logic would be:
for key in d:
if filter_string in key:
# do something
else
# do nothing, continue
I'm imagining the python version would be something like
filtered_dict = crazy_python_syntax(d, substring)
for key,value in filtered_dict.iteritems():
# do something
I've found a lot of posts on here regarding filtering dictionaries, but couldn't find one which involved exactly this.
My dictionary is not nested and i'm using python 2.7
How about a dict comprehension:
filtered_dict = {k:v for k,v in d.iteritems() if filter_string in k}
One you see it, it should be self-explanatory, as it reads like English pretty well.
This syntax requires Python 2.7 or greater.
In Python 3, there is only dict.items(), not iteritems() so you would use:
filtered_dict = {k:v for (k,v) in d.items() if filter_string in k}
Go for whatever is most readable and easily maintainable. Just because you can write it out in a single line doesn't mean that you should. Your existing solution is close to what I would use other than I would user iteritems to skip the value lookup, and I hate nested ifs if I can avoid them:
for key, val in d.iteritems():
if filter_string not in key:
continue
# do something
However if you realllly want something to let you iterate through a filtered dict then I would not do the two step process of building the filtered dict and then iterating through it, but instead use a generator, because what is more pythonic (and awesome) than a generator?
First we create our generator, and good design dictates that we make it abstract enough to be reusable:
# The implementation of my generator may look vaguely familiar, no?
def filter_dict(d, filter_string):
for key, val in d.iteritems():
if filter_string not in key:
continue
yield key, val
And then we can use the generator to solve your problem nice and cleanly with simple, understandable code:
for key, val in filter_dict(d, some_string):
# do something
In short: generators are awesome.
You can use the built-in filter function to filter dictionaries, lists, etc. based on specific conditions.
filtered_dict = dict(filter(lambda item: filter_str in item[0], d.items()))
The advantage is that you can use it for different data structures.
input = {"A":"a", "B":"b", "C":"c"}
output = {k:v for (k,v) in input.items() if key_satifies_condition(k)}
Jonathon gave you an approach using dict comprehensions in his answer. Here is an approach that deals with your do something part.
If you want to do something with the values of the dictionary, you don't need a dictionary comprehension at all:
I'm using iteritems() since you tagged your question with python-2.7
results = map(some_function, [(k,v) for k,v in a_dict.iteritems() if 'foo' in k])
Now the result will be in a list with some_function applied to each key/value pair of the dictionary, that has foo in its key.
If you just want to deal with the values and ignore the keys, just change the list comprehension:
results = map(some_function, [v for k,v in a_dict.iteritems() if 'foo' in k])
some_function can be any callable, so a lambda would work as well:
results = map(lambda x: x*2, [v for k,v in a_dict.iteritems() if 'foo' in k])
The inner list is actually not required, as you can pass a generator expression to map as well:
>>> map(lambda a: a[0]*a[1], ((k,v) for k,v in {2:2, 3:2}.iteritems() if k == 2))
[4]
You can use the built-in function 'filter()':
data = {'aaa':12, 'bbb':23, 'ccc':8, 'ddd':34}
# filter by key
print(dict(filter(lambda e:e[0]=='bbb', data.items() ) ) )
# filter by value
print(dict(filter(lambda e:e[1]>18, data.items() ) ) )
OUTPUT:
{'bbb':23}
{'bbb':23, 'ddd':34}

Slicing a dictionary of list

I have a dictionary of lists, each list greater than 50 items, and to simplify, lets say the dictionary keys are ['a','b','c']. I spend way to long trying to figure out a very pythonic was to sort and slice these lists. What I have so far:
dict = dictionary_of_lists under discussion
[dict[k].sort(reverse=True) for k in dict.keys()]
for k, l in dict.items():
slice = 10 if k in ('a','c') else 20
dict[k] = l[:slice]
I end up with a sorted, and trimmed up list, just like I want. But what I wanted was a one line piece of code like [dict[k].sort(reverse=True) for k in dict.keys()] when I slice against the sorted list. And if someone can figure out how to put the sorting and slicing together, they would be my hero.
UPDATE: First, I like being able to ask somewhat complex questions because they help me learn better coding skills (since I am self taught). So thanks everyone below! My new code:
for c in list_of_categories:
list = [getattr(p,c.name) for p in people if hasattr(p,c.name)]
slice = c.get_slice_value # I added an #property function to a class named `Category`
c.total = sum(sorted(list, reverse=True)[:slice])
List comprehensions with side effects are usually considered bad style. Create a new dict instead:
dct = {k: sorted(l, reverse=True)[:10 if k in ('a','c') else 20]
for k, l in dct.items()}
Also slice values look arbitrary at the moment, it might be better to configure them separately, for example:
slices = {
'a': 10,
'b': 10,
'c': 20
}
dct = {k: sorted(l, reverse=True)[:slices[k]]
for k, l in dct.items()}
sort() works in place, affecting each list. You'd want to create new ones:
[sorted(d[k], reverse = True)[:10 if k in ('a','c') else 20] for k in d.keys()]
Note that it's not very readable.

Categories

Resources