Top Three Values in a Dictionary, No Repeated Values - python

I want to be able to print the top three values in a dictionary created in another function, where there may be repeating values.
For example, if I have a dictionary d = { a:1, b:2, c:3, d:3, e:4 } I would only want a, b, and c returned.
This is what I currently have, but the output would be a, b, c, d. I don't want to remove d from the dictionary, I just don't want it returned when I run this function.
def top3(filename: str):
"""
Takes dict defined in wd_inventory, identifies top 3 words in dict
:param filename:
:return:
"""
d = max_frequency(filename)
x = list(d.values())
x.sort(reverse=True)
y = set(x)
x = x[0:3]
for i in x:
for j in d.keys():
if d[j] == i:
print(str(j) + " : " + str(d[j]))
return

One solution could be the following:
d = { "a":3, "b":4, "c":2, "d":5, "e":1}
print(sorted(d.items(), key=lambda x: x[1])[:3])
OUTPUT
[('e', 1), ('c', 2), ('a', 3)]
Note that will return truly the top 3 entry (by value), not the ones with keys 1, 2 and 3.
EDIT
I don't know what repeating value means exactly, but let's assume that in a dictionary like:
d = {"a":1, "b": 2, "c": 3, "d": 1, "e": 1}
You would like to print just a, b and c (given that d and e repeat the same value as a)
You could use the following approach:
from collections import defaultdict
res = defaultdict(list)
for key, val in sorted(d.items()):
res[val].append(key)
print([y[0] for x, y in list(res.items())])
OUTPUT
['a', 'b', 'c']

You can use heapq.nsmallest() to get the n smallest values in an iterable. This might be especially useful if the dict is very large, because it saves sorting a whole list only to select just three elements of it.
from heapq import nsmallest
from operator import itemgetter
def top3(dct):
return nsmallest(3, dct.items(), key=itemgetter(1))
dct = {'a':1, 'b':2, 'c':3, 'd':3, 'e':4}
for k, v in top3(dct):
print(f"{k}: {v}")
Output
a: 1
b: 2
c: 3
Due credit: I copied parts of j1-lee's code to use as a template.

[edited]
sorry, i have overseen that the smallest number has the highest status.
the code now is sorting the dictionary. this creates a list of tuples.
dic = {'aaa':3, 'xxx':1, 'ccc':8, 'yyy': 4, 'kkk':12}
res = sorted(dic.items(), key=lambda x: x[1])
print(res[:3])
result is:
[('xxx', 1), ('aaa', 3), ('yyy', 4)]

Related

Insert nested values into a dictionary in Python

say I have a simple dictionary d={'a':1}
I wish to run a line d['b']['c'] = 2 but I can't, I get: KeyError: 'b'
I don't want to insert b first with an empty dictionary because most of the time, this dictionary will contain b with more values except for c.
Is there an elegant way to do it so my final dictionary is:
d = {'a':1,
'b':{'c':2}}
Is defaultdict sufficient for you?
from collections import defaultdict
d = defaultdict()
d['a'] = 1
print(d) # this gives back: defaultdict(None, {'a': 1})
d['b'] = {'c':2}
print(d) # this gives back: defaultdict(None, {'a': 1, 'b': {'c': 2}})
For a better example of defaultdict:
s = 'mississippi'
d = defaultdict(int)
for k in s:
d[k] += 1
d.items() # this gives back: [('i', 4), ('p', 2), ('s', 4), ('m', 1)]
When a letter is first encountered, it is missing from the mapping, so the default_factory function calls int() to supply a default count of zero. The increment operation then builds up the count for each letter.
Well if you don't want to first assign an empty dict in order to erase nothing, you can first check if the dict is here or not, it's not one line only but quite clear I think:
d ={'a':1}
b = d.get('b', {})
b['c'] = 2
d['b'] = b

Filter out entries in dictionary based on the sum of the keys

# given I want to find the x largest,
dictionary = {'cat': (3,4,5), 'meow': (6,4,1), 'dog': (1,2,3)}
x = list(dictionary.values())
lst = []
for i in x:
lst.append(sum(i))
for num in range (x-1):
final = (filter(lambda x: sum(x.key()) == lst[num] , dictionary)
print(final)
# expected answer (if x = 1):
{"cat": (3,4,5)}
#because it has the largest sum (3+4+5)
This is what I got so far, but I got EOL while scanning string literal error which means I can't even test to see if it's right.
EDIT: Just tried another method, which seems more succinct. However, how to make it loop for the entire dictionary?
def wrtd2(s):
dictionary = {}
for val in s:
x = max(list(s.items()), key = lambda x: sum(x[1]))
if x[0] not in dictionary:
dictionary[x[0]] = x[1]
return dictionary
You can also use max with map to retrieve the maximum value sum.
Then filter your dictionary via a dictionary comprehension.
d = {'cat': (3,4,5), 'meow': (6,4,1), 'dog': (1,2,3)}
maxval = max(map(sum, d.values()))
res = {k: v for k, v in d.items() if sum(v) == maxval}
# {'cat': (3, 4, 5)}

Append dict items to a list with a single line for loop

I have a list
lst = []
I have dict entries
a= {'a':1,'b':2}
I wish to write a for loop in a comprehension manner filling the list.
What I have tried is
lst.append(k,v) for (k,v) in a.items()
I need to then update the dict as
a = {'c':3, 'd':4}
Then again update the list lst.
Which adds the tuples as [('a',1)('b',2)('c',3)('d',4)]
What is the right way to iterate through a dict and fill the list?
This is what the syntax for a list comprehension is and should do what you're looking for:
lst = [(k,v) for k,v in a.items()]
In general list comprehension works like this:
someList = [doSomething(x) for x in somethingYouCanIterate]
OUTPUT
>>> lst
[('a', 1), ('b', 2)]
P.S. Apart from the question asked, you can also get what you're trying to do without list comprehension by simply calling :
lst = a.items()
this will again give you a list of tuples of (key, value) pairs of the dictionary items.
EDIT
After your updated question, since you're updating the dictionary and want the key value pairs in a list, you should do it like:
a= {'a':1,'b':2}
oldA = a.copy()
#after performing some operation
a = {'c':3, 'd':4}
oldA.update(a)
# when all your updates on a is done
lst = oldA.items() #or [(k,v) for k,v in oldA.items()]
# or instead of updating a and maintaining a copy
# you can simply update it like : a.update({'c':3, 'd':4}) instead of a = {'c':3, 'd':4}
One approach is:
a = {"a" : 1, "b" : 2}
lst = [(k, a[k]) for k in a]
a = {"c" : 3, "d" : 4}
lst += [(k, a[k]) for k in a]
Where the contents of lst are [('a', 1), ('b', 2), ('c', 3), ('d', 4)].
Alternatively, using the dict class' .items() function to accomplish the same:
a = {"a" : 1, "b" : 2}
lst = [b for b in a.items()]
a = {"c" : 3, "d" : 4}
lst += [b for b in a.items()]
There are many valid ways to achieve this. The most easy route is using
a = {"a" : 1, "b" : 2}
lst = list(a.items())
Alternatives include using the zip function, list comprehension etc.

list to dictionary conversion with multiple values per key?

I have a Python list which holds pairs of key/value:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
I want to convert the list into a dictionary, where multiple values per key would be aggregated into a tuple:
{1: ('A', 'B'), 2: ('C',)}
The iterative solution is trivial:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
d = {}
for pair in l:
if pair[0] in d:
d[pair[0]] = d[pair[0]] + tuple(pair[1])
else:
d[pair[0]] = tuple(pair[1])
print(d)
{1: ('A', 'B'), 2: ('C',)}
Is there a more elegant, Pythonic solution for this task?
from collections import defaultdict
d1 = defaultdict(list)
for k, v in l:
d1[k].append(v)
d = dict((k, tuple(v)) for k, v in d1.items())
d contains now {1: ('A', 'B'), 2: ('C',)}
d1 is a temporary defaultdict with lists as values, which will be converted to tuples in the last line. This way you are appending to lists and not recreating tuples in the main loop.
Using lists instead of tuples as dict values:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
d = {}
for key, val in l:
d.setdefault(key, []).append(val)
print(d)
Using a plain dictionary is often preferable over a defaultdict, in particular if you build it just once and then continue to read from it later in your code:
First, the plain dictionary is faster to build and access.
Second, and more importantly, the later read operations will error out if you try to access a key that doesn't exist, instead of silently creating that key. A plain dictionary lets you explicitly state when you want to create a key-value pair, while the defaultdict always implicitly creates them, on any kind of access.
This method is relatively efficient and quite compact:
reduce(lambda x, (k,v): x[k].append(v) or x, l, defaultdict(list))
In Python3 this becomes (making exports explicit):
dict(functools.reduce(lambda x, d: x[d[0]].append(d[1]) or x, l, collections.defaultdict(list)))
Note that reduce has moved to functools and that lambdas no longer accept tuples. This version still works in 2.6 and 2.7.
Are the keys already sorted in the input list? If that's the case, you have a functional solution:
import itertools
lst = [(1, 'A'), (1, 'B'), (2, 'C')]
dct = dict((key, tuple(v for (k, v) in pairs))
for (key, pairs) in itertools.groupby(lst, lambda pair: pair[0]))
print dct
# {1: ('A', 'B'), 2: ('C',)}
I had a list of values created as follows:
performance_data = driver.execute_script('return window.performance.getEntries()')
Then I had to store the data (name and duration) in a dictionary with multiple values:
dictionary = {}
for performance_data in range(3):
driver.get(self.base_url)
performance_data = driver.execute_script('return window.performance.getEntries()')
for result in performance_data:
key=result['name']
val=result['duration']
dictionary.setdefault(key, []).append(val)
print(dictionary)
My data was in a Pandas.DataFrame
myDict = dict()
for idin set(data['id'].values):
temp = data[data['id'] == id]
myDict[id] = temp['IP_addr'].to_list()
myDict
Gave me a Dict of the keys, ID, mappings to >= 1 IP_addr. The first IP_addr is Guaranteed. My code should work even if temp['IP_addr'].to_list() == []
{'fooboo_NaN': ['1.1.1.1', '8.8.8.8']}
My two coins for toss into that amazing discussion)
I've tried to wonder around one line solution with only standad libraries. Excuse me for the two excessive imports. Perhaps below code could solve the issue with satisfying quality (for the python3):
from functools import reduce
from collections import defaultdict
a = [1, 1, 2, 3, 1]
b = ['A', 'B', 'C', 'D', 'E']
c = zip(a, b)
print({**reduce(lambda d,e: d[e[0]].append(e[1]) or d, c, defaultdict(list))})

How to write a function that takes a string and prints the letters in decreasing order of frequency?

I got this far:
def most_frequent(string):
d = dict()
for key in string:
if key not in d:
d[key] = 1
else:
d[key] += 1
return d
print most_frequent('aabbbc')
Returning:
{'a': 2, 'c': 1, 'b': 3}
Now I need to:
reverse the pair
sort by number by decreasing order
only print the letters out
Should I convert this dictionary to tuples or list?
Here's a one line answer
sortedLetters = sorted(d.iteritems(), key=lambda (k,v): (v,k))
This should do it nicely.
def frequency_analysis(string):
d = dict()
for key in string:
d[key] = d.get(key, 0) + 1
return d
def letters_in_order_of_frequency(string):
frequencies = frequency_analysis(string)
# frequencies is of bounded size because number of letters is bounded by the dictionary, not the input size
frequency_list = [(freq, letter) for (letter, freq) in frequencies.iteritems()]
frequency_list.sort(reverse=True)
return [letter for freq, letter in frequency_list]
string = 'aabbbc'
print letters_in_order_of_frequency(string)
Here is something that returns a list of tuples rather than a dictionary:
import operator
if __name__ == '__main__':
test_string = 'cnaa'
string_dict = dict()
for letter in test_string:
if letter not in string_dict:
string_dict[letter] = test_string.count(letter)
# Sort dictionary by values, credits go here http://stackoverflow.com/questions/613183/sort-a-dictionary-in-python-by-the-value/613218#613218
ordered_answer = sorted(string_dict.items(), key=operator.itemgetter(1), reverse=True)
print ordered_answer
Python 2.7 supports this use case directly:
>>> from collections import Counter
>>> Counter('abracadabra').most_common()
[('a', 5), ('r', 2), ('b', 2), ('c', 1), ('d', 1)]
chills42 lambda function wins, I think but as an alternative, how about generating the dictionary with the counts as the keys instead?
def count_chars(string):
distinct = set(string)
dictionary = {}
for s in distinct:
num = len(string.split(s)) - 1
dictionary[num] = s
return dictionary
def print_dict_in_reverse_order(d):
_list = d.keys()
_list.sort()
_list.reverse()
for s in _list:
print d[s]
EDIT This will do what you want. I'm stealing chills42 line and adding another:
sortedLetters = sorted(d.iteritems(), key=lambda (k,v): (v,k))
sortedString = ''.join([c[0] for c in reversed(sortedLetters)])
------------original answer------------
To print out the sorted string add another line to chills42 one-liner:
''.join(map(lambda c: str(c[0]*c[1]), reversed(sortedLetters)))
This prints out 'bbbaac'
If you want single letters, 'bac' use this:
''.join([c[0] for c in reversed(sortedLetters)])
from collections import defaultdict
def most_frequent(s):
d = defaultdict(int)
for c in s:
d[c] += 1
return "".join([
k for k, v in sorted(
d.iteritems(), reverse=True, key=lambda (k, v): v)
])
EDIT:
here is my one liner:
def most_frequent(s):
return "".join([
c for frequency, c in sorted(
[(s.count(c), c) for c in set(s)], reverse=True
)
])
Here's the code for your most_frequent function:
>>> a = 'aabbbc'
>>> {i: a.count(i) for i in set(a)}
{'a': 2, 'c': 1, 'b': 3}
this particular syntax is for py3k, but it's easy to write something similar using syntax of previous versions. it seems to me a bit more readable than yours.
def reversedSortedFrequency(string)
from collections import defaultdict
d = defaultdict(int)
for c in string:
d[c]+=1
return sorted([(v,k) for k,v in d.items()], key=lambda (k,v): -k)
Here is the fixed version (thank you for pointing out bugs)
def frequency(s):
return ''.join(
[k for k, v in
sorted(
reduce(
lambda d, c: d.update([[c, d.get(c, 0) + 1]]) or d,
list(s),
dict()).items(),
lambda a, b: cmp(a[1], b[1]),
reverse=True)])
I think the use of reduce makes the difference in this sollution compared to the others...
In action:
>>> from frequency import frequency
>>> frequency('abbbccddddxxxyyyyyz')
'ydbxcaz'
This includes extracting the keys (and counting them) as well!!! Another nice property is the initialization of the dictionary on the same line :)
Also: no includes, just builtins.
The reduce function is kinda hard to wrap my head around, and setting dictionary values in a lambda is also a bit cumbersome in python, but, ah well, it works!

Categories

Resources