I am trying to write a code that replicates greedy algorithm and for that I need to make sure that my calculations use the highest value possible. Potential values are presented in a dictionary and my goal is to use largest value first and then move on to lower values. However since dictionary values are not sequenced, in for loop I am getting unorganized sequences. For example, out put of below code would start from 25.
How can I make sure that my code is using a dictionary yet following the sequence of (500,100,25,10,5)?
a={"f":500,"o":100,"q":25,"d":10,"n":5}
for i in a:
print a[i]
Two ideas spring to mind:
Use collections.OrderedDict, a dictionary subclass which remembers the order in which items are added. As long as you add the pairs in descending value order, looping over this dict will return them in the right order.
If you can't be sure the items will be added to the dict in the right order, you could construct them by sorting:
Get the values of the dictionary with values()
Sort by (ascending) value: this is sorted(), and Python will default to sorting in ascending order
Get them by descending value instead: this is reverse=True
Here's an example:
for value in sorted(a.values(), reverse=True):
print value
Dictionaries yield their keys when you iterate them normally, but you can use the items() view to get tuples of the key and value. That'll be un-ordered, but you can then use sorted() on the "one-th" element of the tuples (the value) with reverse set to True:
a={"f":500,"o":100,"q":25,"d":10,"n":5}
for k, v in sorted(a.items(), key=operator.itemgetter(1), reverse=True):
print(v)
I'm guessing that you do actually need the keys, but if not, you can just use values() instead of items(): sorted(a.values(), reverse=True)
You can use this
>>> a={"f":500,"o":100,"q":25,"d":10,"n":5}
>>> for value in sorted(a.itervalues(),reverse=True):
... print value
...
500
100
25
10
5
>>>
a={"f":500,"o":100,"q":25,"d":10,"n":5}
k = sorted(a, key=a.__getitem__, reverse=True)
v = sorted(a.values(), reverse=True)
sorted_a = zip(k,v)
print (sorted_a)
Output:
[('f', 500), ('o', 100), ('q', 25), ('d', 10), ('n', 5)]
Related
I have a list of tuples that can be understood as key-value pairs, where a key can appear several times, possibly with different values, for example
[(2,8),(5,10),(2,5),(3,4),(5,50)]
I now want to get a list of tuples with the highest value for each key, i.e.
[(2,8),(3,4),(5,50)]
The order of the keys is irrelevant.
How do I do that in an efficient way?
Sort them and then cast to a dictionary and take the items again from it:
l = [(2,8),(5,10),(2,5),(3,4),(5,50)]
list(dict(sorted(l)).items()) #python3, if python2 list cast is not needed
[(2, 8), (3, 4), (5, 50)]
The idea is that the key-value pairs will get updated in ascending order when transforming to a dictionary filtering the lowest values for each key, then you just have to take it as tuples.
At its core, this problem is essentially about grouping the tuples based on their first element and then keeping only the maximum of each group.
Grouping can be done easily with a defaultdict. A detailed explanation of grouping with defaultdicts can be found in my answer here. In your case, we group the tuples by their first element and then use the max function to find the tuple with the largest number.
import collections
tuples = [(2,8),(5,10),(2,5),(3,4),(5,50)]
groupdict = collections.defaultdict(list)
for tup in tuples:
group = tup[0]
groupdict[group].append(tup)
result = [max(group) for group in groupdict.values()]
# result: [(2, 8), (5, 50), (3, 4)]
In your particular case, we can optimize the code a little bit by storing only the maximum 2nd element in the dict, rather than storing a list of all tuples and finding the maximum at the end:
tuples = [(2,8),(5,10),(2,5),(3,4),(5,50)]
groupdict = {}
for tup in tuples:
group, value = tup
if group in groupdict:
groupdict[group] = max(groupdict[group], value)
else:
groupdict[group] = value
result = [(group, value) for group, value in groupdict.items()]
This keeps the memory footprint to a minimum, but only works for tuples with exactly 2 elements.
This has a number of advantages over Netwave's solution:
It's more readable. Anyone who sees a defaultdict being instantiated knows that it'll be used to group data, and the use of the max function makes it easy to understand which tuples are kept. Netwave's one-liner is clever, but clever solutions are rarely easy to read.
Since the data doesn't have to be sorted, this runs in linear O(n) time instead of O(n log n).
I noticed that the results are different of the two lines. One is a sorted list, while the other is a sorted dictionary. Cant figure out why adding .item will give this difference:
aa={'a':1,'d':2,'c':3,'b':4}
bb=sorted(aa,key=lambda x:x[0])
print(bb)
#['a', 'b', 'c', 'd']
aa={'a':1,'d':2,'c':3,'b':4}
bb=sorted(aa.items(),key=lambda x:x[0])
print(bb)
# [('a', 1), ('b', 4), ('c', 3), ('d', 2)]
The first version implicitly sorts the keys in the dictionary, and is equivalent to sorting aa.keys(). The second version sorts the items, that is: a list of tuples of the form (key, value).
When you iterate on dictionary then you get iterate of keys not (key, value) pair. The sorted method takes any object on which we can iterate and hence you're seeing a difference.
You can verify this by prining while iterating on the dict:
aa={'a':1,'d':2,'c':3,'b':4}
for key in aa:
print(key)
for key in aa.keys():
print(key)
All of the above two for loops print same values.
In the second example, items() method applied to a dictionary returns an iterable collection of tuples (dictionary_key, dictrionary_value). Then the collection is being sorted.
In the first example, a dictionary is automatically casted to an iterable collection of its keys first. (And note: only very first characters of each of them are used for comparinson while sorting, which is probably NOT what you want)
I have seen this post and this post as well as many others, but haven't quite found the answer to my question nor can I figure it out.
I have a dictionary of lists. For example it looks like:
Dict = {'a':[1,2,3,4], 'b':[9,8,7,6], 'c':[8,5,3,2]}
I want to return a list of the keys sorted (descending/reverse) based on a specific item in the lists. For example, I want to sort a,b,c based on the 4th item in each list.
This should return the list sorted_keys = ['b','a','c'] which were sorted by values [6,4,2].
Make sense? Please help...thanks!
Supply a key function, a lambda is easiest, and sort reversed:
sorted(Dict.keys(), key=lambda k: Dict[k][3], reverse=True)
The key function tells sorted what to sort by; the 4th item in the value for the given key.
Demo:
>>> sorted(Dict.keys(), key=lambda k: Dict[k][3], reverse=True)
['b', 'a', 'c']
I have a dictionary of objects where the key is a simple string, and the value is a data object with a few attributes. I'd like to sort my dictionary based on an attribute in the values of the dictionary. i have used this to sort based on the dictionaries values
sorted = dict.values()
sorted.sort(key = operator.attrgetter('total'), reverse=True)
This yields a sorted list of values (which is expected) and I lose my original keys from the dictionary (naturally). I would like to sort both the keys and values together... how can I achieve this? Any help would be greatly appreciated?
Use .items() (or its iterator version iteritems) instead of .values() to get a list of (key, value) tuples.
items = sorted(dct.iteritems(), key=lambda x: x[1].total, reverse=True)
You'll want to use .items() rather than .values(), for example:
def keyFromItem(func):
return lambda item: func(*item)
sorted(
dict.items(),
key=keyFromItem( lambda k,v: (v['total'], k) )
)
The above will sort first based on total, and for items with equal total, will sort them alphabetically by key. It will return items as (key,value) pairs, which you could just do [x[1] for x in sorted(...)] to get the values.
Use items instead of values - and a just use a lambda to fecth the sorting key itself, since there won't be a ready made operator for it:
sorted = dict.items()
sorted.sort(key = lambda item: item[1].total, reverse=True)
I have a dictionary that looks like this:
{'key_info': (rank, raw_data1, raw_data2),
'key_info2': ...}
Basically I need back a list of the keys in sorted order, that is sorted based on the rank field in the tuple.
My code looks something like this right now (diffs is the name of the dict above):
def _sortRanked(self):
print(type(self.diffs))
return sorted(self.diffs.keys(), key=lambda x: x[1], reverse=True)
that right now returns this when I run it:
return sorted(self.diffs.keys(), key=lambda x: x[1], reverse=True)
IndexError: string index out of range
keys() only gives you keys, not values, so you have to use the keys to retrieve values from the dict if you want to sort on them:
return sorted(self.diffs.keys(), key=lambda x: self.diffs[x], reverse=True)
Since you're sorting on rank, which is the first item in the tuple, you don't need to specify which item in the value tuple you want to sort on. But if you wanted to sort on raw_data1:
return sorted(self.diffs.keys(), key=lambda x: self.diffs[x][1], reverse=True)
You're passing the key as the argument to, uh, key.
[k for (k, v) in sorted(D.iteritems(), key=lambda x: x[1], reverse=True)]
You're attempting to sort on the keys of the dictionary, not the values. Replace your self.diffs.keys() call with self.diffs.items(), and then it should work (but do keep the lambda, or use operator.itemgetter(1). Tuples sort starting with the first element, so you don't have to worry about that.)
Just noticed that you only want the keys. With my suggestion, you'd have to wrap the sort with zip()[0] (making sure to unpack the resultant list of tuples from the sort by prefixing with * in the call to zip()).
You're close. Try this instead:
return sorted(self.diffs.keys(), key = lambda x: self.diffs[x][0], reverse = True)
You're sorting a list of keys, so you have to take that key back to the dictionary and retrieve element 1 in order to use it as a comparison value.