Python sort items by specific defined rule - python

say i have a dict: d = {'Abc':5,'Jack':4,'amy':9,'Tom':0,'abc':5}
If i want to write a function such that if i pass that function to the built-in sort function, eg. list(d).sort(function), the sort function will sort the list based on the values, of any have identical values, sort them by their keys(alphabetical order). So, in this case, d = {'Abc':5,'Jack':4,'amy':9,'Tom':0,'abc':5,'TAM':0} returns ['amy','Abc','abc','Jack','TAM','Tom']
The function should look something like this:
def arrange_items(something, thing,**may be a function**):
if something < thing:
return -1
elif something > thing:
return 1
etc
if i call some_list.sort(arrange_items), i should get a sorted list back
Thank you in advance
Modification of specification(Another question):
if i have a dict of twitter users name, the dict is in this format:
dict = {'JohnZ':{'name': Jonny Zue,'follow':'MiniT',}, etc} # JohnZ is one of the twitter user. The follow means people that JonhZ follows, in this case it is MiniT.
Popularity of a user means the number of people that follow this particular user, in the above example, the popularity of MiniT is at least one b/c there is at least one user who follow MiniT.
say i have a list of twitter user names, say L1 = ['JonhZ','MiniT',etc], and i want to sort L1 based on the users popularity (higher popularity comes first). dict is already defined in global namespace(we can directly access dict).The requirement for this sort function is to use L1.sort(pass_function)
How should i write the pass_function such that sort will automatically sort L1 based on the popularity of the users.
Thanks for helping

[k for k, v in sorted(d.iteritems(), key=lambda x: (-x[1], x[0].lower()))]
EDIT:
(I refuse to use the name "dict" since it shadows a builtin, and shadowing builtins is stupid)
L1.sort(key=lambda x: (-d.get(x, 0), x.lower()))

You can't achieve this with list(d).sort(function), because you'll get a list with dictionary keys. You can achieve your objective with alternative approach:
l1 = sorted(d.items(), key=lambda x: (x[1], x[0]))
l2 = sorted(l1, key=lambda x: x[1], reverse=True)
result = [x[0] for x in l2]
This approach converts dictionary to list of (key, value) tuples. Then l1 is sorted by values and l2 is sorted by keys. Since python has a stable sorting algorithm, the order of values is preserved for identical keys.
Edit: Ignacio Vazquez-Abrar's approach is similar, but more elegant, because the list need to be sorted only once.

Related

Ordered Dictionary is not sorting

I've been researching online for a simple way to create an ordered dictionary and landed on OrderedDict and its update method, I've successfully implemented this once but however now the code tends not to sort on the added terms for example the items being placed are:
Doc1: Alpha, zebra, top
Doc2: Andres, tell, exta
Output: Alpha, top, zebra, Andres, exta, tell
My goal is to have Alpha, Andres......, top, zebra
This is the code:
finalindex= collections.OrderedDict()
ctr=0
while ctr < docCtr:
filename = 'dictemp%d.csv' % (ctr,)
ctr+=1
dicTempList = io.openTempDic(filename)
print filename
for key in dicTempList:
if key in finalindex:
print key
for k, v in finalindex.items():
newvalue = v + "," + dicTempList.get(key)
finalindex.update([(key, newvalue)])
else:
finalindex.update([(key, dicTempList.get(key))])
io.saveTempDic(filename,finalindex)
Can someone please assist me?
OrderedDicts remember the order that they were created. If you want it sorted, you need to do that when you create them. Here's how to sort an OrderedDict, an example taken from the docs:
from collections import OrderedDict
d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
sorted_dict = OrderedDict(sorted(d.items(), key=lambda t: t[0]))
This will work with another ordered dict, and I prefer to import the module and reference functions and classes from it for clarity for the reader, so this is done in a slightly different style, but again, to have it sorted, you need to sort it before creating a new OrderedDict:
import collections
ordered_dict=collections.OrderedDict()
ordered_dict['foo'] = 1
ordered_dict['bar'] = 2
ordered_dict['baz'] = 3
sorted_dict = collections.OrderedDict(sorted(ordered_dict.items(),
key=lambda t: t[0]))
and sorted_dict returns:
OrderedDict([('bar', 2), ('baz', 3), ('foo', 1)])
If lambdas are confusing, you can use operator.itemgetter
import operator
get_first = operator.itemgetter(0)
sorted_dict = collections.OrderedDict(sorted(ordered_dict.items(),
key=get_first))
I'm using key arguments to demonstrate their usage in case you want to sort by values, but Python sorts tuples (what dict.items() provides to iterate over by means of a list in Python 2 and an iterator in Python 3) by first element then second and so on, so you can even do this and get the same result:
sorted_dict = collections.OrderedDict(sorted(ordered_dict.items()))
An ordered dictionary is not a sorted dictionary.
From the documentation 8.3. collections — High-performance container datatypes:
OrderedDict dict subclass that remembers the order entries were added
(emphasis mine)
The ordered dictionary is a hash table backed structure that also maintains a linked list along side it that stores the order of which items are inserted. The dictionary, when iterated over, uses that linked list.
This type of structure is very useful for LRU caches where one wants to only maintain the N most recent items requested, and then evict the oldest one when a new one would push it over capacity.
The code is working correctly.
Some explanation of the design philosophy behind this can be found at Why are there no containers sorted by insertion order in Python's standard libraries? which suggests that the lack of sorted structures confuses the "one obvious way to do it" when it comes to selecting which container you want (compare with all the different types of classes implementing Map, Set and List in Java - do you use a LinkedHashMap? or a ConcurrentSkipListMap? or a TreeMap? or a WeakHashMap?).

What's the fastest way to identify the 'name' of a dictionary that contains a specific key-value pair?

I'd like to identify the dictionary within the following list that contains the key-value pair 'Keya':'123a', which is ID1 in this case.
lst = {ID1:{'Keya':'123a','Keyb':456,'Keyc':789},ID2:{'Keya':'132a','Keyb':654,'Keyc':987},ID3:{'Keya':'5433a','Keyb':222,'Keyc':333},ID4:{'Keya':'444a','Keyb':777,'Keyc':666}}
It's safe to assume all dictionaries have the same key's, but have different values.
I currently have the following to identify which dictionary has the value '123a' for the key 'Keya', but is there a shorter and faster way?
DictionaryNames = map(lambda Dict: str(Dict),lst)
Dictionaries = [i[1] for i in lst.items()]
Dictionaries = map(lambda Dict: str(Dict),Dictionaries)
Dict = filter(lambda item:'123a' in item,Dictionaries)
val = DictionaryNames[Dictionaries.index(Dict[0])]
return val
If you actually had a list of dictionaries, this would be:
next(d for d in list_o_dicts if d[key]==value)
Since you actually have a dictionary of dictionaries, and you want the key associated with the dictionary, it's:
next(k for k, d in dict_o_dicts.items() if d[key]==value)
This returns the first matching value. If you're absolutely sure there is exactly one, or if you don't care which you get if there are more than one, and if you're happy with a StopIteration exception if you were wrong and there isn't one, that's exactly what you want.
If you need all matching values, just do the same with a list comprehension:
[k for k, d in dict_o_dicts.items() if d[key]==value]
That list can of course have 0, 1, or 17 values.
You can just do [name for name, d in lst.iteritems() if d['Keya']=='123a'] to get a list of all the dictionaries in lst that have that value for that key. If you know there is only one, you can get it with [name for name, d in lst.iteritems() if d['Keya']=='123a'][0]. (As Andy mentions in a comment, your name lst is misleading, since lst is actually a dictionary of dictionaries, not a list.)
Since you want the fastest, you should short-cut your search as soon as you find the data you are after. Iterating through the whole list is not necessary, nor is producing any temporary dictionary:
for key,data in lst.iteritems():
if data['Keya']=='132a':
return key #or break is not in a function
Å different way to do this is to use the appropriate data structure: Keep a "reverse map" of key-value pairs to names. If your dictionary of dictionaries is static after being built, you can build the reverse dictionary like this:
revdict = {(key, value): name
for name, subdict in dictodicts.items()
for key, value in subdict.items()}
If not, you just need to add revdict[key, value] = name for each d[name][key] = value statement and build them up in parallel.
Either way, to find the name of the dict that maps key to value, it's just:
revdict[key, value]
For (a whole lot) more information (than you actually want), and some sample code for wrapping things up in different ways… I dug up an unfinished blog post, considered editing it, and decided to not bother and just clicked Publish instead, so: Reverse dictionary lookup and more, on beyond z.

Complex sort dictionary into sorted list

I have the following class:
class SalesRecord:
name = ''
dollarsSpent = 0
itemsPurchased = 0
itemsSold = 0
position = 0
I also have a dictionary that is structured with a bunch of these SalesRecords with the SalesRecord.name as the key. How can I sort the values into a sorted list with the following criteria (Assume everything is an integer):
Sort by dollarsSpent desc
Then by itemsPurchased - itemsSold desc
Then by itemsPurchased desc
Then by position asc
Also, just curious, but what would the overall run-time of such a sort be? Does it need to iterate 4 times in the worst case?
Use a compound key.
sorted((k, v) for (k, v) in somedict, key=lambda (k, v):
(-v.dollarsSpent, -(v.itemsPurchased - v.itemsSold),
-v.itemsPurchased, v.position))
You have two options. Both options run in linearithmic time, O(nlog(n)), but you may consider one more readable. You can use a compound sort key:
def sortkey(salesrecord):
r = salesrecord
return (
-r.dollarsSpent,
-r.itemsPurchased + r.itemsSold,
-r.itemsPurchased,
r.position
)
sorted(salesdict.values(), key=sortkey)
or sort multiple times, relying on the stability of the sort algorithm:
l = salesdict.values()
l.sort(key=lambda r: r.position)
l.sort(key=lambda r: r.itemsPurchased, reverse=True)
l.sort(key=lambda r: r.itemsPurchased - r.itemsSold, reverse=True)
l.sort(key=lambda r: r.dollarsSpent, reverse=True)
With a stable sort, you can sort by a number of keys with different priorities by sorting by each key from lowest priority to highest. After sorting by a higher-priority key, items that compare equal by that key are guaranteed to remain in the order they had after the lower-priority key sorts.
The easiest approach is to take advantage of sort stability to sort in multiple stages (first by position, then by items purchased descending, etc).
This is less expensive than it seems because the TimSort algorithm takes advantage the elements that are already partially ordered.
This approach is also easier to get right than trying to build an overly complex key-function.

python dictionary sorted based on time

I have a dictionary such as below.
d = {
'0:0:7': '19734',
'0:0:0': '4278',
'0:0:21': '19959',
'0:0:14': '9445',
'0:0:28': '14205',
'0:0:35': '3254'
}
Now I want to sort it by keys with time priority.
Dictionaries are not sorted, if you want to print it out or iterate through it in sorted order, you should convert it to a list first:
e.g.:
sorted_dict = sorted(d.items(), key=parseTime)
#or
for t in sorted(d, key=parseTime):
pass
def parseTime(s):
return tuple(int(x) for x in s.split(':'))
Note that this will mean you can not use the d['0:0:7'] syntax for sorted_dict though.
Passing a 'key' argument to sorted tells python how to compare the items in your list, standard string comparison will not work to sort by time.
Dictionaries in python have no guarantees on order. There is collections.OrderedDict, which retains insertion order, but if you want to work through the keys of a standard dictionary in order you can just do:
for k in sorted(d):
In your case, the problem is that your time strings won't sort correctly. You need to include the additional zeroes needed to make them do so, e.g. "00:00:07", or interpret them as actual time objects, which will sort correctly. This function may be useful:
def padded(s, c=":"):
return c.join("{0:02d}".format(int(i)) for i in s.split(c))
You can use this as a key for sorted if you really want to retain the current format in your output:
for k in sorted(d, key=padded):
Have a look at the collections.OrderedDict module

python how to sort a list on 2 values

If I have a the list
listOfFiles = [<str>,<intA>,<intB>]
How can I sort this list first by intA then by intB?
The end result would look like
<str>,1,1
<str>,1,2
<str>,1,3
<str>,2,1
<str>,2,2
etc
Use a compound key (or rather, a sequence as a key).
listOfFiles.sort(key=operator.itemgetter(1, 2))
Python list sorting is done in place and is guaranteed to be stable after 2.4 (I believe, it may have been 2.5). That means you can sort like so and should get the results you want:
listOfFiles.sort(key = lambda x: x[2])
listOfFiles.sort(key = lambda x: x[1])
I presume you actually have a list of lists, or a list of tuples. If not, please provide a more complete example of your data structure.
This also works:
listOfFiles.sort(key=lambda x: (x[1], x[2]))

Categories

Resources