Okay so I'm trying to sort a dictionary of key values pairs in a dictionary. The keys are words and values are decimal values. I'm trying to get a list of sorted keys based on the values. right now I have:
sortedList = sorted(wordSimDic, key=wordSimDic.get,reverse=True)
This works. However, I would like to be able to do one thing further. If the values of the keys match I would like them to be sorted in the outputlist by the key in alphabetical order.
For example
Input:
{'c':1,'a':1,'d':3}
Output:
[a,c,d]
right now my output is:
[c,a,d]
Do you have suggestion.Thanks so much!
In general, to 'sort by X then Y' in Python, you want a key function that produces an (X, Y) tuple (or other sequence, but tuples are simplest and cleanest really). Thus:
sorted(wordSimDic, key=lambda k: (wordSimDic.get(k), k), reverse=True)
Related
I have an array here of 5 elements:
["#note1", "#note2", "=dir1/", "=dir2/", "#dir1/note1", "#dir1/note2"]
I want to sort them based on the characters and length but the characters will take precedence over the length
Example output of the sorted array should look like this:
["=dir1/","#dir1/note1", "#dir1/note2", "=dir2/", "#note1", "#note2"]
Right now i have a piece of code that looks like this:
s = ["#note1", "#note2", "=dir1/", "=dir2/", "#dir1/note1", "#dir1/note2"]
s.sort()
s.sort(key=len , reverse=True)
Is there a way to do it with this sorting function from python or do i have to make a custom sorting algorithm to do it?
You can use a sort key with a tuple:
d = ["#note1", "#note2", "=dir1/", "=dir2/", "#dir1/note1", "#dir1/note2"]
new_d = sorted(d, key=lambda x:(x[1:], len(x)))
Output:
['=dir1/', '#dir1/note1', '#dir1/note2', '=dir2/', '#note1', '#note2']
I have a dictionary:
dict_1 = {'ABCDEFG': ['AB'], 'GFE': ['AB']}
I want to sort the dictionary based on value ascending and based on key descending. I have written the following code which I believe to be correct. Unfortunately, out of experience, what I believe to be correct is usually correct only in my specific instance, in short - the code is wrong and there is a smarter way to achieve the searched result.
Code:
dict_1 = {'ABCDEFG': ['AC'], 'GFE': ['AB']}
sorted_1 = sorted(dict_1.items(), key=lambda x: (x[1][0], [-ord(x[0][n]) for n in range(len(x[0]))]), reverse=False)
print(sorted_1)
What I expect as an output:
A sorted list based on value first ascending and based on key second if values are equal - descending.
Note:
The key and value will always be strings with more than 1 symbols.
Because Python's sort is stable, you can do this in two sort steps:
sort by descending key
sort by ascending values
Equal values will retain the original (descending) order of their respective keys
dict1 = {'GFE': ['AB'], 'ABCDEFG': ['AC'], 'XYZ':['AB'] }
sorted1 = sorted(dict1.items(), reverse=True)
sorted1 = sorted(sorted1, key=lambda kv:kv[1])
print(sorted1)
[('XYZ', ['AB']), ('GFE', ['AB']), ('ABCDEFG', ['AC'])]
You could also do this in a single line:
sorted1 = sorted(sorted(dict1.items(),reverse=True),key=lambda kv:kv[1])
How to sort a list of 1 billion elements in python
Please elaborate
Assuming we have unlimited space.
Thanks for the answers, but, this question is asked in the perspective of optimizing algorithm to sort, not to work on python. This question is asked in an interview, in the context of, having large number of elements, may be integers or strings, this probably wont be used in real world as we have techniques like pagination.
Dictionaries are unordered. They are hash tables and you are not guaranteed the order of keys in a hash table.
If you require the keys to be ordered, try the ordered dict class in collections.
If you need to sort the keys you could place them in a list, and sort the list.
my_dict = {key: value for key, value in zip(keys, values)} # Example dict
keys = [i for i in my_dict]
keys.sort()
A dictionary on its own does not store a key order. For this, you will have to use the OrderedDict which stores the order of insertion.
If you just need to iterate through the sorted keys, you can use sorted:
for key in sorted(my_dict):
# output is already sorted by dictionary key
print key, my_dict[key]
if you need to specify a special key or method, you could pass this as information to sorted. The following example sorts by value:
for key, value in sorted(my_dict.items(), key=lambda x: x[1]):
# output is sorted by value
print key, value
Sort orders can be reversed by using reversed:
for key in reversed(sorted(my_dict)):
# output is already sorted descending by dictionary key
print key, my_dict[key]
Finally, this code snippet would fill an OrderedDict with sorted key/value pairs:
from collections import OrderedDict
my_ordered_dict = OrderedDict(sorted(my_dict.items(), key=lambda t: t[0]))
Since you updated your question from dictionary to list
Sorting lists is even easier, just use sorted (again, provide a small method, if you have a different sorting key):
sorted_list = sorted(unsorted_list)
Say you have a dictionary:
a={'20101216':5,'20100216':1,'20111226':2,'20131216':5}
Two keys have the same value. How would I go about printing the the maximum key date (which is a string) and value? Like:
5 at 12/16/2013
I tried to for loop the key and the values and print the max key and max value, but it's not working out.
edit: I originally started out trying to convert an array of string dates to date objects. But it fails [b]
b=['20101216','20100216','20111226','20131216']
c=[5,1,2,5]
z=[]
for strDate in b:
g=[datetime.datetime.strptime(strDate, '%Y%m%d')]
if g not in z:
z.append(g)
Then from there if it worked I have would of done another for loop on my new array [z] to format each date element properly (m/d/y). Following that I would have zipped both arrays into a dictionary.
Like:
d = dict(zip(z,c))
Which would have resulted in
d={12/16/2010:5,02/16/2010:1,12/26/2011:2,12/16/2013:5}
Finally I would have attempted to find max date key and max value. And printed it like so:
5 at 12/16/2013
But because of the failure converting array b, I was thinking maybe working with a dictionary from the start might yield better results.
TL;DR:
max(a.items(), key = lambda x: (x[1], x[0]))
Basically, the problem is that you cant access dict's values directly and you still need to sort your data counting it. So, dict.items() gives you a list of tuples, i.e.
a.items()
[('20101216', 5), ('20131216', 5), ('20111226', 2), ('20100216', 1)]
Then all you need is to get maximum value of this list. The simple solution for getting maximum value is max func. As your problem is slightly complicated, you should leverage max key argument (take a look at doc) and use "compound" sorting key. In such situation the lambda function is a solution. You can express pretty any thing that you need to sort. So, sorting by 2 values inside tuple with corresponding priority should be
max(l, key = lambda x: (x[1], x[0])) # where l is iterable with tuples
I have a dictionary such as below.
d = {
'0:0:7': '19734',
'0:0:0': '4278',
'0:0:21': '19959',
'0:0:14': '9445',
'0:0:28': '14205',
'0:0:35': '3254'
}
Now I want to sort it by keys with time priority.
Dictionaries are not sorted, if you want to print it out or iterate through it in sorted order, you should convert it to a list first:
e.g.:
sorted_dict = sorted(d.items(), key=parseTime)
#or
for t in sorted(d, key=parseTime):
pass
def parseTime(s):
return tuple(int(x) for x in s.split(':'))
Note that this will mean you can not use the d['0:0:7'] syntax for sorted_dict though.
Passing a 'key' argument to sorted tells python how to compare the items in your list, standard string comparison will not work to sort by time.
Dictionaries in python have no guarantees on order. There is collections.OrderedDict, which retains insertion order, but if you want to work through the keys of a standard dictionary in order you can just do:
for k in sorted(d):
In your case, the problem is that your time strings won't sort correctly. You need to include the additional zeroes needed to make them do so, e.g. "00:00:07", or interpret them as actual time objects, which will sort correctly. This function may be useful:
def padded(s, c=":"):
return c.join("{0:02d}".format(int(i)) for i in s.split(c))
You can use this as a key for sorted if you really want to retain the current format in your output:
for k in sorted(d, key=padded):
Have a look at the collections.OrderedDict module