Python OrderedDict ordered by date - python

I am trying to use an OrderedDict (Raymond Hettingers version for pre2.7 Python) where my keys are dates. However it does not order them correctly, I imagine it may be ordering based on the ID.
Does anyone have any suggestions of how this could be done?

In [1]: from collections import OrderedDict
In [2]: import operator
In [3]: from datetime import date
In [4]: d = {date(2012, 1, 1): 123, date(2010,2,5): 542, date(2011,3,3):76 }
In [5]: d # Good old dict
Out[5]: #it seems sorted, but it isn't guaranteed to be that way.
{datetime.date(2010, 2, 5): 542,
datetime.date(2011, 3, 3): 76,
datetime.date(2012, 1, 1): 123}
In [6]: o = OrderedDict(sorted(d.items(), key=operator.itemgetter(0)))
In [7]: o #Now it is ordered(and sorted, because we give it by sorted order.).
Out[7]: OrderedDict([(datetime.date(2010, 2, 5), 542), (datetime.date(2011, 3, 3), 76), (datetime.date(2012, 1, 1), 123)])

OrderedDict, according to its docstring, is a kind of dict that remembers insertion order.
Thus, you need to manually insert the key/value pairs in the correct order.
# assuming unordered_dict is a dict that contains your data
ordered_dict = OrderedDict()
for key, value in sorted(unordered_dict.iteritems(), key=lambda t: t[0]):
ordered_dict[key] = value
edit: See utdemir's answer for a better example. Using operator.itemgetter gives you better performance (60% faster, I use the benchmark code below) and it's a better coding style. And you can apply OrderedDict directly to sorted(...).
a = (1, 2)
empty__func = 0
def empty():
for i in xrange(N_RUNS):
empty__func
lambda_func = lambda t: t[0]
def using_lambda():
for i in xrange(N_RUNS):
lambda_func(a)
getter_func = itemgetter(0)
def using_getter():
for i in xrange(N_RUNS):
getter_func(a)

Related

Order dictionary by key with numerical representation

I have this input, where each value has a range of 200:
d = {'600-800': 3, '1800-2000': 3, '1000-1200': 5, '400-600': 1, '2600-2800': 1}
And I am looking for this expected order:
{'400-600': 1, '600-800': 3, '1000-1200': 5, '1800-2000': 3, '2600-2800': 1}
Already tried something like this, but the order is just wrong:
import collections
od = collections.OrderedDict(sorted(d.items()))
print od
You can split the key into parts at '-' and use the first part as integer value to sort it. The second part is irrelevant for ordering because of the nature of your key-values (when converted to integer):
d = {'600-800': 3, '1800-2000': 3, '1000-1200': 5, '400-600': 1, '2600-2800': 1}
import collections
od = collections.OrderedDict(sorted(d.items(),key =lambda x: int(x[0].split("-")[0])))
print od
Output:
OrderedDict([('400-600', 1), ('600-800', 3), ('1000-1200', 5),
('1800-2000', 3), ('2600-2800', 1)])
Doku:
sorted(iterable,key)
Related:
How to sort a list of objects based on an attribute of the objects? for more "sort by key" examples
Are dictionaries ordered in Python 3.6+? .. which lets you omit the OrderedDict from 3.7+ on (or 3.6 CPython)
If you want to order your dictionary by the first year first (and then by the second year if needed, which is unnecessary in the given example, but feels more natural), you need to convert to integers and set a custom key:
d = {'600-800': 3, '1800-2000': 3, '1000-1200': 5, '400-600': 1, '2600-2800': 1}
sorted(d.items(), key=lambda t: tuple(map(int, t[0].split("-"))))
# [('400-600', 1),
# ('600-800', 3),
# ('1000-1200', 5),
# ('1800-2000', 3),
# ('2600-2800', 1)]
The conversion to integers is needed because e.g. "1000" < "200", but 1000 > 200. This list can be passed to OrderedDict afterwards like in your code, if needed.

Get index range of the repetitive elements in the list

Suppose I have a list a = [-1,-1,-1,1,1,1,2,2,2,-1,-1,-1,1,1,1] in python what i want is if there is any built in function in python in which we pass a list and it will return which element are present at what what index ranges for example
>>> index_range(a)
{-1 :'0-2,9-11', 1:'3-5,12-14', 2:'6-8'}
I have tried to use Counter function from collection.Counter library but it only outputs the count of the element.
If there is not any built in function can you please guide me how can i achieve this in my own function not the whole code just a guideline.
You can create your custom function using itertools.groupby and collections.defaultdict to get the range of numbers in the form of list as:
from itertools import groupby
from collections import defaultdict
def index_range(my_list):
my_dict = defaultdict(list)
for i, j in groupby(enumerate(my_list), key=lambda x: x[1]):
index_range, numlist = list(zip(*j))
my_dict[numlist[0]].append((index_range[0], index_range[-1]))
return my_dict
Sample Run:
>>> index_range([-1,-1,-1,1,1,1,2,2,2,-1,-1,-1,1,1,1])
{1: [(3, 5), (12, 14)], 2: [(6, 8)], -1: [(0, 2), (9, 11)]}
In order to get the values as string in your dict, you may either modify the above function, or use the return value of the function in dictionary comprehension as:
>>> result_dict = index_range([-1,-1,-1,1,1,1,2,2,2,-1,-1,-1,1,1,1])
>>> {k: ','.join('{}:{}'.format(*i) for i in v)for k, v in result_dict.items()}
{1: '3:5,12:14', 2: '6:8', -1: '0:2,9:11'}
You can use a dict that uses list items as keys and their indexes as values:
>>> lst = [-1,-1,-1,1,1,1,2,2,2,-1,-1,-1,1,1,1]
>>> indexes = {}
>>> for index, item in enumerate(lst):
... indexes.setdefault(value, []).append(index)
>>> indexes
{1: [3, 4, 5, 12, 13, 14], 2: [6, 7, 8], -1: [0, 1, 2, 9, 10, 11]}
You could then merge the index lists into ranges if that's what you need. I can help you with that too if necessary.

Easy way to find what item repeated in list

So I have list like
l = [1,2,3,4,4]
If I make a set obvilously I will get
([1,2,3,4])
I need a way to find what item repeated in list and was popped out and I do not want to use looping.
If there is an easy way to do so?
I'm using python 2.7
You'll have to iterate the list, explicitly or implicitly. One way using standard libraries would be with collections.Counter:
In [1]: from collections import Counter
In [2]: l = [1,2,3,4,4]
In [3]: Counter(l).most_common(1)[0][0]
Out[3]: 4
A Counter object is a dictionary with elements of some iterable as keys and their respective counts as values:
In [4]: Counter(l)
Out[4]: Counter({4: 2, 1: 1, 2: 1, 3: 1})
Its most_common() method returns a list of items with highest counts:
In [5]: Counter(l).most_common()
Out[5]: [(4, 2), (1, 1), (2, 1), (3, 1)]
The optional argument restricts the length of the returned list:
In [6]: Counter(l).most_common(1)
Out[6]: [(4, 2)]

Sort an array of tuples by product in python

I have an array of 3-tuples and I want to sort them in order of decreasing product of the elements of each tuple in Python. So, for example, given the array
[(3,2,3), (2,2,2), (6,4,1)]
since 3*2*3 = 18, 2*2*2 = 8, 6*4*1 = 24, the final result would be
[(6,4,1), (3,2,3), (2,2,2)]
I know how to sort by, for example, the first element of the tuple, but I'm not sure how to tackle this.
Any help would be greatly appreciated. Thanks!
Use the key argument of sorted/list.sort to specify a function for computing the product, and set the reverse argument to True to make the results descending rather than ascending, e.g.:
from operator import mul
print sorted([(3,2,3), (2,2,2), (6,4,1)], key=lambda tup: reduce(mul, tup), reverse=True)
In [176]: L = [(3,2,3), (2,2,2), (6,4,1)]
In [177]: L.sort(key=lambda (a,b,c):a*b*c, reverse=True)
In [178]: L
Out[178]: [(6, 4, 1), (3, 2, 3), (2, 2, 2)]
A simpler solution from my point of view:
a = [(3,2,3), (2,2,2), (6,4,1)]
def f(L):
return L[0]*L[1]*L[2]
print sorted(a, key = f, reverse = True)
key must be a function that returns a value that will be used in order to sort the list
reverse is True because you want it ordered in decreasing order
>>> from operator import mul
>>> input_list = [(3,2,3), (2,2,2), (6,4,1)]
>>> input_list.sort(key=lambda tup: reduce(mul,tup))
>>> print input_list
[(2, 2, 2), (3, 2, 3), (6, 4, 1)]

Sorting a dictionary of tuples in Python

I know there's tonnes of questions on python sorting lists/dictionaries already, but I can't seem to find one which helps in my case, and i'm looking for the most efficient solution as I'm going to be sorting a rather large dataset.
My data basically looks like this at the moment:
a = {'a': (1, 2, 3), 'b': (3, 2, 1)}
I'm basically creating a word list in which I store each word along with some stats about it (n, Sigma(x), Sigma(x^2) )
I want to sort it based on a particular stat. So far I've been trying something along the lines of:
b = a.items()
b.sort(key = itemgetter(1), reverse=True)
I'm not sure how to control which index it is sorted based on when its effectively a list of tuples of tuples? I guess I effectively need to nest two itemgetter operations but not really sure how to do this.
If there's a better data structure I should be using instead please let me know. Should I perhaps create a small class/struct and then use a lambda function to access a member of the class?
Many Thanks
Something like this?
>>> a = {'a': (1, 2, 3), 'b': (3, 2, 1)}
>>> b = a.items()
>>> b
[('a', (1, 2, 3)), ('b', (3, 2, 1))]
>>> b.sort(key=lambda x:x[1][2]) # sorting by the third item in the tuple
>>> b
[('b', (3, 2, 1)), ('a', (1, 2, 3))]
Names are easier to work with and remember that indices, so I would go with a class:
class Word(object): # don't need `object` in Python 3
def __init__(self, word):
self.word = word
self.sigma = (some calculation)
self.sigma_sq = (some other calculation)
def __repr__(self):
return "Word(%r)" % self.word
def __str__(self):
return self.word
#property
def sigma(self):
return self._sigma
#sigma.setter # requires python 2.6+
def sigma(self, value):
if not value:
raise ValueError("sigma must be ...")
self._sigma = value
word_list = [Word('python'), Word('totally'), Word('rocks')]
word_list.sort(key=lambda w: w.sigma_sq)

Categories

Resources