I've been researching online for a simple way to create an ordered dictionary and landed on OrderedDict and its update method, I've successfully implemented this once but however now the code tends not to sort on the added terms for example the items being placed are:
Doc1: Alpha, zebra, top
Doc2: Andres, tell, exta
Output: Alpha, top, zebra, Andres, exta, tell
My goal is to have Alpha, Andres......, top, zebra
This is the code:
finalindex= collections.OrderedDict()
ctr=0
while ctr < docCtr:
filename = 'dictemp%d.csv' % (ctr,)
ctr+=1
dicTempList = io.openTempDic(filename)
print filename
for key in dicTempList:
if key in finalindex:
print key
for k, v in finalindex.items():
newvalue = v + "," + dicTempList.get(key)
finalindex.update([(key, newvalue)])
else:
finalindex.update([(key, dicTempList.get(key))])
io.saveTempDic(filename,finalindex)
Can someone please assist me?
OrderedDicts remember the order that they were created. If you want it sorted, you need to do that when you create them. Here's how to sort an OrderedDict, an example taken from the docs:
from collections import OrderedDict
d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
sorted_dict = OrderedDict(sorted(d.items(), key=lambda t: t[0]))
This will work with another ordered dict, and I prefer to import the module and reference functions and classes from it for clarity for the reader, so this is done in a slightly different style, but again, to have it sorted, you need to sort it before creating a new OrderedDict:
import collections
ordered_dict=collections.OrderedDict()
ordered_dict['foo'] = 1
ordered_dict['bar'] = 2
ordered_dict['baz'] = 3
sorted_dict = collections.OrderedDict(sorted(ordered_dict.items(),
key=lambda t: t[0]))
and sorted_dict returns:
OrderedDict([('bar', 2), ('baz', 3), ('foo', 1)])
If lambdas are confusing, you can use operator.itemgetter
import operator
get_first = operator.itemgetter(0)
sorted_dict = collections.OrderedDict(sorted(ordered_dict.items(),
key=get_first))
I'm using key arguments to demonstrate their usage in case you want to sort by values, but Python sorts tuples (what dict.items() provides to iterate over by means of a list in Python 2 and an iterator in Python 3) by first element then second and so on, so you can even do this and get the same result:
sorted_dict = collections.OrderedDict(sorted(ordered_dict.items()))
An ordered dictionary is not a sorted dictionary.
From the documentation 8.3. collections — High-performance container datatypes:
OrderedDict dict subclass that remembers the order entries were added
(emphasis mine)
The ordered dictionary is a hash table backed structure that also maintains a linked list along side it that stores the order of which items are inserted. The dictionary, when iterated over, uses that linked list.
This type of structure is very useful for LRU caches where one wants to only maintain the N most recent items requested, and then evict the oldest one when a new one would push it over capacity.
The code is working correctly.
Some explanation of the design philosophy behind this can be found at Why are there no containers sorted by insertion order in Python's standard libraries? which suggests that the lack of sorted structures confuses the "one obvious way to do it" when it comes to selecting which container you want (compare with all the different types of classes implementing Map, Set and List in Java - do you use a LinkedHashMap? or a ConcurrentSkipListMap? or a TreeMap? or a WeakHashMap?).
Related
Say I want to store ordered values where the key values represent a lower bound. Like this example:
d = {1: "pear", 4: "banana", 7: "orange"}
I can access the first object by d[1]. Say I want to store it so that I can access the first object "pear" by calling for any value between [1,4). If I input any "keyvalue" between [4,7) I want "banana" to be returned. Is there any type of data structure like that in python? I found intervalTrees, but it looked a bit more advanced than what I was looking for. In intervalTrees the intervals which are the keys, can be overlapping and I don't want that. Or maybe it is not a dictionary of any type I want since you can mix datatypes as keys in one dictionary. What do you think?
EDIT: From the tip I got, this would be a working code:
import bisect
d = [(1, "pear"), (4, "banana"), (7,"orange") ]
keys = [j[0] for j in d]
for v in range(1,10):
print("Using input ", v)
i = bisect.bisect(keys, v) - 1
out = d[i]
print(out)
print("")
# Or using SortedDict
from sortedcontainers import SortedDict
d2 = SortedDict()
d2[1] = 'pear'
d2[4] = 'banana'
d2[7] = 'orange'
for v in range(1,10):
print("Using input ", v)
i = bisect.bisect(d2.keys(), v) - 1
j = d2.keys()[i]
out = d2[j]
print(out)
print("")
The data structure you're looking for is a binary search tree (BST), and preferably a balanced BST. Your dictionary keys are the keys of the BST, and each node would just have an additional field to store the corresponding value. Then your lookup is just a lower-bound / bisect-left on the keys. Looking up Python implementations for Red-Black trees or AVL trees returns many possible packages.
There is no builtin library for always-sorted data. If you never need to add or delete keys, you can use bisect with (key, value) tuples in a sorted list.
For a pure Python implementation that allows modification, I would recommend checking out SortedDict from the SortedContainers library. It's built to be a drop-in replacement for BST's, is very usable and tested, and claims to outperform pointer-based BST's in memory and speed on reasonably sized datasets (but does not have the same asymptotic guarantees as a BST). You can also provide a custom key for comparing objects of different types.
I have a small program that works just fine. I am trying to force myself to review\analyze my code in order to try to make my code (and myself) just a little better.
I was wondering if this small portion of code:
temp2 = {}
for key in sorted(temp1.keys()):
temp2[key] = temp1[key]
couldn't be rewritten as a dictionary comprehension.
Due primarily to my profound lack of experience I am not able to 'convert'
this into a comprehension.
All the loop does is take dictionary temp1, sort it and place the newly sorted key:value pairs into temp2.
As I stated above, the whole thing works as is, but I am trying to learn
spotting patterns where I can make improvements.
Directly translating this to a dictionary comprehension is easy:
temp2 = {key: value for key, value in sorted(temp1.items())}
In case you don't want to use the value as tie-breaker if the keys are equal you could also provide a key so you're only sorting based on the key:
temp2 = {key: value for key, value in sorted(temp1.items(), key=lambda x: x[0])}
Even though dictionaries are "ordered" in python-3.6 doesn't mean you should rely on it (the orderedness is officially just a side-effect!). Better to use an OrderedDict and be on the safe side:
from collections import OrderedDict
temp2 = OrderedDict([(key, value) for key, value in sorted(temp1.items(), key=lambda x: x[0])])
I have a collections.OrderedDict with a list of key, value pairs. I would like to compute the index i such that the ith key matches a given value. For example:
food = OrderedDict([('beans',33),('rice',44),('pineapple',55),('chicken',66)])
I want to go from the key chicken to the index 3, or from the key rice to the index 1. I can do this now with
food.keys().index('rice')
but is there any way to leverage the OrderedDict's ability to look things up quickly by key name? Otherwise it seems like the index-finding would be O(N) rather than O(log N), and I have a lot of items.
I suppose I can do this manually by making my own index:
>>> foodIndex = {k:i for i,k in enumerate(food.keys())}
>>> foodIndex
{'chicken': 3, 'rice': 1, 'beans': 0, 'pineapple': 2}
but I was hoping there might be something built in to an OrderedDict.
Basically, no. OrderedDict gets its ability to look things up quickly by key name just by using a regular, unordered dict under the hood. The order information is stored separately in a doubly linked list. Because of this, there's no way to go directly from the key to its index. The order in an OrderedDict is mainly intended to be available for iteration; a key does not "know" its own order.
As others have pointed out, an OrderedDict is just a dictionary that internally remembers what order entries were added to it. However, you can leverage its ability to look-up things quickly by storing the desired index along with the rest of the data for each entry. Here's what I mean:
from collections import OrderedDict
foods = [('beans', 33), ('rice', 44), ('pineapple', 55), ('chicken', 66)]
food = OrderedDict(((v[0], (v[1], i)) for i, v in enumerate(foods))) # saves i
print(food['rice'][1]) # --> 1
print(food['chicken'][1]) # --> 3
The OrderedDict is a subclass of dict which has the ability to traverse its keys in order (and reversed order) by maintaining a doubly linked list. So it does not know the index of a key. It can only traverse the linked list to find the items in O(n) time.
Perusing the source code may be the most satisfying way to confirm that the index is not maintained by OrderedDict. You'll see that no where is an index ever used or obtained.
say i have a dict: d = {'Abc':5,'Jack':4,'amy':9,'Tom':0,'abc':5}
If i want to write a function such that if i pass that function to the built-in sort function, eg. list(d).sort(function), the sort function will sort the list based on the values, of any have identical values, sort them by their keys(alphabetical order). So, in this case, d = {'Abc':5,'Jack':4,'amy':9,'Tom':0,'abc':5,'TAM':0} returns ['amy','Abc','abc','Jack','TAM','Tom']
The function should look something like this:
def arrange_items(something, thing,**may be a function**):
if something < thing:
return -1
elif something > thing:
return 1
etc
if i call some_list.sort(arrange_items), i should get a sorted list back
Thank you in advance
Modification of specification(Another question):
if i have a dict of twitter users name, the dict is in this format:
dict = {'JohnZ':{'name': Jonny Zue,'follow':'MiniT',}, etc} # JohnZ is one of the twitter user. The follow means people that JonhZ follows, in this case it is MiniT.
Popularity of a user means the number of people that follow this particular user, in the above example, the popularity of MiniT is at least one b/c there is at least one user who follow MiniT.
say i have a list of twitter user names, say L1 = ['JonhZ','MiniT',etc], and i want to sort L1 based on the users popularity (higher popularity comes first). dict is already defined in global namespace(we can directly access dict).The requirement for this sort function is to use L1.sort(pass_function)
How should i write the pass_function such that sort will automatically sort L1 based on the popularity of the users.
Thanks for helping
[k for k, v in sorted(d.iteritems(), key=lambda x: (-x[1], x[0].lower()))]
EDIT:
(I refuse to use the name "dict" since it shadows a builtin, and shadowing builtins is stupid)
L1.sort(key=lambda x: (-d.get(x, 0), x.lower()))
You can't achieve this with list(d).sort(function), because you'll get a list with dictionary keys. You can achieve your objective with alternative approach:
l1 = sorted(d.items(), key=lambda x: (x[1], x[0]))
l2 = sorted(l1, key=lambda x: x[1], reverse=True)
result = [x[0] for x in l2]
This approach converts dictionary to list of (key, value) tuples. Then l1 is sorted by values and l2 is sorted by keys. Since python has a stable sorting algorithm, the order of values is preserved for identical keys.
Edit: Ignacio Vazquez-Abrar's approach is similar, but more elegant, because the list need to be sorted only once.
This question already has answers here:
How to keep keys/values in same order as declared?
(13 answers)
Closed 5 years ago.
A Python dictionary is stored in no particular order (mappings have no order), e.g.
>>> myDict = {'first':'uno','second':'dos','third':'tres'}
myDict = {'first':'uno','second':'dos','third':'tres'}
>>> myDict
myDict
{'second': 'dos', 'third': 'tres', 'first': 'uno'}
While it is possible to retrieve a sorted list or tuple from a dictionary, I wonder if it is possible to make a dictionary store the items in the order they are passed to it, in the previous example this would mean having the internal ordering as {'first':'uno','second':'dos','third':'tres'} and no different.
I need this because I am using the dictionary to store the values as I read them from a configuration file; once read and processed (the values are altered), they have to be written to a new configuration file in the same order as they were read (this order is not alphabetical nor numerical).
Any thoughts?
Please notice that I am not looking for secondary ways to retrieve the order (like lists), but of ways to make a dictionary be ordered in itself (as it will be in upcoming versions of Python).
Try python 2.7 and above, probably 3.1, there is OrderedDict
http://www.python.org/
http://python.org/download/releases/2.7/
>>> from collections import OrderedDict
>>> d = OrderedDict([('first', 1), ('second', 2),
... ('third', 3)])
>>> d.items()
[('first', 1), ('second', 2), ('third', 3)]
PEP 372: Adding an ordered dictionary to collections
Use a list to hold the key order
Implementations of order-preserving dictionaries certainly do exist.
There is this one in Django, confusingly called SortedDict, that will work in Python >= 2.3 iirc.
Dictionaries in Python are implemented as hash tables, which is why the order appears random. You could implement your own variation of a dict that sorts, but you'd lose out on the convenient syntax. Instead, keep track of the order of the keys, too.
Initialization:
keys = []
myDict = {}
While reading:
myDict[key] = value
keys.append(key)
While writing:
for key in keys:
print key, myDict[key]
Rather Than Explaining The Theoretical Part I'll Give A Simple Example.
>>> from collections import OrderedDict
>>> my_dictionary=OrderedDict()
>>> my_dictionary['foo']=3
>>> my_dictionar['aol']=1
>>> my_dictionary
OrderedDict([('foo', 3), ('aol', 1)])
There is a very short answer to that..
do this--
dictCopy=yourdictname.copy()
then use the dictCopy , it will be in the same order.