A pyton set is meant as not ordered, so why enumerate accepts them as input?
The same question would apply to dictionary.
From my point of view these are giving the false impression that there is a predictable way of enumerating them, but there is not.
This is quite misleading. I would have expected at least a warning from enumerate whens I request the enumerate(set) or enumerate(dict).
Can anyone explain why this warning is not there? is it "pythonic" to allow enumeration which can be not predictable?
There is a distinction between a container and its iterator. Technically, enumerate doesn't work with set, dict, or list, because none of those types is an iterator. They are iterable, though, meaning enumerate can get an iterator from each by implicitly using the iter function (i.e., enumerate(some_list_dict_or_set) == enumerate(iter(some_list_dict_or_set)))
>>> iter([1,2,3])
<list_iterator object at 0x109d924e0>
>>> iter(dict(a=1, b=2))
<dict_keyiterator object at 0x109d4b818>
>>> iter({1,2,3})
<set_iterator object at 0x109d53ab0>
So while a given container may not have any inherent ordering of its elements, its iterator can impose an order, and enumerate simply pairs that ordering with a sequence of int values.
You can really see the difference between inherent ordering and imposed ordering when comparing dict and OrderedDict in Python 3.7 or later. Both remember the order in which its keys were added, but that order isn't an important part of a dict's identity. That is, two dicts with the same keys and values mapped to those keys are equivalent, no matter what order the keys were added.
>>> dict(a=1, b=2) == dict(b=2, a=1)
True
The same is not true of two OrderedDicts, which are only equal they have the same keys, the same values for those keys, and the keys were added in the same order.
>>> from collections import OrderedDict
>>> OrderedDict(a=1, b=2) == OrderedDict(b=2, a=1)
False
enumerate accepts any iterable which includes set and dict. set might be unordered but its order of iteration is not arbitrary; if you iterate the same set multiple times, it will yield elements in the same order.
Also note that as of Python 3.7 dict preserves insertion order. Whether or not this is useful solely depends on your use case.
Related
In Python 3.10, I am aware that a dictionary preserves insertion order. However when performing conditional list comprehensions, can this order still be guaranteed?
For example, given:
my_dict = {}
my_dict['a'] = 1
my_dict['b'] = 2
my_dict['c'] = 3
my_dict['d'] = 4
Can one guarantee that either (option A):
print([k for k in my_dict.keys() if k not in ['c']])
or (option B):
print([k for k in (my_dict.keys() - {'c'})])
will always return:
['a', 'b', 'd']
Iterating over dict or dict.keys() should give the same results for any version of Python, since the language guarantees the current order will always be stable, even if it doesn't necessarily match the insertion order. In Python 3, the keys() method provides a dynamic view of the dictionary's entries, so it will directly reflect the current state of the dict. The views themselves may be "set-like", but that does not imply they are unordered (or independently ordered).
The problem with the examples in the question is that they don't compare like with like. The keys() method returns a view (or a list in earlier versions), whereas keys() - {'a'} evaluates to a set (i.e. an object with no guaranteed order). So it is safe to assume option A will always give the same results, but not option B.
I think the short answer is yes, the "preserves insertion order" clause gives you a proper order of keys whenever you go through them (be it via for k in my_dict or my_dict.keys()), and together with the one that #Larry pointed out gives you what you ask for.
However the downwotes on this question are probably due to the fact that if you need an answer to this question for a coding problem, you should either learn more about list comprehensions or just rethink your solution and sort the keys based on insertion order or whatever way of guaranteing you'd imagine
I need set-like data structure with these properties:
hashable
no duplicate elements
maintains order
immutable
iterable
part of standard library? want to keep it simple
What is happening:
frozenset([3,1,2,2,3]) -> frozenset(1,2,3)
What I need:
frozenset*([3,1,2,2,3]) -> frozenset*(3,1,2)
I thought I could use frozenset but both sets and frozensets
reorder elements. I assume this is for faster duplicate checks?
But in any case I can't have a reordering.
As of Python 3.7 dicts no longer reorder elements and instead guarantee to preserve insertion order. You could use a dict where the keys are your set items and the values are ignored.
>>> dict.fromkeys([3,1,2,2,3])
{3: None, 1: None, 2: None}
Dicts aren't frozen, so if that's crucial then you could first put all the items into a dict and then build a tuple from the keys.
>>> tuple(dict.fromkeys([3,1,2,2,3]).keys())
(3, 1, 2)
This would be pretty close to a frozenset. The main difference is that it would take O(n) rather than O(1) time to check if an item is in the tuple.
There is no such implementation in the standard library
So the python documentation suggests using itemgetter, attrgetter, or methodgetter from the operator module when applying sorted on complex data types. Further, iterators are smaller and faster than lists for large size objects.
Thus I am wondering how to create an iterator on an OrderDict's values. The reason being that in the OrderDict I wish to sort all the values are also (regular) dictionaries.
For regular dictionaries, one could do this with:
sorted(my_dict.itervalues(), key=itemgetter('my_key'))
however OrderedDict only seems to have the method __iter__() which works on the OrderedDict keys.
So how can I efficiently make an iterator for the values of the OrderedDict.
Note, I am not looking for list comprehension, a lambda function, or extracting the relevant sub key (key inside the dictionary (a value)) values of the OrderedDict.
e.g.
sorted (my_dict, key= lambda key: my_dict[key]['my_key'])
example nested:
test = OrderedDict({'a': {'x':1, 'y':2, 'z':3},
'b': {'x':1, 'y':2, 'z':3}
})
Neither dict nor OrderedDict have an itervalues() method in Python 3. That method only exists in Python 2.
Use dict.values():
sorted(my_dict.values(), key=itemgetter('my_key'))
In Python 2 you want to use itervalues() not so much because it is an iterator, but because dict.values() had to create a new list object which is then discarded again. Iterables are also not faster (rather, they are often slower!), they are instead more memory efficient. In this case it is faster because not having to create a (large) list that you then discard again takes time.
In Python 3, dict.values() creates a view instead, a lightweight object that like dict.itervalues() yields values on demand and doesn't have to produce a list up front.
You don't have to call iter() on this. sorted() takes an iterable, and will itself call iter() on whatever you passed in. Because it does this from native code and doesn't have to look up a global name, it can do this much faster than Python code ever could.
The answer is to call the method .values() to get a view and type set it to iter:
sorted(iter(my_dict.values()), key=itemgetter('my_subkey'))
If we have 2 separate dict, both with the same keys and values, when we print them it will come in different orders, as expected.
So, let's say I want to to use hash() on those dict:
hash(frozenset(dict1.items()))
hash(frozenset(dict2.items()))
I'm doing this to make a new dict with the hash() value created as the new keys .
Even showing up different when printing dict, the value createad by hash() will always be equal? If no, how to make it always the same so I can make comparisons successfully?
If the keys and values hash the same, frozenset is designed to be a stable and unique representation of the underlying values. The docs explicitly state:
Two sets are equal if and only if every element of each set is contained in the other (each is a subset of the other).
And the rules for hashable types require that:
Hashable objects which compare equal must have the same hash value.
So by definition frozensets with equal, hashable elements are equal and hash to the same value. This can only be violated if a user-defined class which does not obey the rules for hashing and equality is contained in the resulting frozenset (but then you've got bigger problems).
Note that this does not mean they'll iterate in the same order or produce the same repr; thanks to chaining on hash collisions, two frozensets constructed from the same elements in a different order need not iterate in the same order. But they're still equal to one another, and hash the same (precise outputs and ordering is implementation dependent, could easily vary between different versions of Python; this just happens to work on my Py 3.5 install to create the desired "different iteration order" behavior):
>>> frozenset([1,9])
frozenset({1, 9})
>>> frozenset([9,1])
frozenset({9, 1}) # <-- Different order; consequence of 8 buckets colliding for 1 and 9
>>> hash(frozenset([1,9]))
-7625378979602737914
>>> hash(frozenset([9,1]))
-7625378979602737914 # <-- Still the same hash though
>>> frozenset([1,9]) == frozenset([9,1])
True # <-- And still equal
>>> {x for x in 'spam'}
{'a', 'p', 's', 'm'}
Why does it change the order? If you take a look at a loop, it works perfectly:
>>> for x in 'spam':
... print(x)
...
s
p
a
m
>>>
Sets in python (and in set theory) are not ordered. So when you loop over them, there is no defined ordering.
You looped over the string literal 'spam' to make a set containing each character in that string. Once you did that, the ordering was gone.
When you perform the for loop over 'spam', you are performing the loop against a string which does have ordering.
From Set types:
These represent unordered, finite sets of unique, immutable objects. As such, they cannot be indexed by any subscript [because no ordering is defined among the elemnts]. However, they can be iterated over, and the built-in function len() returns the number of items in a set. Common uses for sets are fast membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.
But if you really need to preserve the order, then please check ordered set.
And anyway you may like really to write just >>> set('spam') instead of any comprehension.
set is not an ordered collection, and as such, the internal order of keys is undefined.
From docs.python.org
A set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. (For other containers see the built in dict, list, and tuple classes, and the collections module.)
sets are unordered by definition. The reason for this is that their implementation runs faster that way, by using appropriate data structures that do not preserve order. If you need order, you can use the (slower) OrderedDict type.
Python sets are defined as unordered, so Python is free to order them any way it likes (efficiently, I pressme).