I've noticed that Python lets me do this:
>>> {1: "foo"} < {2: "bar"}
True
It lets me do the same thing for lists, deques, etc. What are the semantics of < when applied to dictionaries in Python?
In general where can I find out the semantics of < for any given type of collection? In most cases it seems not to be found in the documentation. For example:
>>> help(dict.__cmp__)
Help on wrapper_descriptor:
__cmp__(...)
x.__cmp__(y) <==> cmp(x,y)
>>> help(cmp)
Help on built-in function cmp in module __builtin__:
cmp(...)
cmp(x, y) -> integer
Return negative if x<y, zero if x==y, positive if x>y.
I ask because I have a list of tuples of the form (int, dict). I want to sort this array based on the first element, but if the first elements are equal for two items then I don't care about the second. I'd like to know if myArray.sort() will do something complicated involving recursing through the dicts in this case, or if it will just return an arbitrary value.
Quoting from comparison docs,
Tuples and Lists
Tuples and lists are compared lexicographically using comparison of corresponding elements. This means that to compare equal, each element must compare equal and the two sequences must be of the same type and have the same length.
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is ordered first (for example, [1,2] < [1,2,3]).
Dictionaries
Mappings (dictionaries) compare equal if and only if their sorted (key, value) lists compare equal. (The implementation computes this efficiently, without constructing lists or sorting.) Outcomes other than equality are resolved consistently, but are not otherwise defined. (Earlier versions of Python [prior to 2.7.6] used lexicographic comparison of the sorted (key, value) lists, but this was very expensive for the common case of comparing for equality. An even earlier version of Python compared dictionaries by identity only, but this caused surprises because people expected to be able to test a dictionary for emptiness by comparing it to {}.)
Also, find this part of the documentation, which specifically comparing sequence types with themselves and other types,
Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal. If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII ordering for individual characters.
Note that comparing objects of different types is legal. The outcome is deterministic but arbitrary: the types are ordered by their name. Thus, a list is always smaller than a string, a string is always smaller than a tuple, etc. (The rules for comparing objects of different types should not be relied upon; they may change in a future version of the language.) Mixed numeric types are compared according to their numeric value, so 0 equals 0.0, etc.
Actual dictionary comparison, as per Python 2.7 source code, goes like this
Compare the length of keys first. (-1 is returned if first has lesser keys, 1 if second has lesser keys)
If they are the same, then it tries to find a key for which either the key is missing in the other or different (this is called as characterizing the dict)
It does the step 2, either ways, both a, b and b, a. If either of them is empty, then both the dictionaries are assumed to be equal.
Now, the differences we got from characterizing the dictionaries will be compared to get the actual comparison result.
Like the answer from #thefourtheye.
Written in python, it can be explained like this:
def dict_compare(a, b):
if len(a) != len(b): # STEP 1: compare by length
return -1 if len(a) < len(b) else 1
res = 0
akey, aval = characterize(a, b) # Find first k, v that a[k] != b[k]
bkey, bval = characterize(b, a)
if akey is None: # if no difference
return 0
if bkey is not None: # STEP 2: compare by key
res = cmp(akey, bkey)
if res == 0 and bval is not None: # STEP 3: compare by value
res = cmp(aval, bval)
return res
Where characterize fucntion is something like this:
def characterize(a, b):
"""Find the first k that a[k] != b[k]"""
akey, aval = None, None
for k, v in a.items():
if akey < k:
continue
if (k not in b) or (a != b[k]):
akey, aval = k, v
return akey, aval
Related
I have been reading the Core Python programming book, and the author shows an example like:
(4, 5) < (3, 5) # Equals false
So, I'm wondering, how/why does it equal false? How does python compare these two tuples?
Btw, it's not explained in the book.
Tuples are compared position by position:
the first item of the first tuple is compared to the first item of the second tuple; if they are not equal (i.e. the first is greater or smaller than the second) then that's the result of the comparison, else the second item is considered, then the third and so on.
See Common Sequence Operations:
Sequences of the same type also support comparisons. In particular, tuples and lists are compared lexicographically by comparing corresponding elements. This means that to compare equal, every element must compare equal and the two sequences must be of the same type and have the same length.
Also Value Comparisons for further details:
Lexicographical comparison between built-in collections works as follows:
For two collections to compare equal, they must be of the same type, have the same length, and each pair of corresponding elements must compare equal (for example, [1,2] == (1,2) is false because the type is not the same).
Collections that support order comparison are ordered the same as their first unequal elements (for example, [1,2,x] <= [1,2,y] has the same value as x <= y). If a corresponding element does not exist, the shorter collection is ordered first (for example, [1,2] < [1,2,3] is true).
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is considered smaller (for example, [1,2] < [1,2,3] returns True).
Note 1: < and > do not mean "smaller than" and "greater than" but "is before" and "is after": so (0, 1) "is before" (1, 0).
Note 2: tuples must not be considered as vectors in a n-dimensional space, compared according to their length.
Note 3: referring to question https://stackoverflow.com/questions/36911617/python-2-tuple-comparison: do not think that a tuple is "greater" than another only if any element of the first is greater than the corresponding one in the second.
The Python documentation does explain it.
Tuples and lists are compared
lexicographically using comparison of
corresponding elements. This means
that to compare equal, each element
must compare equal and the two
sequences must be of the same type and
have the same length.
The python 2.5 documentation explains it well.
Tuples and lists are compared lexicographically using comparison of corresponding elements. This means that to compare equal, each element must compare equal and the two sequences must be of the same type and have the same length.
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is ordered first (for example, [1,2] < [1,2,3]).
Unfortunately that page seems to have disappeared in the documentation for more recent versions.
I had some confusion before regarding integer comparsion, so I will explain it to be more beginner friendly with an example
a = ('A','B','C') # see it as the string "ABC"
b = ('A','B','D')
A is converted to its corresponding ASCII ord('A') #65 same for other elements
So,
>> a>b # True
you can think of it as comparing between string (It is exactly, actually)
the same thing goes for integers too.
x = (1,2,2) # see it the string "123"
y = (1,2,3)
x > y # False
because (1 is not greater than 1, move to the next, 2 is not greater than 2, move to the next 2 is less than three -lexicographically -)
The key point is mentioned in the answer above
think of it as an element is before another alphabetically not element is greater than an element and in this case consider all the tuple elements as one string.
I have been reading the Core Python programming book, and the author shows an example like:
(4, 5) < (3, 5) # Equals false
So, I'm wondering, how/why does it equal false? How does python compare these two tuples?
Btw, it's not explained in the book.
Tuples are compared position by position:
the first item of the first tuple is compared to the first item of the second tuple; if they are not equal (i.e. the first is greater or smaller than the second) then that's the result of the comparison, else the second item is considered, then the third and so on.
See Common Sequence Operations:
Sequences of the same type also support comparisons. In particular, tuples and lists are compared lexicographically by comparing corresponding elements. This means that to compare equal, every element must compare equal and the two sequences must be of the same type and have the same length.
Also Value Comparisons for further details:
Lexicographical comparison between built-in collections works as follows:
For two collections to compare equal, they must be of the same type, have the same length, and each pair of corresponding elements must compare equal (for example, [1,2] == (1,2) is false because the type is not the same).
Collections that support order comparison are ordered the same as their first unequal elements (for example, [1,2,x] <= [1,2,y] has the same value as x <= y). If a corresponding element does not exist, the shorter collection is ordered first (for example, [1,2] < [1,2,3] is true).
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is considered smaller (for example, [1,2] < [1,2,3] returns True).
Note 1: < and > do not mean "smaller than" and "greater than" but "is before" and "is after": so (0, 1) "is before" (1, 0).
Note 2: tuples must not be considered as vectors in a n-dimensional space, compared according to their length.
Note 3: referring to question https://stackoverflow.com/questions/36911617/python-2-tuple-comparison: do not think that a tuple is "greater" than another only if any element of the first is greater than the corresponding one in the second.
The Python documentation does explain it.
Tuples and lists are compared
lexicographically using comparison of
corresponding elements. This means
that to compare equal, each element
must compare equal and the two
sequences must be of the same type and
have the same length.
The python 2.5 documentation explains it well.
Tuples and lists are compared lexicographically using comparison of corresponding elements. This means that to compare equal, each element must compare equal and the two sequences must be of the same type and have the same length.
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is ordered first (for example, [1,2] < [1,2,3]).
Unfortunately that page seems to have disappeared in the documentation for more recent versions.
I had some confusion before regarding integer comparsion, so I will explain it to be more beginner friendly with an example
a = ('A','B','C') # see it as the string "ABC"
b = ('A','B','D')
A is converted to its corresponding ASCII ord('A') #65 same for other elements
So,
>> a>b # True
you can think of it as comparing between string (It is exactly, actually)
the same thing goes for integers too.
x = (1,2,2) # see it the string "123"
y = (1,2,3)
x > y # False
because (1 is not greater than 1, move to the next, 2 is not greater than 2, move to the next 2 is less than three -lexicographically -)
The key point is mentioned in the answer above
think of it as an element is before another alphabetically not element is greater than an element and in this case consider all the tuple elements as one string.
I have been reading the Core Python programming book, and the author shows an example like:
(4, 5) < (3, 5) # Equals false
So, I'm wondering, how/why does it equal false? How does python compare these two tuples?
Btw, it's not explained in the book.
Tuples are compared position by position:
the first item of the first tuple is compared to the first item of the second tuple; if they are not equal (i.e. the first is greater or smaller than the second) then that's the result of the comparison, else the second item is considered, then the third and so on.
See Common Sequence Operations:
Sequences of the same type also support comparisons. In particular, tuples and lists are compared lexicographically by comparing corresponding elements. This means that to compare equal, every element must compare equal and the two sequences must be of the same type and have the same length.
Also Value Comparisons for further details:
Lexicographical comparison between built-in collections works as follows:
For two collections to compare equal, they must be of the same type, have the same length, and each pair of corresponding elements must compare equal (for example, [1,2] == (1,2) is false because the type is not the same).
Collections that support order comparison are ordered the same as their first unequal elements (for example, [1,2,x] <= [1,2,y] has the same value as x <= y). If a corresponding element does not exist, the shorter collection is ordered first (for example, [1,2] < [1,2,3] is true).
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is considered smaller (for example, [1,2] < [1,2,3] returns True).
Note 1: < and > do not mean "smaller than" and "greater than" but "is before" and "is after": so (0, 1) "is before" (1, 0).
Note 2: tuples must not be considered as vectors in a n-dimensional space, compared according to their length.
Note 3: referring to question https://stackoverflow.com/questions/36911617/python-2-tuple-comparison: do not think that a tuple is "greater" than another only if any element of the first is greater than the corresponding one in the second.
The Python documentation does explain it.
Tuples and lists are compared
lexicographically using comparison of
corresponding elements. This means
that to compare equal, each element
must compare equal and the two
sequences must be of the same type and
have the same length.
The python 2.5 documentation explains it well.
Tuples and lists are compared lexicographically using comparison of corresponding elements. This means that to compare equal, each element must compare equal and the two sequences must be of the same type and have the same length.
If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is ordered first (for example, [1,2] < [1,2,3]).
Unfortunately that page seems to have disappeared in the documentation for more recent versions.
I had some confusion before regarding integer comparsion, so I will explain it to be more beginner friendly with an example
a = ('A','B','C') # see it as the string "ABC"
b = ('A','B','D')
A is converted to its corresponding ASCII ord('A') #65 same for other elements
So,
>> a>b # True
you can think of it as comparing between string (It is exactly, actually)
the same thing goes for integers too.
x = (1,2,2) # see it the string "123"
y = (1,2,3)
x > y # False
because (1 is not greater than 1, move to the next, 2 is not greater than 2, move to the next 2 is less than three -lexicographically -)
The key point is mentioned in the answer above
think of it as an element is before another alphabetically not element is greater than an element and in this case consider all the tuple elements as one string.
I am implementing a method in my python program that checks if a mathematical function is valid.
An example of one in my program would be:
['set',['tuple',1,2],['tuple',3,4]]
Which equates to, {(1,2),(3,4)}
For the check to return True all tuples within the set must have a unique number as their leftmost value. So the function {(1,2),(1,4)} would return false.
Currently I have implemented this for a set with one tuple, which would require no check for a unique value in the tuple:
if "set" in argument:
print("Found a set")
print("Next part of argument", argument[1])
if "tuple" in argument[1]:
print("Found a tuple, only one found so this argument is a function")
I am unsure how to implement this for a set that may contain multiple tuples like the examples above.
How about this:
def is_function(thing):
if thing[0] == 'set':
different = len(set(element[1] for element in thing if element[0] == 'tuple'))
tuples = sum(1 for element in thing if element[0] == 'tuple')
return different == tuples
If the first element is 'set', then count the number of different first items in the tuples (by measuring the length of its set), and compare it with the amount of "tuples" in the list.
>>> is_function(['set',['tuple',1,2],['tuple',3,4]])
True
>>> is_function(['set',['tuple',1,2],['tuple',1,4]])
False
Better explanation:
The function first tests whether the first element of the list is "set", if it's not the function terminates (and returns None).
A set is created from the generator comprehension element[1] for element in thing if element[0] == 'tuple', which will be the set of all second elements of all those lists in the main list that have a first element called "tuple". This set will contain all first values, each of them once (because it's a set).
The cardinality of that set is stored in different. It is the amount of different elements directly after "tuple".
A sum is calculated from a similar generator comprehension. Again, this iterates over all sublists whose first element is "tuple", but what is added up is just the number 1, therefore the result will be the amount of sublists that start with "tuple".
The function returns the result of different == tuples; so True if they're the same and False otherwise. If there are several "tuples" with the same starting element, then different will be smaller than tuples, so it will return False. If there aren't, it will return True, because the number of "tuples" with different starting elements will be the same as the number of "tuples".
Simply put! there is this list say LST = [[12,1],[23,2],[16,3],[12,4],[14,5]] and i want to get all the minimum elements of this list according to its first element of the inside list. So for the above example the answer would be [12,1] and [12,4]. Is there any typical way in python of doing this?
Thanking you in advance.
Two passes:
minval = min(LST)[0]
return [x for x in LST if x[0] == minval]
One pass:
def all_minima(iterable, key=None):
if key is None: key = id
hasminvalue = False
minvalue = None
minlist = []
for entry in iterable:
value = key(entry)
if not hasminvalue or value < minvalue:
minvalue = value
hasminvalue = True
minlist = [entry]
elif value == minvalue:
minlist.append(entry)
return minlist
from operator import itemgetter
return all_minima(LST, key=itemgetter(0))
A compact single-pass solution requires sorting the list -- that's technically O(N log N) for an N-long list, but Python's sort is so good, and so many sequences "just happen" to have some embedded order in them (which timsort cleverly exploits to go faster), that sorting-based solutions sometimes have surprisingly good performance in the real world.
Here's a solution requiring 2.6 or better:
import itertools
import operator
f = operator.itemgetter(0)
def minima(lol):
return list(next(itertools.groupby(sorted(lol, key=f), key=f))[1])
To understand this approach, looking "from the inside, outwards" helps.
f, i.e., operator.itemgetter(0), is a key-function that picks the first item of its argument for ordering purposes -- the very purpose of operator.itemgetter is to easily and compactly build such functions.
sorted(lol, key=f) therefore returns a sorted copy of the list-of-lists lol, ordered by increasing first item. If you omit the key=f the sorted copy will be ordered lexicographically, so it will also be in order of increasing first item, but that acts only as the "primary key" -- items with the same first sub-item will in turn be sorted among them by the values of their second sub-items, and so forth -- while with the key=f you're guaranteed to preserve the original order among items with the same first sub-item. You don't specify which behavior you require (and in your example the two behaviors happen to produce the same result, so we cannot distinguish from that example) which is why I'm carefully detailing both possibilities so you can choose.
itertools.groupby(sorted(lol, key=f), key=f) performs the "grouping" task that is the heart of the operation: it yields groups from the sequence (in this case, the sequence sorted provides) based on the key ordering criteria. That is, a group with all adjacent items producing the same value among themselves when you call f with the item as an argument, then a group with all adjacent item producing a different value from the first group (but same among themselves), and so forth. groupby respect the ordering of the sequence it takes as its argument, which is why we had to sort the lol first (and this behavior of groupby makes it very useful in many cases in which the sequence's ordering does matter).
Each result yielded by groupby is a pair k, g: a key k which is the result of f(i) on each item in the group, an iterator g which yields each item in the group in sequence.
The next built-in (the only bit in this solution which requires Python 2.6) given an iterator produces its next item -- in particular, the first item when called on a fresh, newly made iterator (and, every generator of course is an iterator, as is groupby's result). In earlier Python versions, it would have to be groupby(...).next() (since next was only a method of iterators, not a built-in), which is deprecated since 2.6.
So, summarizing, the result of our next(...) is exactly the pair k, g where k is the minimum (i.e., first after sorting) value for the first sub-item, and g is an iterator for the group's items.
So, with that [1] we pick just the iterator, so we have an iterator yielding just the subitems we want.
Since we want a list, not an iterator (per your specs), the outermost list(...) call completes the job.
Is all of this worth it, performance-wise? Not on the tiny example list you give -- minima is actually slower than either code in #Kenny's answer (of which the first, "two-pass" solution is speedier). I just think it's worth keeping the ideas in mind for the next sequence processing problem you may encounter, where the details of typical inputs may be quite different (longer lists, rarer minima, partial ordering in the input, &c, &c;-).
m = min(LST, key=operator.itemgetter(0))[0]
print [x for x in LST if x[0] == m]
minval = min(x[0] for x in LST)
result = [x for x in LST if x[0]==minval]