I've got a dict where some of the values are not hashable. I need some way to compare two unordered groups of these to ensure they contain equal elements. I can't use lists because list equality takes the order into account but sets won't work because dicts aren't hashable. I had a look through the python docs, and the only thing that looks useful is a dict's view, which is hashable under some circumstances but in this case this doesn't help either as one of the values is an object which contains lists itself, meaning the dict's view won't be hashable either.
Is there a standard container for situations like this, or should I just use lists and loop through every element in both lists and ensure an equal element is somewhere in the other list?
When duplicate entries don't exist, the usual choices are:
If the elements are hashable: set(a) == set(b)
If the elements are orderable: sorted(a) == sorted(b)
If all you have is equality: len(a) == len(b) and all(x in b for x in a)
If you have duplicates and their multiplicity matters, the choices are:
If the elements are hashable: Counter(a) == Counter(b)
If the elements are orderable: sorted(a) == sorted(b)
If all you have is equality: len(a) == len(b) and all(a.count(x) == b.count(x) for x in a)
I think the simplest method is to use lists.
group_1 = my_dict_1.values()
group_2 = my_dict_2.values()
Your comparison won't be as simple as if order mattered, or if the values were hashable, but the following should work:
def contain_the_same(group_1, group_2):
for item in group_1:
if item not in group_2:
return False
else:
group_2.pop(group_2.index(item))
if len(group_2) != 0:
return False
return True
This should be able to handle unhashable objects just fine:
>>> contain_the_same([1,2,3], [1,2,3])
True
>>> contain_the_same([1,2,3], [1,2,3,4])
False
>>> contain_the_same([1,2,[3,2,1]], [1,2,[3,2,1]])
True
>>> contain_the_same([5,1,2,[3,2,1,[1]]], [1,[3,2,1,[1]],2,5])
True
A caveat: This will return false if there are duplicates in one list, but no the other. This would require some modification if you wanted to make that an allowable case.
Edit: Even easier:
sorted(my_dict_1.values()) == sorted(my_dict_1.values())
It even looks like this is twice as fast as my contain_the_same function:
>>> timeit("contain_the_same([5,1,2,[3,2,1,[1]]], [1,[3,2,1,[1]],2,5])",
"from __main__ import contain_the_same", number=10000)/10000
8.868489032757054e-06
>>>timeit("sorted([5,1,2,[3,2,1,[1]]]) == sorted([1,[3,2,1,[1]],2,5])",
number=10000)/10000
4.928951884845034e-06
Although it would not be as easy to extend to the case where duplicates are allowed.
Related
I want to check if all lists in a list of lists are equal.
One example for which I succeeded is a lists of two lists l2 for which
all([a == b for a, b in zip(*l2)])
correctly returns True if l2 = [[1,2],[1,2]] and Falsewhen l2 = [[1,2],[1,666]].
I expected to be able to directly use this code in the case in which the list of lists l has more lists in it by using the same code, but it seems to not work.
For example, when
l=[[1,2],[1,2],[1,2]]
all([a == b for a, b in zip(*l)])
returns the following error:
ValueError: too many values to unpack (expected 2)
I do not understand why this is the case as the result of zip(*l) looks like it should work:
list(zip(*l))
>> [(1, 1, 1), (2, 2, 2)]
Using the observation a == b and a == c imply a == c. You should test the first list with the other lists.
def equalLists(lists):
return not lists or all(lists[0] == b for b in lists[1:])
>>> equalLists([])
True
>>> equalLists([1,2],[1,2])
True
>>> equalLists([1,2],[1,2],[1,2])
True
>>> equalLists([1,2],[1,2],[1,3])
False
You could create a set of tuples and check the size is 1, otherwise they are not all the same:
len(set(tuple(elem) for elem in l)) == 1
This will work for a list of lists of any length. It will also be more efficient than linear time comparisons.
(You have to convert to a tuple first because a list is not hashable and a set requires its members to be hashable.)
Your method (and the other answers here) don't consider that if the lists' lengths vary, zip will shorten them to the length of the shortest:
all(a == b for a,b in zip([1,2], [1,2,3]))
>>> True
Firstly note that it's not necessary to construct a list in all like all([...]) as this adds an extra iteration after list creation, whereas as I've done above uses a generator which evaluates as it goes along.
If each list has hashable elements, I'd exploit set to calculate the distinct elements and check there's only 1:
len(set(tuple(x) for x in l)) == 1
If the elements aren't hashable, but do have the equals method defined on them (unlike your examples, since int is hashable) I'd compare each list to the first, possibly using a generator if you want to avoid comparing the first to itself:
li = iter(l)
first = next(li)
all(x == first for x in li)
This still makes use of python's built-in list equals method and won't do more comparisons than any zip methods in the case that all the lists are equal.
The only case where the above is inefficient is if you have a list of long lists, where most but not all are equal. In that case it's possible a zip method would be quicker:
from itertools import zip_longest
all(len(set(x)) == 1 for x in zip_longest(*l))
Here I used zip_longest for the case the list lengths are unequal. If you knw the lengths are equal you can use zip. By default it fills values with None from the shorter lists once they 'run out' in the iterator, so only use this if your lists have no legitimate Nones! (In that case you can set zip_longest(..., fillvalue="<something not in the lists>").
Equivalent for non-hashable list elements (with equals method):
all(all(i == x[0] for i in x[1:]) for x in zip_longest(*l))
l=[[1,2],[1,2],[1,2]]
all([a == b == c for a, b, c in zip(*l)])
You're telling zip() there's 2 values to unpack but zip has 3 lists to unpack.
You have three item after zip (1,1,1)
Use
l=[[1,2],[1,2],[1,2]]
all([a == b for a, b, c in zip(*l)])
or
l=[[1,2],[1,2],[1,2]]
all([a == b for a, b, *_ in zip(*l)])
# OR all([a == b for a, b, _ in zip(*l)])
# OR all([i[0] == i[1] for i in zip(*l)])
Edit as per comment.
To test all sub elements are equal
all([len(set(i)) == 1 for i in zip(*l)])
a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]
a & b should be considered equal, because they have exactly the same elements, only in different order.
The thing is, my actual lists will consist of objects (my class instances), not integers.
O(n): The Counter() method is best (if your objects are hashable):
def compare(s, t):
return Counter(s) == Counter(t)
O(n log n): The sorted() method is next best (if your objects are orderable):
def compare(s, t):
return sorted(s) == sorted(t)
O(n * n): If the objects are neither hashable, nor orderable, you can use equality:
def compare(s, t):
t = list(t) # make a mutable copy
try:
for elem in s:
t.remove(elem)
except ValueError:
return False
return not t
You can sort both:
sorted(a) == sorted(b)
A counting sort could also be more efficient (but it requires the object to be hashable).
>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True
If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)
In the general case you can't rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)
len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)
If you have to do this in tests:
https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual
assertCountEqual(first, second, msg=None)
Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.
Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.
New in version 3.2.
or in 2.7:
https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual
Outside of tests I would recommend the Counter method.
The best way to do this is by sorting the lists and comparing them. (Using Counter won't work with objects that aren't hashable.) This is straightforward for integers:
sorted(a) == sorted(b)
It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.
sorted(a, key=id) == sorted(b, key==id)
(In Python 2.x you don't actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn't matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances -- for example, you can't compare strings to integers -- so if you will have objects of various types, best to explicitly use the object's ID.)
If you want to compare the objects in the list by value, on the other hand, first you need to define what "value" means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.
sorted(a, key=repr) == sorted(b, key==repr)
If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.
If the comparison is to be performed in a testing context, use assertCountEqual(a, b) (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).
Works on sequences of unhashable objects too.
If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:
from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
print("Lists a and b contain the same objects")
Let a,b lists
def ass_equal(a,b):
try:
map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
if len(a) == 0: # if a is empty, means that b has removed them all
return True
except:
return False # b failed to remove some items from a
No need to make them hashable or sort them.
I hope the below piece of code might work in your case :-
if ((len(a) == len(b)) and
(all(i in a for i in b))):
print 'True'
else:
print 'False'
This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.
For better understanding, refer to my answer in this question
You can write your own function to compare the lists.
Let's get two lists.
list_1=['John', 'Doe']
list_2=['Doe','Joe']
Firstly, we define an empty dictionary, count the list items and write in the dictionary.
def count_list(list_items):
empty_dict={}
for list_item in list_items:
list_item=list_item.strip()
if list_item not in empty_dict:
empty_dict[list_item]=1
else:
empty_dict[list_item]+=1
return empty_dict
After that, we'll compare both lists by using the following function.
def compare_list(list_1, list_2):
if count_list(list_1)==count_list(list_2):
return True
return False
compare_list(list_1,list_2)
from collections import defaultdict
def _list_eq(a: list, b: list) -> bool:
if len(a) != len(b):
return False
b_set = set(b)
a_map = defaultdict(lambda: 0)
b_map = defaultdict(lambda: 0)
for item1, item2 in zip(a, b):
if item1 not in b_set:
return False
a_map[item1] += 1
b_map[item2] += 1
return a_map == b_map
Sorting can be quite slow if the data is highly unordered (timsort is extra good when the items have some degree of ordering). Sorting both also requires fully iterating through both lists.
Rather than mutating a list, just allocate a set and do a left-->right membership check, keeping a count of how many of each item exist along the way:
If the two lists are not the same length you can short circuit and return False immediately.
If you hit any item in list a that isn't in list b you can return False
If you get through all items then you can compare the values of a_map and b_map to find out if they match.
This allows you to short-circuit in many cases long before you've iterated both lists.
plug in this:
def lists_equal(l1: list, l2: list) -> bool:
"""
import collections
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
ref:
- https://stackoverflow.com/questions/9623114/check-if-two-unordered-lists-are-equal
- https://stackoverflow.com/questions/7828867/how-to-efficiently-compare-two-unordered-lists-not-sets
"""
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
set_comp = set(l1) == set(l2) # removes duplicates, so returns true when not sometimes :(
multiset_comp = compare(l1, l2) # approximates multiset
return set_comp and multiset_comp #set_comp is gere in case the compare function doesn't work
I have two lists say
List1 = ['a','c','c']
List2 = ['x','b','a','x','c','y','c']
Now I want to find out if all elements of List1 are there in List2. In this case all there are. I can't use the subset function because I can have repeated elements in lists. I can use a for loop to count the number of occurrences of each item in List1 and see if it is less than or equal to the number of occurrences in List2. Is there a better way to do this?
Thanks.
When number of occurrences doesn't matter, you can still use the subset functionality, by creating a set on the fly:
>>> list1 = ['a', 'c', 'c']
>>> list2 = ['x', 'b', 'a', 'x', 'c', 'y', 'c']
>>> set(list1).issubset(list2)
True
If you need to check if each element shows up at least as many times in the second list as in the first list, you can make use of the Counter type and define your own subset relation:
>>> from collections import Counter
>>> def counterSubset(list1, list2):
c1, c2 = Counter(list1), Counter(list2)
for k, n in c1.items():
if n > c2[k]:
return False
return True
>>> counterSubset(list1, list2)
True
>>> counterSubset(list1 + ['a'], list2)
False
>>> counterSubset(list1 + ['z'], list2)
False
If you already have counters (which might be a useful alternative to store your data anyway), you can also just write this as a single line:
>>> all(n <= c2[k] for k, n in c1.items())
True
Be aware of the following:
>>>listA = ['a', 'a', 'b','b','b','c']
>>>listB = ['b', 'a','a','b','c','d']
>>>all(item in listB for item in listA)
True
If you read the "all" line as you would in English, This is not wrong but can be misleading, as listA has a third 'b' but listB does not.
This also has the same issue:
def list1InList2(list1, list2):
for item in list1:
if item not in list2:
return False
return True
Just a note. The following does not work:
>>>tupA = (1,2,3,4,5,6,7,8,9)
>>>tupB = (1,2,3,4,5,6,6,7,8,9)
>>>set(tupA) < set(TupB)
False
If you convert the tuples to lists it still does not work. I don't know why strings work but ints do not.
Works but has same issue of not keeping count of element occurances:
>>>set(tupA).issubset(set(tupB))
True
Using sets is not a comprehensive solution for multi-occurrance element matching.
But here is a one-liner solution/adaption to shantanoo's answer without try/except:
all(True if sequenceA.count(item) <= sequenceB.count(item) else False for item in sequenceA)
A builtin function wrapping a list comprehension using a ternary conditional operator. Python is awesome! Note that the "<=" should not be "==".
With this solution sequence A and B can be type tuple and list and other "sequences" with "count" methods. The elements in both sequences can be most types. I would not use this with dicts as it is now, hence the use "sequence" instead of "iterable".
A solution using Counter and the builtin intersection method (note that - is proper multiset difference, not an element-wise subtraction):
from collections import Counter
def is_subset(l1, l2):
c1, c2 = Counter(l1), Counter(l2)
return not c1 - c2
Test:
>>> List1 = ['a','c','c']
>>> List2 = ['x','b','a','x','c','y','c']
>>> is_subset(List1, List2)
True
I can't use the subset function because I can have repeated elements in lists.
What this means is that you want to treat your lists as multisets rather than sets. The usual way to handle multisets in Python is with collections.Counter:
A Counter is a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counter class is similar to bags or multisets in other languages.
And, while you can implement subset for multisets (implemented with Counter) by looping and comparing counts, as in poke's answer, this is unnecessary—just as you can implement subset for sets (implemented with set or frozenset) by looping and testing in, but it's unnecessary.
The Counter type already implements all the set operators extended in the obvious way for multisets.<1 So you can just write subset in terms of those operators, and it will work for both set and Counter out of the box.
With (multi)set difference:2
def is_subset(c1, c2):
return not c1 - c2
Or with (multi)set intersection:
def is_subset(c1, c2):
def c1 & c2 == c1
1. You may be wondering why, if Counter implements the set operators, it doesn't just implement < and <= for proper subset and subset. Although I can't find the email thread, I'm pretty sure this was discussed, and the answer was that "the set operators" are defined as the specific set of operators defined in the initial version of collections.abc.Set (which has since been expanded, IIRC…), not all operators that set happens to include for convenience, in the exact same way that Counter doesn't have named methods like intersection that's friendly to other types than & just because set does.
2. This depends on the fact that collections in Python are expected to be falsey when empty and truthy otherwise. This is documented here for the builtin types, and the fact that bool tests fall back to len is explained here—but it's ultimately just a convention, so that "quasi-collections" like numpy arrays can violate it if they have a good reason. It holds for "real collections" like Counter, OrderedDict, etc. If you're really worried about that, you can write len(c1 - c2) == 0, but note that this is against the spirit of PEP 8.
This will return true is all the items in List1 are in List2
def list1InList2(list1, list2):
for item in list1:
if item not in list2:
return False
return True
def check_subset(list1, list2):
try:
[list2.remove(x) for x in list1]
return 'all elements in list1 are in list2'
except:
return 'some elements in list1 are not in list2'
This question already has answers here:
How to efficiently compare two unordered lists (not sets)?
(12 answers)
Closed 6 years ago.
Sorry for the simple question, but I'm having a hard time finding the answer.
When I compare 2 lists, I want to know if they are "equal" in that they have the same contents, but in different order.
Ex:
x = ['a', 'b']
y = ['b', 'a']
I want x == y to evaluate to True.
You can simply check whether the multisets with the elements of x and y are equal:
import collections
collections.Counter(x) == collections.Counter(y)
This requires the elements to be hashable; runtime will be in O(n), where n is the size of the lists.
If the elements are also unique, you can also convert to sets (same asymptotic runtime, may be a little bit faster in practice):
set(x) == set(y)
If the elements are not hashable, but sortable, another alternative (runtime in O(n log n)) is
sorted(x) == sorted(y)
If the elements are neither hashable nor sortable you can use the following helper function. Note that it will be quite slow (O(n²)) and should generally not be used outside of the esoteric case of unhashable and unsortable elements.
def equal_ignore_order(a, b):
""" Use only when elements are neither hashable nor sortable! """
unmatched = list(b)
for element in a:
try:
unmatched.remove(element)
except ValueError:
return False
return not unmatched
Determine if 2 lists have the same elements, regardless of order?
Inferring from your example:
x = ['a', 'b']
y = ['b', 'a']
that the elements of the lists won't be repeated (they are unique) as well as hashable (which strings and other certain immutable python objects are), the most direct and computationally efficient answer uses Python's builtin sets, (which are semantically like mathematical sets you may have learned about in school).
set(x) == set(y) # prefer this if elements are hashable
In the case that the elements are hashable, but non-unique, the collections.Counter also works semantically as a multiset, but it is far slower:
from collections import Counter
Counter(x) == Counter(y)
Prefer to use sorted:
sorted(x) == sorted(y)
if the elements are orderable. This would account for non-unique or non-hashable circumstances, but this could be much slower than using sets.
Empirical Experiment
An empirical experiment concludes that one should prefer set, then sorted. Only opt for Counter if you need other things like counts or further usage as a multiset.
First setup:
import timeit
import random
from collections import Counter
data = [str(random.randint(0, 100000)) for i in xrange(100)]
data2 = data[:] # copy the list into a new one
def sets_equal():
return set(data) == set(data2)
def counters_equal():
return Counter(data) == Counter(data2)
def sorted_lists_equal():
return sorted(data) == sorted(data2)
And testing:
>>> min(timeit.repeat(sets_equal))
13.976069927215576
>>> min(timeit.repeat(counters_equal))
73.17287588119507
>>> min(timeit.repeat(sorted_lists_equal))
36.177085876464844
So we see that comparing sets is the fastest solution, and comparing sorted lists is second fastest.
This seems to work, though possibly cumbersome for large lists.
>>> A = [0, 1]
>>> B = [1, 0]
>>> C = [0, 2]
>>> not sum([not i in A for i in B])
True
>>> not sum([not i in A for i in C])
False
>>>
However, if each list must contain all the elements of other then the above code is problematic.
>>> A = [0, 1, 2]
>>> not sum([not i in A for i in B])
True
The problem arises when len(A) != len(B) and, in this example, len(A) > len(B). To avoid this, you can add one more statement.
>>> not sum([not i in A for i in B]) if len(A) == len(B) else False
False
One more thing, I benchmarked my solution with timeit.repeat, under the same conditions used by Aaron Hall in his post. As suspected, the results are disappointing. My method is the last one. set(x) == set(y) it is.
>>> def foocomprehend(): return not sum([not i in data for i in data2])
>>> min(timeit.repeat('fooset()', 'from __main__ import fooset, foocount, foocomprehend'))
25.2893661496
>>> min(timeit.repeat('foosort()', 'from __main__ import fooset, foocount, foocomprehend'))
94.3974742993
>>> min(timeit.repeat('foocomprehend()', 'from __main__ import fooset, foocount, foocomprehend'))
187.224562545
As mentioned in comments above, the general case is a pain. It is fairly easy if all items are hashable or all items are sortable. However I have recently had to try solve the general case. Here is my solution. I realised after posting that this is a duplicate to a solution above that I missed on the first pass. Anyway, if you use slices rather than list.remove() you can compare immutable sequences.
def sequences_contain_same_items(a, b):
for item in a:
try:
i = b.index(item)
except ValueError:
return False
b = b[:i] + b[i+1:]
return not b
a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]
a & b should be considered equal, because they have exactly the same elements, only in different order.
The thing is, my actual lists will consist of objects (my class instances), not integers.
O(n): The Counter() method is best (if your objects are hashable):
def compare(s, t):
return Counter(s) == Counter(t)
O(n log n): The sorted() method is next best (if your objects are orderable):
def compare(s, t):
return sorted(s) == sorted(t)
O(n * n): If the objects are neither hashable, nor orderable, you can use equality:
def compare(s, t):
t = list(t) # make a mutable copy
try:
for elem in s:
t.remove(elem)
except ValueError:
return False
return not t
You can sort both:
sorted(a) == sorted(b)
A counting sort could also be more efficient (but it requires the object to be hashable).
>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True
If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)
In the general case you can't rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)
len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)
If you have to do this in tests:
https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual
assertCountEqual(first, second, msg=None)
Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.
Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.
New in version 3.2.
or in 2.7:
https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual
Outside of tests I would recommend the Counter method.
The best way to do this is by sorting the lists and comparing them. (Using Counter won't work with objects that aren't hashable.) This is straightforward for integers:
sorted(a) == sorted(b)
It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.
sorted(a, key=id) == sorted(b, key==id)
(In Python 2.x you don't actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn't matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances -- for example, you can't compare strings to integers -- so if you will have objects of various types, best to explicitly use the object's ID.)
If you want to compare the objects in the list by value, on the other hand, first you need to define what "value" means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.
sorted(a, key=repr) == sorted(b, key==repr)
If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.
If the comparison is to be performed in a testing context, use assertCountEqual(a, b) (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).
Works on sequences of unhashable objects too.
If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:
from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
print("Lists a and b contain the same objects")
Let a,b lists
def ass_equal(a,b):
try:
map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
if len(a) == 0: # if a is empty, means that b has removed them all
return True
except:
return False # b failed to remove some items from a
No need to make them hashable or sort them.
I hope the below piece of code might work in your case :-
if ((len(a) == len(b)) and
(all(i in a for i in b))):
print 'True'
else:
print 'False'
This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.
For better understanding, refer to my answer in this question
You can write your own function to compare the lists.
Let's get two lists.
list_1=['John', 'Doe']
list_2=['Doe','Joe']
Firstly, we define an empty dictionary, count the list items and write in the dictionary.
def count_list(list_items):
empty_dict={}
for list_item in list_items:
list_item=list_item.strip()
if list_item not in empty_dict:
empty_dict[list_item]=1
else:
empty_dict[list_item]+=1
return empty_dict
After that, we'll compare both lists by using the following function.
def compare_list(list_1, list_2):
if count_list(list_1)==count_list(list_2):
return True
return False
compare_list(list_1,list_2)
from collections import defaultdict
def _list_eq(a: list, b: list) -> bool:
if len(a) != len(b):
return False
b_set = set(b)
a_map = defaultdict(lambda: 0)
b_map = defaultdict(lambda: 0)
for item1, item2 in zip(a, b):
if item1 not in b_set:
return False
a_map[item1] += 1
b_map[item2] += 1
return a_map == b_map
Sorting can be quite slow if the data is highly unordered (timsort is extra good when the items have some degree of ordering). Sorting both also requires fully iterating through both lists.
Rather than mutating a list, just allocate a set and do a left-->right membership check, keeping a count of how many of each item exist along the way:
If the two lists are not the same length you can short circuit and return False immediately.
If you hit any item in list a that isn't in list b you can return False
If you get through all items then you can compare the values of a_map and b_map to find out if they match.
This allows you to short-circuit in many cases long before you've iterated both lists.
plug in this:
def lists_equal(l1: list, l2: list) -> bool:
"""
import collections
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
ref:
- https://stackoverflow.com/questions/9623114/check-if-two-unordered-lists-are-equal
- https://stackoverflow.com/questions/7828867/how-to-efficiently-compare-two-unordered-lists-not-sets
"""
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
set_comp = set(l1) == set(l2) # removes duplicates, so returns true when not sometimes :(
multiset_comp = compare(l1, l2) # approximates multiset
return set_comp and multiset_comp #set_comp is gere in case the compare function doesn't work