lists in list comparison with another list [duplicate] - python

This question already has answers here:
Python: Determine if an unsorted list is contained in a 'list of lists', regardless of the order to the elements
(4 answers)
Closed 9 years ago.
I need to check if list with lists contains list with similar values to the specified list, values may be in different order but in case all values are same it should return true
a= ["1","2","3","4","5"]
b= ["2","3","6","4","7"]
e = (["1","3","2","4","5"],["2","3","6","4","7"])
CombinationFound = []
for i in e:
if i == a:
CombinationFound = True
break;
else:
CombinationFound = False
it should return true since ["1","2","3","4","5"] and ["1","3","2","4","5"] have same values

Python sets are a much better implementation for your particular problem.
Sets are mathematical objects that contain data, but have methods to determine unions, intersections, differences etc between two collections.
Using:
set(a) == set(b)
should give you the result you want. As long as by 'similar' you mean 'the same'.

Different way to compare without the for loop:
found = any(set(a)==set(l) for l in e)

Try to convert those lists into sets:
def isthesame(a,b):
return set(b) == set(a)
For example if you have:
a= ["1","2","3","4","5"]
b= ["2","3","6","4","7"]
This solution should work for lists that don't have duplicit items.

You need to convert the list you're trying to match against into something you can compare against without caring about the original order. If the number of appearances of an element of the list counts, use sorted lists. If it does not, use sets.
list_to_match = sorted(a)
combination_found = False
for i in e:
if sorted(i) == list_to_match:
combination_found = True
break
That version will distinguish lists with different numbers of repeated elements - that is, [0, 1, 1, 2] will not match [0, 1, 1, 1, 2]. If you don't care about that possibility, use set(a) and set(i) instead:
set_to_match = set(a)
combination_found = False
for i in e:
if set(i) == set_to_match:
combination_found = True
break
Or, for a more concise version, use the built-in any function with a generator expression:
list_to_match = sorted(a)
combination_found = any(sorted(i) == list_to_match for i in e)

def checker(list_of_lists, example):
for i in list_of_lists:
if tuple(sorted(i)) == tuple(sorted(example)):
return True
return False

Related

Python: How can 2 dictionaries with a list be compared neglecting the list items order? [duplicate]

a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]
a & b should be considered equal, because they have exactly the same elements, only in different order.
The thing is, my actual lists will consist of objects (my class instances), not integers.
O(n): The Counter() method is best (if your objects are hashable):
def compare(s, t):
return Counter(s) == Counter(t)
O(n log n): The sorted() method is next best (if your objects are orderable):
def compare(s, t):
return sorted(s) == sorted(t)
O(n * n): If the objects are neither hashable, nor orderable, you can use equality:
def compare(s, t):
t = list(t) # make a mutable copy
try:
for elem in s:
t.remove(elem)
except ValueError:
return False
return not t
You can sort both:
sorted(a) == sorted(b)
A counting sort could also be more efficient (but it requires the object to be hashable).
>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True
If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)
In the general case you can't rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)
len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)
If you have to do this in tests:
https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual
assertCountEqual(first, second, msg=None)
Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.
Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.
New in version 3.2.
or in 2.7:
https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual
Outside of tests I would recommend the Counter method.
The best way to do this is by sorting the lists and comparing them. (Using Counter won't work with objects that aren't hashable.) This is straightforward for integers:
sorted(a) == sorted(b)
It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.
sorted(a, key=id) == sorted(b, key==id)
(In Python 2.x you don't actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn't matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances -- for example, you can't compare strings to integers -- so if you will have objects of various types, best to explicitly use the object's ID.)
If you want to compare the objects in the list by value, on the other hand, first you need to define what "value" means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.
sorted(a, key=repr) == sorted(b, key==repr)
If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.
If the comparison is to be performed in a testing context, use assertCountEqual(a, b) (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).
Works on sequences of unhashable objects too.
If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:
from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
print("Lists a and b contain the same objects")
Let a,b lists
def ass_equal(a,b):
try:
map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
if len(a) == 0: # if a is empty, means that b has removed them all
return True
except:
return False # b failed to remove some items from a
No need to make them hashable or sort them.
I hope the below piece of code might work in your case :-
if ((len(a) == len(b)) and
(all(i in a for i in b))):
print 'True'
else:
print 'False'
This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.
For better understanding, refer to my answer in this question
You can write your own function to compare the lists.
Let's get two lists.
list_1=['John', 'Doe']
list_2=['Doe','Joe']
Firstly, we define an empty dictionary, count the list items and write in the dictionary.
def count_list(list_items):
empty_dict={}
for list_item in list_items:
list_item=list_item.strip()
if list_item not in empty_dict:
empty_dict[list_item]=1
else:
empty_dict[list_item]+=1
return empty_dict
After that, we'll compare both lists by using the following function.
def compare_list(list_1, list_2):
if count_list(list_1)==count_list(list_2):
return True
return False
compare_list(list_1,list_2)
from collections import defaultdict
def _list_eq(a: list, b: list) -> bool:
if len(a) != len(b):
return False
b_set = set(b)
a_map = defaultdict(lambda: 0)
b_map = defaultdict(lambda: 0)
for item1, item2 in zip(a, b):
if item1 not in b_set:
return False
a_map[item1] += 1
b_map[item2] += 1
return a_map == b_map
Sorting can be quite slow if the data is highly unordered (timsort is extra good when the items have some degree of ordering). Sorting both also requires fully iterating through both lists.
Rather than mutating a list, just allocate a set and do a left-->right membership check, keeping a count of how many of each item exist along the way:
If the two lists are not the same length you can short circuit and return False immediately.
If you hit any item in list a that isn't in list b you can return False
If you get through all items then you can compare the values of a_map and b_map to find out if they match.
This allows you to short-circuit in many cases long before you've iterated both lists.
plug in this:
def lists_equal(l1: list, l2: list) -> bool:
"""
import collections
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
ref:
- https://stackoverflow.com/questions/9623114/check-if-two-unordered-lists-are-equal
- https://stackoverflow.com/questions/7828867/how-to-efficiently-compare-two-unordered-lists-not-sets
"""
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
set_comp = set(l1) == set(l2) # removes duplicates, so returns true when not sometimes :(
multiset_comp = compare(l1, l2) # approximates multiset
return set_comp and multiset_comp #set_comp is gere in case the compare function doesn't work

Python list comprehensions that return multiple lists [duplicate]

This question already has an answer here:
What is the inverse function of zip in python? [duplicate]
(1 answer)
Closed 5 years ago.
I don't know whether it is anyway possible in Python and that is the reason why I ask it here.
I have a Python function that returns a tuple:
def my_func(i):
return i * 2, 'a' * i
This is just a dumb function that given a number k, it returns k * 2 as is and another string is the letter 'a' concatenated k times.
I want now to form two lists, calling the function with i = 0...9, I want to create one list with all the first values and another one with the rest of them.
What I do with my current knowledge is:
Option 1: Run the same list comprehension two times, that is not very efficient:
first_vals = [my_func(i)[0] for i in range(10)]
second_vals = [my_func(i)[1] for i in range(10)]
Option 2: Avoiding list comprehensions:
first_vals = []
second_vals = []
for i in range(10):
f, s = my_func(i)
first_vals.append(f)
second_vals.append(s)
Option 3: Use list comprehension to get a list of tuples and then two other list comprehension to copy the values. It is better than Option 1, as here my_func() is only called once for each i:
ret = [my_func(i) for i in range(10)]
first_vals = [r[0] for r in ret]
second_vals = [r[1] for r in ret]
Is it possible to somehow use the list comprehension feature in order to return two lists in one command and one iteration, assigning each returned parameter into a different list?
Option 4: Use the inverse of zip:
first_vals, second_vals = zip(*[my_func(i) for i in range(10)])
As Mark Dickinson pointed out in the comment this will lead to tuples for first_vals and second_vals. If you need them to be of type list you can, for example, apply a map:
first_vals, second_vals = map(list, zip(*[my_func(i) for i in range(10)]))

Optimized code to check the element of list unique or not

python3 program that takes input a list and output if it is unique or not. The following is an example:
list_a = [1,2,3,4,5] #unique
list_b = [1,2,2,3,4] #not unique
I have wrote a python3 script for this problem:
for i in range(len(list_a)):
j = i+1
for j in range(len(list_a)):
if list_a[i] == list_a[j]:
print ("not unique")
else:
print ("unique")
Is this the only way to check it. I bet it isn't! I want some optimized code that is equivalent to above or simply that ouputs "unique" or "not unique" for a given list.
Thank you in advance.
The easiest way to do this is compare length of set of given list with length of list:
if len(l) != len(set(l)):
# not unique
You can use all() and sets, this will short-circuit as soon as a repeated item is found.
>>> def solve(lis):
... seen = set()
... return all(item not in seen and not seen.add(item) for item in lis)
...
>>> solve(range(5))
True
>>> solve([1,2,2,3,4])
False
Throw the list into a set:
set_a = set(list_a)
if len(set_a) == len(list_a):
print("unique")
else:
print("not unique")
You could use AVL Trees, add each element one by one in the tree while the inserted element is not yet in the tree.
In the case of big lists, it will be very fast in comparison of your current code.
If you want to easily find duplicate elements you can use collections.Counter:
import collections
a = [1, 2, 2, 3]
b = collections.Counter(a)
duplicates = [i for i in b if b[i] > 1]
Variable b is an object which acts a bit like a dictionary with keys being values from a and values being numbers indicating how many times this value appeared in the original list.
print(b[2])
would give 2.
Variable duplicates has all duplicate elements and you can use it like this:
if duplicates:
print("not unique")
else:
print("unique")
This is longer than generating a set, but you have much more information you can use.

Determine if 2 lists have the same elements, regardless of order? [duplicate]

This question already has answers here:
How to efficiently compare two unordered lists (not sets)?
(12 answers)
Closed 6 years ago.
Sorry for the simple question, but I'm having a hard time finding the answer.
When I compare 2 lists, I want to know if they are "equal" in that they have the same contents, but in different order.
Ex:
x = ['a', 'b']
y = ['b', 'a']
I want x == y to evaluate to True.
You can simply check whether the multisets with the elements of x and y are equal:
import collections
collections.Counter(x) == collections.Counter(y)
This requires the elements to be hashable; runtime will be in O(n), where n is the size of the lists.
If the elements are also unique, you can also convert to sets (same asymptotic runtime, may be a little bit faster in practice):
set(x) == set(y)
If the elements are not hashable, but sortable, another alternative (runtime in O(n log n)) is
sorted(x) == sorted(y)
If the elements are neither hashable nor sortable you can use the following helper function. Note that it will be quite slow (O(n²)) and should generally not be used outside of the esoteric case of unhashable and unsortable elements.
def equal_ignore_order(a, b):
""" Use only when elements are neither hashable nor sortable! """
unmatched = list(b)
for element in a:
try:
unmatched.remove(element)
except ValueError:
return False
return not unmatched
Determine if 2 lists have the same elements, regardless of order?
Inferring from your example:
x = ['a', 'b']
y = ['b', 'a']
that the elements of the lists won't be repeated (they are unique) as well as hashable (which strings and other certain immutable python objects are), the most direct and computationally efficient answer uses Python's builtin sets, (which are semantically like mathematical sets you may have learned about in school).
set(x) == set(y) # prefer this if elements are hashable
In the case that the elements are hashable, but non-unique, the collections.Counter also works semantically as a multiset, but it is far slower:
from collections import Counter
Counter(x) == Counter(y)
Prefer to use sorted:
sorted(x) == sorted(y)
if the elements are orderable. This would account for non-unique or non-hashable circumstances, but this could be much slower than using sets.
Empirical Experiment
An empirical experiment concludes that one should prefer set, then sorted. Only opt for Counter if you need other things like counts or further usage as a multiset.
First setup:
import timeit
import random
from collections import Counter
data = [str(random.randint(0, 100000)) for i in xrange(100)]
data2 = data[:] # copy the list into a new one
def sets_equal():
return set(data) == set(data2)
def counters_equal():
return Counter(data) == Counter(data2)
def sorted_lists_equal():
return sorted(data) == sorted(data2)
And testing:
>>> min(timeit.repeat(sets_equal))
13.976069927215576
>>> min(timeit.repeat(counters_equal))
73.17287588119507
>>> min(timeit.repeat(sorted_lists_equal))
36.177085876464844
So we see that comparing sets is the fastest solution, and comparing sorted lists is second fastest.
This seems to work, though possibly cumbersome for large lists.
>>> A = [0, 1]
>>> B = [1, 0]
>>> C = [0, 2]
>>> not sum([not i in A for i in B])
True
>>> not sum([not i in A for i in C])
False
>>>
However, if each list must contain all the elements of other then the above code is problematic.
>>> A = [0, 1, 2]
>>> not sum([not i in A for i in B])
True
The problem arises when len(A) != len(B) and, in this example, len(A) > len(B). To avoid this, you can add one more statement.
>>> not sum([not i in A for i in B]) if len(A) == len(B) else False
False
One more thing, I benchmarked my solution with timeit.repeat, under the same conditions used by Aaron Hall in his post. As suspected, the results are disappointing. My method is the last one. set(x) == set(y) it is.
>>> def foocomprehend(): return not sum([not i in data for i in data2])
>>> min(timeit.repeat('fooset()', 'from __main__ import fooset, foocount, foocomprehend'))
25.2893661496
>>> min(timeit.repeat('foosort()', 'from __main__ import fooset, foocount, foocomprehend'))
94.3974742993
>>> min(timeit.repeat('foocomprehend()', 'from __main__ import fooset, foocount, foocomprehend'))
187.224562545
As mentioned in comments above, the general case is a pain. It is fairly easy if all items are hashable or all items are sortable. However I have recently had to try solve the general case. Here is my solution. I realised after posting that this is a duplicate to a solution above that I missed on the first pass. Anyway, if you use slices rather than list.remove() you can compare immutable sequences.
def sequences_contain_same_items(a, b):
for item in a:
try:
i = b.index(item)
except ValueError:
return False
b = b[:i] + b[i+1:]
return not b

How to efficiently compare two unordered lists (not sets)?

a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]
a & b should be considered equal, because they have exactly the same elements, only in different order.
The thing is, my actual lists will consist of objects (my class instances), not integers.
O(n): The Counter() method is best (if your objects are hashable):
def compare(s, t):
return Counter(s) == Counter(t)
O(n log n): The sorted() method is next best (if your objects are orderable):
def compare(s, t):
return sorted(s) == sorted(t)
O(n * n): If the objects are neither hashable, nor orderable, you can use equality:
def compare(s, t):
t = list(t) # make a mutable copy
try:
for elem in s:
t.remove(elem)
except ValueError:
return False
return not t
You can sort both:
sorted(a) == sorted(b)
A counting sort could also be more efficient (but it requires the object to be hashable).
>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True
If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)
In the general case you can't rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)
len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)
If you have to do this in tests:
https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual
assertCountEqual(first, second, msg=None)
Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.
Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.
New in version 3.2.
or in 2.7:
https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual
Outside of tests I would recommend the Counter method.
The best way to do this is by sorting the lists and comparing them. (Using Counter won't work with objects that aren't hashable.) This is straightforward for integers:
sorted(a) == sorted(b)
It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.
sorted(a, key=id) == sorted(b, key==id)
(In Python 2.x you don't actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn't matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances -- for example, you can't compare strings to integers -- so if you will have objects of various types, best to explicitly use the object's ID.)
If you want to compare the objects in the list by value, on the other hand, first you need to define what "value" means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.
sorted(a, key=repr) == sorted(b, key==repr)
If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.
If the comparison is to be performed in a testing context, use assertCountEqual(a, b) (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).
Works on sequences of unhashable objects too.
If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:
from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
print("Lists a and b contain the same objects")
Let a,b lists
def ass_equal(a,b):
try:
map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
if len(a) == 0: # if a is empty, means that b has removed them all
return True
except:
return False # b failed to remove some items from a
No need to make them hashable or sort them.
I hope the below piece of code might work in your case :-
if ((len(a) == len(b)) and
(all(i in a for i in b))):
print 'True'
else:
print 'False'
This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.
For better understanding, refer to my answer in this question
You can write your own function to compare the lists.
Let's get two lists.
list_1=['John', 'Doe']
list_2=['Doe','Joe']
Firstly, we define an empty dictionary, count the list items and write in the dictionary.
def count_list(list_items):
empty_dict={}
for list_item in list_items:
list_item=list_item.strip()
if list_item not in empty_dict:
empty_dict[list_item]=1
else:
empty_dict[list_item]+=1
return empty_dict
After that, we'll compare both lists by using the following function.
def compare_list(list_1, list_2):
if count_list(list_1)==count_list(list_2):
return True
return False
compare_list(list_1,list_2)
from collections import defaultdict
def _list_eq(a: list, b: list) -> bool:
if len(a) != len(b):
return False
b_set = set(b)
a_map = defaultdict(lambda: 0)
b_map = defaultdict(lambda: 0)
for item1, item2 in zip(a, b):
if item1 not in b_set:
return False
a_map[item1] += 1
b_map[item2] += 1
return a_map == b_map
Sorting can be quite slow if the data is highly unordered (timsort is extra good when the items have some degree of ordering). Sorting both also requires fully iterating through both lists.
Rather than mutating a list, just allocate a set and do a left-->right membership check, keeping a count of how many of each item exist along the way:
If the two lists are not the same length you can short circuit and return False immediately.
If you hit any item in list a that isn't in list b you can return False
If you get through all items then you can compare the values of a_map and b_map to find out if they match.
This allows you to short-circuit in many cases long before you've iterated both lists.
plug in this:
def lists_equal(l1: list, l2: list) -> bool:
"""
import collections
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
ref:
- https://stackoverflow.com/questions/9623114/check-if-two-unordered-lists-are-equal
- https://stackoverflow.com/questions/7828867/how-to-efficiently-compare-two-unordered-lists-not-sets
"""
compare = lambda x, y: collections.Counter(x) == collections.Counter(y)
set_comp = set(l1) == set(l2) # removes duplicates, so returns true when not sometimes :(
multiset_comp = compare(l1, l2) # approximates multiset
return set_comp and multiset_comp #set_comp is gere in case the compare function doesn't work

Categories

Resources