Python list difference

Python list difference - python

I am trying to find all the elements that are in list A and not in list B.
I thought something like newList = list(set(a) & !set(b)) or newList = list(set(a) & (not set(b))) would work, but it's not.
If there a better way to achieve what I'm trying to do other than this?
newList = []
for item in a:
if item not in b:
newList.append(item)
Also important, it needs to be done in Python 2.6

You're looking for the set difference:
newList = list(set(a).difference(b))
Alternatively, use the minus operator:
list(set(a) - set(b))

Did you try
list(set(a) - set(b))
Here is a list of all Python set operations.
But this unnecessarily creates a new set for b. As #phihag mentions, difference method would prevent this.

If you care about maintaining order:
def list_difference(a, b):
# returns new list of items in a that are not in b
b = set(b)
return [x for x in a if x not in b]

>>> list1 = [1,2,3,4,5]
>>> list2 = [4,5,6,7,8]
>>> print list(set(list1)-set(list2))
[1, 2, 3]

Related

How to XOR two lists in Python? [duplicate]

This question already has answers here:
Comparing two lists and only printing the differences? (XORing two lists)
(6 answers)
Closed 2 years ago.
I've got two lists, for example:
a = ['hello','world']
b = ['hello','world','im','steve']
If I want to create a third list that only contains elements NOT in both:
c = ['im','steve']
How do I do this if the order of the elements IS important? I know I can use sets but they keep throwing out the order of my lists. I could use ' '.join(list) to convert them to strings but not sure how to do this operation in that format either.

You can concatenate the lists and use list comprehension:
a = ['hello','world']
b = ['hello','world','im','steve']
final_vals = [i for i in a+b if i not in a or i not in b]
Output:
['im', 'steve']

Option 1: set method (recommended)
Sets have a symmetric_difference method that exclusively return elements from either a or b. Order can be preserved with a list comprehension for a concatenated list a + b.
comp = set(a).symmetric_difference(b)
[x for x in a + b if x in comp]
# ['im', 'steve']
Option 2: pathlib method
For reference, another way to diff two lists might be with pathlib.Path.relative_to method:
import pathlib
p = pathlib.Path(*b)
r = p.relative_to(*a)
list(r.parts)
# ['im', 'steve']
Note: b is the longer list. This option is potentially less efficient than a simple list comprehension.

Add two lists together and minus the intersection part if it shows in the new list. Order is preserved.
c = a + b
for v in set(a).intersection(set(b)):
while v in c:
c.remove(v)

a = ['hello','world']
b = ['hello','world','im','steve']
a = set(a)
b = set(b)
print(a.symmetric_difference(b))
This code print elements that are only in one of the tables.
Look here:
https://learnpython.org/en/Sets

You could also just create a function that filters elements from l1 that don't exist in l2, and call it twice with the arguments flipped:
a = ['hello','world', 'foo']
b = ['hello','world','im','steve']
def difference(l1, l2):
return list(filter(lambda x: x not in l2, l1))
print(difference(a, b) + difference(b, a))
# ['foo', 'im', 'steve']
If you don't wish to use filter(), a simple list comprehension like this also works:
[item for item in l1 if item not in l2]

The question is not very clear, indeed, and probably you're good with #Ajax1234 's answer, but here's another "take" on it.
If you wanna compare positions (kind of what a bit-wise XOR would do) you can do something like getting the shortest list, iterate checking position by position with the longest list (check the same position in the longest list matches the word in the shortest list) and then add the remainder (the "unwalked" part of the longest list). Something like the following:
a = ['hello', 'world']
b = ['hello', 'world', 'im', 'steve']
min_list = a if len(a) < len(b) else b
max_list = b if len(b) > len(a) else a
results = []
for i, item in enumerate(min_list):
# Iterate through the shortest list to avoid IndexError(s)
if min_list[i] != max_list[i]:
results.append(min_list[i])
results.append(max_list[i])
results.extend(max_list[i + 1:])
print(results)
# Prints: ['im', 'steve']
However, then you have the problem of what to do if the same positions don't match. I mean... What to do in that case? In the code above, I just added both entries to the results list, which means for the following inputs:
a = ['hello', 'foo']
b = ['hello', 'world', 'im', 'steve']
would output:
>>> ['foo', 'world', 'im', 'steve']
(notice both foo from list a and world from list b have been added)

Using standard for loop to check for items not in one or the other list (may be more understandable than list comprehension):
a = ['hello','world', 'foo']
b = ['hello','world','im','steve']
c = a+b
ans = []
for i in c:
if i not in a or i not in b:
ans.append(i)
print(ans)
Output:
['foo', 'im', 'steve']

I recommend, using ^ operator with sets, like set(a) ^ set(b), Example (demo):
>>> a = ['hello','world']
>>> b = ['hello','world','im','steve']
>>> set(a) ^ set(b)
{'steve', 'im'}
>>> sorted(set(a) ^ set(b),key=max([a,b],key=len).index)
['im', 'steve']
>>>
https://docs.python.org/2/library/stdtypes.html#frozenset.symmetric_difference

Explore all object of a list and delete some of them

lis a list that I want to explore in order to suppress some items. The function do.i.want.to.suppres.i returns TRUE or FALSE in order to tell me whether I want the suppression. The details of this function is not important.
I tried this:
l = [1,4,2,3,5,3,5,2]
for i in l:
if do.i.want.to.suppress.i(i):
del i
print l
but l does not change! So I tried
l = [1,4,2,3,5,3,5,2]
for position,i in enumerate(l):
if do.i.want.to.suppress.i(i):
del l[position]
But then the problem is that the position does not match the object i as lget modified during the loop.
I could do something like this:
l = [1,4,2,3,5,3,5,2]
for position,i in enumerate(l):
if do.i.want.to.suppress.i(i):
l[position] = 'bulls'
l = [x for x in l if x!='bulls']
But I guess there should have a smarter solution. Do you have one?

l = [item for item in my_list if not do_I_suppress(item)]
List comprehensions! learn them! love them! live them!

The list comprehension approach is the most pythonic way, but if you really need to modify the list itself then I found this to be the best approach, nicer than the while loop approach:
for position in xrange(len(l) - 1, -1, -1):
i = l[position]
if do.i.want.to.suppress.i(i):
del l[position]

This is a good place to use a while loop
i = 0
while i < len(l):
if do.i.want.to.suppress.i(i):
del l[i]
else:
i = i + 1

Besides List Comprehension (which returns a list, creating the full list in memory):
filtered_list = [itm for itm in lst if i_want_to_keep(itm)]
You can use filter() (same result as List Comprehensions)
filtered_list = filter(i_want_to_keep, lst)
or itertools.ifilter() (which returns an iterator and avoid creating the whole list in memory, specially useful for iterating)
import itertools
filtered_list = itertools.ifilter(i_want_to_keep, lst)
for itm in filtered_list:
do_whatever(itm)

filter will also work:
answer = filter(lambda x: not do_I_suppress(x), lis)
Note that in Python 3.x, you will need to put filter in list:
answer = list(filter(lambda x: not do_I_suppress(x), lis))

How to copy data in Python

After entering a command I am given data, that I then transform into a list. Once transformed into a list, how do I copy ALL of the data from that list [A], and save it - so when I enter a command and am given a second list of data [B], I can compare the two; and have data that is the same from the two lists cancel out - so what is not similar between [A] & [B] is output. For example...
List [A]
1
2
3
List [B]
1
2
3
4
Using Python, I now want to compare the two lists to each other, and then output the differences.
Output = 4
Hopefully this makes sense!

You can use set operations.
a = [1,2,3]
b = [1,2,3,4]
print set(b) - set(a)
to output the data in list format you can use the following print statement
print list(set(b) - set(a))

>>> b=[1,2,3,4]
>>> a=[1,2,3]
>>> [x for x in b if x not in a]
[4]

for element in b:
if element in a:
a.remove(element)
This answer will return a list not a set, and should take duplicates into account. That way [1,2,1] - [1,2] returns [1] not [].

Try itertools.izip_longest
import itertools
a = [1,2,3]
b = [1,2,3,4]
[y for x, y in itertools.izip_longest(a, b) if x != y]
# [4]
You could easily modify this further to return a duple for each difference, where the first item in the duple is the position in b and the second item is the value.
[(i, pair[1]) for i, pair in enumerate(itertools.izip_longest(a, b)) if pair[0] != pair[1]]
# [(3, 4)]

For entering the data use a loop:
def enterList():
result = []
while True:
value = raw_input()
if value:
result.append(value)
else:
return result
A = enterList()
B = enterList()
For comparing you can use zip to build pairs and compare each of them:
for a, b in zip(A, B):
if a != b:
print a, "!=", b
This will truncate the comparison at the length of the shorter list; use the solution in another answer given here using itertools.izip_longest() to handle that.

python - Common lists among lists in a list

I need to be able to find the first common list (which is a list of coordinates in this case) between a variable amount of lists.
i.e. this list
>>> [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
should return
>>> [3,4]
If easier, I can work with a list of all common lists(coordinates) between the lists that contain the coordinates.
I can't use sets or dictionaries because lists are not hashable(i think?).

Correct, list objects are not hashable because they are mutable. tuple objects are hashable (provided that all their elements are hashable). Since your innermost lists are all just integers, that provides a wonderful opportunity to work around the non-hashableness of lists:
>>> lists = [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
>>> sets = [set(tuple(x) for x in y) for y in lists]
>>> set.intersection(*sets)
set([(3, 4)])
Here I give you a set which contains tuples of the coordinates which are present in all the sublists. To get a list of list like you started with:
[list(x) for x in set.intersection(*sets)]
does the trick.
To address the concern by #wim, if you really want a reference to the first element in the intersection (where first is defined by being first in lists[0]), the easiest way is probably like this:
#... Stuff as before
intersection = set.intersection(*sets)
reference_to_first = next( (x for x in lists[0] if tuple(x) in intersection), None )
This will return None if the intersection is empty.

If you are looking for the first child list that is common amongst all parent lists, the following will work.
def first_common(lst):
first = lst[0]
rest = lst[1:]
for x in first:
if all(x in r for r in rest):
return x

Solution with recursive function. :)
This gets first duplicated element.
def get_duplicated_element(array):
global result, checked_elements
checked_elements = []
result = -1
def array_recursive_check(array):
global result, checked_elements
if result != -1: return
for i in array:
if type(i) == list:
if i in checked_elements:
result = i
return
checked_elements.append(i)
array_recursive_check(i)
array_recursive_check(array)
return result
get_duplicated_element([[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]])
[3, 4]

you can achieve this with a list comprehension:
>>> l = [[[1,2],[3,4],[6,7]],[[3,4],[5,9],[8,3],[4,2]],[[3,4],[9,9]]]
>>> lcombined = sum(l, [])
>>> [k[0] for k in [(i,lcombined.count(i)) for i in lcombined] if k[1] > 1][0]
[3, 4]

What is the Pythonic way to find the longest common prefix of a list of lists?

Given: a list of lists, such as [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
Todo: Find the longest common prefix of all sublists.
Exists: In another thread "Common elements between two lists not using sets in Python", it is suggested to use "Counter", which is available above python 2.7. However our current project was written in python 2.6, so "Counter" is not used.
I currently code it like this:
l = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
newl = l[0]
if len(l)>1:
for li in l[1:]:
newl = [x for x in newl if x in li]
But I find it not very pythonic, is there a better way of coding?
Thanks!
New edit: Sorry to mention: in my case, the shared elements of the lists in 'l' have the same order and alway start from the 0th item. So you wont have cases like [[1,2,5,6],[2,1,7]]

os.path.commonprefix() works well for lists :)
>>> x = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> import os
>>> os.path.commonprefix(x)
[3, 2, 1]

I am not sure how pythonic it is
from itertools import takewhile,izip
x = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
def allsame(x):
return len(set(x)) == 1
r = [i[0] for i in takewhile(allsame ,izip(*x))]

Here's an alternative way using itertools:
>>> import itertools
>>> L = [[3,2,1,4], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> common_prefix = []
>>> for i in itertools.izip(*L):
... if i.count(i[0]) == len(i):
... common_prefix.append(i[0])
... else:
... break
...
>>> common_prefix
[3, 2, 1]
Not sure how "pythonic" it might be considered though.

Given your example code, you seem to want a version of reduce(set.intersection, map(set, l)) that preserves the initial order of the first list.
This requires algorithmic improvements, not stylistic improvements; "pythonic" code alone won't do you any good here. Think about the situation that must hold for all values that occur in every list:
Given a list of lists, a value occurs in every list if and only if it occurs in nlist lists, where nlist is the total number of lists.
If we can guarantee that each value occurs only once in every list, then the above can be rephrased:
Given a list of lists of unique items, a value occurs in every list if and only if it occurs nlist times total.
We can use sets to guarantee that the items in our lists are unique, so we can combine this latter principle with a simple counting strategy:
>>> l = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> count = {}
>>> for i in itertools.chain.from_iterable(map(set, l)):
... count[i] = count.get(i, 0) + 1
...
Now all we have to do is filter the original list:
>>> [i for i in l[0] if count[i] == len(l)]
[3, 2, 1]

It is inefficient as it doesn't early-out as soon as a mismatch is found, but its tidy:
([i for i,(j,k) in enumerate(zip(a,b)) if j!=k] or [0])[0]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python list difference - python

You're looking for the set difference: newList = list(set(a).difference(b)) Alternatively, use the minus operator: list(set(a) - set(b))

Did you try list(set(a) - set(b)) Here is a list of all Python set operations. But this unnecessarily creates a new set for b. As #phihag mentions, difference method would prevent this.

If you care about maintaining order: def list_difference(a, b): # returns new list of items in a that are not in b b = set(b) return [x for x in a if x not in b]

>>> list1 = [1,2,3,4,5] >>> list2 = [4,5,6,7,8] >>> print list(set(list1)-set(list2)) [1, 2, 3]

Related

How to XOR two lists in Python? [duplicate]

Explore all object of a list and delete some of them

How to copy data in Python

python - Common lists among lists in a list

What is the Pythonic way to find the longest common prefix of a list of lists?

Categories

Resources