Getting the different between two 2D lists - python

I have a 2D list of lists, I am doing some stuff to it and getting, as a result a slightly modified 2d lists of lists. I cannot track what changes are being made until after I get the new list back. I want to get a list of all the items that have been changed such that
[[1,2,3], [4,5,6], [7,8,9]] becomes [[1,None,3], [4,None,6], [7,None, None]] and I would get a list [(0,1), (1,1), (2, 1), (2,2)] I know you can normally do list(set(a)-set(b)) but when I tried it I got TypeError: unhashable type: 'list' So what is the most efficient way of doing this?

Using zip, enumerate and a generator function:
def diff(lis1, lis2):
for i, (x, y) in enumerate(zip(lis1, lis2)):
for j, (x1, y1) in enumerate(zip(x, y)):
if x1 != y1:
yield i, j
...
>>> lis1 = [[1,2,3], [4,5,6], [7,8,9]]
>>> lis2 = [[1,None,3], [4,None,6], [7,None, None]]
>>> list(diff(lis1, lis2))
[(0, 1), (1, 1), (2, 1), (2, 2)]

Using list comprehension:
>>> a = [[1,2,3], [4,5,6], [7,8,9]]
>>> b = [[1,None,3], [4,None,6], [7,None, None]]
>>> [(i,j) for i, row in enumerate(a) for j, x in enumerate(row) if b[i][j] != x]
[(0, 1), (1, 1), (2, 1), (2, 2)]

If the lists have a regular structure, that is each of the sub-lists has the same length, and you don't mind using external packages, numpy can help.
import numpy as np
a = np.array([[1,2,3], [4,5,6], [7,8,9]])
b = np.array([[1,None,3], [4,None,6], [7,None, None]])
print(np.where(a!=b))
>>>(array([0, 1, 2, 2]), array([1, 1, 1, 2]))

Related

Generating 2D grid indices using list comprehension

I would like to store the row and column indices of a 3 x 3 list in list. It should look like the following:
rc = [(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)]
How can I get this list using a list comprehension in Python?
Some consider multiple for loops inside a list comprehension to be poor style. Use itertools.product() instead:
from itertools import product
list(product(range(3), repeat=2))
This outputs:
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
How about:
[(x, y) for x in range(3) for y in range(3)]
If I understand correctly, and the input is A, the B can be output as follows:
A = [[1,2,3],[4,5,6],[7,8,9]]
B =[(A[i], A[j]) for i in range(3) for j in range(3)]

combine 2 lists of list to list of tuples

I'm trying to combine to different nested list into a list of tuples (x,y)
where x comes from the first nested list and y from the second nested list.
nested_list1 = [[1, 2, 3],[3],[0, 3],[1]]
nested_list2 = [[.0833, .0833, .0833], [.2], [.175, .175], [.2]]
when you combine them it should be:
result = [(1,.0833), (2,.0833), (3,.0833), (3,.2), (0,.175), (3,.175), (1,.2)]
my approach is that i need to iterate through the list of lists and join them 1 at a time.
I know to iterate through 1 nested list like so:
for list in nested_list1:
for number in list:
print(number)
but I can't iterate through 2 nested list at the same time.
for list, list in zip(nested_list1, nested_list2):
for number, prob in zip(list,list):
print(tuple(number, prob)) #will not work
any ideas?
You could do a double zip through lists:
lst1 = [[1, 2, 3],[3],[0, 3],[1]]
lst2 = [[.0833, .0833, .0833], [.2], [.175, .175], [.2]]
print([(u, v) for x, y in zip(lst1, lst2) for u, v in zip(x, y)])
Or use itertools.chain.from_iterable to flatten list and zip:
from itertools import chain
lst1 = [[1, 2, 3],[3],[0, 3],[1]]
lst2 = [[.0833, .0833, .0833], [.2], [.175, .175], [.2]]
print(list(zip(chain.from_iterable(lst1), chain.from_iterable(lst2))))
Use itertools.chain:
>>> nested_list1 = [[1, 2, 3],[3],[0, 3],[1]]
>>> nested_list2 = [[.0833, .0833, .0833], [.2], [.175, .175], [.2]]
>>> import itertools
>>> res = list(zip(itertools.chain.from_iterable(nested_list1), itertools.chain.from_iterable(nested_list2)))
>>> res
[(1, 0.0833), (2, 0.0833), (3, 0.0833), (3, 0.2), (0, 0.175), (3, 0.175), (1, 0.2)]
Flatten your lists and then pass to zip():
list1 = [item for sublist in nested_list1 for item in sublist]
list2 = [item for sublist in nested_list2 for item in sublist]
final = list(zip(list1, list2))
Yields:
[(1, 0.0833), (2, 0.0833), (3, 0.0833), (3, 0.2), (0, 0.175), (3, 0.175), (1, 0.2)]
There are 2 errors in your code:
You shadow built-in list twice and in a way that you can't differentiate between two variables. Don't do this.
You use tuple(x, y) to create a tuple form 2 variables. This is incorrect, as tuple takes one argument only. To construct a tuple of two variables just use syntax (x, y).
So this will work:
for L1, L2 in zip(nested_list1, nested_list2):
for number, prob in zip(L1, L2):
print((number, prob))
More idiomatic would be to flatten your nested lists; for example, via itertools.chain:
from itertools import chain
res = list(zip(chain.from_iterable(nested_list1),
chain.from_iterable(nested_list2)))
[(1, 0.0833), (2, 0.0833), (3, 0.0833), (3, 0.2), (0, 0.175), (3, 0.175), (1, 0.2)]
This one liner will achieve what you want.
reduce(lambda x, y: x+y, [[(i, j) for i, j in zip(x,y)] for x, y in zip(nested_list1, nested_list2)])
One way is to convert both the nested lists into full lists and then use zip. Sample code below:
>>> nested_list1 = [[1, 2, 3],[3],[0, 3],[1]]
>>> nested_list2 = [[.0833, .0833, .0833], [.2], [.175, .175], [.2]]
>>> new_list1 = [x for val in nested_list1 for x in val]
>>> new_list2 = [x for val in nested_list2 for x in val]
>>> print new_list1
[1, 2, 3, 3, 0, 3, 1]
>>> print new_list2
[0.0833, 0.0833, 0.0833, 0.2, 0.175, 0.175, 0.2]
>>> new_val = zip(new_list1, new_list2)
>>> print new_val
[(1, 0.0833), (2, 0.0833), (3, 0.0833), (3, 0.2), (0, 0.175), (3, 0.175), (1, 0.2)]
result = []
[result.extend(list(zip(x, y))) for x in nested_list1 for y in nested_list2]
print(result)
Use two times zip and flatten the list
from functools import reduce
reduce(lambda x,y: x+y,[(zip(i,j)) for i,j in zip(nested_list1,nested_list2)])
You can flatten using chain as well
from itertools import chain
list(chain(*[(zip(i,j)) for i,j in zip(nested_list1,nested_list2)]))
output
[(1, 0.0833), (2, 0.0833), (3, 0.0833), (3, 0.2), (0, 0.175), (3, 0.175), (1, 0.2)]

compare to lists and return the different indices and elements in python

I want to compare to lists and return the different indices and elements.
So I write the following code:
l1 = [1,1,1,1,1]
l2 = [1,2,1,1,3]
ind = []
diff = []
for i in range(len(l1)):
if l1[i] != l2[i]:
ind.append(i)
diff.append([l1[i], l2[i]])
print ind
print diff
# output:
# [1, 4]
# [[1, 2], [1, 3]]
The code works, but are there any better ways to do that?
Update the Question:
I want to ask for another solutions, for example with the iterator, or ternary expression like [a,b](expression) (Not the easiest way like what I did. I want to exclude it.) Thanks very much for the patient! :)
You could use a list comprehension to output all the information in a single list.
>>> [[idx, (i,j)] for idx, (i,j) in enumerate(zip(l1, l2)) if i != j]
[[1, (1, 2)], [4, (1, 3)]]
This will produce a list where each element is: [index, (first value, second value)] so all the information regarding a single difference is together.
An alternative way is the following
>>> l1 = [1,1,1,1,1]
>>> l2 = [1,2,1,1,3]
>>> z = zip(l1,l2)
>>> ind = [i for i, x in enumerate(z) if x[0] != x[1]]
>>> ind
[1, 4]
>>> diff = [z[i] for i in ind]
>>> diff
[(1, 2), (1, 3)]
In Python3 you have to add a call to list around zip.
You can try functional style:
res = filter(lambda (idx, x): x[0] != x[1], enumerate(zip(l1, l2)))
# [(1, (1, 2)), (4, (1, 3))]
to unzip res you can use:
zip(*res)
# [(1, 4), ((1, 2), (1, 3))]

Find all occurences of an element in a matrix in Python

I have a list of lists and I want to find the cordinates of all occurences. I managed to do it, but I wonder if there is a better way to do it using numpy where for example.
This is what I did:
my_list = [[1,2,3,1, 3], [1,3,2]]
target_value = 3
locations = []
for k in range(len(my_list)):
indices = [i for i, x in enumerate(my_list[k]) if x == target_value]
locations.append((k, indices))
locations2 = []
for row in locations:
for i in row[1]:
locations2.append((row[0], i))
print locations2 # prints [(0, 2), (0, 4), (1, 1)]
While you could get this to work in numpy, numpy isn't all that happy with ragged arrays. I think the pure python comprehension version looks okay:
>>> my_list = [[1,2,3,1, 3], [1,3,2]]
>>> [(i,j) for i,x in enumerate(my_list) for j,y in enumerate(x) if y == 3]
[(0, 2), (0, 4), (1, 1)]

How to get first element in a list of tuples?

I have a list like below where the first element is the id and the other is a string:
[(1, u'abc'), (2, u'def')]
I want to create a list of ids only from this list of tuples as below:
[1,2]
I'll use this list in __in so it needs to be a list of integer values.
>>> a = [(1, u'abc'), (2, u'def')]
>>> [i[0] for i in a]
[1, 2]
Use the zip function to decouple elements:
>>> inpt = [(1, u'abc'), (2, u'def')]
>>> unzipped = zip(*inpt)
>>> print unzipped
[(1, 2), (u'abc', u'def')]
>>> print list(unzipped[0])
[1, 2]
Edit (#BradSolomon):
The above works for Python 2.x, where zip returns a list.
In Python 3.x, zip returns an iterator and the following is equivalent to the above:
>>> print(list(list(zip(*inpt))[0]))
[1, 2]
do you mean something like this?
new_list = [ seq[0] for seq in yourlist ]
What you actually have is a list of tuple objects, not a list of sets (as your original question implied). If it is actually a list of sets, then there is no first element because sets have no order.
Here I've created a flat list because generally that seems more useful than creating a list of 1 element tuples. However, you can easily create a list of 1 element tuples by just replacing seq[0] with (seq[0],).
I was thinking that it might be useful to compare the runtimes of the different approaches so I made a benchmark (using simple_benchmark library)
I) Benchmark having tuples with 2 elements
As you may expect to select the first element from tuples by index 0 shows to be the fastest solution very close to the unpacking solution by expecting exactly 2 values
import operator
import random
from simple_benchmark import BenchmarkBuilder
b = BenchmarkBuilder()
#b.add_function()
def rakesh_by_index(l):
return [i[0] for i in l]
#b.add_function()
def wayneSan_zip(l):
return list(list(zip(*l))[0])
#b.add_function()
def bcattle_itemgetter(l):
return list(map(operator.itemgetter(0), l))
#b.add_function()
def ssoler_upacking(l):
return [idx for idx, val in l]
#b.add_function()
def kederrack_unpacking(l):
return [f for f, *_ in l]
#b.add_arguments('Number of tuples')
def argument_provider():
for exp in range(2, 21):
size = 2**exp
yield size, [(random.choice(range(100)), random.choice(range(100))) for _ in range(size)]
r = b.run()
r.plot()
II) Benchmark having tuples with 2 or more elements
import operator
import random
from simple_benchmark import BenchmarkBuilder
b = BenchmarkBuilder()
#b.add_function()
def kederrack_unpacking(l):
return [f for f, *_ in l]
#b.add_function()
def rakesh_by_index(l):
return [i[0] for i in l]
#b.add_function()
def wayneSan_zip(l):
return list(list(zip(*l))[0])
#b.add_function()
def bcattle_itemgetter(l):
return list(map(operator.itemgetter(0), l))
#b.add_arguments('Number of tuples')
def argument_provider():
for exp in range(2, 21):
size = 2**exp
yield size, [tuple(random.choice(range(100)) for _
in range(random.choice(range(2, 100)))) for _ in range(size)]
from pylab import rcParams
rcParams['figure.figsize'] = 12, 7
r = b.run()
r.plot()
This is what operator.itemgetter is for.
>>> a = [(1, u'abc'), (2, u'def')]
>>> import operator
>>> b = map(operator.itemgetter(0), a)
>>> b
[1, 2]
The itemgetter statement returns a function that returns the element at the index that you specify. It's exactly the same as writing
>>> b = map(lambda x: x[0], a)
But I find that itemgetter is a clearer and more explicit.
This is handy for making compact sort statements. For example,
>>> c = sorted(a, key=operator.itemgetter(0), reverse=True)
>>> c
[(2, u'def'), (1, u'abc')]
You can use "tuple unpacking":
>>> my_list = [(1, 'abc'), (2, 'def')]
>>> my_ids = [idx for idx, val in my_list]
>>> my_ids
[1, 2]
At iteration time each tuple is unpacked and its values are set to the variables idx and val.
>>> x = (1, 'abc')
>>> idx, val = x
>>> idx
1
>>> val
'abc'
if the tuples are unique then this can work
>>> a = [(1, u'abc'), (2, u'def')]
>>> a
[(1, u'abc'), (2, u'def')]
>>> dict(a).keys()
[1, 2]
>>> dict(a).values()
[u'abc', u'def']
>>>
From a performance point of view, in python3.X
[i[0] for i in a] and list(zip(*a))[0] are equivalent
they are faster than list(map(operator.itemgetter(0), a))
Code
import timeit
iterations = 100000
init_time = timeit.timeit('''a = [(i, u'abc') for i in range(1000)]''', number=iterations)/iterations
print(timeit.timeit('''a = [(i, u'abc') for i in range(1000)]\nb = [i[0] for i in a]''', number=iterations)/iterations - init_time)
print(timeit.timeit('''a = [(i, u'abc') for i in range(1000)]\nb = list(zip(*a))[0]''', number=iterations)/iterations - init_time)
output
3.491014136001468e-05
3.422205176000717e-05
when I ran (as suggested above):
>>> a = [(1, u'abc'), (2, u'def')]
>>> import operator
>>> b = map(operator.itemgetter(0), a)
>>> b
instead of returning:
[1, 2]
I received this as the return:
<map at 0xb387eb8>
I found I had to use list():
>>> b = list(map(operator.itemgetter(0), a))
to successfully return a list using this suggestion. That said, I'm happy with this solution, thanks. (tested/run using Spyder, iPython console, Python v3.6)
I'd prefer zipping this way:
>>> lst = [(1, u'abc'), (2, u'def')]
>>> new, _ = zip(*lst)
>>> new
(1, 2)
>>>
Or if you don't know how many extra values there are:
>>> new, *_ = zip(*lst)
>>> new
(1, 2)
>>>
you can unpack your tuples and get only the first element using a list comprehension:
l = [(1, u'abc'), (2, u'def')]
[f for f, *_ in l]
output:
[1, 2]
this will work no matter how many elements you have in a tuple:
l = [(1, u'abc'), (2, u'def', 2, 4, 5, 6, 7)]
[f for f, *_ in l]
output:
[1, 2]
I wondered why nobody suggested to use numpy, but now after checking i understand. It is maybe not the best for mixed type arrays.
This would be a solution in numpy:
>>> import numpy as np
>>> a = np.asarray([(1, u'abc'), (2, u'def')])
>>> a[:, 0].astype(int).tolist()
[1, 2]
To get an element of a list or tuple you can iterate through a list or tuple
a = [(1, u'abc'), (2, u'def')]
list1 = [a[i][0] for i in range(len(a))]
print(list1)
Those are tuples, not sets. You can do this:
l1 = [(1, u'abc'), (2, u'def')]
l2 = [(tup[0],) for tup in l1]
l2
>>> [(1,), (2,)]
another simple suggestion if you need to convert to a nested of the tuple, and all elements inside the list the answer will be:
s=[]
for i in range(len(a)):
s.append(a[i][0])
print(s)
Output:
[(1),(2)]
If you need to convert to a nested of the list, the answer will be:
a = [(1, u'abc'), (2, u'def')]
print([list(i[0]) for i in a])
output:
[[1], [2]]
Solution using list comprehension.
og_list = [(1, u'abc'), (2, u'def')]
list_of_keys = [key for key, _ in og_list]
output
[1,2]

Categories

Resources