How to remove duplicates from the list of tuples? [duplicate]

How to remove duplicates from the list of tuples? [duplicate] - python

I have a 2D list which I create like so:
Z1 = [[0 for x in range(3)] for y in range(4)]
I then proceed to populate this list, such that Z1 looks like this:
[[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
I need to extract the unique 1x3 elements of Z1, without regard to order:
Z2 = makeUnique(Z1) # The solution
The contents of Z2 should look like this:
[[4, 5, 6], [2, 5, 1]]
As you can see, I consider [1, 2, 3] and [2, 3, 1] to be duplicates because I don't care about the order.
Also note that single numeric values may appear more than once across elements (e.g. [2, 3, 1] and [2, 5, 1]); it's only when all three values appear together more than once (in the same or different order) that I consider them to be duplicates.
I have searched dozens of similar problems, but none of them seems to address my exact issue. I'm a complete Python beginner so I just need a push in the right direction.
I have already tried :
Z2= dict((x[0], x) for x in Z1).values()
Z2= set(i for j in Z2 for i in j)
But this does not produce the desired behaviour.
Thank you very much for your help!
Louis Vallance

If the order of the elements inside the sublists does not matter, you could use the following:
from collections import Counter
z1 = [[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
temp = Counter([tuple(sorted(x)) for x in z1])
z2 = [list(k) for k, v in temp.items() if v == 1]
print(z2) # [[4, 5, 6], [1, 2, 5]]
Some remarks:
sorting makes lists [1, 2, 3] and [2, 3, 1] from the example equal so they get grouped by the Counter
casting to tuple converts the lists to something that is hashable and can therefore be used as a dictionary key.
the Counter creates a dict with the tuples created above as keys and a value equal to the number of times they appear in the original list
the final list-comprehension takes all those keys from the Counter dictionary that have a count of 1.
If the order does matter you can use the following instead:
z1 = [[1, 2, 3], [4, 5, 6], [2, 3, 1], [2, 5, 1]]
def test(sublist, list_):
for sub in list_:
if all(x in sub for x in sublist):
return False
return True
z2 = [x for i, x in enumerate(z1) if test(x, z1[:i] + z1[i+1:])]
print(z2) # [[4, 5, 6], [2, 5, 1]]

Related

Why is a different outcome occurring on the following use of sort

I was doing a puzzle and where i had to add 2 lists having same length to a new list and sort the list by the second element of the list.
for x in range(n):
tmp.append([start[x],end[x]])
where start and end are lists containing equal elements and n is the length of start and end.
Now, idk why a difference / error occurs between the use of following code.
end.sort()
for x in range(n):
tmp.append([start[x],end[x]])
and
for x in range(n):
tmp.append([start[x],end[x]])
tmp.sort(key=lambda x:x[1])
EDIT:-
Input list
start=[1, 3, 0, 5, 8, 5]
end=[2, 4, 6, 7, 9, 9]
output by sorting first
[[1, 2], [3, 4], [0, 6], [5, 7], [8, 9], [5, 9]]
output by sorting later
[[1, 2], [3, 4], [0, 6], [5, 7], [8, 9], [5, 9]]
works fine for this list but doesn't work for a bigger array
(array contains 80 elements thats why not uploading here)

If you sort end first, you combine the original order of start with the sorted order of end.
If you combine the two lists first and then sort by the end element, the start elements will get reordered, too, as they "tag along" with their end partner. Consider
start = [1, 2, 3]
end = [3, 2, 1]
Now, sorting end and combining, you'll end up with:
start = [1, 2, 3]
end = [1, 2, 3]
# =>
tmp = [[1, 1], [2, 2], [3, 3]]
Combining first, however, produces:
tmp = [[1, 3], [2, 2], [3, 1]]
And sorting this by the second element, will shuffle the old start elements as well:
tmp.sort(key=lambda x:x[1])
# [[3, 1], [2, 2], [1, 3]]
Side note: Check out zip:
tmp = list(zip(start, end))

List comprehension headaches

I have a nested list like this which:
list = [[1,2,3], [2,5,7,6], [1,-1], [5,7], [6,3,7,4,3], [2, 5, 1, -5]]
What I am trying to do is to remove nested lists, where the value within these lists are both positive and negative. I have tried doing it by list comprehension, but I couldn't figure it out.
def method(list):
return [obj for obj in list if (x for x in obj if -x not in obj)]
The obtained results should be like:
list = [[1,2,3], [2,5,7,6], [5,7], [6,3,7,4,3]]

Assuming you want lists where elements are either all negative or all positive you can use all predefined function to check for both possibilities
result = [L for L in x if all(y>0 for y in L) or all(y<0 for y in L)]
EDIT:
In the comments you clarified what is a valid list (e.g. [-1, 2] is valid)... with this new formulation the test should be
result = [L for L in x if all(-y not in L for y in L)]
where each single test is however now quadratic in the size of the list. Using set this problem can be removed
result = [L for L in x if all(-y not in S for S in (set(L),) for y in L)]

Using list comprehension you can do something like:
def method2(list):
return [obj for obj in list if (all(n>0 for n in obj) or all(n<0 for n in obj))]
that, with your example, give as output:
[[1, 2, 3], [2, 5, 7, 6], [5, 7], [6, 3, 7, 4, 3]]

In general is better to split the task by steps:
Given list find the positives (positives function)
Given list find the negatives and multiply them by -1 (negatives function)
If the intersection of both positives and negatives is not empty remove.
So, you could do:
def positives(ls):
return set(l for l in ls if l > 0)
def negatives(ls):
return set(-1*l for l in ls if l < 0)
list = [[1, 2, 3], [2, 5, 7, 6], [1, -1], [5, 7], [6, 3, 7, 4, 3], [2, 5, 1, -5]]
result = [l for l in list if not negatives(l) & positives(l)]
print(result)
Output
[[1, 2, 3], [2, 5, 7, 6], [5, 7], [6, 3, 7, 4, 3]]
As a side note you should not use list as a variable name as it shadows the built-int list function.

Your generator should yield whether the condition to filter an object applies.
You then feed the generator to an aggregator to determine if obj should be filtered.
the aggregator could be any or all, or something different.
# assuming obj should be filtered if both x and the inverse of x are in obj
def method_with_all(src):
return [obj for obj in src if all(-x not in obj for x in obj)]
def method_with_any(src):
return [obj for obj in src if any(-x in obj for x in obj)]

you can filter out the lists that have both negative and positive elements:
def keep_list(nested_list):
is_first_positive = nested_list[0] > 0
for element in nested_list[1:]:
if (element > 0) != is_first_positive:
return False
return True
my_list = [[1,2,3], [2,5,7,6], [1,-1], [5,7], [6,3,7,4,3], [2, 5, 1, -5]]
print(list(filter(keep_list, my_list)))
output:
[[1, 2, 3], [2, 5, 7, 6], [5, 7], [6, 3, 7, 4, 3]]

Numpy can be used as well. My solution here is similar to the "all"-operation suggested by others but coded explicitly and only needs one condition. It checks whether the sign of the all the elements equals the sign of the first element (could be any other as well).
from numpy import *
def f(b):
return [a for a in b if sum(sign(array(a)) == sign(a[0])) == len(a)]
For your case...
data = [[1,2,3], [2,5,7,6], [1,-1], [5,7], [6,3,7,4,3], [2, 5, 1, -5]]
print(f(data))
...it will return:
[[1, 2, 3], [2, 5, 7, 6], [5, 7], [6, 3, 7, 4, 3]]

How to do Math Functions on Lists within a List

I'm very new to python (using python3) and I'm trying to add numbers from one list to another list. The only problem is that the second list is a list of lists. For example:
[[1, 2, 3], [4, 5, 6]]
What I want is to, say, add 1 to each item in the first list and 2 to each item in the second, returning something like this:
[[2, 3, 4], [6, 7, 8]]
I tried this:
original_lst = [[1, 2, 3], [4, 5, 6]]
trasposition_lst = [1, 2]
new_lst = [x+y for x,y in zip(original_lst, transposition_ls)]
print(new_lst)
When I do this, I get an error
can only concatenate list (not "int") to list
This leads me to believe that I can't operate in this way on the lists as long as they are nested within another list. I want to do this operation without flattening the nested list. Is there a solution?

One approach using enumerate
Demo:
l = [[1, 2, 3], [4, 5, 6]]
print( [[j+i for j in v] for i,v in enumerate(l, 1)] )
Output:
[[2, 3, 4], [6, 7, 8]]

You can use enumerate:
l = [[1, 2, 3], [4, 5, 6]]
new_l = [[c+i for c in a] for i, a in enumerate(l, 1)]
Output:
[[2, 3, 4], [6, 7, 8]]

Why don't use numpy instead?
import numpy as np
mat = np.array([[1, 2, 3], [4, 5, 6]])
mul = np.array([1,2])
m = np.ones(mat.shape)
res = (m.T *mul).T + mat

You were very close with you original method. Just fell one step short.
Small addition
original_lst = [[1, 2, 3], [4, 5, 6]]
transposition_lst = [1, 2]
new_lst = [[xx + y for xx in x] for x, y in zip(original_lst, transposition_lst)]
print(new_lst)
Output
[[2, 3, 4], [6, 7, 8]]
Reasoning
If you print your original zip it is easy to see the issue. Your original zip yielded this:
In:
original_lst = [[1, 2, 3], [4, 5, 6]]
transposition_lst = [1, 2]
for x,y in zip(original_lst, transposition_lst):
print(x, y)
Output
[1, 2, 3] 1
[4, 5, 6] 2
Now it is easy to see that you are trying to add an integer to a list (hence the error). Which python doesn't understand. if they were both integers it would add them or if they were both lists it would combine them.
To fix this you need to do one extra step with your code to add the integer to each value in the list. Hence the addition of the extra list comprehension in the solution above.

A different approach than numpy that could work even for lists of different lengths is
lst = [[1, 2, 3], [4, 5, 6, 7]]
c = [1, 2]
res = [[l + c[i] for l in lst[i]] for i in range(len(c))]

In the for-loop, when I use 'append' to add a new element to the list, the one that has added before also change

I want to get transpose of matrix B without using Numpy. When I use 'append' to add a new element to the list, the one that has added before also change. How can I fix it?
from decimal import *
B = [[1,2,3,5],
[2,3,3,5],
[1,2,5,1]]
def shape(M):
r = len(M)
c = len(M[0])
return r,c
def matxRound(M, decPts=4):
for p in M:
for index in range(len(M[0])):
p[index] = round(p[index], decPts)
def transpose(M):
c_trans, r_trans = shape(M)
new_row = [0]*c_trans
trans_M = []
for i in range(r_trans):
for j in range(c_trans):
new_row[j] = M[j][i]
print 'new_row',new_row
print 'trans_M before append',trans_M
trans_M.append(new_row)
print 'trans_M after append',trans_M
return trans_M
print transpose(B)
The output is here:
new_row [1, 2, 1]
trans_M before append []
trans_M after append [[1, 2, 1]]
new_row [2, 3, 2]
trans_M before append [[2, 3, 2]]
trans_M after append [[2, 3, 2], [2, 3, 2]]
new_row [3, 3, 5]
trans_M before append [[3, 3, 5], [3, 3, 5]]
trans_M after append [[3, 3, 5], [3, 3, 5], [3, 3, 5]]
new_row [5, 5, 1]
trans_M before append [[5, 5, 1], [5, 5, 1], [5, 5, 1]]
trans_M after append [[5, 5, 1], [5, 5, 1], [5, 5, 1], [5, 5, 1]]
[[5, 5, 1], [5, 5, 1], [5, 5, 1], [5, 5, 1]]

I will complete #glibdud comment's answer :
What you are doing now is creating a list that fits your needs for your Transpose.
You are creating your new matrix.
You are, then, appending your transposed value into your new matrix... without creating a new Transpose list.
What happens then is that you modify the last list you just appended, and try to append it again.
So in the end, you added the 4 same lists to your new matrix. As the 4 lists point to the same address in memory as they are the same object, your new matrix have 4 identical rows.

The most pythonic way I know to perform matrix transposition without using Numpy (that should be the preferred way), is by using list unpacking (list expansion) and the builtin zip function transposed = list(zip(*B)).
However, zip() return tuples while your original matrix is a list of lists. So, if you want to keep your structure, you can use transposed = [list(i) for i in zip(*B)]

Remove duplicated lists in list of lists in Python

I've seen some questions here very related but their answer doesn't work for me. I have a list of lists where some sublists are repeated but their elements may be disordered. For example
g = [[1, 2, 3], [3, 2, 1], [1, 3, 2], [9, 0, 1], [4, 3, 2]]
The output should be, naturally according to my question:
g = [[1,2,3],[9,0,1],[4,3,2]]
I've tried with set but only removes those lists that are equal (I thought It should work because sets are by definition without order). Other questions i had visited only has examples with lists exactly duplicated or repeated like this: Python : How to remove duplicate lists in a list of list?. For now order of output (for list and sublists) is not a problem.

(ab)using side-effects version of a list comp:
seen = set()
[x for x in g if frozenset(x) not in seen and not seen.add(frozenset(x))]
Out[4]: [[1, 2, 3], [9, 0, 1], [4, 3, 2]]
For those (unlike myself) who don't like using side-effects in this manner:
res = []
seen = set()
for x in g:
x_set = frozenset(x)
if x_set not in seen:
res.append(x)
seen.add(x_set)
The reason that you add frozensets to the set is that you can only add hashable objects to a set, and vanilla sets are not hashable.

If you don't care about the order for lists and sublists (and all items in sublists are unique):
result = set(map(frozenset, g))
If a sublist may have duplicates e.g., [1, 2, 1, 3] then you could use tuple(sorted(sublist)) instead of frozenset(sublist) that removes duplicates from a sublist.
If you want to preserve the order of sublists:
def del_dups(seq, key=frozenset):
seen = {}
pos = 0
for item in seq:
if key(item) not in seen:
seen[key(item)] = True
seq[pos] = item
pos += 1
del seq[pos:]
Example:
del_dups(g, key=lambda x: tuple(sorted(x)))
See In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique while preserving order?

What about using mentioned by roippi frozenset this way:
>>> g = [list(x) for x in set(frozenset(i) for i in [set(i) for i in g])]
[[0, 9, 1], [1, 2, 3], [2, 3, 4]]

I would convert each element in the list to a frozenset (which is hashable), then create a set out of it to remove duplicates:
>>> g = [[1, 2, 3], [3, 2, 1], [1, 3, 2], [9, 0, 1], [4, 3, 2]]
>>> set(map(frozenset, g))
set([frozenset([0, 9, 1]), frozenset([1, 2, 3]), frozenset([2, 3, 4])])
If you need to convert the elements back to lists:
>>> map(list, set(map(frozenset, g)))
[[0, 9, 1], [1, 2, 3], [2, 3, 4]]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to remove duplicates from the list of tuples? [duplicate] - python

Related

Why is a different outcome occurring on the following use of sort

List comprehension headaches

How to do Math Functions on Lists within a List

In the for-loop, when I use 'append' to add a new element to the list, the one that has added before also change

Remove duplicated lists in list of lists in Python

Categories

Resources