Remove rows that contains at least one duplicate value

Remove rows that contains at least one duplicate value - python

I have a list of indexes A :
A = np.array([[1, 4, 3],
[1, 2, 5],
[6, 7, 8],
[9, 10, 2],
[11, 3, 12]])
I would like to delete each lines that contains at least one duplicated value (no matter in which column it's located) to obtain an array with no duplicates indexes :
[[1, 4, 3],
[6, 7, 8],
[9, 10, 2],
Is there a quick and convenient way to perform this ?

Given that no sub-list should be in the output if a value in it has already been in any previous sub-list then one possible solution can look like this,
# Assuming that A is a 2-D list
output = []
# Add all values to a set
st = set()
for i in range(len(A)):
for j in range(len(A[i])):
if A[i][j] not in st:
st.add(A[i][j])
for i in range(len(A)):
flag = True
j = 0
while j < len(A[i]):
if A[i][j] not in st:
# set flag to false if value not in set meaning a duplicate is encountered
flag = False
break
else:
# if a value is in the set then remove it from set
st.remove(A[i][j])
j += 1
# append A[i] to output only if the sub-list did not encounter any duplicates
if flag:
output.append(A[i])
The time complexity will be O(n2).
When the input is,
A = [[1, 4, 3],
[1, 2, 5],
[6, 7, 8],
[9, 10, 2],
[11, 3, 12]]
It provides the output,
[[1, 4, 3], [6, 7, 8], [9, 10, 2]]

mask means dont show the element.
mask = 0 means show element
In beginning assume all are unmasked
one by one check if there is intersection with next and if so, mask that
If something is masked, dont use it for masking next elements
Show all unamsked element
mask = np.zeros((A.shape[0],), dtype=bool)
for i in range(A.shape[0]):
if mask[i]: continue
a = set(A[i])
for j in range(i+1, A.shape[0]):
if not a.intersection(A[j]): continue
mask[j] = True
A[~mask]

Related

How to split a nested list into a smaller nested list

So I have a nested list as my input (and the nested list is always square. ie. same number of rows as columns). I want to break this list up into another nested list of which the elements are just 2x2 "parts" of the original list.
For example, if my input was
[[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
my output should be
[[1,2,5,6], [3,4,7,8], [9,10,13,14], [11,12,15,16]]
Another example:
input:
[[1,2,3],
[5,6,7],
[9,10,11],
output:
[[1,2,5,6],[3,7],[9,10],[11]]
I've tried making a nested for loop that goes through the first two columns and rows and makes that into a list and then appends that to another list and then repeats the process, but I get an index out of bounds exception error
This is what I've done so far
def get_2_by_2(map: List[List[int]]) -> int:
i = 0
j = 0
lst_2d = []
lst = []
for row in range(i, min(i+2, len(map))):
for column in range(j, min(j+2, len(map))):
print(row,column)
lst.append(map[row][column])
lst_2d.append(lst)
return lst_2d
basically this one only returns the first 2x2. I attempted using a while loop on the outside and incrementing the values of i and j and making my while loop dependent on one of them. that resulted in an index out of bounds.

You can iterate through the rows and columns in a step of 2, and slice the list of lists accordingly:
def get_2_by_2(matrix):
output = []
for row in range(0, len(matrix), 2):
for col in range(0, len(matrix[0]), 2):
output.append([i for r in matrix[row: row + 2] for i in r[col: col + 2]])
return output
or with a nested list comprehension:
def get_2_by_2(matrix):
return [
[i for r in matrix[row: row + 2]
for i in r[col: col + 2]] for col in range(0, len(matrix[0]), 2)
]
so that given:
m = [[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]]
get_2_by_2(m) returns:
[[1, 2, 5, 6], [3, 4, 7, 8], [9, 10, 13, 14], [11, 12, 15, 16]]
and that given:
m = [[1, 2, 3],
[5, 6, 7],
[9, 10, 11]]
get_2_by_2(m) returns:
[[1, 2, 5, 6], [3, 7], [9, 10], [11]]

Sorting a list of list such that last element of a list equals first element of next list

I have a list of lists which is
my_list=[[9, 10, 1], [1, 7, 5, 6, 11], [0, 4], [4, 2, 9]]
I want to sort this list such that it looks like this:
result=[[0, 4], [4, 2, 9],[9, 10, 1], [1, 7, 5, 6, 11]]
The conditions are:
1. The list should start with the element containing zero.
2. The last element of a list should be same as first element of the next list and so on.
3. The elements inside the sub-lists should be in the same order as the original list.
Thank you.

The fast solution is to build a dict that maps numbers to sublists based on the first element:
dct = {sublist[0]: sublist for sublist in my_list}
# {0: [0, 4], 9: [9, 10, 1], 1: [1, 7, 5, 6, 11], 4: [4, 2, 9]}
And then, starting with the number 0, looking up the next sublist that needs to be added in the dict:
result = []
num = 0 # start with the sublist where the first element is 0
while True:
try:
# find the sublist that has `num` as the first element
sublist = dct[num]
except KeyError:
# if there is no such sublist, we're done
break
# add the sublist to the result and update `num`
result.append(sublist)
num = sublist[-1]
This runs in linear O(n) time and gives the expected results:
[[0, 4], [4, 2, 9], [9, 10, 1], [1, 7, 5, 6, 11]]

You could check for each permutation of your list, if they are a valid permutation. It might be possible to write a more efficient algorithm, but this one does not assume that there is exactly one possible solution.
from itertools import permutations
my_list=[[9, 10, 1], [1, 7, 5, 6, 11], [0, 4], [4, 2, 9]]
def sortCheck(a):
if a[0][0] != 0:
return False
for i in range(0, len(a) - 1):
if a[i][-1] != a[i+1][0]:
return False
return True
result_list = []
for permutation in permutations(my_list):
if sortCheck(permutation):
result_list.append(list(permutation))

Not a very nice implementation, but it works:
my_list=[[9, 10, 1], [1, 7, 5, 6, 11], [0, 4], [4, 2, 9]]
new_list = []
index = 0
while my_list:
index = [item[0] for item in my_list].index(index)
item = my_list[index]
del my_list[index]
new_list.append(item)
index = item[-1]
print(new_list)
A ValueError is raised when no sublist is found that meet the criterion

def sortList(list)
hash = {}
list.each do |value|
hash[value[0]] = value
end
key = 0
sorted = []
list.each do |k|
item = hash[key.to_i]
key = item[-1]
sorted << item
end
sorted
end

Finding a minimum value in a list of lists and returning that list

This code is supposed to iterate over the list of lists and return the entire list that contains the smallest value. I have already identified that it keeps returning the list at index[0], but I cannot figure out why. Any help, or even hints, would be greatly appreciated.
def list_with_min(list_of_lists):
m = 0
for i in range(len(list_of_lists)-1):
list_i = list_of_lists[m]
min_list = min(list_of_lists[m])
if min_list < list_i[0]:
m = i
answer = list_of_lists[m]
return answer
print(list_with_min([[9, 10, 15], [1, 8, 4], [-3, 7, 8]]))
# [9, 10, 15]--------> should be [-3, 7, 8]
print(list_with_min([[5], [9], [6], [2], [7], [10], [72]]))
# [5]----------------> should be [2]
print(list_with_min([[-2, 6, 9], [-9, 6, 9], [4, 8, 2], [5, -2]]))
# [-2, 6, 9]---------> should be [[-2, 6, 9], [5, -2]] (I assume two lists with the same minimum value should both be returned?)

You can provide a key to the function min, that is a function used for comparison. It turns out that here you want key to be the function min itself.
list_of_lists = [[9, 10, 15], [1, 8, 4], [-3, 7, 8]]
min(list_of_lists, key=min) # [-3, 7, 8]
This does not return multiple minima, but can be improved to do so.
list_of_lists = [[9, 10, 15], [1, -3, 4], [-3, 7, 8]]
min_value = min(map(min, list_of_lists))
[lst for lst in list_of_lists if min(lst) == min_value] # [[1, -3, 4], [-3, 7, 8]]

you can get this in one line with a list comprehension (ive added three though to help you work through the logic), it deals with duplicates differently however:
#for each list return a list with the minimum and the list
mins_and_lists = [[min(_list), _list] for _list in lists]
#find the minimum one
min_and_list = min([[min(_list), _list] for _list in lists])
#parse the result of the minimum one list
minimum, min_list = min([[min(_list), _list] for _list in lists])
if you want to handle duplicate minimums by returning both then:
dup_mins = [_list for _min, _list in mins_and_lists if _min == minimum]

EDIT: A more python way could be this:
def list_with_min(l):
min_vals = [min(x) for x in l]
return l[min_vals.index(min(min_vals))]
A bit bulky but it works...
l=[[9, 10, 15], [1, 8, 4], [-3, 7, 8]]
def list_with_min(l):
m = min(l[0])
for i in l[1:]:
m = min(i) if min(i) < m else m
for i in l:
if m in i:
return i
print(list_with_min(l))
Output:
[-3, 7, 8]

You could also use this:
min([ (min(a),a) for a in list_of_lists ])[1]

Your condition just makes no sense. You're checking
if min_list < list_i[0]:
which means, if the smallest value of list_i is less than the first value of list_i.
I don't think you'd ever want to compare to just list_i[0]. You need to store min_list across loops, and compare to that.

Finding unique solution puzzle python

x = np.array([[0,1,11],[0,2,11],[0,3,10],[0,4,10],[0,5,9],[0,6,9],[1,7,9],
[1,5,11],[1,6,11],[2,7,11],[2,8,10]])
I'm pretty new to this so i'm gonna call things like this [element1,element2,element3]
i have an array as shown above, and i want to find a solution to this array.
It should satisfy the following conditions:
The first element 0:
it should have atleast one solution from [0,1,11],[0,2,11],[0,3,10],[0,4,10],[0,5,9],[0,6,9]
The first element 1:
this : [1,7,9],[1,5,11],[1,6,11]
The first element 2:
and this : [2,7,11],[2,8,10]
Such that the second element and 3rd element is unique for each solution(where 1st element=0,second element=1 and 3rd element=2)
o/p can be :
[0,1,11] and [1,7,9] and [2,8,10]
wrong output :
[0,1,11], [1,6,11] ,[2,8,10]
here parameter 3 of the first and the second are same.

If I understand correctly, you want to produce triplets from the given x array so that the first, second and third element are all unique in one triplet. Code to do that:
import itertools
x = [[0,1,11],[0,2,11],[0,3,10],[0,4,10],[0,5,9],[0,6,9],[1,7,9],
[1,5,11],[1,6,11],[2,7,11],[2,8,10]]
triplets = itertools.combinations(x,3)
for t in triplets:
isGood = True
for pos in range(3):
if (t[0][pos] == t[1][pos] or t[0][pos] == t[2][pos] or t[1][pos] == t[2][pos]):
isGood = False
if (isGood):
print(repr(t))
This procudes the following output:
([0, 1, 11], [1, 7, 9], [2, 8, 10])
([0, 2, 11], [1, 7, 9], [2, 8, 10])
([0, 5, 9], [1, 6, 11], [2, 8, 10])
([0, 6, 9], [1, 5, 11], [2, 8, 10])
A more pythonic solution which does the same in only 3 lines
for t in itertools.combinations(x,3):
if all(len(col) == len(set(col)) for col in zip(*t)):
print(repr(t))
Insane one-liner:
print(''.join(repr(t) + '\n' for t in itertools.combinations(x,3) if all(len(col) == len(set(col)) for col in zip(*t))))

how to merge two sublists sharing any number in common?(2)

Given that:
list=[[1,2,3],[3,4,5],[5,6],[6,7],[9,10],[10,11]]
I have asked a similar question before, I have tried the code on
how to merge two sublists sharing any number in common?
but I am stuck in my code now.
I want to merge the sublists that share a common number,
e.g. [1,2,3] and [3,4,5] can merge to give [1,2,3,4,5] as they share a common number, 3.
In [[1,2,3],[3,4,5],[5,6]], although [1,2,3] and [3,4,5] share a common number, 3,
[3,4,5] and [5,6] also share a common number, 5, so I want all three of them to merge then gives
[1,2,3,4,5,6].
So for list,
my expected result is:
[[1,2,3,4,5,6,7],[9,10,11]]
I have tried the following code but don't know what is wrong, can anyone help?
s = map(set, list)
for i, si in enumerate(s):
for j, sj in enumerate(s):
if i != j and si & sj:
s[i] = si | sj
s[j] = set()
list=[list(el) for el in s if el]
print list
>>>[[5, 6, 7], [9, 10, 11]]

def merge_containing(input_list):
merged_list = [set(input_list[0])]
i = 0
for sub_list in input_list:
if not set(sub_list).intersection(set(merged_list[i])): # 1
merged_list.append(set()) # 2
i += 1 # 2
merged_list[i].update(set(sub_list)) # 3
return [sorted(sub_list) for sub_list in merged_list] # 4
mylist=[[1,2,3],[3,4,5],[5,6],[6,7],[9,10],[10,11]]
print merge_containing(mylist)
Output:
[[1, 2, 3, 4, 5, 6, 7], [9, 10, 11]]
How does it work:
Check if the sub_list set shares any common member with the current
index set of the merged_list.
If it doesn't, add a new empty set to the merged_list and increment
the index.
Adds the sub_list set to the set at index in the merged_list.
Converts from set to list and return

def merge(list_of_lists):
number_set = set()
for l in list_of_lists:
for item in l:
number_set.add(item)
return sorted(number_set)
if __name__ == '__main__':
list_of_lists = [[1,2,3],[3,4,5],[5,6],[6,7],[9,10],[10,11]]
merged = merge(list_of_lists)
print merged

I'm posting this as a new answer since the OP already accepted my other.
But as pointed out by #Eithos,
the input:
[[3,4], [1,2], [1,3]]
should return
[[1,2,3,4]]
and the input
[[1,2,3],[3,4,5],[5,6],[6,7],[9,10],[10,11],[65,231,140], [13,14,51]]
should return
[[1, 2, 3, 4, 5, 6, 7], [9, 10, 11], [13, 14], [51], [65], [140], [231]]
Here's my attempt:
from itertools import chain
def merge_containing(input_list):
print " input:", input_list
chain_list = sorted(set(chain(*input_list))) # 1
print " chain:",chain_list
return_list = []
new_sub_list = []
for i, num in enumerate(chain_list):
try:
if chain_list[i + 1] == chain_list[i] + 1: # 2
new_sub_list.append(num)
else:
new_sub_list.append(num) # 3
return_list.append(new_sub_list)
new_sub_list = []
except IndexError:
new_sub_list.append(num) # 3
return_list.append(new_sub_list)
print 'result:', return_list
return return_list
mylist = [[3,4], [1,2], [1,3]]
merge_containing(mylist)
print
mylist = [[1,2,3],[3,4,5],[5,6],[6,7],[9,10],[10,11],[65,231,140], [13,14,51]]
merge_containing(mylist)
Output:
input: [[3, 4], [1, 2], [1, 3]]
chain: [1, 2, 3, 4]
result: [[1, 2, 3, 4]]
input: [[1, 2, 3], [3, 4, 5], [5, 6], [6, 7], [9, 10], [10, 11], [65, 231, 140], [13, 14, 51]]
chain: [1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 13, 14, 51, 65, 140, 231]
result: [[1, 2, 3, 4, 5, 6, 7], [9, 10, 11], [13, 14], [51], [65], [140], [231]]
Explanation:
This one is a little hacky them the last one
I use itertool.chain to flat all the lists and them I sort it.
Then I check if the current number is within the range of 1 digit from the next
If it is I store it in the new_sub_list, if not I store in the new_sub_list, then store new_sub_list in the return_list, and empty the new_sub_list.
Note the try/except Index Error, it to avoid comparing the last item of the list with one that doesn't exist,

Well... I couldn't resist answering #f.rodrigues' last answer with one of my own.
I have to be honest though, this final version was heavily influenced by jonrsharpe's solution (the code went through various revisions, each one more efficient until I realized his method was the only way to press the most amount of juice) over here: Using sublists to create new lists where numbers don't repeat
Which made me wonder... why are we answering the same question over and over again (from the very same person)? This question, how to merge two sublists sharing any number in common? and the one with jonrsharpe's solution.
Anyway, this joins lists in the way outlined in his first question, but, like the solutions he already received over there, this one also works just as well for solving this problem.
sequence = [[1, 4, 9], [2, 3, 6], [4, 13, 50], [13, 23, 29], [2, 3, 7]]
def combineSequences(seq):
for index, y in enumerate(seq):
while True:
for x in seq[index + 1:]:
if any(i in x for i in seq[index]):
seq.remove(x)
y.extend(x)
break
else:
index += 1
break
return [sorted(set(l)) for l in seq]
sequence = [[1, 4, 9], [2, 3, 6], [4, 13, 50], [13, 23, 29], [2, 3, 7]]
print combineSequences(sequence)
>>> [[1, 4, 9, 13, 23, 29, 50], [2, 3, 6, 7]]
sequence = [[3, 4], [1, 2], [1, 3]]
print combineSequences(sequence)
>>> [[1, 2, 3, 4]]
This solution operates under a different assumption than the one I made earlier, just to clarify. This simply joins lists that have a common number. If the idea, however, was to only have them separated by intervals of 1, see my other answer.
That's it!

Okay. This solution may be difficult to grasp at first, but it's quite logical and succint, IMO.
The list comprehension basically has two layers. The outer layer will itself create separate lists
for each value of i (outer index) that satisfies the condition that the value it points to in the list is not equal to the value pointed to by the previous index + 1. So every numerical jump greater than one will create a new list within the outer list comprehension.
The math around the second (inner list comprehension) condition is a bit more complicated to explain, but essentially the condition seeks to make sure that the inner list only begins counting from the point where the outer index is at, stopping to where once again there is a numerical jump greater than one.
Assuming an even more complicated input:
listVar=[[1,2,3],[3,4,5],[5,6],[6,7],[9,10],[10,11],[65,231,140], [13,14,51]]
# Flattens the lists into one sorted list with no duplicates.
l = sorted(set([x for b in listVar for x in b]))
lGen = xrange(len(l))
print [
[l[i2] for i2 in lGen if l[i2] + i == l[i] + i2]
for i in lGen if l[i] != l[i-1] + 1
]
>>> [[1, 2, 3, 4, 5, 6, 7], [9, 10, 11], [13, 14], [51], [65], [140], [231]]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Remove rows that contains at least one duplicate value - python

Related

How to split a nested list into a smaller nested list

Sorting a list of list such that last element of a list equals first element of next list

Finding a minimum value in a list of lists and returning that list

Finding unique solution puzzle python

how to merge two sublists sharing any number in common?(2)

Categories

Resources