Remove sublist duplicates including reversed

Remove sublist duplicates including reversed - python

For example i have following
list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
I want to match if a sub list has a reversed sub list within same list (i.e. ['1', '2'] = ['2', '1']) , and if True than to remove from the list the mirrored one.
The final list should look like :
list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5']['2', '6']]
This is what i tried:
for i in range(len(list)):
if list[i] == list[i][::-1]:
print("Match found")
del list[i][::-1]
print(list)
But finally I get the same list as original. I am not sure if my matching condition is correct.

You could iterate over the elements of the list, and use a set to keep track of those that have been seen so far. Using a set is a more convenient way to check for membership, since the operation has a lower complexity, and in that case you'll need to work with tuples, since lists aren't hashable. Then just keep those items if neither the actual tuple or the reversed have been seen (if you just want to ignore those which have a reversed you just need if tuple(reversed(t)) in s):
s = set()
out = []
for i in l:
t = tuple(i)
if t in s or tuple(reversed(t)) in s:
continue
s.add(t)
out.append(i)
print(out)
# [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

lists = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
for x in lists:
z=x[::-1]
if z in lists:
lists.remove(z)
Explanation: While looping over lists, reverse each element and store in 'z'. Now, if 'z' exists in lists, remove it using remove()
The problem with your solution is you are checking while using index 'i' which means if an element at 'i' is equal to its reverse which can never happen!! hence getting the same results

Approach1:
new_list = []
for l in List:
if l not in new_list and sorted(l) not in new_list:
new_list.append(l)
print(new_list)
Approach2:
You can try like this also:
seen = set()
print([x for x in List if frozenset(x) not in seen and not seen.add(frozenset(x))])
[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

my_list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
my_list = list(set([sorted(l) for l in my_list]))

This is similar to solution by #Mehul Gupta, but I think their solution is traversing the list twice if matched: one for checking and one for removing. Instead, we could
the_list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
for sub_list in the_list:
try:
idx = the_list.index(sub_list[::-1])
except ValueError:
continue
else:
the_list.pop(idx)
print(the_list)
# [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]
because it is easier to ask for forgiveness than permission.
Note: Removing elements whilst looping is not a good thing but for this specific problem, it does no harm. In fact, it is better because we do not check the mirrored again; we already removed it.

As I have written in a comment, do never use list (or any built-in) as a variable name:
L = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
Have a look at your code:
for i in range(len(L)):
if L[i] == L[i][::-1]:
print("Match found")
del L[i][::-1]
There are two issues. First, you compare L[i] with L[i][::-1], but you want to compare L[i] with L[j][::-1] for any j != i. Second, you try to delete elements of a list during an iteration. If you delete an element, then the list length is decreased and the index of the loop will be out of the bounds of list:
>>> L = [1,2,3]
>>> for i in range(len(L)):
... del L[i]
...
Traceback (most recent call last):
...
IndexError: list assignment index out of range
To fix the first issue, you can iterate twice over the elements: for each element, is there another element that is the reverse of the first? To fix the second issue, you have two options: 1. build a new list; 2. proceed in reverse order, to delete first the last indices.
First version:
new_L = []
for i in range(len(L)):
for j in range(i+1, len(L)):
if L[i] == L[j][::-1]:
print("Match found")
break
else: # no break
new_L.append(L[i])
print(new_L)
Second version:
for i in range(len(L)-1, -1, -1):
for j in range(0, i):
if L[i] == L[j][::-1]:
print("Match found")
del L[i]
print(L)
(For a better time complexity, see #yatu's answer.)
For a one-liner, you can use the functools module:
>>> L = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
>>> import functools
>>> functools.reduce(lambda acc, x: acc if x[::-1] in acc else acc + [x], L, [])
[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]
The logic is the same as the logic of the first version.

You can try this also:-
l = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
res = []
for sub_list in l:
if sub_list[::-1] not in res:
res.append(sub_list)
print(res)
Output:-
[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

Related

Find all possible varients of max pair of 2

Given a string of numbers like 123456, I want to find all the possibilities they can be paired in by 2 or by itself. For example, from the string 123456 I would like to get the following:
12 3 4 5 6, 12 34 5 6, 1 23 4 56, etc.
The nearest I was able to come to was this:
strr = list("123456")
x = list("123456")
for i in range(int(len(strr)/2)):
newlist = []
for j in range(i):
newlist.append(x[j])
newlist.append(x[i] + x[i+1])
for j in range(len(x))[i+2:]:
newlist.append(x[j])
x = newlist.copy()
b = x.copy()
for f in range(len(b))[i:]:
if f == i:
print(b)
continue
b[f] = b[f - 1][1] + b[f]
b[f - 1] = b[f - 1][0]
print(b)
This code gives the output:

It's easy to solve this problem with a recursive generator. This is similar to how you solve change-making problems, just here we have only two "coins", either two characters together, or one character at a time. The total change we're trying to make is the length of the input string. The fact that the characters are digits in a numeric string is irrelevant.
def singles_and_pairs(string):
if len(string) <= 1: # base case
yield list(string) # yield either [] or [string] and then quit
return
for result in singles_and_pairs(string[:-1]): # first recursion
result.append(string[-1:])
yield result
for result in singles_and_pairs(string[:-2]): # second recursion
result.append(string[-2:])
yield result
If you plan on running this on large input strings, you might want to add memoization, since the recursive calls recalculate the same results quite often.

Pheew, this one took me some time to get right, but it seems to finally work (edited for prettier ordering):
def max_2_partitions(my_string):
if not my_string:
return [[]]
if len(my_string) == 1:
return [[my_string]]
ret = []
for i in range(len(my_string)):
for l in max_2_partitions(my_string[:i] + my_string[i + 1:]):
li = sorted([my_string[i]]+l, key = lambda x: (len(x),x))
if li not in ret:
ret.append(li)
for j in range(i+1,len(my_string)):
for l in max_2_partitions(my_string[:i]+my_string[i+1:j]+my_string[j+1:]):
li = sorted([my_string[i] + my_string[j]] + l, key = lambda x: (len(x),x))
if li not in ret:
ret.append(li)
return sorted(ret, key=lambda x: (-len(x),x))
Example:
print(max_2_partitions("1234"))
# [['1', '2', '3', '4'], ['1', '2', '34'], ['1', '3', '24'], ['1', '4', '23'], ['2', '3', '14'], ['2', '4', '13'], ['3', '4', '12'], ['12', '34'], ['13', '24'], ['14', '23']]

12 lines of code, full permutations:
You can first create permutations of the string, and then add spacing:
from itertools import permutations
def solution(A):
result = []
def dfs(A,B):
if not B:
result.append(A)
else:
for i in range(1,min(2,len(B))+1):
dfs(A+[B[:i]],B[i:])
for x in permutations(A):
dfs([],''.join(x))
return result
print(f"{solution('123') = }")
# solution('123') = [['1', '2', '3'], ['1', '23'], ['12', '3'], ['1', '3', '2'], ['1', '32'], ['13', '2'], ['2', '1', '3'], ['2', '13'], ['21', '3'], ['2', '3', '1'], ['2', '31'], ['23', '1'], ['3', '1', '2'], ['3', '12'], ['31', '2'], ['3', '2', '1'], ['3', '21'], ['32', '1']]

how to split a list every nth item

I am trying to split a list every 5th item, then delete the next two items ('nan'). I have attempted to use List[:5], but that does not seem to work in a loop. The desired output is: [['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5']]
List = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
for i in List:
# split first 5 items
# delete next two items
# Desired output:
# [['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5']]

There are lots of ways to do this. I recommend stepping by 7 then splicing by 5.
data = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
# Step by 7 and keep the first 5
chunks = [data[i:i+5] for i in range(0, len(data), 7)]
print(*chunks, sep='\n')
Output:
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
Reference: Split a python list into other “sublists”...

WARNING: make sure the list follows the rules as you said, after every 5 items 2 nan.
This loop will add the first 5 items as a list, and delete the first 7 items.
lst = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
output = []
while True:
if len(lst) <= 0:
break
output.append(lst[:5])
del lst[:7]
print(output) # [['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5']]

List=['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
new_list = list()
for k in range(len(List)//7):
new_list.append(List[k*7:k*7+5])
new_list.append(List[-len(List)%7])

Straightforward solution in case if the list doesn’t follow the rules you mentioned but you want to split sequence always between NAN's:
result, temp = [], []
for item in lst:
if item != 'nan':
temp.append(item)
elif temp:
result.append(list(temp))
temp = []

Using itertools.groupby would also support chunks of different lengths:
[list(v) for k, v in groupby(List, key='nan'.__ne__) if k]

I guess there is more pythonic way to do the same but:
result = []
while (len(List) > 5):
result.append(List[0:0+5])
del List[0:0+5]
del List[0:2]
This results: [['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5']]

mainlist=[]
sublist=[]
count=0
for i in List:
if i!="nan" :
if count==4:
# delete next two items
mainlist.append(sublist)
count=0
sublist=[]
else:
# split first 5 items
sublist.append(i)
count+=1

Generally numpy.split(...) will do any kind of custom splitting for you. Some reference:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.split.html
And the code:
import numpy as np
lst = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
ind=np.ravel([[i*7+5, (i+1)*7] for i in range(len(lst)//7)])
lst2=np.split(lst, ind)[:-1:2]
print(lst2)
Outputs:
[array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3')]

I like the splice answers.
Here is my 2 cents.
# changed var name away from var type
myList = ['1','2','3','4','5','nan','nan','1','2','3','4','10','nan','nan','1','2','3','4','15','nan','nan','1','2','3','4','20','nan','nan']
newList = [] # declare new list of lists to create
addItem = [] # declare temp list
myIndex = 0 # declare temp counting variable
for i in myList:
myIndex +=1
if myIndex==6:
nothing = 0 #do nothing
elif myIndex==7: #add sub list to new list and reset variables
if len(addItem)>0:
newList.append(list(addItem))
addItem=[]
myIndex = 0
else:
addItem.append(i)
#output
print(newList)

'IndexError: list index out of range' during assignment

j = [['4', '5'], ['1', '1'], ['1', '5'], ['3', '4'], ['3', '1']]
k = [['5', '2'], ['4', '2'], ['2', '4'], ['3', '3'], ['4', '3']]
t = []
indexPointer = 0
for coord in j:
for number in coord:
t[indexPointer][0] = number
indexPointer += 1
indexPointer = 0
for coord in k:
for number in coord:
t[indexPointer][1] = number
indexPointer += 1
print(t)
should output:
[[4,5],[5,2],[1,4],[1,2],[1,2],[5,4],[3,3],[4,3],[3,4],[1,3]]
instead i get:
t[indexPointer][0] = number
IndexError: list index out of range
How can I solve this? I've tried to find a way but without any luck.
Edit:
I didn't include all the code necessary. It has been updated.

You can't index into an empty list, since there's nothing there. You'll either have to append things to it, or prefill it with empty values, eg:
t = [None] * 10
But even this won't exactly work, since you expect t to be two dimensional. You may want to try making t a defaultdict, like so:
from collection import defaultdict
t = defaultdict(dict)
t[1][0] = 'a'

Why wouldn't ths be out of range?
Your t variable is just an empty single dimensional list which you're trying to access as if it were 2-dimensional.
I think your just trying to add everything that is in j to t? In which case you could just do something like this: t = list(itertools.chain(*j))
Edit: Just noticed each element is in it's own list: t = [[x] for x in itertools.chain(*j)]

I would recommend the code posted by Pythonista, but to adjust your code to make it work:
j = [['4', '5'], ['1', '1'], ['1', '5'], ['3', '4'], ['3', '1']]
t = []
for coord in j:
for number in coord:
t.append([number])
print(t)
#[['4'], ['5'], ['1'], ['1'], ['1'], ['5'], ['3'], ['4'], ['3'], ['1']]
As you're looping through each element the .append list method is tacking on [number] to the end of your list t.
You can accomplish this same nested loop code using a list comprehension:
t = [[number] for coord in j for number in coord]
Update:
Since you've updated your question:
You should consider the zip function in this situation.
t=list(itertools.chain(*zip(j,k)))
To update the code from above if you want to use a for loop, but here you can use the list .extend method:
j = [['4', '5'], ['1', '1'], ['1', '5'], ['3', '4'], ['3', '1']]
k = [['5', '2'], ['4', '2'], ['2', '4'], ['3', '3'], ['4', '3']]
t = []
for coord in zip(j,k):
t.extend(coord)
print(t)
#[['4'], ['5'], ['1'], ['1'], ['1'], ['5'], ['3'], ['4'], ['3'], ['1']]
And as a comprehension:
t=[i for coord in zip(j,k) for i in coord]

how to create a sub list for a specific string in a nested list

I have Python nested list that I'm trying to organize and eventually count number of occurrences. The nested list looks like:
[['22', '1'], ['21', '15'], ['11', '3'], ['31', '4'], ['41', '13'],...]
The first I want to do is create a sublist that only contains '1' corresponding to the second item in the nested list. I was able to do this by the following command:
Subbasin_1 = []
Subbasin_1.append([x for x in Subbasins_Imp if x[1] == '1'])
print Subbasin_1
Giving these results, which are correct:
[['21', '1'], ['21', '1'], ['21', '1'], ['21', '1'], ['22', '1'],...]
Now I want to create another sublist that will give me all the '21' in the each nested list for Subbasin_1. When I use the same line of script, but change the appropriate items, I get an empty list. Not sure what is going on...?
OS_Count1 = []
OS_Count1.append([x for x in Subbasin_1 if x[0] == '21'])
print OS_Count1
Result is [[]] ??? What's the difference between the two?
Thanks for any help...

I don't believe that your
[['21', '1'], ['21', '1'], ['21', '1'], ['21', '1'], ['22', '1'],...]
line could be produced by the code you gave. Your Subbasin_1.append line appends a list to the empty list Subbasin_1, so you should get something like
[[['22', '1'], ['21', '1']]]
with one extra level of nesting.
If you avoid the unnecessary construction of an empty list + append, you should get what you want:
>>> Subbasins_Imp = [['22', '1'], ['21', '15'], ['11', '3'], ['31', '4'], ['41', '13'], ['21', '1']]
>>>
>>> Subbasin_1 = [x for x in Subbasins_Imp if x[1] == '1']
>>> print Subbasin_1
[['22', '1'], ['21', '1']]
>>> OS_Count1 = [x for x in Subbasin_1 if x[0] == '21']
>>> print OS_Count1
[['21', '1']]
Alternatively, you could simply replace append by extend. I don't recommend this, but it might help you to see what's happening:
>>> Subbasins_Imp = [['22', '1'], ['21', '15'], ['11', '3'], ['31', '4'], ['41', '13'], ['21', '1']]
>>>
>>> Subbasin_1 = []
>>> Subbasin_1.extend([x for x in Subbasins_Imp if x[1] == '1'])
>>> print Subbasin_1
[['22', '1'], ['21', '1']]
>>>
>>> OS_Count1 = []
>>> OS_Count1.extend([x for x in Subbasin_1 if x[0] == '21'])
>>> print OS_Count1
[['21', '1']]

Your list comprehension [x for x in Subbasins_Imp if x[1] == '1'] creates a list by itself, which means when you append that list to Subbasin_1, you end up with a doubly nested list.
Compare:
sub_imp = [['22', '1'], ['21', '15'], ['11', '3'], ['31', '4'], ['41', '13']]
sub_1 = [x for x in sub_imp if x[1] == '1']
sub_2 = []
sub_2.append([x for x in sub_imp if x[1] == '1'])
print(sub_1)
print(sub_2)

Running your code I obtained a triple nested list....
Sub = [[['21','1'],....]]
Instead of doing:
Subbasin_1 = []
Subbasin_1.append([x for x in Sub if x[1]=='1'])
Simple do the list comprehension :
Subbasin_1 = [x for x in Sub if x[1] == '1']
This will give you the result you are expecting.

There is no difference which implies Subbasin_1 might be empty at the time of the call or doesn't contain the data you think it does. It might also be that Subbasin_1 is nested 3 layers deep, not 2.

How could i refresh a list once an item has been removed from a list within a list in python

This is quite complicated but i would like to be able to refresh a larger list once at item has been taken out of a mini list within the bigger list.
listA = ['1','2','3','4','5','6','6','8','9','5','3','7']
i used the code below to split it into lists of threes
split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
print(split)
# [['1','2','3'],['4','5','6'],['6','8','9'],['5','3','7']]
split = [['1','2','3'],['4','5','6'],['6','8','9'],['5','3','7']]
if i deleted #3 from the first list, split will now be
del split[0][-1]
split = [['1','2'],['4','5','6'],['6','8','9'],['5','3','7']]
after #3 has been deleted, i would like to be able to refresh the list so that it looks like;
split = [['1','2','4'],['5','6','6'],['8','9','5'],['3','7']]
thanks in advance

Not sure how big this list is getting, but you would need to flatten it and recalculate it:
>>> listA = ['1','2','3','4','5','6','6','8','9','5','3','7']
>>> split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
>>> split
[['1', '2', '3'], ['4', '5', '6'], ['6', '8', '9'], ['5', '3', '7']]
>>> del split[0][-1]
>>> split
[['1', '2'], ['4', '5', '6'], ['6', '8', '9'], ['5', '3', '7']]
>>> listA = sum(split, []) # <- flatten split list back to 1 level
>>> listA
['1', '2', '4', '5', '6', '6', '8', '9', '5', '3', '7']
>>> split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
>>> split
[['1', '2', '4'], ['5', '6', '6'], ['8', '9', '5'], ['3', '7']]

Just recreate the single list from your nested lists, then re-split.
You can join the lists, assuming they are only one level deep, with something like:
rejoined = [element for sublist in split for element in sublist]
There are no doubt fancier ways, or single-liners that use itertools or some other library, but don't overthink it. If you're only talking about a few hundred or even a few thousand items this solution is quite good enough.

I need this for turning of cards in the deck in a solitaire game.
You can deal your cards using itertools.groupby() with a good key function:
def group_key(x, n=3, flag=[0], counter=itertools.count(0)):
if next(counter) % n == 0:
flag[0] = flag[0] ^ 1
return flag[0]
^ is a bitwise operator, basically it change the value of the flag from 0 to 1 and viceversa. The flag value is an element of a list because we're doing some kind of memoization.
Example:
>>> deck = ['1', '2', '3', '4', '5', '6', '6', '8', '9', '5', '3', '7']
>>> for k,g in itertools.groupby(deck, key=group_key):
... print(list(g))
['1', '2', '3']
['4', '5', '6']
['6', '8', '9']
['5', '3', '7']
Now let's say you've used card '9' and '8', so your new deck looks like:
>>> deck = ['1', '2', '3', '4', '5', '6', '6', '5', '3', '7']
>>> for k,g in itertools.groupby(deck, key=group_key):
... print(list(g))
['1', '2', '3']
['4', '5', '6']
['6', '5', '3']
['7']

Build an object that contains a list and tracks when the list is altered (probably by controlling write to it), then have the object do it's own split every time the data is altered and save the split list to a member of the object.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Remove sublist duplicates including reversed - python

my_list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']] my_list = list(set([sorted(l) for l in my_list]))

You can try this also:- l = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']] res = [] for sub_list in l: if sub_list[::-1] not in res: res.append(sub_list) print(res) Output:- [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

Related

Find all possible varients of max pair of 2

how to split a list every nth item

'IndexError: list index out of range' during assignment

how to create a sub list for a specific string in a nested list

How could i refresh a list once an item has been removed from a list within a list in python

Categories

Resources