Removing repeated sub-lists from a list

Removing repeated sub-lists from a list - python

I have a list as follows:
l = [['A', 'C', 'D'], ['B', 'E'], ['A', 'C', 'D'], ['A', 'C', 'D'], ['B', 'E'], ['F']]
The result should be:
[['A', 'C', 'D'], ['B', 'E'], ['F']]
The order of elements is also not important.
I tried as:
print list(set(l))
Does numpy has better way

Lists are not a "hashable" type and cannot be members of a set.
Frozen sets can, so we first convert to those (also making the sublists order-insentive), and later convert back to lists.
print map(list, set(map(frozenset, l)))
or if you prefer comprehensions,
print [list(x) for x in {frozenset(x) for x in l}]
I doubt numpy offers any "better" (for some definition of better) way.
This way is IMO the clearest and most pythonic.
The reason lists cannot be part of sets is that they are mutable, so the hash now is different from the hash after they are changed; being in a hash-based set would make for confusing behavior.

#!/usr/bin/python
l1 = [['A', 'C', 'D'], ['B', 'E'], ['A', 'C', 'D'], ['A', 'C', 'D'], ['B', 'E'], ['F']]
l2=[]
for l in l1:
if l not in l2:
l2.append(l)
print l2
OUTPUT
[['A', 'C', 'D'], ['B', 'E'], ['F']]

The easiest and straightforward approach where you don't need to convert a non hashable type to hashable and vice versa (which has a performance impact), is to use itertools.groupby
Off-course, the order won;t be maintained but in any case OP categorically specified that it is not a strict requirement
>>> l = [['A', 'C', 'D'], ['B', 'E'], ['A', 'C', 'D'], ['A', 'C', 'D'], ['B', 'E'], ['F']]
>>> from itertools import groupby
>>> [k for k, g in groupby(sorted(l))]
[['A', 'C', 'D'], ['B', 'E'], ['F']]

Related

adding list as a sublist in a nested list at the start of the nested list Python

given a nested list:
input_list = [['c', 'd'], ['e', 'f']]
addition_to_input_list = ['a', 'b']
required_output = [['a', 'b'], ['c', 'd'], ['e', 'f']]
for my current program, it is enough to put the addition at the start, in the future I may have to also put the addition at a specific index in the nested list.
Thanks in advance

This is a simple list insertion. It doesn't matter that the elements are lists themselves. So, this will do it:
input_list.insert( 0, addition_to_input_list )
Or you can build a new list:
required_output = [addition_to_input_list] + input_list
Proof that both options work:
>>> input_list = [['c', 'd'], ['e', 'f']]
>>> addition_to_input_list = ['a', 'b']
>>> input_list.insert(0,addition_to_input_list)
>>> input_list
[['a', 'b'], ['c', 'd'], ['e', 'f']]
>>> input_list = [['c', 'd'], ['e', 'f']]
>>> [addition_to_input_list]+input_list
[['a', 'b'], ['c', 'd'], ['e', 'f']]
>>>

Sorting a 2D list alphabetically?

I have a 2D list such as this:
lst = [['c', 'd', 'b'], ['d', 'c', 'a'], ['b', 'a', 'c']]
I would first like to sort each list within the list alphabetically like this:
lst = [['b', 'c', 'd'], ['a', 'c', 'd'], ['a', 'b', 'c']]
And finally, I would like to sort the whole list alphabetically which takes into account each element in a sublist:
lst = [['a', 'b', 'c'], ['a', 'c', 'd'], ['b', 'c', 'd']]
What would be the fastest way to achieve this? Thank you.

The fastest way in general should be just as you described it:
for sublist in lst:
sublist.sort()
lst.sort()
Alternatively, if you want to do it out of place:
new_lst = [sorted(sublist) for sublist in lst]
new_lst.sort()

Concatenate a list of lists excluding one index

Is there a Pythonic way for concatenating a list of lists, excluding an index of choice? For example, if I had
[['a'], ['b', 'c'], ['d'], ['e', 'f', 'g']]
and did not want index 1 in the result, my concatenated list would look like:
['a', 'd', 'e', 'f', 'g']
I could do this with a loop and checking against the iteration against my index of choice, but I'm hoping there's a cleaner way.

You can use slicing:
from itertools import chain
ls = [['a'], ['b', 'c'], ['d'], ['e', 'f', 'g']]
list(chain.from_iterable(ls[:1] + ls[2:]))
If you want to avoid the cost of adding the slices together and creating new lists, it gets a bit more complicated:
from itertools import chain, islice
list(chain.from_iterable(chain(islice(ls, 1), islice(ls, 2, None))))

Here is one way:
lists = [['a'], ['b', 'c'], ['d'], ['e', 'f', 'g']]
subset = [x for ind, x in enumerate(lists) if ind != 1]
subset # [['a'], ['d'], ['e', 'f', 'g']]
flattened = [item for l in subset for item in l]
flattened # ['a', 'd', 'e', 'f', 'g']
You could combine these into a single comprehension, I did it in two steps here to show more clearly what each part does.

this is ~ "I could do this with a loop and checking against the iteration against my index of choice...."
but in a list comprehension, no libs
nix = 1
myls = [['a'], ['b', 'c'], ['d'], ['e', 'f', 'g']]
[e for ls in myls for e in ls if ls != myls[nix]]
Out[11]: ['a', 'd', 'e', 'f', 'g']
no need for enumerate either
another of the slice and flatten possibilities that reads nice
sum(myls[:nix] + myls[nix+1:],[])
but some have a problem with the use of sum that way https://mathieularose.com/how-not-to-flatten-a-list-of-lists-in-python/

If you don't mind using an external library I could offer remove and flatten from iteration_utilities1:
>>> from iteration_utilities import remove, flatten
>>> l = [['a'], ['b', 'c'], ['d'], ['e', 'f', 'g']]
>>> list(flatten(remove(l, 1)))
['a', 'd', 'e', 'f', 'g']
1 I'm the author of that library.

Sorting lists based on a particular element - Python

How do I sort a list of lists based on the first element of the lists in Python?
>>> list01 = (['a','b','c'],['b','a','d'],['d','e','c'],['a','f','d'])
>>> map(sorted, list01)
[['a', 'b', 'c'], ['a', 'b', 'd'], ['c', 'd', 'e'], ['a', 'd', 'f']]
>>> sorted(map(sorted, list01))
[['a', 'b', 'c'], ['a', 'b', 'd'], ['a', 'd', 'f'], ['c', 'd', 'e']]

Python's sorted() can receive a function to sort by.
If you want to sort by the first element in each sublist, you can use the following:
>>> lst = [[2, 3], [1, 2]]
>>> sorted(lst, key=lambda x: x[0])
[[1, 2], [2, 3]]
For more information on sorted(), please see the official docs.

from operator import itemgetter
sorted(list01, key=itemgetter(0))

>>> sorted(list01, key=lambda l: l[0])
[['a', 'b', 'c'], ['a', 'f', 'd'], ['b', 'a', 'd'], ['d', 'e', 'c']]
Is this what you mean?

Apart from the passing a key function to the sorted (as show in earlier answers) you can also pass it a cmp (comparison) function in Python2 as follows:
sorted(list01, cmp=lambda b, a: cmp(b[0], a[0]))
Output of above expression would be same as that of using the the key function.
Although they have removed the cmp argument in Python3 from sorted, https://docs.python.org/3.3/library/functions.html#sorted, and using a key function is the only choice.

Sorting a list in Python without repeat previous item

Well
I have a unique combination of elements (A B C D E F)
from itertools import combinations
data = ['A', 'B', 'C', 'D', 'E', 'F'];
comb = combinations(data, 2);
d = [];
for i in comb:
d.append([i[0], i[1]]);
print d
This returns to me:
[['A', 'B'],
['A', 'C'],
['A', 'D'],
['A', 'E'],
['A', 'F'],
['B', 'C'],
['B', 'D'],
['B', 'E'],
['B', 'F'],
['C', 'D'],
['C', 'E'],
['C', 'F'],
['D', 'E'],
['D', 'F'],
['E', 'F']]
The question is, how to sort this in a way that the line N do not repeat element [0] or element [1] of line (N-1)...in a simpler way:
AB (This line can have any element)
CD (This line can't have A or B)
EF (This line can't have C or D)
AC (This line can't have E or F)
...

mylist= [['A', 'B'],
['A', 'C'],
['A', 'D'],
['A', 'E'],
['A', 'F'],
['B', 'C'],
['B', 'D'],
['B', 'E'],
['B', 'F'],
['C', 'D'],
['C', 'E'],
['C', 'F'],
['D', 'E'],
['D', 'F'],
['E', 'F']]
a=mylist[:] #this used to assign all elements to a so u have ur mylist safe
b=[]
b.append(a[0]) #this appends the first list in the list
del a[0] #now deleting appended list
while len(a)>0:
for val,i in enumerate(a):# enumerte gives index and value of list
if len(set(b[len(b)-1]).intersection(set(i)))==0: # this checks intersection so that both list should not have same elements
b.append(a[val])
del a[val]
print b
#output [['A', 'B'], ['C', 'D'], ['E', 'F'], ['A', 'C'], ['B', 'D'], ['C', 'E'], ['D', 'F'], ['A', 'E'], ['B', 'C'], ['D', 'E'], ['A', 'F'], ['B', 'E'], ['C', 'F'], ['A', 'D'], ['B', 'F']]

Using the neighborhood generator from this answer you can get the previous, current and next element in your loop, so that you can compare them. Then you can do something like this
from itertools import combinations
# Credit to Markus Jarderot for this function
def neighborhood(iterable):
iterator = iter(iterable)
prev = None
item = iterator.next() # throws StopIteration if empty.
for next in iterator:
yield (prev,item,next)
prev = item
item = next
# this can be written like this also prev,item=item,next
yield (prev,item,None)
data = ['A', 'B', 'C', 'D', 'E', 'F'];
comb = combinations(data, 2);
d = [];
for prev, item, next in neighborhood(comb):
# If prev and item both exist and neither are in the last element in d
if prev and item and not any(x in d[-1] for x in item):
d.append([item[0], item[1]])
elif item and not prev: # For the first element
d.append([item[0], item[1]])
print d
This prints
[['A', 'B'],
['C', 'D'],
['E', 'F']]
I'm aware this is probably not 100% what you need, but it should be able to get you where you want

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Removing repeated sub-lists from a list - python

I have a list as follows: l = [['A', 'C', 'D'], ['B', 'E'], ['A', 'C', 'D'], ['A', 'C', 'D'], ['B', 'E'], ['F']] The result should be: [['A', 'C', 'D'], ['B', 'E'], ['F']] The order of elements is also not important. I tried as: print list(set(l)) Does numpy has better way

#!/usr/bin/python l1 = [['A', 'C', 'D'], ['B', 'E'], ['A', 'C', 'D'], ['A', 'C', 'D'], ['B', 'E'], ['F']] l2=[] for l in l1: if l not in l2: l2.append(l) print l2 OUTPUT [['A', 'C', 'D'], ['B', 'E'], ['F']]

Related

adding list as a sublist in a nested list at the start of the nested list Python

Sorting a 2D list alphabetically?

Concatenate a list of lists excluding one index

Sorting lists based on a particular element - Python

Sorting a list in Python without repeat previous item

Categories

Resources