Remove duplicate elements in python - python

for a given list for example:
[
'A',
[
'B',
['D','C'],
['C','D']
],
['C','D'],
['E','F'],
['F','E'],
[
['not','M'],
'N',
['not','M']
]
]
I want to remove the duplicate element in it, the result of the above list should be:
[
'A',
[
'B',
['C','D']
],
['C','D'],
['E','F'],
[
['not','M'],
'N'
]
]
It has two rules: ['not','A'] represents ~A, and it can be seen as one element.
If the value is same but order not, we consider it the same.So ['C','D'] is same as ['D','C']
Can anyone one help me write this function in python to realize the requirement?

In general, I agree with Klaus D.'s comment. But I thought the question was interesting since the easiest way I could think of was using lists->tuples->sets and that gets kind of interesting when you want to remove elements. It also causes you to lose the ordering of the original input list (As noted by Adam Smith).
So given all that, consider:
import itertools
def _reduce(lst):
if not isinstance(lst, list): return lst
seen = []
for x in lst:
if not any(list(perm) in seen for perm in itertools.permutations(x)):
seen.append(_reduce(x))
return seen
Which you can run by:
lst = [
'A',
[
'B',
['D','C'],
['C','D']
],
['C','D'],
['E','F'],
['F','E'],
[
['not','M'],
'N',
['not','M']
]
]
print _reduce(lst)
Which outputs:
[
'A',
[
'B',
['D', 'C']
],
['C', 'D'],
['E', 'F'],
[
['not', 'M'],
'N'
]
]
Note that this preserves input ordering of list elements. Also note, that because of this, this output differs slightly from your expected output (['D', 'C'] is preserved and ['C', 'D'] is discarded).
Edit Per your commment, itertools.permutations() won't suffice, since you seem to want some recursive function that will consider the permutations of the subelements as well. What about:
import itertools
def _permutations(x):
if not isinstance(x, list): return x
perms = []
for prod in itertools.product(*[_permutations(elem) for elem in x]):
for perm in itertools.permutations(prod):
perms.append(list(perm))
return perms
def _reduce(lst):
if not isinstance(lst, list): return lst
seen = []
for x in lst:
if not any(list(perm) in seen for perm in _permutations(x)):
seen.append(_reduce(x))
return seen
def lexsort(x): return sorted(str(e) for e in _permutations(x))
arrs = [
['B',['C','D']],
[['D','C'],'B'],
]
print _reduce(arrs)
Outputs:
[['B', ['C', 'D']]]

Related

How to create sublists out of a list with strings?

Having a Python list like the following one:
list1 = [ "abc", "def", "ghi" ]
How can I obtain sub-lists for each character of the strings ?
list3 = [ ["a"], ["b"], ["c"] ]
list4 = [ ["d"], ["e"], ["f"] ]
list5 = [ ["g"], ["h"], ["i"] ]
Do not generate variables programmatically, this leads to unclear code and is often a source of bugs, use a container instead (dictionaries are ideal).
Using a dictionary comprehension:
list1 = [ "abc", "def", "ghi" ]
lists = {'list%s' % (i+3): [[e] for e in s]]
for i,s in enumerate(list1)}
output:
>>> lists
{'list3': [['a'], ['b'], ['c']],
'list4': [['d'], ['e'], ['f']],
'list5': [['g'], ['h'], ['i']]}
>>> lists['list3']
[['a'], ['b'], ['c']]
NB. you could also simply use the number as key (3/4/5)
To get a list from a string use list(string)
list('hello') # -> ['h', 'e', 'l', 'l', 'o']
And if you want to wrap the individual characters in a list do it like this:
[list(c) for c in 'hello'] # --> [['h'], ['e'], ['l'], ['l'], ['o']]
But that is useful only if you want to change the list afterward.
Note that you can index a string like so
'hello'[2] # --> 'l'
This is how to break a list then break it into character list.
list1 = ["abc", "def", "ghi"]
list3 = list(list1[0])
list4 = list(list1[1])
list5 = list(list1[2])

How to Check if Element exists in Second Sublist?

If I had a list like this:
L = [
['a', 'b'],
['c', 'f'],
['d', 'e']
]
I know that I could check if e.g. 'f' was contained in any of the sub lists by using any in the following way:
if any('f' in sublist for sublist in L) # True
But how would I go about searching through second sub lists, i.e. if the list was initialized the following way:
L = [
[
['a', 'b'],
['c', 'f'],
['d', 'e']
],
[
['z', 'i', 'l'],
['k']
]
]
I tried chaining the for in expressions like this:
if any('f' in second_sublist for second_sublist in sublist for sublist in L)
However, this crashes because name 'sublist' is not defined.
First write your logic as a regular for loop:
for first_sub in L:
for second_sub in first_sub:
if 'f' in second_sub:
print('Match!')
break
Then rewrite as a generator expression with the for statements in the same order:
any('f' in second_sub for first_sub in L for second_sub in first_sub)
If you don't need to know where 'f' is located you can leverage itertools here as well.
import itertools
any('f' in x for x in itertools.chain.from_iterable(l))
This will flatten your nested lists and evaluate each list separately. The benefit here is if you have three nested lists this solution would still function without having to continue writing nest for loops.

Including first and last elements in list comprehension

I would like to keep the first and last elements of a list, and exclude others that meet defined criteria without using a loop. The first and last elements may or may not have the criteria of elements being removed.
As a very basic example,
aList = ['a','b','a','b','a']
[x for x in aList if x !='a']
returns ['b', 'b']
I need ['a','b','b','a']
I can split off the first and last values and then re-concatenate them together, but this doesn't seem very Pythonic.
You can use slice assignment:
>>> aList = ['a','b','a','b','a']
>>> aList[1:-1]=[x for x in aList[1:-1] if x !='a']
>>> aList
['a', 'b', 'b', 'a']
Yup, it looks like dawg’s and jez’s suggested answers are the right ones, here. Leaving the below for posterity.
Hmmm, your sample input and output don’t match what I think your question is, and it is absolutely pythonic to use slicing:
a_list = ['a','b','a','b','a']
# a_list = a_list[1:-1] # take everything but the first and last elements
a_list = a_list[:2] + a_list[-2:] # this gets you the [ 'a', 'b', 'b', 'a' ]
Here's a list comprehension that explicitly makes the first and last elements immune from removal, regardless of their value:
>>> aList = ['a', 'b', 'a', 'b', 'a']
>>> [ letter for index, letter in enumerate(aList) if letter != 'a' or index in [0, len(x)-1] ]
['a', 'b', 'b', 'a']
Try this:
>>> list_ = ['a', 'b', 'a', 'b', 'a']
>>> [value for index, value in enumerate(list_) if index in {0, len(list_)-1} or value == 'b']
['a', 'b', 'b', 'a']
Although, the list comprehension is becoming unwieldy. Consider writing a generator like so:
>>> def keep_bookends_and_bs(list_):
... for index, value in enumerate(list_):
... if index in {0, len(list_)-1}:
... yield value
... elif value == 'b':
... yield value
...
>>> list(keep_bookends_and_bs(list_))
['a', 'b', 'b', 'a']

Inserting values in 2D lists in Python

I'm looking for a method of inserting values to 2D lists in Python. My sample list is as follows:
List= [ ['A', 'B'], ['C', 'D'] ]
I would like to insert a value at the beginning of each list within my List, so that it would look like this:
List = [ ['#','A', 'B'], ['#','C', 'D'] ]
I have written a function as follows:
def Foo(l):
rows = len(l)
cols = len(l[0])
for row in xrange(rows):
l.insert(row, '#')
But that has given me the following output:
List= [ '#', '#', ['A', 'B'], ['C', 'D'] ]
when you do l.insert() it adds an item to l not the sub lists, to iterate over the sub lists you can do:
for row in l:
row.insert(0,"#")
Or using xrange:
for i in xrange(len(l)):
l[i].insert(0,"#")

Pythonic way to combine (interleave, interlace, intertwine) two lists in an alternating fashion?

I have two lists, the first of which is guaranteed to contain exactly one more item than the second. I would like to know the most Pythonic way to create a new list whose even-index values come from the first list and whose odd-index values come from the second list.
# example inputs
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
# desired output
['f', 'hello', 'o', 'world', 'o']
This works, but isn't pretty:
list3 = []
while True:
try:
list3.append(list1.pop(0))
list3.append(list2.pop(0))
except IndexError:
break
How else can this be achieved? What's the most Pythonic approach?
If you need to handle lists of mismatched length (e.g. the second list is longer, or the first has more than one element more than the second), some solutions here will work while others will require adjustment. For more specific answers, see How to interleave two lists of different length? to leave the excess elements at the end, or How to elegantly interleave two lists of uneven length? to try to intersperse elements evenly, or Insert element in Python list after every nth element for the case where a specific number of elements should come before each "added" element.
Here's one way to do it by slicing:
>>> list1 = ['f', 'o', 'o']
>>> list2 = ['hello', 'world']
>>> result = [None]*(len(list1)+len(list2))
>>> result[::2] = list1
>>> result[1::2] = list2
>>> result
['f', 'hello', 'o', 'world', 'o']
There's a recipe for this in the itertools documentation (note: for Python 3):
from itertools import cycle, islice
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
num_active = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while num_active:
try:
for next in nexts:
yield next()
except StopIteration:
# Remove the iterator we just exhausted from the cycle.
num_active -= 1
nexts = cycle(islice(nexts, num_active))
import itertools
print([x for x in itertools.chain.from_iterable(itertools.zip_longest(list1,list2)) if x])
I think this is the most pythonic way of doing it.
In Python 2, this should do what you want:
>>> iters = [iter(list1), iter(list2)]
>>> print list(it.next() for it in itertools.cycle(iters))
['f', 'hello', 'o', 'world', 'o']
Without itertools and assuming l1 is 1 item longer than l2:
>>> sum(zip(l1, l2+[0]), ())[:-1]
('f', 'hello', 'o', 'world', 'o')
In python 2, using itertools and assuming that lists don't contain None:
>>> filter(None, sum(itertools.izip_longest(l1, l2), ()))
('f', 'hello', 'o', 'world', 'o')
If both lists have equal length, you can do:
[x for y in zip(list1, list2) for x in y]
As the first list has one more element, you can add it post hoc:
[x for y in zip(list1, list2) for x in y] + [list1[-1]]
Edit: To illustrate what is happening in that first list comprehension, this is how you would spell it out as a nested for loop:
result = []
for y in zip(list1, list2): # y is is a 2-tuple, containining one element from each list
for x in y: # iterate over the 2-tuple
result.append(x) # append each element individually
I know the questions asks about two lists with one having one item more than the other, but I figured I would put this for others who may find this question.
Here is Duncan's solution adapted to work with two lists of different sizes.
list1 = ['f', 'o', 'o', 'b', 'a', 'r']
list2 = ['hello', 'world']
num = min(len(list1), len(list2))
result = [None]*(num*2)
result[::2] = list1[:num]
result[1::2] = list2[:num]
result.extend(list1[num:])
result.extend(list2[num:])
result
This outputs:
['f', 'hello', 'o', 'world', 'o', 'b', 'a', 'r']
Here's a one liner that does it:
list3 = [ item for pair in zip(list1, list2 + [0]) for item in pair][:-1]
Here's a one liner using list comprehensions, w/o other libraries:
list3 = [sub[i] for i in range(len(list2)) for sub in [list1, list2]] + [list1[-1]]
Here is another approach, if you allow alteration of your initial list1 by side effect:
[list1.insert((i+1)*2-1, list2[i]) for i in range(len(list2))]
This one is based on Carlos Valiente's contribution above
with an option to alternate groups of multiple items and make sure that all items are present in the output :
A=["a","b","c","d"]
B=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
def cyclemix(xs, ys, n=1):
for p in range(0,int((len(ys)+len(xs))/n)):
for g in range(0,min(len(ys),n)):
yield ys[0]
ys.append(ys.pop(0))
for g in range(0,min(len(xs),n)):
yield xs[0]
xs.append(xs.pop(0))
print [x for x in cyclemix(A, B, 3)]
This will interlace lists A and B by groups of 3 values each:
['a', 'b', 'c', 1, 2, 3, 'd', 'a', 'b', 4, 5, 6, 'c', 'd', 'a', 7, 8, 9, 'b', 'c', 'd', 10, 11, 12, 'a', 'b', 'c', 13, 14, 15]
Might be a bit late buy yet another python one-liner. This works when the two lists have equal or unequal size. One thing worth nothing is it will modify a and b. If it's an issue, you need to use other solutions.
a = ['f', 'o', 'o']
b = ['hello', 'world']
sum([[a.pop(0), b.pop(0)] for i in range(min(len(a), len(b)))],[])+a+b
['f', 'hello', 'o', 'world', 'o']
from itertools import chain
list(chain(*zip('abc', 'def'))) # Note: this only works for lists of equal length
['a', 'd', 'b', 'e', 'c', 'f']
itertools.zip_longest returns an iterator of tuple pairs with any missing elements in one list replaced with fillvalue=None (passing fillvalue=object lets you use None as a value). If you flatten these pairs, then filter fillvalue in a list comprehension, this gives:
>>> from itertools import zip_longest
>>> def merge(a, b):
... return [
... x for y in zip_longest(a, b, fillvalue=object)
... for x in y if x is not object
... ]
...
>>> merge("abc", "defgh")
['a', 'd', 'b', 'e', 'c', 'f', 'g', 'h']
>>> merge([0, 1, 2], [4])
[0, 4, 1, 2]
>>> merge([0, 1, 2], [4, 5, 6, 7, 8])
[0, 4, 1, 5, 2, 6, 7, 8]
Generalized to arbitrary iterables:
>>> def merge(*its):
... return [
... x for y in zip_longest(*its, fillvalue=object)
... for x in y if x is not object
... ]
...
>>> merge("abc", "lmn1234", "xyz9", [None])
['a', 'l', 'x', None, 'b', 'm', 'y', 'c', 'n', 'z', '1', '9', '2', '3', '4']
>>> merge(*["abc", "x"]) # unpack an iterable
['a', 'x', 'b', 'c']
Finally, you may want to return a generator rather than a list comprehension:
>>> def merge(*its):
... return (
... x for y in zip_longest(*its, fillvalue=object)
... for x in y if x is not object
... )
...
>>> merge([1], [], [2, 3, 4])
<generator object merge.<locals>.<genexpr> at 0x000001996B466740>
>>> next(merge([1], [], [2, 3, 4]))
1
>>> list(merge([1], [], [2, 3, 4]))
[1, 2, 3, 4]
If you're OK with other packages, you can try more_itertools.roundrobin:
>>> list(roundrobin('ABC', 'D', 'EF'))
['A', 'D', 'E', 'B', 'F', 'C']
My take:
a = "hlowrd"
b = "el ol"
def func(xs, ys):
ys = iter(ys)
for x in xs:
yield x
yield ys.next()
print [x for x in func(a, b)]
def combine(list1, list2):
lst = []
len1 = len(list1)
len2 = len(list2)
for index in range( max(len1, len2) ):
if index+1 <= len1:
lst += [list1[index]]
if index+1 <= len2:
lst += [list2[index]]
return lst
How about numpy? It works with strings as well:
import numpy as np
np.array([[a,b] for a,b in zip([1,2,3],[2,3,4,5,6])]).ravel()
Result:
array([1, 2, 2, 3, 3, 4])
Stops on the shortest:
def interlace(*iters, next = next) -> collections.Iterable:
"""
interlace(i1, i2, ..., in) -> (
i1-0, i2-0, ..., in-0,
i1-1, i2-1, ..., in-1,
.
.
.
i1-n, i2-n, ..., in-n,
)
"""
return map(next, cycle([iter(x) for x in iters]))
Sure, resolving the next/__next__ method may be faster.
Multiple one-liners inspired by answers to another question:
import itertools
list(itertools.chain.from_iterable(itertools.izip_longest(list1, list2, fillvalue=object)))[:-1]
[i for l in itertools.izip_longest(list1, list2, fillvalue=object) for i in l if i is not object]
[item for sublist in map(None, list1, list2) for item in sublist][:-1]
An alternative in a functional & immutable way (Python 3):
from itertools import zip_longest
from functools import reduce
reduce(lambda lst, zipped: [*lst, *zipped] if zipped[1] != None else [*lst, zipped[0]], zip_longest(list1, list2),[])
using for loop also we can achive this easily:
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
list3 = []
for i in range(len(list1)):
#print(list3)
list3.append(list1[i])
if i < len(list2):
list3.append(list2[i])
print(list3)
output :
['f', 'hello', 'o', 'world', 'o']
Further by using list comprehension this can be reduced. But for understanding this loop can be used.
My approach looks as follows:
from itertools import chain, zip_longest
def intersperse(*iterators):
# A random object not occurring in the iterators
filler = object()
r = (x for x in chain.from_iterable(zip_longest(*iterators, fillvalue=filler)) if x is not filler)
return r
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
print(list(intersperse(list1, list2)))
It works for an arbitrary number of iterators and yields an iterator, so I applied list() in the print line.
def alternate_elements(small_list, big_list):
mew = []
count = 0
for i in range(len(small_list)):
mew.append(small_list[i])
mew.append(big_list[i])
count +=1
return mew+big_list[count:]
if len(l2)>len(l1):
res = alternate_elements(l1,l2)
else:
res = alternate_elements(l2,l1)
print(res)
Here we swap lists based on size and perform, can someone provide better solution with time complexity O(len(l1)+len(l2))
I'd do the simple:
chain.from_iterable( izip( list1, list2 ) )
It'll come up with an iterator without creating any additional storage needs.
This is nasty but works no matter the size of the lists:
list3 = [
element for element in
list(itertools.chain.from_iterable([
val for val in itertools.izip_longest(list1, list2)
]))
if element != None
]
Obviously late to the party, but here's a concise one for equal-length lists:
output = [e for sub in zip(list1,list2) for e in sub]
It generalizes for an arbitrary number of equal-length lists, too:
output = [e for sub in zip(list1,list2,list3) for e in sub]
etc.
I'm too old to be down with list comprehensions, so:
import operator
list3 = reduce(operator.add, zip(list1, list2))

Categories

Resources