Remove first duplicate element of a list - python

I need to remove the duplicate occurrence of the 1st element in the list which is duplicate (present more than once) while preserving the order of the input list. For eg: for the input of
in = [2, 3, 4, 5, 3, 6, 4, 1]
output should be
out = [2, 3, 4, 5, 6, 4, 1]
I have tried below and is giving correct result , I just wanted to check with the community if there is a better or more pythonic solution to this
input = [2, 3, 4, 5, 3, 6, 4, 1]
first_dupe = None
for elem in input:
if input.count(elem) > 1:
first_dupe = elem
break
flg = True
new_list = []
for x in input:
if x != first_dupe or flg is True:
new_list.append(x)
if x == first_dupe:
flg = False
print(new_list)

If you just want to remove the first duplicate, you can create set, append elements to a new list if they are in the set while removing each element from the set as well. When an element is not seen, append the rest of the list.
This is fairly efficient if the duplicate is seen early, but has O(1) if the element is seen late.
x = [2, 3, 4, 5, 3, 6, 4, 1]
s = set(x)
out = []
for i,z in enumerate(x):
if z in s:
out.append(z)
s.remove(z)
else:
break
out += x[i+1:]
out
# returns:
[2, 3, 4, 5, 6, 4, 1]

You could keep track of what you have already used and check if the value is in there.
lst = [2, 3, 4, 5, 3, 6, 4, 1]
used = set([])
for i, x in enumerate(lst):
if x in used:
lst.pop(i)
break
used.add(x)
print(lst)
Also, don't give variables, or anything else, the same name as a keyword in python. input is already a built-in function.

This is my version of Chiheb Nexus' answer.
def find_dup(lst):
seen = set()
it = iter(lst)
for item in it:
if item not in seen:
seen.add(item)
yield item
else:
yield from it
lst = [2, 3, 4, 5, 3, 6, 4, 1]
list(find_dup(lst))
[2, 3, 4, 5, 6, 4, 1]

Related

List containing only every second second pair of elements

I am new to python and so I am experimenting a little bit, but I have a little problem now.
I have a list of n numbers and I want to make a new list that contains only every second pair of the numbers.
So basically if I have list like this
oldlist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
then I want that the new list looks like this
newlist = [3, 4, 7, 8]
I already tried the slice() function, but I didn't find any way to make it slice my list into pairs. Then I thought that I could use two slice() functions that goes by four and are moved by one, but if I merge these two new lists they won't be in the right order.
If you enumerate the list, you'd be taking those entries whose indices give either 2 or 3 as a remainder when divided by 4:
>>> [val for j, val in enumerate(old_list) if j % 4 in (2, 3)]
[3, 4, 7, 8]
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
b = [a[i] for i in range(len(a)) if i%4 in (2,3)]
# Output: b = [3, 4, 7, 8]
Here, we use the idea that the 3rd,4th,7th,8th..and so on. indices leave either 2 or 3 as the remainder when divided by 4.
first_part = oldList[2::4] # every 4th item, starting from the 3rd item
second_part = oldList[3::4] # every 4th item starting from the 4th item
pairs = zip(first_part, second_part)
final_result = chain.from_iterable(pairs)
Break this problem in to parts.
first = oldlist[2::4]
second = oldlist[3::4]
pairs = [(x, y) for x, y in zip(first, second)]
Now unwrap the pairs:
newlist = [x for p in pairs for x in p]
Combining:
newlist = [z for p in [(x, y) for x, y in zip(oldlist[2::4], oldlist[3::4])] for z in p]
I would firstly divide original list into two lists, with odd and even elements. Then iterate over zip of them.
old = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
result = list()
part1, part2 = old[::2], old[1::2]
for i, z in enumerate(zip(part1,part2)):
if i % 2 == 0:
result.extend(z)
You could use a double range:
oldlist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
newlist = []
for i,j in zip(range(2, len(oldlist), 4), range(3, len(oldlist), 4)):
newlist += [oldlist[i], oldlist[j]]
#> newlist: [3, 4, 7, 8]
import more_itertools
oldlist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[*more_itertools.interleave(oldlist[2::4], oldlist[3::4])]
# [3, 4, 7, 8]
oldlist[2::4], oldlist[3::4]: slice 4th item
[*more_itertools.interleave(...)]: interleave the two above and convert back to a list
Here is what I have come up with:
oldList = list(range(1,10))
newList = []
for i in oldList:
if (i%2 == 0) and (i%4 != 0):
try:
newList.append(i+1)
newList.append(i+2)
except IndexError:
break
Result:
>>> newList
[3, 4, 7, 8]

Skipping through one iteration of a loop

Say I had a list:
lis = [4, 8, 2, 4, 6]
And I want to go through each value in the list and double it but If I come across the number 2, after I double it I should skip the next number and only double the on after. For example, in the end my list should look like this.
lis = [8, 16, 4, 4, 12]
Can this be possible with a for loop?
The algorithm boils down what number you are using to double the items in the list (1 or 2). Here is my take on this problem:
lis = [4, 8, 2, 4, 6]
def double_with_my_condition(l, doubler=2):
for i in l:
yield i * doubler
if i == 2:
doubler = 1
continue
doubler = 2
new_lis = [*double_with_my_condition(lis)]
print(new_lis)
Outputs:
[8, 16, 4, 4, 12]
I wrote out a really simple solution that should be easy to understand since it appears you are a beginner
lis = [4, 8, 2, 4, 6]
new_lis = []
i = 0
while (i < len(lis)):
new_lis.append(lis[i] * 2)
if (lis[i] == 2):
if (i+1 < len(lis)):
new_lis.append(lis[i+1])
i = i+1
i = i+1
print(new_lis)
This creates a new list, loops through the old list, appends the doubled value to the new list, skips a number if the value at the index is 2.
This will work!
Method-1:
lis = [4, 8, 2, 4, 6]
for i in range(len(lis)-1, -1, -1):
if(lis[i-1] == 2):
continue
else:
lis[i] = lis[i]*2
lis
Method-2:
lis1 = [4, 8, 2, 4, 6]
indices = [i+1 for i, x in enumerate(lis1) if x == 2] #Storing indices of next to 2
lis2 = [i*2 for i in lis1]
for i in indices:
lis2[i] = lis1[i] # replacing the previous values
print(lis2)
You can also use list comprehensions
lis = [4, 8, 2, 4, 6]
print([lis[x] if lis[x - 1] == 2 else lis[x] * 2 for x in range(len(lis))])

compare the first index of sublists(within a list of lists) and then converting it to a set and back

I have a list of lists. I want to collect all of the sublists with the same first element, combine their elements, and remove duplicates. I'm getting close, but my current solution dies on an index error.
Current output:
[[1, 2, 3, 7, 8, 9], [5, 5, 8], [2, 2, 0, 2, 3, 4, 5, 6]]
Correct output:
[[1, 2, 3, 7, 8, 9], [4, 5, 6, 3, 2], [5, 5, 8], [2, 2, 0]] (not necessarily in that order)
Code:
listoflist = [[1,2,3,1],[4,5,6],[1,7,8],[4,3,2],[5,5,8],[2,2,0],[1,9,9]]
try: #(try and except is to deal with when lists are removed)
for i in range(len(listoflist)): #iterating forwards
x = listoflist[i][0] #indexing through sublists and setting the first value
for j in range(len(listoflist)-1,0,-1): #iterating backwards
if x == listoflist[j][0] and listoflist[i] != listoflist[j]: #comparing
listoflist[i].extend(listoflist[j]) #combining lists and removing the old list
listoflist.remove(listoflist[j])
new = set(listoflist[i]) #conversion to set and back to list
listoflist[i] = list()
for obj in new:
listoflist[i].append(obj)
print listoflist[i]
print listoflist
except IndexError:
print listoflist
ANALYSIS
Your main problem is that you violate a basic practice: do not change an iterable while you're iterating on it. Your outer loop runs i through values 0-5. However, by the time you get to 4, you no longer have an item #4 in the list. I did some simple tracing on your code, including a better message on error:
except IndexError:
print "ERROR", i, j, listoflist
Output:
ERROR 4 1 [[1, 2, 3, 1, 1, 9, 9, 1, 7, 8], [4, 5, 6, 4, 3, 2], [5, 5, 8], [2, 2, 0]]
REPAIR
Don't alter the original list. Instead, build a new list from the old one. Gather all of the lists that begin with the same element; append the resulting set to the result you want.
listoflist = [[1,2,3,1],[4,5,6],[1,7,8],[4,3,2],[5,5,8],[2,2,0],[1,9,9]]
result = []
try:
for i in range(len(listoflist)):
x = listoflist[i][0]
if x < 0: # Skip any sub-list already used
continue
gather = listoflist[i][:] # Copy list for processing
for j in range(len(listoflist)-1,0,-1):
if x == listoflist[j][0] and i != j:
gather.extend(listoflist[j])
listoflist[j][0] = -j # Uniquely mark this as not in use.
print i, j, gather
result.append(list(set(gather)))
print i, result
print result
except IndexError:
print "ERROR", i, j, listoflist
Yes, this is still much longer than needed; if you really want to dig into Python, you can group the lists by first element and do all the processing in one long line. I've tried to keep this more accessible for you.
Here is my solution. I went for building dict{} based on the first elem in the list as the key, and merging the lists based on that.
listoflist = [[1,2,3,1],[4,5,6],[1,7,8],[4,3,2],[5,5,8],[2,2,0],[1,9,9]]
temp_dict = {}
result_list = []
# build a dict{int:list[]} based on first elem of list
for ll in listoflist:
if ll[0] not in temp_dict:
temp_dict[ll[0]]=ll
else:
for x in ll:
temp_dict[ll[0]].append(x)
# now build a list of list with our dict
for key, values in temp_dict.items():
temp_list = []
temp_list.append(key)
for x in values:
if x not in temp_list:
temp_list.append(x)
result_list.append(temp_list)
print(result_list)
Output:
[[1, 2, 3, 7, 8, 9], [4, 5, 6, 3, 2], [5, 8], [2, 0]]

Python: Find-replace on lists

I first want to note that my question is different from what's in this link:
finding and replacing elements in a list (python)
What I want to ask is whether there is some known API or conventional way to achieve such a functionality (If it's not clear, a function/method like my imaginary list_replace() is what I'm looking for):
>>> list = [1, 2, 3]
>>> list_replace(list, 3, [3, 4, 5])
>>> list
[1, 2, 3, 4, 5]
An API with limitation of number of replacements will be better:
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, 3, [8, 8], 2)
>>> list
[1, 2, 8, 8, 8, 8, 3]
And another optional improvement is that the input to replace will be a list itself, instead of a single value:
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, [2, 3], [8, 8], 2)
>>> list
[1, 8, 8, 3, 3]
Is there any API that looks at least similar and performs these operations, or should I write it myself?
Try;
def list_replace(ls, val, l_insert, num = 1):
l_insert_len = len(l_insert)
indx = 0
for i in range(num):
indx = ls.index(val, indx) #it throw value error if it cannot find an index
ls = ls[:indx] + l_insert + ls[(indx + 1):]
indx += l_insert_len
return ls
This function works for both first and second case;
It wont work with your third requirement
Demo
>>> list = [1, 2, 3]
>>> list_replace(list, 3, [3, 4, 5])
[1, 2, 3, 4, 5]
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, 3, [8, 8], 2)
[1, 2, 8, 8, 8, 8, 3]
Note
It returns a new list; The list passed in will not change.
how about this, it work for the 3 requirements
def list_replace(origen,elem,new,cantidad=None):
n=0
resul=list()
len_elem=0
if isinstance(elem,list):
len_elem=len(elem)
for i,x in enumerate(origen):
if x==elem or elem==origen[i:i+len_elem]:
if cantidad and n<cantidad:
resul.extend(new)
n+=1
continue
elif not cantidad:
resul.extend(new)
continue
resul.append(x)
return resul
>>>list_replace([1,2,3,4,5,3,5,33,23,3],3,[42,42])
[1, 2, 42, 42, 4, 5, 42, 42, 5, 33, 23, 42, 42]
>>>list_replace([1,2,3,4,5,3,5,33,23,3],3,[42,42],2)
[1, 2, 42, 42, 4, 5, 42, 42, 5, 33, 23, 3]
>>>list_replace([1,2,3,4,5,3,5,33,23,3],[33,23],[42,42,42],2)
[1, 2, 3, 4, 5, 3, 5, 42, 42, 42, 23, 3]
Given this isn't hard to write, and not a very common use case, I don't think it will be in the standard library. What would it be named, replace_and_flatten? It's quite hard to explain what that does, and justify the inclusion.
Explicit is also better than implicit, so...
def replace_and_flatten(lst, searched_item, new_list):
def _replace():
for item in lst:
if item == searched_item:
yield from new_list # element matches, yield all the elements of the new list instead
else:
yield item # element doesn't match, yield it as is
return list(_replace()) # convert the iterable back to a list
I developed my own function, you are welcome to use and to review it.
Note that in contradiction to the examples in the question - my function creates and returns a new list. It does not modify the provided list.
Working examples:
list = [1, 2, 3]
l2 = list_replace(list, [3], [3, 4, 5])
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
list = [1, 2, 3, 3, 3]
l2 = list_replace(list, [3], [8, 8], 2)
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
list = [1, 2, 3, 3, 3]
l2 = list_replace(list, [2, 3], [8, 8], 2)
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
I always print also the original list, so you can see that it is not modified:
Changed: [1, 2, 3, 4, 5]
Original: [1, 2, 3]
Changed: [1, 2, 8, 8, 8, 8, 3]
Original: [1, 2, 3, 3, 3]
Changed: [1, 8, 8, 3, 3]
Original: [1, 2, 3, 3, 3]
Now, the code (tested with Python 2.7 and with Python 3.4):
def list_replace(lst, source_sequence, target_sequence, limit=0):
if limit < 0:
raise Exception('A negative replacement limit is not supported')
source_sequence_len = len(source_sequence)
target_sequence_len = len(target_sequence)
original_list_len = len(lst)
if source_sequence_len > original_list_len:
return list(lst)
new_list = []
i = 0
replace_counter = 0
while i < original_list_len:
suffix_is_long_enough = source_sequence_len <= (original_list_len - i)
limit_is_satisfied = (limit == 0 or replace_counter < limit)
if suffix_is_long_enough and limit_is_satisfied:
if lst[i:i + source_sequence_len] == source_sequence:
new_list.extend(target_sequence)
i += source_sequence_len
replace_counter += 1
continue
new_list.append(lst[i])
i += 1
return new_list
I developed a function for you (it works for your 3 requirements):
def list_replace(lst,elem,repl,n=0):
ii=0
if type(repl) is not list:
repl = [repl]
if type(elem) is not list:
elem = [elem]
if type(elem) is list:
length = len(elem)
else:
length = 1
for i in range(len(lst)-(length-1)):
if ii>=n and n!=0:
break
e = lst[i:i+length]
if e==elem:
lst[i:i+length] = repl
if n!=0:
ii+=1
return lst
I've tried with your examples and it works ok.
Tests made:
print list_replace([1,2,3], 3, [3, 4, 5])
print list_replace([1, 2, 3, 3, 3], 3, [8, 8], 2)
print list_replace([1, 2, 3, 3, 3], [2, 3], [8, 8], 2)
NOTE: never use list as a variable. I need that object to do the is list trick.

Python - Remove between indexes of two values if it occurs twice in a list

Title is definitely confusing, so here's an example: Say I have a list of values [1,2,3,2,1,4,5,6,7,8]. I want to remove between the two 1s in the list, and by pythonic ways it will also end up removing the first 1 and output [1,4,5,6,7,8]. Unfortunately, due to my lack of pythonic ability, I have only been able to produce something that removes the first set:
a = [1,2,3,2,1,4,5,6,7]
uniques = []
junks = []
for value in a:
junks.append(value)
if value not in uniques:
uniques.append(value)
for value in uniques:
junks.remove(value)
for value in junks:
a.remove(value)
a.remove(value)
a[0] = 1
print(a)
[1,4,5,6,7]
Works with the first double occurrence and will not work with the next occurrence in a larger list. I have an idea which is to remove between the index of the first occurrence and the second occurrence which will preserve the second and not have me do some dumb thing like a[0] = 1 but I'm really not sure how to implement it.
Would this do what you asked:
a = [1, 2, 3, 2, 1, 4, 5, 6, 7, 8]
def f(l):
x = l.copy()
for i in l:
if x.count(i) > 1:
first_index = x.index(i)
second_index = x.index(i, first_index + 1)
x = x[:first_index] + x[second_index:]
return x
So the output of f(a) would be [1, 4, 5, 6, 7, 8] and the output of f([1, 2, 3, 2, 1, 4, 5, 6, 7, 8, 7, 6, 5, 15, 16]) would be [1, 4, 5, 15, 16].
if you want to find unique elements you can use set and list
mylist = list(set(mylist))
a = [1, 2, 3, 2, 1, 4, 5, 6, 7, 8, 7, 6, 5, 15, 16]
dup = [x for x in a if a.count(x) > 1] # list of duplicates
while dup:
pos1 = a.index(dup[0])
pos2 = a.index(dup[0], pos1+1)
a = a[:pos1]+a[pos2:]
dup = [x for x in a if a.count(x) > 1]
print a #[1, 4, 5, 15, 16]
A more efficient solution would be
a = [1, 2, 3, 2, 1, 4, 5, 6, 7, 8, 7, 6, 5, 15, 16]
pos1 = 0
while pos1 < len(a):
if a[pos1] in a[pos1+1:]:
pos2 = a.index(a[pos1], pos1+1)
a = a[:pos1]+a[pos2:]
pos1 += 1
print a #[1, 4, 5, 15, 16]
(This probably isn't the most efficient way, but hopefully it helps)
Couldn't you just check if something appears twice, if it does you have firstIndex, secondIndex, then:
a=[1,2,3,4,5,1,7,8,9]
b=[]
#do a method to get the first and second index of the repeated number then
for index in range(0, len(a)):
print index
if index>firstIndex and index<secondIndex:
print "We removed: "+ str(a[index])
else:
b.append(a[index])
print b
The output is [1,1,7,8,9] which seems to be what you want.
To do the job you need:
the first and the last position of duplicated values
all indexes between, to remove them
Funny thing is, you can simply tell python to do this:
# we can use a 'smart' dictionary, that can construct default value:
from collections import defaultdict
# and 'chain' to flatten lists (ranges)
from itertools import chain
a = [1, 2, 3, 2, 1, 4, 5, 6, 7]
# build dictionary where each number is key, and value is list of positions:
index = defaultdict(list)
for i, item in enumerate(a):
index[item].append(i)
# let's take first only and last index for non-single values
edges = ((pos[0], pos[-1]) for pos in index.values() if len(pos) > 1)
# we can use range() to get us all index positions in-between
# ...use chain.from_iterable to flatten our list
# ...and make set of it for faster lookup:
to_remove = set(chain.from_iterable(range(start, end)
for start, end in edges))
result = [item for i, item in enumerate(a) if i not in to_remove]
# expected: [1, 4, 5, 6, 7]
print result
Of course you can make it shorter:
index = defaultdict(list)
for i, item in enumerate([1, 2, 3, 2, 1, 4, 5, 6, 7]):
index[item].append(i)
to_remove = set(chain.from_iterable(range(pos[0], pos[-1])
for pos in index.values() if len(pos) > 1))
print [item for i, item in enumerate(a) if i not in to_remove]
This solution has linear complexity and should be pretty fast. The cost is
additional memory for dictionary and set, so you should be careful for huge data sets. But if you have a lot of data, other solutions that use lst.index will choke anyway, because they are O(n^2) with a lot of dereferencing and function calls.

Categories

Resources