Problem to solve: Define a Python function remdup(l) that takes a non-empty list of integers l
and removes all duplicates in l, keeping only the last occurrence of each number. For instance:
if we pass this argument then remdup([3,1,3,5]) it should give us a result [1,3,5]
def remdup(l):
for last in reversed(l):
pos=l.index(last)
for search in reversed(l[pos]):
if search==last:
l.remove(search)
print(l)
remdup([3,5,7,5,3,7,10])
# intended output [5, 3, 7, 10]
On line 4 for loop I want the reverse function to check for each number excluding index[last] but if I use the way I did in the above code it takes the value at pos, not the index number. How can I solve this
You need to reverse the entire slice, not merely one element:
for search in reversed(l[:pos]):
Note that you will likely run into a problem for modifying a list while iterating. See here
It took me a few minutes to figure out the clunky logic. Instead, you need the rest of the list:
for search in reversed(l[pos+1:]):
Output:
[5, 3, 7, 10]
Your original algorithm could be improved. The nested loop leads to some unnecessary complexity.
Alternatively, you can do this:
def remdup(l):
seen = set()
for i in reversed(l):
if i in seen:
l.remove(i)
else:
seen.add(i)
print(l)
I use the 'seen' set to keep track of the numbers that have already appeared.
However, this would be more efficient:
def remdup(l):
seen = set()
for i in range(len(l)-1, -1, -1):
if l[i] in seen:
del l[i]
else:
seen.add(l[i])
print(l)
In the second algorithm, we are iterating over the list in reverse order using a range, and then we delete any item that already exists in 'seen'. I'm not sure what the implementation of reversed() and remove() is, so I can't say what the exact impact on time/space complexity is. However, it is clear to see exactly what is happening in the second algorithm, so I would say that it is a safer option.
This is a fairly inefficient way of accomplishing this:
def remdup(l):
i = 0
while i < len(l):
v = l[i]
scan = i + 1
while scan < len(l):
if l[scan] == v:
l.remove(v)
scan -= 1
i -= 1
scan += 1
i += 1
l = [3,5,7,5,3,7,10]
remdup(l)
print(l)
It essentially walks through the list (indexed by i). For each element, it scans forward in the list for a match, and for each match it finds, it removes the original element. Since removing an element shifts the indices, it adjusts both its indices accordingly before continuing.
It takes advantage of the built-in the list.remove: "Remove the first item from the list whose value is equal to x."
Here is another solution, iterating backward and popping the index of a previously encountered item:
def remdup(l):
visited= []
for i in range(len(l)-1, -1, -1):
if l[i] in visited:
l.pop(i)
else:
visited.append(l[i])
print(l)
remdup([3,5,7,5,3,7,10])
#[5, 3, 7, 10]
Using dictionary:
def remdup(ar):
d = {}
for i, v in enumerate(ar):
d[v] = i
return [pair[0] for pair in sorted(d.items(), key=lambda x: x[1])]
if __name__ == "__main__":
test_case = [3, 1, 3, 5]
output = remdup(test_case)
expected_output = [1, 3, 5]
assert output == expected_output, f"Error in {test_case}"
test_case = [3, 5, 7, 5, 3, 7, 10]
output = remdup(test_case)
expected_output = [5, 3, 7, 10]
assert output == expected_output, f"Error in {test_case}"
Explanation
Keep the last index of each occurrence of the numbers in a dictionary. So, we store like: dict[number] = last_occurrence
Sort the dictionary by values and use list comprehension to make a new list from the keys of the dictionary.
Along with other right answers, here's one more.
from iteration_utilities import unique_everseen,duplicates
import numpy as np
list1=[3,5,7,5,3,7,10]
dup=np.sort(list((duplicates(list1))))
list2=list1.copy()
for j,i in enumerate(list2):
try:
if dup[j]==i:
list1.remove(dup[j])
except:
break
print(list1)
How about this one-liner: (convert to a function is easy enough for an exercise)
# - one-liner Version
lst = [3,5,7,5,3,7,10]
>>>list(dict.fromkeys(reversed(lst)))[::-1]
# [5, 3, 7, 10]
if you don't want a new list, you can do this instead:
lst[:] = list(dict.fromkeys(reversed(lst)))[::-1]
Related
To illustrate my problem, imagine I have a list and I want to compare each element with the next one to check if they are the same value. The problem is that when I try to access the last element of the list and compare it with "the next one", that one is out of range, so I would get an error. So, to avoid this, I put a condition when accessing that last element, so I avoid the comparison.
list = [1, 2, 1, 1, 5, 6, 1,1]
for i in range(len(list)):
if i == len(list)-1:
print('Last element. Avoid comparison')
else:
if list[i] == list[i+1]:
print('Repeated')
I guess that there should be a more efficient way to do this. For instance, I was trying to set the condition in the definition of the for loop, something like this:
for i in range(len(list)) and i < len(list)-1
But that is invalid. Any suggestion about how to do this in a more efficient/elegant way?
If you need to start from 0, you should use:
for i in range(len(list) - 1):
if list[i] == list[i + 1]:
print('Repeated')
The parameter stop of range function is just integer, so you can use value len(list) - 1 instead of len(list) to stop iterating on last but one element.
Other answers have solved this, but I think it's worth mentioning an approach that may be closer to idiomatic Python. Python provides iterable unpacking and other tools like the zip function to avoid accessing elements of sequences by index.
# Better to avoid shadowing the build-in name `list`
a_list = [1, 2, 1, 1, 5, 6, 1, 1]
for value, following_value in zip(a_list, a_list[1:]):
if value == following_value:
print("Repeated!")
You can utilize the functionality of range as follows:
for i in range(1, len(list)):
if list[i-1] == list[i]:
print('Repeated')
In this way, you won't overrun the list.
start from one and look backwards
for i in range(1, len(list)):
if list[i-1] == list[i]:
print('Repeated')
This works!
list = [1, 2, 1, 1, 5, 6, 1, 1]
for i in range(len(list)):
if i+1 < len(list) and list[i] == list[i+1]:
print('Repeated')
len(list) is 8
range(len(list)) is 0, 1, ..., 7
but you want the for loop to skip when the index is 6 right?
so given that case ... if i == len(list)-1: this condition will be True when the index is 7 (not the index that you want)
Just change that to if i == len(list)-2:
There are many ways to do this. The most common one is to use zip to pair each item with its successor:
if any(item == successor for item,successor in zip(lst,lst[1:])):
print('repeated')
groupby from itertools is also a popular choice (but not optimal for this):
if any(duplicate for _,(_,*duplicate) in itertools.groupby(lst)):
print('repeated')
A for-loop would only need to track the previous value (no need for indexing):
prev = object() # non-matching initial value
for x in lst:
if prev==x: # compare to previous
print('repeated')
break
prev = x # track previous for next iteration
Iterators can be interesting when traversing data in parallel (here the elements and their predecessors):
predecessor = iter(lst) # iterate over items from first
for x in lst[1:]: # iterate from 2nd item
if x == next(predecessor): # compare to corresponding predecessor
print('repeated')
break
list = [1, 2, 1, 1, 5, 6, 1,1]
for i in range(len(list)):
if list[i] in list[i+1:i+2]:
print('repeated')
If you use only numbers in your list, you might want to work with numpy
for instance:
import numpy as np
np_arr = np.array(lst) # don't use 'list' for your object name.
diffs = np.diff(np_arr)
diffs_indices = np.where(diffs != 0)[0]
It is unclear what your exact uses, but for example in my code, you will get:
>>> diffs_indexes
array([0, 1, 3, 4, 5])
Which are the indices where elelment[i] != element[i+1]
I'm trying to manipulate a given list in an unusual way (at least for me).
Basically, I have the list a (also image 1), it has the first index as principal. Now, I want to iterate through the other indexes and if a certain value match with one of those in the first index, I want to insert the sublist of this index inside the first one.
I don't know if I was clear enough, but the goal should be the list b (also image 2). I think a recursive function should be used here, but I don't know how. Do you guys think it's possible?
Original list:
a = [[1,2,3],[2,5],[6,3],[10,5]]
Expected Output:
b = [[1,2,[2,5,[10,5]],3,[6,3]]]
You could use a dictionary to record where the first occurrence of each number is found, recording the list in which it was found, and at which index. If then a list is found that has a value that was already encountered, the recorded list can be mutated having the matching list inserted. If there was no match (which is the case for the very first list [1,2,3]), then this list is just appended to the result.
Because insertion into a list will impact other insertion points, I suggest to first collect the insertion actions, and then apply them in reversed order:
Here is the code for that:
def solve(a):
dct = {}
result = []
insertions = []
for lst in a:
found = None
for i, val in enumerate(lst):
if val in dct:
found = val
else:
dct[val] = [lst, i]
if found is None:
result.append(lst)
else:
insertions.append((*dct[found], lst))
for target, i, lst in reversed(insertions):
target.insert(i + 1, lst)
return result
# Example run:
a = [[1,2,3],[2,5],[6,3],[10,5]]
print(solve(a))
Output:
[[1, 2, [2, 5, [10, 5]], 3, [6, 3]]]
I had the following code:
return [p.to_dict() for p in points]
I changed it to only print every nth row:
n = 100
count = 0
output = []
for p in points:
if (count % n == 0):
output.append(p.to_dict())
count += 1
return output
Is there a more pythonic way to write this, to acheive the same result?
use enumerate and modulo on the index in a modified list comprehension to filter the ones dividable by n:
return [p.to_dict() for i,p in enumerate(points) if i % n == 0]
List comprehension filtering is good, but in that case, eduffy answer which suggests to use slicing with a step is better since the indices are directly computed. Use the filter part only when you cannot predict the indices.
Improving this answer even more: It's even better to use itertools.islice so not temporary list is generated:
import itertools
return [p.to_dict() for p in itertools.islice(points,None,None,n)]
itertools.islice(points,None,None,n) is equivalent to points[::n] but performing lazy evaluation.
The list slicing syntax takes an optional third argument to define the "step". This take every 3rd in a list:
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(10)[::3]
[0, 3, 6, 9]
you can use enumerate with list comprehension.
[p.to_dict() for i, p in enumerate(points) if i %100 == 0]
I have a bug in my attempt to add to a list a sequence of numbers recursively. E.g. if the input is [5,3,9], I do [5+1,3+2,9+3] and output [6,5,12]. I want to do this recursively so the way I'm doing it is going through and adding one to a smaller and smaller part of the list as below:
def add_position_recur(lst, number_from=0):
length = len(lst)
# base case
if (length <= 1):
lst = [x+1 for x in lst]
print "last is", lst
else:
lst = [x+1 for x in lst]
print "current list is", lst
add_position_recur(lst[1:], number_from)
return lst
The problem, though, is that all this does is add 1 to every element of the list. Where is the bug? Is it to do with the way I return the list in the base case?
When you recurse down your call stack you slice lst which creates a new list, this is not the same as what you return, so you will only ever return the changes you've applied to your list in the first call to the function, losing all changes further down the stack:
>>> add_position_recur([1,2,3])
[2, 3, 4]
This should have returned [2, 4, 6].
You need to consider reassembling the list on the way out to get the changes.
return [lst[0]] + add_position_recur(lst[1:], number_from)
and you need to return lst in your base case:
def add_position_recur(lst, number_from=0):
length = len(lst)
# base case
if (length <= 1):
lst = [x+1 for x in lst]
return lst
else:
lst = [x+1 for x in lst]
return [lst[0]] + add_position_recur(lst[1:], number_from)
>>> add_position_recur([1,2,3])
[2, 4, 6]
However, this is quite a complicated approach to this recursion. It is idiomatic for the base case to be the empty list, otherwise take the head and recurse down the tail. So something to consider which uses the number_from:
def add_position_recur(lst, number_from=1):
if not lst:
return lst
return [lst[0]+number_from] + add_position_recur(lst[1:], number_from+1)
>>> add_position_recur([1,2,3])
[2, 4, 6]
This also has the advantage(?) of not changing the passed in lst
Why don't you instead do something like this:
def func(lon, after=[]):
if not l:
pass
else:
v = len(lon) + lon[-1]
after.append(v)
func(lon[:-1], after)
return after[::-1]
The output of the function for the example you provided matches what you want.
Currently, you are simply adding 1 to each value of your list.
lst = [x+1 for x in lst]
Rather, you should be increasing a variable which is being added to x with each iteration of x in lst.
lst = [x+(lst.index(x)+1) for x in lst]
This solution assumes that you want the number being added to x to depend on its position in the list relative to the start of the list, rather than being dependent on the position of x relative to the first element which was >1. Meaning, do you want to add 1 or 3 to the value 2 in the following list? The solution above adds three.
lst = [0.5, 0.1, 2, 3]
I'm trying to sort a list from smallest to biggest integers. Unfortunately I get the error stated above when I try to run it.
Traceback (most recent call last):
File "lesson_4/selection_sort.py", line 24, in <module>
print selection_sort([-8, 8, 4, -4, -2, 2]) # [-8, -4, -2, 2, 4, 8]
File "lesson_4/selection_sort.py", line 14, in selection_sort
lst.remove(min)
ValueError: list.remove(x): x not in list
Here is the code of selection_sort.py
def selection_sort(lst):
sorted = []
list_len = len(lst) # Store this now because our loop will make it
# smaller
min = lst[0]
i = 1
while list_len > 0:
while i < list_len:
item = lst[i]
if item < min:
min = item
i += 1
lst.remove(min)
sorted.append(min)
return sorted
# Test Code
print "Testing"
print selection_sort([-8, 8, 4, -4, -2, 2]) # [-8, -4, -2, 2, 4, 8]
Thank for helping me out!
On your first pass through the list, you find the minimum element. However, on your second pass, min is still set to the minimum element in the original list. As a result, item < min is never true, and min forever remains the minimum element of the original list. Then when you try to remove it, you can't, because you already got rid of that item on the previous pass (unless there is a tie for the minimum, in which case this will happen as soon as all those elements are removed).
To solve this, just move min = lst[0] inside the first loop, so you reset it to a valid value each time.
You've also got some other issues, which I will mention here briefly:
You never update list_len, so you'll get an error at the end of the second pass through the outer loop (when you will attempt to go beyond the length of the list). You'd also loop forever if it didn't break bist. Luckily this whole variable is unneeded: you can use len(lst) in the outer loop, and replace your inner while loop with this:
for item in lst: # But see below regarding variable names!
if item < min:
min = item
This eliminates the need to track i separately and avoids any issues with the length of the list.
Next: this looks like homework, so it's probably not critical at this moment, but it's definitely worth mentioning: if I pass a list to a function called selection_sort, I would be very surprised to discover that after being sorted, my original list is now empty!! It's generally bad form to modify an input unless you're doing so explicitly (e.g. an in-place sort), so I highly recommend that you do all your work on a copy of the input, to avoid deleting all the content of the original:
lst_copy = lst[:] # If `lst` contains mutable objects (e.g. other lists), use deepcopy instead!
# Do stuff with lst_copy and avoid modifying lst
Finally, you've got two variables shadowing built-in functions: sorted and min. While this will technically work, it's poor form, and it's best to get into the habit of not naming local variables the same as builtins. By convention, if it's really the best name for the object, you can just add an underscore to the name to distinguish it from the builtin: min_ and sorted_ (or maybe better, output), for example.
If you simply want to sort the list, you can use inbuilt sort() function:
>>> lst=[-8, 8, 4, -4, -2, 2]
>>> lst.sort()
>>> lst
[-8, -4, -2, 2, 4, 8]
If you want to sort by your method, there are two slight mistakes in your code: you need to decrement lst_len every time you remove an element and reinitialize min to lst[0]. Also outer while should be while lst_len > 1 because list of length 1 is trivially sorted. Demo is given below:
>>> def selection_sort(lst):
sorted = []
list_len = len(lst) # Store this now because our loop will make it
# smaller
min = lst[0]
i = 1
while list_len > 1:
while i < list_len:
item = lst[i]
if item < min:
min = item
i += 1
lst.remove(min)
list_len-=1 # decrement length of list
min=lst[0] # reinitialize min
sorted.append(min)
return sorted
>>> selection_sort([-8, 8, 4, -4, -2, 2])
[8, 4, -4, -2, 2]