I have read in several places that it is bad practice to modify an array/list during iteration. However many common algorithms appear to do this. For example Bubble Sort, Insertion Sort, and the example below for finding the minimum number of swaps needed to sort a list.
Is swapping list items during iteration an exception to the rule? If so why?
Is there a difference between what happens with enumerate and a simple for i in range(len(arr)) loop in this regard?
def minimumSwaps(arr):
ref_arr = sorted(arr)
index_dict = {v: i for i,v in enumerate(arr)}
swaps = 0
for i,v in enumerate(arr):
print("i:", i, "v:", v)
print("arr: ", arr)
correct_value = ref_arr[i]
if v != correct_value:
to_swap_ix = index_dict[correct_value]
print("swapping", arr[to_swap_ix], "with", arr[i])
# Why can you modify list during iteration?
arr[to_swap_ix],arr[i] = arr[i], arr[to_swap_ix]
index_dict[v] = to_swap_ix
index_dict[correct_value] = i
swaps += 1
return swaps
arr = list(map(int, "1 3 5 2 4 6 7".split(" ")))
assert minimumSwaps(arr) == 3
An array should not be modified while iterating through it, because iterators cannot handle the changes. But there are other ways to go through an array, without using iterators.
This is using iterators:
for index, item in enumerate(array):
# don't modify array here
This is without iterators:
for index in range(len(array)):
item = array[index]
# feel free to modify array, but make sure index and len(array) are still OK
If the length & index need to be modified when modifying an array, do it even more "manually":
index = 0
while index < len(array):
item = array[index]
# feel free to modify array and modify index if needed
index += 1
Modifying items in a list could sometimes produce unexpected result but it's perfectly fine to do if you are aware of the effects. It's not unpredictable.
You need to understand it's not a copy of the original list you ar iterating through. The next item is always the item on the next index in the list. So if you alter the item in an index before iterator reaches it the iterator will yield the new value.
That means if you for example intend to move all items one index up by setting item at index+1 to current value yielded from enumerate(). Then you will end up with a list completely filled with the item originally on index 0.
a = ['a','b','c','d']
for i, v in enumerate(a):
next_i = (i + 1) % len(a)
a[next_i] = v
print(a) # prints ['a', 'a', 'a', 'a']
And if you appending and inserting items to the list while iterating you may never reach the end.
In your example, and as you pointed out in a lot of algorithms for e.g. combinatoric and sorting, it's a part of the algorithm to change the forthcoming items.
An iterator over a range as in for i in range(len(arr)) won't adapt to changes in the original list because the range is created before starting and is immutable. So if the list has length 4 in the beginning, the loop will try iterate exactly 4 times regardless of changes of the lists length.
# This is probably a bad idea
for i in range(len(arr)):
item = arr[i]
if item == 0:
arr.pop()
# This will work (don't ask for a use case)
for i, item in enumerate(arr):
if item == 0:
arr.pop()
Related
A=[2,3,4,1] B=[1,2,3,4]
I need to find how many elements of list A appear before than the same element of list B. In this case values 2,3,4 and the expected return would be 3.
def count(a, b):
muuttuja = 0
for i in range(0, len(a)-1):
if a[i] != b[i] and a[i] not in b[:i]:
muuttuja += 1
return muuttuja
I have tried this kind of solution but it is very slow to process lists that have great number of values. I would appreciate some suggestions for alternative methods of doing the same thing but more efficiently. Thank you!
If both the lists have unique elements you can make a map of element (as key) and index (as value). This can be achieved using dictionary in python. Since, dictionary uses only O(1) time for lookup. This code will give a time complexity of O(n)
A=[2,3,4,1]
B=[1,2,3,4]
d = {}
count = 0
for i,ele in enumerate(A) :
d[ele] = i
for i,ele in enumerate(B) :
if i > d[ele] :
count+=1
Use a set of already seen B-values.
def count(A, B):
result = 0
seen = set()
for a, b in zip(A, B):
seen.add(b)
if a not in seen:
result += 1
return result
This only works if the values in your lists are immutable.
Your method is slow because it has a time complexity of O(N²): checking if an element exists in a list of length N is O(N), and you do this N times. We can do better by using up some more memory instead of time.
First, iterate over b and create a dictionary mapping the values to the first index that value occurs at:
b_map = {}
for index, value in enumerate(b):
if value not in b_map:
b_map[value] = index
b_map is now {1: 0, 2: 1, 3: 2, 4: 3}
Next, iterate over a, counting how many elements have an index less than that element's value in the dictionary we just created:
result = 0
for index, value in enumerate(a):
if index < b_map.get(value, -1):
result += 1
Which gives the expected result of 3.
b_map.get(value, -1) is used to protect against the situation when a value in a doesn't occur in b, and you don't want to count it towards the total: .get returns the default value of -1, which is guaranteed to be less than any index. If you do want to count it, you can replace the -1 with len(a).
The second snippet can be replaced by a single call to sum:
result = sum(index < b_map.get(value, -1)
for index, value in enumerate(a))
You can make a prefix-count of A, which is an array where for each index you keep track of the number of occurrences of each element before the index.
You can use this to efficiently look-up the prefix-counts when looping over B:
import collections
A=[2,3,4,1]
B=[1,2,3,4]
prefix_count = [collections.defaultdict(int) for _ in range(len(A))]
prefix_count[0][A[0]] += 1
for i, n in enumerate(A[1:], start=1):
prefix_count[i] = collections.defaultdict(int, prefix_count[i-1])
prefix_count[i][n] += 1
prefix_count_b = sum(prefix_count[i][n] for i, n in enumerate(B))
print(prefix_count_b)
This outputs 3.
This still could be O(NN) because of the copy from the previous index when initializing the prefix_count array, if someone knows a better way to do this, please let me know*
I have an array that I am looping over and I want to compare each element to the element next to it, and if it is larger say, then I want to do something with its index. It's clear to me that enumeration would help in this case; however I am running into an 'index out of range error':
array = [1,2,3,4]
for index,i in enumerate(array):
if array[index]>array[index+1]:
...
While I know there are other ways of doing this,
is there a way I can make the above work with enumerate? I tried to do enumerate(array)-1 ; knowing this would not work. But anything of this sort that would fix the indexing error? Thanks
I know we can easily do the above with simply using 'i' from the for loop, but just curious if I can manipulate enumeration here.
You can just shorten the range:
for i, val in enumerate(array[:-1]):
if val > array[i+1]:
# do stuff
If you don't need the index, you can use zip to the same effect:
for prev, crnt in zip(array, array[1:]):
if prev > crnt:
# do stuff not requiring index
The slicing requires O(n) extra space, if you don't want that, you can use your original approach without enumerate, but a simple range:
for i in range(len(array)-1):
if array[i] > array[i+1]:
# ...
Use zip and slice:
for i, j in zip(array, array[1:]):
print(f'i: {i} - j: {j}')
Output:
i: 1 - j: 2
i: 2 - j: 3
i: 3 - j: 4
Doing this way will set index at 0 and i at 1. A list is starting at 0 in Python. So at i=3, you are looking at array[i+1]=array[4] which does not exist ! That is why the program is saying 'index out of range error'.
Here is what I suggest if you want to stick with lists:
array = [1,2,3,4]
for i in range(len(array)-1):
if array[i]>array[i+1]:
...
If you want to manipulate the index, then it will be the current index in your loop (i). Maybe I have not understood your issue correctly but I suggest you to use numpy if you want to work with array-like objects.
Charles
What logic do you need to apply for the last element in the list?
You can use the range function instead of enumerate.
If you don't need to implement business logic to last element, then use below:
array = [1,2,3,4]
l = len(array)
for i in range(l-1):
if array[i]>array[i+1]:
...
If you do need to implement business logic to last element then use below:
array = [1,2,3,4]
l = len(array)
for i in range(l):
if i==l-1:
implemet last elemt logic
else:
if array[i]>array[i+1]:
....
I should not use advance function, as this is a logical test during interview.
Trying to remove all digits which appear more than once in array.
testcase:
a=[1,1,2,3,2,4,5,6,7]
code:
def dup(a):
i=0
arraySize = len(a)
print(arraySize)
while i < arraySize:
#print("1 = ",arraySize)
k=i+1
for k in range(k,arraySize):
if a[i] == a[k]:
a.remove(a[k])
arraySize -= 1
#print("2 = ",arraySize)
i += 1
print(a)
result should be : 1,2,3,4,5,6,7
But i keep getting index out of range. i know that it is because the array list inside the loop changed, so the "while" initial index is different with the new index.
The question is : any way to sync the new index length (array inside the loop) with the parent loop (index in "while" loop) ?
The only thing i can think of is to use function inside the loop.
any hint?
Re-Calculating Array Size Per Iteration
It looks like we have a couple issues here. The first issue is that you can't update the "stop" value in your inner loop (the range function). So first off, let's remove that and use another while loop to give us the ability to re-calculate our array size every iteration.
Re-Checking Values Shifted Into Removed List Spot
Next, after you fix that you will run into a larger issue. When you use remove it moves a value from the end of the list or shifts the entire list to the left to use the removed spot, and you are not re-checking the value that got moved into the old values removed spot. To resolve this, we need to decrement i whenever we remove an element, this makes sure we are checking the value that gets placed into the removed elements spot.
remove vs del
You should use del over remove in this case. remove iterates over the list and removes the first occurrence of the value and it looks like we already know the exact index of the value we want to remove. remove might work, but it's usage here over complicates things a bit.
Functional Code with Minimal Changeset
def dup(a):
i = 0
arraySize = len(a)
print(arraySize)
while i < arraySize:
k = i + 1
while k < arraySize: # CHANGE: use a while loop to have greater control over the array size.
if a[i] == a[k]:
print("Duplicate found at indexes %d and %d." % (i, k))
del a[i] # CHANGE: used del instead of remove.
i -= 1 # CHANGE: you need to recheck the new value that got placed into the old removed spot.
arraySize -= 1
break
k += 1
i += 1
return a
Now, I'd like to note that we have some readability and maintainability issues with the code above. Iterating through an array and manipulating the iterator in the way we are doing is a bit messy and could be prone to simple mistakes. Below are a couple ways I'd implement this problem in a more readable and maintainable manner.
Simple Readable Alternative
def remove_duplicates(old_numbers):
""" Simple/naive implementation to remove duplicate numbers from a list of numbers. """
new_numbers = []
for old_number in old_numbers:
is_duplicate = False
for new_number in new_numbers:
if old_number == new_number:
is_duplicate = True
if is_duplicate == False:
new_numbers.append(old_number)
return new_numbers
Optimized Low Level Alternative
def remove_duplicates(numbers):
""" Removes all duplicates in the list of numbers in place. """
for i in range(len(numbers) - 1, -1, -1):
for k in range(i, -1, -1):
if i != k and numbers[i] == numbers[k]:
print("Duplicate found. Removing number at index: %d" % i)
del numbers[i]
break
return numbers
You could copy contents in another list and remove duplicates from that and return the list. For example:
duplicate = a.copy()
f = 0
for j in range(len(a)):
for i in range(len(duplicate)):
if i < len(duplicate):
if a[j] == duplicate[i]:
f = f+1
if f > 1:
f = 0
duplicate.remove(duplicate[i])
f=0
print(duplicate)
I'm not sure about the space complexity of these two selection sort implementations:
def selection_sort(lst):
n = len(lst)
for i in range(n):
m_index = i
for j in range(i+1,n):
if lst[m_index] > lst[j]:
m_index = j
swap(lst, i, m_index)
return None
and this one:
def selection_sort2(lst):
n = len(lst)
for i in range(n):
m = min(lst[i:n])
m_index = lst.index(m) #find the index of the minimum
lst[i], lst[m_index] = lst[m_index], lst[i]
return None
and, regarding the second code, where are the previous slices being saved, once m gets a new slice?
Thanks!
The first point to make is that your second function contains a bug in its use of index. Running this:
def selection_sort2(lst):
n = len(lst)
for i in range(n):
m = min(lst[i:n])
m_index = lst.index(m) #find the index of the minimum
lst[i], lst[m_index] = lst[m_index], lst[i]
return
l = [5,4,1,3,4]
selection_sort2(l)
print(l)
prints out
[1, 3, 5, 4, 4]
This is because you have misunderstood the index function. What it does is to find the first occurrence of the supplied value (here m) in the supplied list (here lst). So what your code is doing is first of all to create a slice and find its min. Then the slice goes out of scope and is garbage collected. Then you find the value in the whole list (in the wrong place in this example).
We can fix this by restricting the index to the slice, though bear in mind that this is not good code, as I will explain next.
m_index = lst.index(m,i) #find the index of the minimum
With this change, the function works, but it has two problems. The first is that the slicing does (as you suspected) create a copy and so doubles the memory requirement of the code. But the second problem is that once you find the minimum value, you then pointlessly iterate through the slice a second time to find the index of the place where you found the minimum, so also doubling the run time.
The copying can be fixed by replacing the slice with a generator expression. So instead of a slice we just produce the values one at a time.
Then we can arrange to find the index of the minimum by carrying it along with the value in a tuple. Then minimising the tuples provides us with the index at the same time. The resulting code looks like this:
def selection_sort2(lst):
n = len(lst)
for i in range(n):
m,m_index = min((lst[j],j) for j in range(i,n))
lst[i], lst[m_index] = lst[m_index], lst[i]
return
However, this code is functionally more or less the same as your first example and probably not any clearer - so why change?
I am very new to programming, so please bear with me...I have been learning Python and I just did an assessment that involved looping through a list using your current value as the next index value to go to while looping. This is roughly what the question was:
You have a zero-indexed array length N of positive and negative integers. Write a function that loops through the list, creates a new list, and returns the length of the new list. While looping through the list, you use your current value as the next index value to go to. It stops looping when A[i] = -1
For example:
A[0] = 1
A[1] = 4
A[2] = -1
A[3] = 3
A[4] = 2
This would create:
newlist = [1, 4, 2, -1]
len(newlist) = 4
It was timed and I was not able to finish, but this is what I came up with. Any criticism is appreciated. Like I said I am new and trying to learn. In the meantime, I will keep looking. Thanks in advance!
def sol(A):
i = 0
newlist = []
for A[i] in range(len(A)):
e = A[i]
newlist.append(e)
i == e
if A[i] == -1:
return len(newlist)
This might be the easiest way to do it if your looking for the least lines of code to write.
A = [1,4,-1,3,2]
B = []
n = 0
while A[n] != -1:
B.append(A[n])
n = A[n]
B.append(-1)
print(len(B))
First of all, note that for A[i] in range(len(A)) is a pattern you certainly want to avoid, as it is an obscure construct that will modify the list A by storing increasing integers into A[i]. To loop over elements of A, use for val in A. To loop over indices into A, use for ind in xrange(len(A)).
The for loop, normally the preferred Python looping construct, is not the right tool for this problem because the problem requires iterating over the sequence in an unpredictable order mandated by the contents of the sequence. For this, you need to use the more general while loop and manage the list index yourself. Here is an example:
def extract(l):
newlist = []
ind = 0
while l[ind] != -1:
newlist.append(l[ind])
ind = l[ind]
newlist.append(-1) # the problem requires the trailing -1
print newlist # for debugging
return len(newlist)
>>> extract([1, 4, -1, 3, 2])
[1, 4, 2, -1]
4
Note that collecting the values into the new list doesn't really make sense in any kind of real-world scenario because the list is not visible outside the function in any way. A more sensible implementation would simply increment a counter in each loop pass and return the value of the counter. But since the problem explicitly requests maintaining the list, code like the above will have to do.
It's simpler to just use a while loop:
data = [1,4,-1,3,2]
ls = []
i = 0
steps = 0
while data[i] != -1:
ls.append(data[i])
i = data[i]
steps += 1
assert steps < len(data), "Infinite loop detected"
ls.append(-1)
print ls, len(ls)