kind of a newbie to Python and I've looked around a bit but haven't found a satisfying answer to my question. I'm doing some practice problems and I want to make a method that gets rid of duplicate values in a list. So far, this is my code:
def noDouble(nums):
for x in xrange(len(nums) - 2):
if nums[x] == nums[x + 1]:
nums.pop(x)
x -= 1
return nums
What I want to happen is that if there's a duplicate, pop off one of the duplicates and then move back again (so that if there are, say, 3 instances of the same number, it'll get rid of al of them by 'rewinding').
I'm looking for an explanation for why my code doesn't work as well as an explained solution and I'd really appreciate any and all help. Thanks.
Because for doesn't count numbers; it iterates over a sequence. Every time the body of the loop executes, x is set to the next value produced by the xrange. The for statement couldn't care less what x is at the end of the loop.
You really shouldn't modify a list within a loop over it, anyway. The easier way to do this is to build a new list:
def remove_doubles(old):
new = []
for item in old:
if new and item == new[-1]:
continue
new.append(item)
return new
Of course, the really easy way to remove duplicates is list(set(foo)). But that will remove all duplicates, not just adjacent ones, and probably destroy the item order too.
As far as I understand it, the for loop isn't a simple increment like it is in C or Java; Python will actually force x back to the value it's expected to be for the next loop iteration. So decrementing x won't have the effect of adding more loop iterations, because it will be forcibly reset.
for x in range(10):
x -= 1
print x
will yield
-1
0
1
2
3
4
5
6
7
8
EDIT: here's a possibility for what you're trying to do, though the accepted answer is easier for your specific use case. Use while instead of for:
limit = len(nums) - 1
x = 0
while x < limit:
if nums[x] == nums[x+1]:
nums.pop(x)
x -= 1
x += 1
return nums
Though this code may still fail with an IndexError because of how you're accessing list elements. But still, that's how you'd add extra iterations.
Generally you never want to manipulate the counter in a loop (aside from setting it to break out of the loop). For a similar reason, you don't want to saw a branch you are sitting on, when cutting down a tree :-)
The safer route (compared to nested loops) is to build a set of operations in one loop, then pass this set to a second loop (each loop one level deep).
Related
I'm practicing python and one of the coding tasks assigned was to create a function that looks through a list and ignores numbers that occur between a 6 and a 9 and returns the sum of all other values.
Edit: this does not mean to add numbers whose values are less than 6 or greater than 9. It means to add all numbers of any value, but to ignore any numbers that come after a 6, until a 9 is seen. Symbolically if i means include and x means exclude, the code should return all the values marked as i:
[i,i...6, x,x,...,9,i,i...,6,x,x,...]
In other words, 6 turns off adding and if adding is off, 9 turns it back on.
Note that a 9 with no preceding 6 is just a number and will be added.
For example if I have a list:
[4,5,6,7,8,9,9]
the output should be:
8 <---(4+5+9)
The solution is provided but I'm having trouble understanding the code. I don't understand the purpose of the break statements in the code. The solution provided is as follows:
def summer_69(*arr):
total = 0
add = True
for num in arr:
while add == True:
if num!=6:
total = total + num
break
else:
add = False
while add == False:
if num !=9:
break
else:
add = True
break
return total
I'm really confused how the break statements help with the code. Particularly, I'm confused why the first 'break' is needed when there is already an 'else'.
The second break confuses me as well.
I understand that 'break' statements stop a loop and go onto the next loop.
My interpretation of the code so is 'if the number does not equal to 6 then total = total + num, if it does equal 6 then the loop is broken and if it is broken, add changes to False'.
I'm not sure if that interpretation is correct or not.
I was wondering how seasoned Python coders interpret 'breaks' vs 'else'.
break will exit whatever loop the statement is in. It's useful for many things, but often it's used to "short-circuit" the loop. If we know that the rest of the loop is irrelevant after some condition is met, then it's just wasted resources to keep looping through.
The break statement allow you to leave the while loop, but the if else statement allow you to stay in loop, until the condition of the while loop change or a break statement is in the action into the while loop
The solution you've provided is extremely convoluted and hard to understand.
A much better solution is:
total = 0
for num in arr_2:
if(num >= 6 and num <=9):
continue
total += num
Or a more pythonic way:
filtered_arr = filter(lambda x: x <6 or x > 9, arr_2)
total = reduce(lambda x, y: x + y, arr)
Anyways, in your solution, the first break is absolutely redundant. The reason why there is a break there, is because when you've found a number that doesn't equal 6, you add it, and you get out of the while loop.
In other words, the solution should have used an if statement, instead of the while statement. The break is there to basically have the while loop execute once.
Because, if a number does equal 6, then add will be false, and the while loop will terminate. If a number does not equal 6, you get out of the while loop. So the while loop is pointless, and meant to be an if statement instead.
This is a tricky way to handle program flow with a toggle nested in conditional loops.
It's a little hard to follow, but it is a well-known classic pattern.
Initially ADD == True, so if we start with a number that is not 6 (as in your example), the algorithm adds the number & breaks out of the first while loop. When it breaks, the next statement executed will be the line while add == False
At this point ADD == TRUE so the second while loop will not be entered. The next statement executed will be for num in arr (the outermost loop).
The outer FOR loop will go again and this process will repeat.
When you encounter a 6, the number will not be added and the break will not occur. The program will execute the else clause, setting ADD = FALSE.
After the else clause, execution continues with statement while add == false. Since ADD == FALSE at this point, the second while loop will be entered.
From now on ADD will be FALSE so the first While loop will not be entered and numbers will not be added. Instead, the condition for the second while loop will be evaluated for each number. As long as numbers are not equal to 9, the second while loop will not be entered.
When you encounter a 9, you will enter the second while loop, switch ADD back to TRUE, and break out of the while loop.
The first 9 comes after a 6 (ADD is FALSE) so it just toggles ADD from FALSE to TRUE and the number 9 doesn't get added.
When the NEXT 9 is encountered, ADD is TRUE and the number is not 6, so the first while loop will be entered and the number 9 will get added.
This is a classic pattern that used to be used in assembly language code perhaps 40 years ago. As written, the IF statements toggle a state variable. The state variable is turned on when the start condition is met, and turned off when a stop condition is met. The while loops ensure that the toggle can only be turned ON when it was OFF and vice versa, and provide places to put in different handling when the state is ON vs when it is OFF. This pattern brings certain efficiencies that are completely irrelevant in modern high-level languages.
There are better ways to do this in all modern languages, but as an exercise in following tricky program flow it's quite good :)
I have the following function that generates the longest palindrome of a string by removing and re-ordering the characters:
from collections import Counter
def find_longest_palindrome(s):
count = Counter(s)
chars = list(set(s))
beg, mid, end = '', '', ''
for i in range(len(chars)):
if count[chars[i]] % 2 != 0:
mid = chars[i]
count[chars[i - 1]] -= 1
else:
for j in range(0, int(count[chars[i]] / 2)):
beg += chars[i]
end = beg
end = ''.join(list(reversed(end)))
return beg + mid + end
out = find_longest_palindrome('aacggg')
print(out)
I got this function by 'translating' this example from C++
When ever I run my function, I get one of the following outputs at random it seems:
a
aca
agcga
The correct one in this case is 'agcga' as this is the longest palindrome for the input string 'aacggg'.
Could anyone suggest why this is occurring and how I could get the function to reliably return the longest palindrome?
P.S. The C++ code does not have this issue.
Your code depends on the order of list(set(s)).
But sets are unordered.
In CPython 3.4-3.7, the specific order you happen to get for sets of strings depends on the hash values for strings, which are explicitly randomized at startup, so it makes sense that you’d get different results on each run.
The reason you don’t see this in C++ is that the C++ set class template is not an unordered set, but a sorted set (based on a binary search tree, instead of a hash table), so you always get the same order in every run.
You could get the same behavior in Python by calling sorted on the set instead of just copying it to a list in whatever order it has.
But the code still isn’t correct; it just happens to work for some examples because the sorted order happens to give you the characters in most-repeated order. But that’s obviously not true in general, so you need to rethink your logic.
The most obvious difference introduced in your translation is this:
count[ch--]--;
… or, since you're looping over the characters by index instead of directly, more like:
count[chars[i--]]--;
Either way, this decrements the count of the current character, and then decrements the current character so that the loop will re-check the same character the next time through. You've turned this into something completely different:
count[chars[i - 1]] -= 1
This just decrements the count of the previous character.
In a for-each loop, you can't just change the loop variable and have any effect on the looping. To exactly replicate the C++ behavior, you'd either need to switch to a while loop, or put a while True: loop inside the for loop to get the same "repeat the same character" effect.
And, of course, you have to decrement the count of the current character, not decrement the count of the previous character that you're never going to see again.
for i in range(len(chars)):
while True:
if count[chars[i]] % 2 != 0:
mid = chars[i]
count[chars[i]] -= 1
else:
for j in range(0, int(count[chars[i]] / 2)):
beg += chars[i]
break
Of course you could obviously simplify this—starting with just looping for ch in chars:, but if you think about the logic of how the two loops work together, you should be able to see how to remove a whole level of indentation here. But this seems to be the smallest change to your code.
Notice that if you do this change, without the sorted change, the answer is chosen randomly when the correct answer is ambiguous—e.g., your example will give agcga one time, then aggga the next time.
Adding the sorted will make that choice consistent, but no less arbitrary.
I have the following code.
for idx in range(len(networks)):
net_ = networks[idx]
lastId=0
for layerUptID in range(len(net_[1])):
retNet,lastId=cn_.UpdateTwoConvLayers(deepcopy(net_),lastId)
networks.append(retNet)
if(lastId==-1):
break
networks has only one net at the beginning.
After running the line retNet,lastId=cn_.UpdateTwoConvLayers(deepcopy(net_),lastId), I have additional six nets and appended to networks.
So after this lastId ==-1, go back to first for loop with len(networks) is 7.
For the next idx, idx=1 and continue.
Then, len(networks) is 13. Then go back to first for loop.
After this, the first for loop breaks.
I am expecting to continue for idx is 2, but it breaks.
What could be the issue?
If you try using a WHILE loop instead of FOR loop, the break statement would be check if the loop is on the last item in 'networks' collection.
This way the network length would be calculated in each loop iteration
For starters: Iterating, or looping, over the list (or data) you're editing is bad practice. Keep that in mind while coding.
This means if you plan to edit what you're looping on, in your case networks, then you're going to have a bad time looping over it. I would advise to break it up into two code parts:
The first part creates a new list of whatever it is you want WHILE looping.
The second part replaces the list you've used to generate what you wanted.
Another thing which could go wrong is net_[i] may not be set up for some i, and you're trying to access it here:
for layerUptID in range(len(net_[1])):
What if there is nothing in net_[1]?
To avoid these errors, usually verifying your data is a great way to start. If it is not null, then proceed, otherwise, print an error.
This is what I can think of. Hope it helps.
If I understood correctly your problem is that you've added new elements to networks, i.e. have increased length of networks and expect that for-loop will pick up this changes, well it's not, let's look at following snippet
elements = [1]
indices = range(len(elements))
for index in indices:
print('index is', index)
elements.append(2)
print('elements count is', len(elements))
print('indices count is', len(indices))
outputs are
index is 0
elements count is 2
indices count is 1
so as we can see despite the fact that length of elements list has changed, range object which is used in for-loop has not. This happens because len returns int object which are immutable, so when you change list length its length becomes different object and range function has no idea about this changes.
Finally, we can use while loop here like
while networks:
net_ = networks.pop()
lastId = 0
for layerUptID in range(len(net_[1])):
retNet, lastId = cn_.UpdateTwoConvLayers(deepcopy(net_), lastId)
networks.append(retNet)
if lastId == -1:
break
An issue with my existing code. Code goes:
example_dic = {'name': 'jim','value': 4}
list_of_dic = [example_dic,dic2,dic3,...]
empty_list = [] #will be filled with multiple dictionaries all in same format/same keys
key_sum = sum(blah['value'] for blah in empty_list) #tested this with a filled in "list_of_dic", works as expected
if not empty_list or key_sum < arbitrary_value:
for things in list_of_dic[:]:
if case1:
empty_list.append(things)
list_of_dic.remove(things)
elif case2:
empty_list.append(things)
list_of_dic.remove(things)
else:
pass
Problem is that key_sum does not get updated ever even though things are being appended onto empty_list. As I said in the comments, I know the key_sum line works because I tried it by filling in the list of dictionaries with random stuff first.
What I want is that items will keep being added onto list_of_dic only while key_sum < arbitrary. If for example I want key_sum < 20, if the next item causes key_sum >= 20, I do not want it to be added at all, not simply break and end after it's already been added. I also do not want the code to end there, if there is a list of 10 items and the 1st one has value = 22 I don't want the whole thing to stop, I want it to keep going through the rest, adding items on until it cannot add anymore that wouldn't cause key_sum >= 20.
Simpler answer would be, is there any other language which doesn't require such unnecessary complication for what seems like a very simple task?
There are a couple of issues with this. One is that your code assumes that key_sum gets automatically updated when you change empty_list, but that's not the case. It just gets calculated once. You'll need to recalculate key_sum on every iteration, or if you're really worried about efficiency, increment the key_sum every time you append to empty_list. It also seems like you want to check the value of key_sum on every iteration of your for loop, rather than only after you've iterated over the entire list_of_dic.
The second issue is that you're removing items from list_of_dic while you iterate over it. This has undefined behavior in Python, and generally results in certain elements of your iterable being skipped over. Instead, you need to iterate over a copy of the list.
Summarizing the changes:
for things in list_of_dic[:]: # Iterate over a copy of list_of_dic
do_append = False
if case1:
do_append = True
elif case2:
do_append = True
if do_append:
if (key_sum + things['value']) >= arbitrary_value:
continue
empty_list.append(things)
list_of_dic.remove(things)
key_sum += things['value']
I know that it is not allowed to remove elements while iterating a list, but is it allowed to add elements to a python list while iterating. Here is an example:
for a in myarr:
if somecond(a):
myarr.append(newObj())
I have tried this in my code and it seems to work fine, however I don't know if it's because I am just lucky and that it will break at some point in the future?
EDIT: I prefer not to copy the list since "myarr" is huge, and therefore it would be too slow. Also I need to check the appended objects with "somecond()".
EDIT: At some point "somecond(a)" will be false, so there can not be an infinite loop.
EDIT: Someone asked about the "somecond()" function. Each object in myarr has a size, and each time "somecond(a)" is true and a new object is appended to the list, the new object will have a size smaller than a. "somecond()" has an epsilon for how small objects can be and if they are too small it will return "false"
Why don't you just do it the idiomatic C way? This ought to be bullet-proof, but it won't be fast. I'm pretty sure indexing into a list in Python walks the linked list, so this is a "Shlemiel the Painter" algorithm. But I tend not to worry about optimization until it becomes clear that a particular section of code is really a problem. First make it work; then worry about making it fast, if necessary.
If you want to iterate over all the elements:
i = 0
while i < len(some_list):
more_elements = do_something_with(some_list[i])
some_list.extend(more_elements)
i += 1
If you only want to iterate over the elements that were originally in the list:
i = 0
original_len = len(some_list)
while i < original_len:
more_elements = do_something_with(some_list[i])
some_list.extend(more_elements)
i += 1
well, according to http://docs.python.org/tutorial/controlflow.html
It is not safe to modify the sequence
being iterated over in the loop (this
can only happen for mutable sequence
types, such as lists). If you need to
modify the list you are iterating over
(for example, to duplicate selected
items) you must iterate over a copy.
You could use the islice from itertools to create an iterator over a smaller portion of the list. Then you can append entries to the list without impacting the items you're iterating over:
islice(myarr, 0, len(myarr)-1)
Even better, you don't even have to iterate over all the elements. You can increment a step size.
In short: If you'are absolutely sure all new objects fail somecond() check, then your code works fine, it just wastes some time iterating the newly added objects.
Before giving a proper answer, you have to understand why it considers a bad idea to change list/dict while iterating. When using for statement, Python tries to be clever, and returns a dynamically calculated item each time. Take list as example, python remembers a index, and each time it returns l[index] to you. If you are changing l, the result l[index] can be messy.
NOTE: Here is a stackoverflow question to demonstrate this.
The worst case for adding element while iterating is infinite loop, try(or not if you can read a bug) the following in a python REPL:
import random
l = [0]
for item in l:
l.append(random.randint(1, 1000))
print item
It will print numbers non-stop until memory is used up, or killed by system/user.
Understand the internal reason, let's discuss the solutions. Here are a few:
1. make a copy of origin list
Iterating the origin list, and modify the copied one.
result = l[:]
for item in l:
if somecond(item):
result.append(Obj())
2. control when the loop ends
Instead of handling control to python, you decides how to iterate the list:
length = len(l)
for index in range(length):
if somecond(l[index]):
l.append(Obj())
Before iterating, calculate the list length, and only loop length times.
3. store added objects in a new list
Instead of modifying the origin list, store new object in a new list and concatenate them afterward.
added = [Obj() for item in l if somecond(item)]
l.extend(added)
You can do this.
bonus_rows = []
for a in myarr:
if somecond(a):
bonus_rows.append(newObj())
myarr.extend( bonus_rows )
Access your list elements directly by i. Then you can append to your list:
for i in xrange(len(myarr)):
if somecond(a[i]):
myarr.append(newObj())
make copy of your original list, iterate over it,
see the modified code below
for a in myarr[:]:
if somecond(a):
myarr.append(newObj())
I had a similar problem today. I had a list of items that needed checking; if the objects passed the check, they were added to a result list. If they didn't pass, I changed them a bit and if they might still work (size > 0 after the change), I'd add them on to the back of the list for rechecking.
I went for a solution like
items = [...what I want to check...]
result = []
while items:
recheck_items = []
for item in items:
if check(item):
result.append(item)
else:
item = change(item) # Note that this always lowers the integer size(),
# so no danger of an infinite loop
if item.size() > 0:
recheck_items.append(item)
items = recheck_items # Let the loop restart with these, if any
My list is effectively a queue, should probably have used some sort of queue. But my lists are small (like 10 items) and this works too.
You can use an index and a while loop instead of a for loop if you want the loop to also loop over the elements that is added to the list during the loop:
i = 0
while i < len(myarr):
a = myarr[i];
i = i + 1;
if somecond(a):
myarr.append(newObj())
Expanding S.Lott's answer so that new items are processed as well:
todo = myarr
done = []
while todo:
added = []
for a in todo:
if somecond(a):
added.append(newObj())
done.extend(todo)
todo = added
The final list is in done.
Alternate solution :
reduce(lambda x,newObj : x +[newObj] if somecond else x,myarr,myarr)
Assuming you are adding at the last of this list arr, You can try this method I often use,
arr = [...The list I want to work with]
current_length = len(arr)
i = 0
while i < current_length:
current_element = arr[i]
do_something(arr[i])
# Time to insert
insert_count = 1 # How many Items you are adding add the last
arr.append(item_to_be inserted)
# IMPORTANT!!!! increase the current limit and indexer
i += 1
current_length += insert_count
This is just boilerplate and if you run this, your program will freeze because of infinite loop. DO NOT FORGET TO TERMINATE THE LOOP unless you need so.