Recently, I am solving problems No. 113 Path Sum II on LeetCode and I find a solution online.
Given a binary tree and a sum, find all root-to-leaf paths where each path's sum equals the given sum.
Code as below:
class Solution:
def pathSum(self, R: TreeNode, S: int) -> List[List[int]]:
A, P = [], []
def dfs(N):
if N == None: return
P.append(N.val)
if (N.left,N.right) == (None,None) and sum(P) == S: A.append(list(P))
else: dfs(N.left), dfs(N.right)
P.pop()
dfs(R)
return A
- Junaid Mansuri
- Chicago, IL
I would like to ask some questions based on the above code to help me understand how Python works more.
Why do we need to use list(), A.append(list(P)), to successfully append the list into A if P itself is already a list?
What happens when the interpreter runs dfs(N.left), dfs(N.right). Both of the function will append a value into P, but they seem don't affect the other functions(like they are running at the exact same time with the exact same P), is it something like multithreading?
A related question of the above, is A, P = [ ], [ ] works with same concept as dfs(N.left), dfs(N.right)? If not, what is the difference?
what does P.pop() pop indeed, I mean which value will be poped out if both dfs(N.left) and dfs(N.right) runs? I mean, will there be two P after the two functions run?
Updates (more question)
10 while head != None:
11 if id(head.next) in hashMap: return True
12 head = head.next
13 hashMap.add(id(head.next))
Line 13: AttributeError: 'NoneType' object has no attribute 'next'
The above is part of the code. It just simply look into a linked list. It will show an error as above which I think is normal when it reaches the end of the linked list.
What I want to understand is that if the code changed as below, there will be no error and the code run successfully. Is that related to the comma or there is another reason that makes it runs?
10 while head != None:
11 if id(head.next) in hashMap: return True
12 head, _ = head.next, hashMap.add(id(head.next))
An internet search will point you to lots of articles on depth-first search.
To answer your immediate questions.
Why do we need to use list(), A.append(list(P)), to successfully
append the list into A if P itself is already a list?
A.append(list(P))
Uses the list constructor to make a shallow copy of P to add to A.
Otherwise, if you just used:
A.append(P)
then the list in A will change every time P changes
What happens when the interpreter runs dfs(N.left), dfs(N.right). Both
of the function will append a value into P, but they seem don't affect
the other functions(like they are running at the exact same time with
the exact same P), is it something like multithreading?
These functions are run sequentially. First dfs(N.left) followed by dfs(N.right).
This performs a depth-first search (DFS) on the left subtree, followed by a DFS on right subtree.
Each function is run for its side-effect of updating A and P.
A related question of the above, is A, P = [ ], [ ] works with same
concept as dfs(N.left), dfs(N.right)? If not, what is the difference?
what does P.pop() pop indeed, I mean which value will be poped out if
both dfs(N.left) and dfs(N.right) runs? I mean, will there be two P
after the two functions run?
Variable A and P are local variables of pathSum. dfs being a nested function within pathSum has access to these local variables. Thus there is only one A and one P which dfs updates as it is called recursively.
A, P = [], []
Is initializing A, P (done once within pathSum).
dfs(N.left), dfs(N.right)
Is calling the dfs methods on the left and right subnodes, which performs updates on A and P as the recursive calls run.
what does P.pop() pop indeed, I mean which value will be poped out if
both dfs(N.left) and dfs(N.right) runs? I mean, will there be two P
after the two functions run?
P.pop()
Removes the last value appended to list P.
dfs(N.left) and dfs(N.right) are run one after the other. For example with N correspoding the value = 1:
dfs(N.left), dfs(N.right)
First dfs(N.left) will recursively traverse nodes with values of:
2, 4, 5
Then, dfs(N.right) will traverse the node with value 3.
The values of A, P will be updated during the traversal. P contains the path to the current node. When we branch left (i.e. DFS(N.left) the left path is added to P. When we return we need to remove this (thus P.pop()). Similarly, N.right() is run.
When we have traversed both the left and right children, we remove the current node with P.pop() and control flow returns to the parent.
Tuple Abbreviation
a, b is an abbreviation for tuple (a, b)
Thus:
[], [] computes tuple ([], [])
b, a computes tuple (b, a)
dfs(N.left), dbs(N.right) computes tuple (dfs(N.left), dfs(N.right))
Tuples can be unpacked
Example:
t = (2, 4, 6, 8)
x, y, z, w = t
Will have x = 2, y = 4, z = 6, w = 8
With:
A, P = [], [] equivalent to A, P = ([], [])
Unpacking then has:
A = [], P = []
With:
a, b = b, a is equivalent to a, b = (b, a)
Unpacking then has:
a = b and b == a
In terms of result think of these assignments as occuring in parallel.
So with:
(a, b) = (5, 3)
a, b = (b, a) will have a = 3, b = 5 (thus swapping a & b)
DFS(N.left), DFS(N.right) is computing the tuple(DFS(N.left, DFS(N.right)).
This necessitates running DFS on N.left and N.right.
The resulting tuple is discarded, but this has the desired effect of updating A & P as DFS is run recursively.
Related
I have a problem with the recursion. The function I wrote should recursively generate and return a list of pairs, called chain. The breaking condition is when the pair, (remainder, quotient) already belongs to the chain-list, then stop iterating and return the list. Instead of completing, the recursion just blows up, raising a RecursionError. The list doesn't update and contains only a single term, so the breaking condition is not executed. I don't understand why...
How should I proper implement the recursive step to make the list update?
def proper_long_division(a, b):
"""a < b"""
chain = []
block_size = len(str(b)) - len(str(a))
a_new_str = str(a) + '0' * block_size
a_new = int(a_new_str)
if a_new < b:
a_new = int(a_new_str + '0')
quotient = a_new // b
remainder = a_new - b * quotient
print(remainder)
#print(chain)
# breaking condition <--- !
if (remainder, quotient) in chain:
return chain
# next step
chain.append((remainder, quotient))
chain.extend(proper_long_division(remainder, b))
return chain
try:
a = proper_long_division(78, 91)
print(a)
except RecursionError:
print('boom')
Here a an example of recursion which (should) follows the same structure but the returned list is updated. I don't know why one code works while the other does not.
import random
random.seed(1701)
def recursion():
nrs = []
# breaking condition
if (r := random.random()) > .5:
return nrs
# update
nrs.append(r)
# recursive step
nrs.extend(recursion())
return nrs
a = recursion()
print(a)
# [0.4919374389681155, 0.4654907396198952]
When you enter proper_long_division, the first thing you do is chain = []. That means that the local variable chain refers to a new empty list. Then you do some algebra, which does not affect chain, and check if (remainder, quotient) in chain:. Clearly this will always be False, since chain was and has remained empty.
The next line, chain.append((remainder, quotient)) runs just fine, but remember that only this call to proper_long_division has a reference to it.
Now you call chain.extend(proper_long_division(remainder, b)). You seem to expect that the recursive call will be able to check and modify chain. However, the object referred to by chain in a given call of proper_long_division is only visible within that call.
To fix that, you can use a piece of shared memory that any invocation of the recursive function can see. You could use a global variable, but that would make the function have unpredictable behavior since anyone could modify the list. A better way would be to use a nested function that has access to a list in the enclosing scope:
def proper_long_division(a, b):
"""a < b"""
chain = {}
def nested(a, b):
while a < b:
a *= 10
quotient = a // b
remainder = a - b * quotient
key = (remainder, quotient)
if key in chain:
return chain
# next step
chain[key] = None
nested(remainder, b)
nested(a, b)
return list(chain.keys())
A couple of suggested changes are showcased above. Multiplication by 10 is the same as padding with a zero to the right, so you don't need to play games with strings. Lookup in a hashtable is much faster than a list. Since ordering is important, you can't use a set. Instead, I turned chain into a dict, which is ordered as of python 3.6, and used only the keys for lookup. The values all refer to the singleton None.
The second example does not match the structure of the first in the one way that matters: you do not use nrs as part of your exit criterion.
This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 1 year ago.
I'm new with python language and I'm having trouble to understand why this code doesn't work as I expect.
I want to calculate and put in a tuple the primitive pythagorean triplet (a^2+b^2=c^2) for a,b,c < 100.
This is the code
n=100
#Here I calculate all the pythagorean triples and I put it in a list(I wanted to use the nested list comprehension)
d=[(ap,b,c) for ap in range(1,n+1) for b in range(ap,n+1) for c in range(b,n+1) if ap**2 + b**2 == c**2 ]
#it work
#now I wonna find the primitive one:
q=[]
for q in d: #I take each triples
#I check if it is primitive
v=2
for v in range(2,q[0]) :
if q[0]%v==0 and q[1]%v==0 and q[2]%v== 0 :
d.remove(q) #if not I remove it and exit from this cycle
break
#then I would expect that it read all the triples, but it doesn't
#it miss the triples after the one I have cancelled
Can you tell me why?
Is there another way to solve it?
Do I miss some step ?
The missing 3-tuples are not caused by the break but by modifying a list at the same time that you loop it. When you remove an element from a list, the remaining element's indexes is also modified, which can produce a loop to skip certain elements from being checked. It's never a good practice to remove elements from a list you are iterating. One usually create a copy of the list that you iterate, or use functions such as filter.
Also, you can remove v = 2. There's no need to set the value of v to 2 for every iteration when you already do so with the instruction for v in range(2,q[0])
Iterating over a copy of d
If you do list(d) then you create a clone of the list you are iterating. The original list can be modified but it won't be a problem because your loop will always iterate over the original list.
n=100
d=[(ap,b,c) for ap in range(1,n+1) for b in range(ap,n+1) for c in range(b,n+1) if ap**2 + b**2 == c**2]
q=[]
for q in list(d):
for v in range(2,q[0]) :
if q[0]%v==0 and q[1]%v==0 and q[2]%v== 0 :
d.remove(q)
break
Use the function filter
For the filter function , you need to define a function that is applied to every element of your list. The filter function uses that defined function to build a new list with the elements that pass. If the defined function returns True then that element is kept and used for the new list, if it returns False, then that element is not used. filter() returns an iterator, so you need to build the list from it.
def removePrimitives(q):
for v in range(2, q[0]):
if q[0] % v == 0 and q[1] % v == 0 and q[2] % v == 0:
return False
return True
n=100
d=[(ap,b,c) for ap in range(1,n+1) for b in range(ap,n+1) for c in range(b,n+1) if ap**2 + b**2 == c**2]
q=[]
d = list(filter(removePrimitives, d))
Bonus: debugging
When it comes to coding, no matter what language, I believe one of the first things you should learn to do is how to debug it. Python has an amazing interactive debugging module called: ipdb.
Here's the commands:
n[ext]: next line, step over
s[tep]: step into next line or function.
c[ontinue]: continue
l[ine]: show more lines
p <variable>: print value
q[uit]: quit debugging
help: show all commands
help <command>: show help
Install the package with your pip installer. Here's how you could have used it in your code to see exactly what happens when a primitive tuple is found and you break from the inner loop:
import ipdb
n=100
d=[(ap,b,c) for ap in range(1,n+1) for b in range(ap,n+1) for c in range(b,n+1) if ap**2 + b**2 == c**2 ]
q=[]
for q in d:
for v in range(2,q[0]) :
if q[0]%v==0 and q[1]%v==0 and q[2]%v== 0 :
ipdb.set_trace() # This sets the breakpoint
d.remove(q)
break
At this breakpoint you can print the variable q, and d, and see how it gets modified and what happens after the break is executed.
Consider the following piece of code that generates all subsets of size k of an array [1,2,3,...,n]:
def combinations(n, k):
result = []
directed_combinations(n, k, 1, [], result)
return result
def directed_combinations(n, k, offset, partial_combination, result):
if len(partial_combination) == k:
new_partial = [x for x in partial_combination]
result.append(new_partial)
return
num_remaining = k - len(partial_combination)
i = offset
# kind of checks if expected num remaining is no greater than actual num remaining
while i <= n and num_remaining <= n - i + 1:
partial_combination.append(i)
directed_combinations(n, k, i + 1, partial_combination, result)
del partial_combination[-1]
# partial_combination = partial_combination[:-1] <-- same funcationality as line above, but produces weird bug.
i += 1
print(combinations(n=4,k=2))
For example, combinations(n=4,k=2) will generate all subsets of length 2 of [1,2,3,4].
There are two lines in the code that produce a list with the last element removed. I tried accomplishing it with del and creating a brand new list by slicing off the last element (i.e. [-1]). The version with del produces the correct result. But, version with [-1] doesn't. There is no runtime error; just a logical bug (i.e. incorrect result).
I suspect this has something to do with creating a new list when doing slicing vs. keeping the same list with del. I can't seem to understand why this is an issue.
I didn't notice at first that your function is recursive (should've read your tags better).
You're right, functionally the two are almost the same. Here is the exact same thing:
# del partial_combination[-1] # working (mutate)
# partial_combination = partial_combination[:-1] # different (rebind)
partial_combination[:] = partial_combination[:-1] # same (mutate)
The result of each of the above will be that you end up with a list containing the same elements. But while del and partial_combination[:] mutate your original list, the middle one rebinds the name to a new list with the same elements. When you pass on this new list to the next recursive step, it will operate on its own copy rather than on the single list the previous recursive levels are working on.
To prove this, you can call print(id(partial_combination)) after each of the above options, and see that the id changes in the rebinding case, while it stays the same throughout the mutating ones.
I am new to python and I have really poor expiriences with other codes.
For the most of you a stupid question but somewhere I should start.
def fib(n):
a, b = 0, 1
while a < n:
print(a, end=' ')
a, b = b, a+b
print()
I don't understand why one should enter a, b = b, a+b
I see and understand the result and I can conclude the basic algorithm but I don't get the real understanding of what is happening with this line and why we need it.
Many thanks
This line is executed in the following order:
New tuple is created with first element equal to b and second to a + b
The tuple is unpacked and first element is stored in a and the second one in b
The tricky part is that the right part is executed first and you do not need to use temporary variables.
The reason you need it is because, if you update a with a new value, you won't be able to calculate the new value of b. You could always use temporary variables to keep the old value while you calculate the new values, but this is a very neat way of avoiding that.
It's called sequence unpacking.
In your statement:
a, b = b, a + b
the right side b, a + b creates a tuple:
>>> 8, 5 + 8
(8, 13)
You then assign this to the left side, which is also a tuple a, b.
>>> a, b = 8, 13
>>> a
8
>>> b
13
See the last paragraph the documentation on Tuples and Sequences:
The statement t = 12345, 54321, 'hello!' is an example of tuple packing: the values 12345, 54321 and 'hello!' are packed together in a tuple. The reverse operation is also possible:
>>> x, y, z = t
This is called, appropriately enough, sequence unpacking and works for any sequence on the right-hand side. Sequence unpacking requires the list of variables on the left to have the same number of elements as the length of the sequence. Note that multiple assignment is really just a combination of tuple packing and sequence unpacking.
Can I access a list while it is being sorted in the list.sort()
b = ['b', 'e', 'f', 'd', 'c', 'g', 'a']
f = 'check this'
def m(i):
print i, b, f
return None
b.sort(key=m)
print b
this returns
b [] check this
e [] check this
f [] check this
d [] check this
c [] check this
g [] check this
a [] check this
Note that individual items of list b is sent to function m. But at m the list b is empty, however it can see the variable f, which has same scope as list b. Why does function m print b as []?
Looking at the source code (of CPython, maybe different behaviour for other implementations) the strange output of your script becomes obvious:
/* The list is temporarily made empty, so that mutations performed
* by comparison functions can't affect the slice of memory we're
* sorting (allowing mutations during sorting is a core-dump
* factory, since ob_item may change).
*/
saved_ob_size = Py_SIZE(self);
saved_ob_item = self->ob_item;
saved_allocated = self->allocated;
Py_SET_SIZE(self, 0);
The comment says it all: When you begin sorting, the list is emptied. Well, it is "empty" in the eye of an external observer.
I quite like the term "core-dump factory".
Compare also:
b = ['b','e','f','d','c','g','a']
f = 'check this'
def m(i):
print i, b, f
return None
b = sorted(b, key= m)
print b
This is something you can't rely on in general - not just for lists - unless the documentation for the method you're using explicitly says otherwise. Accessing an object in an intermediate state - ie, after some iteration has been started, but before it has been finished - is a problem that concurrent code runs into a lot. You've found a rare non-concurrent case of it, but the advice is the same: avoid this situation. The intermediate state is not guaranteed to be meaningful to you, and is not guaranteed to be a "valid" state according to the rules of that object (when it tends to be called an "inconsistent" state).