Python mutation of lists in tuples - python

After learning about lists and box-and-pointer diagrams, I decided to create random stuff for myself and test out my knowledge. I am going to use the words shallow copy and suspected shallow copies as I'm not really sure whether they are correct by definition. My queries are in the reasons provide for the behaviour of such code, please tell me whether I'm thinking soundly.
Code A
from copy import *
x=[1,[2,[3,[4]]]] #normal copy/hardcopy
a=x
v=list(x) #suspected shallow copy
y=x.copy() #shallow copy
z=deepcopy(x) #theoretical deep copy
w=x[:] #suspected shallow copy
def test():
print("Original:",x)
print("hardcopy:",a)
print("suspected shallow copy",v)
print("shallow copy",y)
print("deep copy:",z)
print("suspected shallow copy",w)
x[1]=x[1]+[4]
test()
Output A:
Original: [1, [2, [3, [4]], 4]]
hardcopy: [1, [2, [3, [4]], 4]]
suspected shallow copy [1, [2, [3, [4]]]]
shallow copy [1, [2, [3, [4]]]]
deep copy: [1, [2, [3, [4]]]]
suspected shallow copy [1, [2, [3, [4]]]]
Code B
a=(1,2,[1,2,3])
def shallow_copy(x):
tup=()
for i in x:
tup+=(i,)
return tup
def hardcopy(x):
return x
b=hardcopy(a)
c=shallow_copy(a)
a[2]+=[3]
Output B:
I see TypeError in IDLE here, but the mutation of the list element is still done, and across ALL a,b,c
Continuation from output B:
a[2][0]=a[2][0]+99
a,b,c
Output C:
((1, 2, [100, 2, 3, 3]), (1, 2, [100, 2, 3, 3]), (1, 2, [100, 2, 3, 3]))
Code D:
a=[1,2,(1,2,3)]
def shallow_copy(x):
tup=[]
for i in x:
tup+=[i]
return tup
def hardcopy(x):
return x
b=hardcopy(a)
c=shallow_copy(a)
d=a.copy()
a[2]=a[2]+(4,)
a,b,c,d
Output D:
[1, 2, (1, 2, 3, 4)], [1, 2, (1, 2, 3, 4)],
[1, 2, (1, 2, 3)], [1, 2, (1, 2, 3)]
From Output A, we observe the following:
1)For lists which have shallow copies, doing x[1]=x[1]+[4] does not affect the shallow copies. My reasons for the above could be
a) = followed by + does __add__ instead of __iadd__(which is +=), and doing __add__ should not modify the object, only changing the value for one pointer(x and its hardcopy in this case)
This is further supported in Output B but somehow contradicted in Output C, could be partly due to reason (b) below, but can't be too sure.
b) We executed this in the first layer(only 1 slice operator), maybe there's some kind of rule which prevents these elements from being modified.
This is supported by both Output B and Output C, though Output B might be argued to be in the first layer, think of it as increasing the elements in the 2nd layer, and it fits the above observation.
2)What is the reason why the TypeError appeared in Output B, but is still executed? I know that whether an Exception might be triggered is based on the final sequence you are actually changing(the list in this case), but why is there still TypeError: 'tuple' object does not support item assignment ?
I have presented my views for the above questions. I appreciate any thoughts(theoretical solutions preferably) on this question as I'm still relatively new to programming.

To answer question 1, which looks complex but whose answer is probably quite simple:
when you have a another name referencing the original object, you will see the changes in the original. Those changes will not reflect in other copies (being those either shallow or deep) if(!) you change the objects using the form x[1] = x[1] + [4]. This is because you are assigning a new object into x[1], instead of making an in-place change like in x[1].append(4).
You can check that with the id() function.
To answer your question 2, and adapted from the official docs:
let's make
a = (['hello'],)
then
a[0] += [' world']
this is the same as
a[0] = operator.iadd(a[0],[' world'])
The iadd changes the list in place, but then the assignment fails because you can't assign to a tuple (immutable type) index.
If you make
a[0] = a[0] + [' world']
the concatenation goes into a new list object, then the assignment to the tuple index fails too. But the new object gets lost. a[0] wasn't changed in place.
To clarify OP's comment, directly from the docs in here it says that
Many operations have an “in-place” version. Listed below are functions providing a more primitive access to in-place operators than the usual syntax does; for example, the statement x += y is equivalent to x = operator.iadd(x, y). Another way to put it is to say that z = operator.iadd(x, y) is equivalent to the compound statement z = x; z += y.
In those examples, note that when an in-place method is called, the
computation and assignment are performed in two separate steps. The
in-place functions listed below only do the first step, calling the
in-place method. The second step, assignment, is not handled.
As for your Output D:
Writing
b = hardcopy(a)
does nothing more than writing
b = a
really, b is a new name referencing the same object that a references.
This is because a is mutable and so a reference pointing to the original object is passed into local function name x. Returning x just returns the same reference into b.
That's why you see further changes in a reflected in b. Again you make a[2] a new different object tuple by assignment, so now a[2] and b[2] reference a new tuple (1,2,3,4), while c and d still reference the old tuple object. And now because they are tuples you can't change them in place, like lists.
As for the term "hardcopy", I wouldn't use it. It doesn't appear even once in official docs, and the mentions in Python SO questions beside this one, appear in other contexts. And it is ambiguous (contrary to "shallow" and "deep" which give a good clue for their meaning). I would think exactly the opposite (an object copy) for the term "hardcopy" you describe (an additional name/reference/pointer to the same object). Of course there are eventually many ways to say the same thing. We say "copy" because its shorter, and for immutables it doesn't matter if the copy happens or not (you can't change them anyway). For mutables saying "copy" usually means "shallow copy", because you have to "go further" in your code if you want a "deep copy".

Related

List is unchanged ever after element is changed

While trying to implement an algorithm, I couldn't get python lists to mutate via a function. After reading up on the issue I was suggested by this StackOverflow answer to use [:] in order to mutate the array passed in the function argumemt.
However, as seen in the following code snippet, the issue still persists when trying to mutate the list l. I am expecting the output to be Before: [1,2,3,4]
After: [69, 69, 69, 69], but instead I get back the original value of l as shown below.
def mutate_list(a, b):
c = [69] * 4
a[:] = c[:2] # changed the elements, but array's still unchanged outside function
b[:] = c[2:]
if __name__ == '__main__':
l = [1, 2, 3, 4]
print("Before: {}" .format(l))
mutate_list(l[:2], l[2:])
print("After: {}" .format(l))
Output:
Before: [1, 2, 3, 4]
After : [1, 2, 3, 4]
Any insights into why this is happening?
The error is that you not pass actually the l but two slices of it. You should change it, for example:
def mutate_list(a):
c = [69] * 4
a[:2] = c[:2]
a[2:] = c[2:]
if __name__ == '__main__':
l = [1, 2, 3, 4]
print("Before: {}" .format(l))
mutate_list(l)
print("After: {}" .format(l))
its all about the scope, mutable concept is applicable on list but not to reference variable.
The variables a,b are local variables, hence the scope of the variable will be always function scope.
The operations which you have performed :
a[:]=c[:2]
b[:]=c[2:]
Note: a and b both are list now so you will get following output in the function:
[69,69],[69,69]
but if you use + operator which is use for adding operations then the out out will be like:
[69,69,69,69]
Now whatever I told you that will be a local scope, if you want that the list should be mutable across the program then you have to specify the scope of the list as global inside function and on that variable you can do changes. in this case you also dont need to pass any arguments:
def mutate_list():
global l # telling python to use this global variable in a local function
c = [69] * 4
l=c # assigning new values to actual list i.e l
Now before output will be [1,2,3,4]
and after will be [69,69,69,69]
As pointed out by others, the issue arose from the fact that the function parameters were slices of the original array and as such, the parameters were being passed by value (instead of being passed by reference).
According to #Selcuk 's suggestion, the correct way of doing such an operation would be to pass the original array along with its indices to the function and then perform any slicing inside the function.
NOTE: This concept comes in handy for (recursive) divide-and-conquer algorithms where subarrays must be mutated and combined to form the solution.

What's the point of assignment to slice?

I found this line in the pip source:
sys.path[:] = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path
As I understand the line above is doing the same as below:
sys.path = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path
With one difference: in the first case sys.path still points to the same object in memory while in the second case sys.path points to the new list created from two existing.
Another one thing is that the first case is two times slower than second:
>>> timeit('a[:] = a + [1,2]', setup='a=[]', number=20000)
2.111023200035561
>>> timeit('a = a + [1,2]', setup='a=[]', number=20000)
1.0290934000513516
The reason as I think is that in the case of slice assignment objects from a (references to objects) are copied to a new list and then copied back to the resized a.
So what are the benefits of using a slice assignment?
Assigning to a slice is useful if there are other references to the same list, and you want all references to pick up the changes.
So if you do something like:
bar = [1, 2, 3]
foo = bar
bar[:] = [5, 4, 3, 2, 1]
print(foo)
this will print [5, 4, 3, 2, 1]. If you instead do:
bar = [5, 4, 3, 2, 1]
print(foo)
the output will be [1, 2, 3].
With one difference: in the first case sys.path still points to the same object in memory while in the second case sys.path points to the new list created from two existing.
Right: That’s the whole point, you’re modifying the object behind the name instead of the name. Thus all other names referring to the same object also see the changes.
Another one thing is that the first case is two times slower than second:
Not really. Slice assignment performs a copy. Performing a copy is an O(n) operation while performing a name assignment is O(1). In other words, the bigger the list, the slower the copy; whereas the name assignment always takes the same (short) time.
Your assumptions are very good!
In python a variable is a name that has been set to point to an object in memory, which in essence is what gives python the ability to be a dynamically typed language, i.e. you can have the same variable as a number, then reassign it to a string etc.
as shown here whenever you assign a new value to a variable, you are just pointing a name to a different object in memory
>>> a = 1
>>> id(a)
10968800
>>> a = 1.0
>>> id(a)
140319774806136
>>> a = 'hello world'
>>> id(a)
140319773005552
(in CPython the id refers to its address in memory).
Now for your question sys.path is a list, and a python list is a mutable type, thus meaning that the type itself can change, i.e.
>>> l = []
>>> id(l)
140319772970184
>>> l.append(1)
>>> id(l)
140319772970184
>>> l.append(2)
>>> id(l)
140319772970184
even though I modified the list by adding items, the list still points to the same object, and following the nature of python, a lists elements as well are only pointers to different areas in memory (the elements aren't the objects, the are only like variables to the objects held there) as shown here,
>>> l
[1, 2]
>>> id(l[0])
10968800
>>> l[0] = 3
>>> id(l[0])
10968864
>>> id(l)
140319772970184
After reassigning to l[0] the id of that element has changed. but once again the list hasn't.
Seeing that assigning to an index in the list only changes the places where lists elements where pointing, now you will understand that when I reassign l I don't reassign, I just change where l was pointing
>>> id(l)
140319772970184
>>> l = [4, 5, 6]
>>> id(l)
140319765766728
but if I reassign to all of ls indexes, then l stays the same object only the elements point to different places
>>> id(l)
140319765766728
>>> l[:] = [7, 8, 9]
>>> id(l)
140319765766728
That will also give you understanding on why it is slower, as python is reassigning the elements of the list, and not just pointing the list somewhere else.
One more little point if you are wondering about the part where the line finishes with
sys.path[:] = ... + sys.path
it goes in the same concept, python first creates the object on the right side of the = and then points the name on the left side to the new object, so when python is still creating the new list on the right side, sys.path is in essence the original list, and python takes all of its elements and then reassigns all of the newly created elements to the mappings in the original sys.paths addresses (since we used [:])
now for why pip is using [:] instead of reassigning, I don't really know, but I would believe that it might have a benefit of reusing the same object in memory for sys.path.
python itself also does it for the small integers, for example
>>> id(a)
10968800
>>> id(b)
10968800
>>> id(c)
10968800
a, b and c all point to the same object in memory even though all requested to create an 1 and point to it, since python knows that the small numbers are most probably going to be used a lot in programs (for example in for loops) so they create it and reuse it throughout.
(you might also find it being the case with filehandles that python will recycle instead of creating a new one.)
You are right, slice assignment will not rebind, and slice object is one type of objects in Python. You can use it to set and get.
In [1]: a = [1, 2, 3, 4]
In [2]: a[slice(0, len(a), 2)]
Out[2]: [1, 3]
In [3]: a[slice(0, len(a), 2)] = 6, 6
In [4]: a[slice(0, len(a), 1)] = range(10)
In [5]: a
Out[5]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [6]: a[:] = range(4)
In [7]: a
Out[7]: [0, 1, 2, 3]

Why deepcopy of list of integers returns the same integers in memory?

I understand the differences between shallow copy and deep copy as I have learnt in class. However the following doesn't make sense
import copy
a = [1, 2, 3, 4, 5]
b = copy.deepcopy(a)
print(a is b)
print(a[0] is b[0])
----------------------------
~Output~
>False
>True
----------------------------
Shouldn't print(a[0] is b[0]) evaluate to False as the objects and their constituent elements are being recreated at a different memory location in a deep copy? I was just testing this out as we had discussed this in class yet it doesn't seem to work.
It was suggested in another answer that this may be due to the fact Python has interned objects for small integers. While this statement is correct, it is not what causes that behaviour.
Let's have a look at what happens when we use bigger integers.
> from copy import deepcopy
> x = 1000
> x is deepcopy(x)
True
If we dig down in the copy module we find out that calling deepcopy with an atomic value defers the call to the function _deepcopy_atomic.
def _deepcopy_atomic(x, memo):
return x
So what is actually happening is that deepcopy will not copy an atomic value, but only return it.
By example this is the case for int, float, str, function and more.
The reason of this behavior is that Python optimize small integers so they are not actually in different memory location. Check out the id of 1, they are always the same:
>>> x = 1
>>> y = 1
>>> id(x)
1353557072
>>> id(y)
1353557072
>>> a = [1, 2, 3, 4, 5]
>>> id(a[0])
1353557072
>>> import copy
>>> b = copy.deepcopy(a)
>>> id(b[0])
1353557072
Reference from Integer Objects:
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object. So it should be possible to change the value of 1. I suspect the behaviour of Python in this case is undefined. :-)
Olivier Melançon's answer is the correct one if we take this as a mechanical question of how the deepcopy function call ends up returning references to the same int objects rather than copies of them. I'll take a step back and answer the question of why that is the sensible thing for deepcopy to do.
The reason we need to make copies of data structures - either deep or shallow copies - is so we can modify their contents without affecting the state of the original; or so we can modify the original while still keeping a copy of the old state. A deep copy is needed for that purpose when a data structure has nested parts which are themselves mutable. Consider this example, which multiplies every number in a 2D grid, like [[1, 2], [3, 4]]:
import copy
def multiply_grid(grid, k):
new_grid = copy.deepcopy(grid)
for row in new_grid:
for i in range(len(row)):
row[i] *= k
return new_grid
Objects such as lists are mutable, so the operation row[i] *= k changes their state. Making a copy of the list is a way to defend against mutation; a deep copy is needed here to make copies of both the outer list and the inner lists (i.e. the rows), which are also mutable.
But objects such as integers and strings are immutable, so their state cannot be modified. If an int object is 13 then it will stay 13, even if you multiply it by k; the multiplication results in a different int object. There is no mutation to defend against, and hence no need to make a copy.
Interestingly, deepcopy doesn't need to make copies of tuples if their components are all immutable*, but it does when they have mutable components:
>>> import copy
>>> x = ([1, 2], [3, 4])
>>> x is copy.deepcopy(x)
False
>>> y = (1, 2)
>>> y is copy.deepcopy(y)
True
The logic is the same: if an object is immutable but has nested components which are mutable, then a copy is needed to avoid mutation to the components of the original. But if the whole structure is completely immutable, there is no mutation to defend against and hence no need for a copy.
* As Kelly Bundy points out in the comments, deepcopy sometimes does make copies of deeply-immutable objects, for example it does generally make copies of frozenset instances. The principle is that it doesn't need to make copies of those objects; it is an implementation detail whether or not it does in some specific cases.

Python Variable Scope (passing by reference or copy?)

Why does the variable L gets manipulated in the sorting(L) function call? In other languages, a copy of L would be passed through to sorting() as a copy so that any changes to x would not change the original variable?
def sorting(x):
A = x #Passed by reference?
A.sort()
def testScope():
L = [5,4,3,2,1]
sorting(L) #Passed by reference?
return L
>>> print testScope()
>>> [1, 2, 3, 4, 5]
Long story short: Python uses pass-by-value, but the things that are passed by value are references. The actual objects have 0 to infinity references pointing at them, and for purposes of mutating that object, it doesn't matter who you are and how you got a reference to the object.
Going through your example step by step:
L = [...] creates a list object somewhere in memory, the local variable L stores a reference to that object.
sorting (strictly speaking, the callable object pointed to be the global name sorting) gets called with a copy of the reference stored by L, and stores it in a local called x.
The method sort of the object pointed to by the reference contained in x is invoked. It gets a reference to the object (in the self parameter) as well. It somehow mutates that object (the object, not some reference to the object, which is merely more than a memory address).
Now, since references were copied, but not the object the references point to, all the other references we discussed still point to the same object. The one object that was modified "in-place".
testScope then returns another reference to that list object.
print uses it to request a string representation (calls the __str__ method) and outputs it. Since it's still the same object, of course it's printing the sorted list.
So whenever you pass an object anywhere, you share it with whoever recives it. Functions can (but usually won't) mutate the objects (pointed to by the references) they are passed, from calling mutating methods to assigning members. Note though that assigning a member is different from assigning a plain ol' name - which merely means mutating your local scope, not any of the caller's objects. So you can't mutate the caller's locals (this is why it's not pass-by-reference).
Further reading: A discussion on effbot.org why it's not pass-by-reference and not what most people would call pass-by-value.
Python has the concept of Mutable and Immutable objects. An object like a string or integer is immutable - every change you make creates a new string or integer.
Lists are mutable and can be manipulated in place. See below.
a = [1, 2, 3]
b = [1, 2, 3]
c = a
print a is b, a is c
# False True
print a, b, c
# [1, 2, 3] [1, 2, 3] [1, 2, 3]
a.reverse()
print a, b, c
# [3, 2, 1] [1, 2, 3] [3, 2, 1]
print a is b, a is c
# False True
Note how c was reversed, because c "is" a. There are many ways to copy a list to a new object in memory. An easy method is to slice: c = a[:]
It's specifically mentioned in the documentation the .sort() function mutates the collection. If you want to iterate over a sorted collection use sorted(L) instead. This provides a generator instead of just sorting the list.
a = 1
b = a
a = 2
print b
References are not the same as separate objects.
.sort() also mutates the collection.

Confusing [...] List in Python: What is it?

So I was writing up a simple binary tree in Python and came across [...]
I don't believe this to be related to the Ellipsis object, more it seems to have something to do with an infinity loop (due to Python's shallow copy?). The source of this infinity loop and why it doesn't get expanded while expanding when accessed is something I'm completely lost to, however
>>> a
[[[[[], [], 8, 3], [[], [], 3, 2], 6, 3], [], 1, 4], [[], [], -4, 2], 0, 0]
>>> Keys(a)#With a+b
[0, 1, 6, 8, 3, -4]
>>> Keys(a)#With [a,b]
[8, [...], [...], 3, [...], [...], 6, [...], [...], 1, [...], [...], -4, [...], [...], 0, [...], [...]]
>>> Keys(a)[1]#??
[8, [...], [...], 3, [...], [...], 6, [...], [...], 1, [...], [...], -4, [...], [...], 0, [...], [...], 8, [...], [...], 3, [...], [...], 6, [...], [...], 1, [...], [...], -4, [...], [...], 0, [...], [...]]
Version using a+b
def Keys(x,y=[]):
if len(x):y+=[x[2]]+Keys(x[0],y)+Keys(x[1],y)#Though it seems I was using y=y[:]+, this actually outputs an ugly mess
return y
version using [a,b]
def Keys(x,y=[]):
if len(x):y+=[x[2],Keys(x[0],y),Keys(x[1],y)]
return y
So what exactly is [...]?
It can also appear if you have a circular structure with a list pointing to itself. Like this:
>>> a = [1,2]
>>> a.append(a)
>>> a
[1, 2, [...]]
>>>
Since python can't print out the structure (it would be an infinite loop) it uses the ellipsis to show that there is recursion in the structure.
I'm not quite sure if the question was what what going on or how to fix it, but I'll try to correct the functions above.
In both of them, you first make two recursive calls, which add data to the list y, and then AGAIN append the returned data to y. This means the same data will be present several times in the result.
Either just collect all the data without adding to any y, with something like
return [x[2]]+keys(x[0])+keys(x[1])
or just do the appending in the calls, with something like
y += [x[2]]
keys(x[0], y) #Add left children to y...
keys(x[1], y) #Add right children to y...
return y
(Of course, both these snippets need handling for empty lists etc)
#Abgan also noted that you really don't want y=[] in the initializer.
I believe, that your 'tree' contains itself, therefore it contains cycles.
Try this code:
a = [1,2,3,4]
print a
a.append(a)
print a
The first print outputs:
[1,2,3,4]
while the second:
[1,2,3,4, [...]]
The reason is using
def Keys(x,y=[]):
This is wrong and evil. List is a mutable object, and when used as a default parameter, it is preserved between function calls.
So each y += "anything" operation adds to the same list (in all function calls, and since the function is recursive...)
See the Effbot or Devshed for more details on mutable objects passed as default values for functions.
I don't understand your code above, but the [...] I think is the Python interpreter skipping infinite data structures. For example:
>>> a = [0, 1]
>>> a[0] = a
>>> a
[[...], 1]
It looks like your tree structure is becoming looped.
The answers about slice objects are beside the point.
I don't believe this to be related to the Ellipsis object, more it seems to have something to do with an infinity loop (due to Python's shallow copy?). The source of this infinity loop and why it doesn't get expanded while expanding when accessed is something I'm completely lost to, however
Look at the following code:
>>> a = [0]
>>> a.append(a)
>>> print a
[0, [...]]
How is Python supposed to print a? It is a list that contains a zero and a reference to itself. Hence it is a list that contains a zero and a reference to a list
[0, [...]]
which in turn contains a zero and a reference to a list
[0, [0, [...]]]
which in turn contains a zero and a reference to a list,
and so on, recursively:
[0, [0, [0, [...]]]]
[0, [0, [0, [0, [...]]]]]
[0, [0, [0, [0, [0, [...]]]]]]
...
There is nothing wrong with the recursive data structure itself. The only problem is that it cannot be displayed, for this would imply an infinite recursion. Hence Python stops at the first recursion step and deals with the infinity issue printing only the ellipsis, as was pointed out in previous answers.
If you would have used a PrettyPrinter, the output would had been self explanatory
>>> l = [1,2,3,4]
>>> l[0]=l
>>> l
[[...], 2, 3, 4]
>>> pp = pprint.PrettyPrinter(indent = 4)
>>> pp.pprint(l)
[<Recursion on list with id=70327632>, 2, 3, 4]
>>> id(l)
70327632
In other words its something like
EDIT: As mentioned above, this isn't the Ellipsis object, but the result of a looped list. I jumped the gun here. Knowing about the Ellipsis object is a good bit of back shelf knowledge should you find an Ellipsis in some actual code, rather than the output.
The Ellipsis object in Python is used for extended slice notation. It's not used in current Python core libraries, but is available for developers to define in their own libraries. For example, NumPy (or SciPy) use this as part of their array object. You'll need to look at the documentation for tree() to know exactly how Ellipsis behaves in this object.
From Python documentation:
3.11.8 The Ellipsis Object
This object is used by extended slice
notation (see the Python Reference
Manual). It supports no special
operations. There is exactly one
ellipsis object, named Ellipsis (a
built-in name).
It is written as Ellipsis.
Ok, so in points:
You're creating infinite data
structure: def Keys(x,y=[]) will use the same 'y' in
each call. This just isn't correct.
The print statement, however, is clever enough not to print an infinite data, but to mark self-reference with a [...] (known as Ellipsis)
The Python will allow you to address such structure correctly, so you can write a.keys()[1][1][1] and so on. Why shouldn't you?
The y = y[:] statement simply copies the list y. Can be done more soundly with y = list(y)
Try using the following code:
def Keys(x,y=None):
if y is None:
y = []
if len(x):
y += [x[2], Keys(x[0],y), Keys(x[1],y)]
return y
But still I guess that it can bite you. You're still using the same variable y (I mean the same object) in three places in one expression:
y += [x[2], Keys(x[0], y), Keys(x[1], y)]
Is that what you really want to achieve?
Or maybe you should try:
def mKeys(x,y=None):
if y is None:
y = []
if len(x):
z = [x[2], mKeys(x[0], y), mKeys(x[1],y)]
return z
return []
For the difference between the two versions of the function Keys, note the following difference:
y+=[x[2]]+Keys(x[0],y)+Keys(x[1],y)
The right side value in this statement is a list which contains x[2], plus the ELEMENTS OF Keys(x[0],y) and the ELEMENTS OF Keys(x[1],y)
y+=[x[2],Keys(x[0],y),Keys(x[1],y)]
The right side value in this statement is a list which contains x[2], plus the LIST Keys(x[2],y) and the LIST Keys(x[1],y).
So the version using [a,b] will causing y contains itself as its elements.
Some other notes:
Since in python, the default value object is created once when the function is defined, the first version will not work like the example shows. It will contain multiple copy of some keys. It's hard to explain in short, but you can get some idea by printing the values of x, y on each call of Keys.
This is confirmed by running the function on my machine with python 2.5.2.
Also because the default value is defined only once at function definition time, even the function works correct for the first time, it will not work when calling with a different a, since the keys in the first binary tree will remain in y.
You can see this by calling Keys(a) twice, or calling it on two different lists.
The second parameter is not required for this problem. The function can be like this:
def Keys(a):
if a = []:
return []
else:
return [a[2]]+Keys(a[0])+Keys(a[1])
Defining a recursive function basically contains two part, solve subproblems and combined the results. In your code, the combining results part is repeated twice: one by accumulating them in y, one by adding the list together.
The issue is because one of the list element is referencing the list itself. So if an attempt to print all the elements is made then it would never end.
Illustration:
x = range(3)
x.append(x)
x[3][3][3][3][3][0] = 5
print x
Output:
[5, 1, 2, [...]]
x[3] is a referring to x itself. Same goes for x[3][3].
This can be visualized better
here

Categories

Resources