Copying arrays in python and manipulating one - python

I am loading a file "data.imputation" which is 2 dimensional in variable 'x'. Variable 'y' is a copy of 'x'. I pop the first array from 'y' (y is 2D). Why is the change reflected on x? (The first array from 'x' is also popped)
ip = open('data.meanimputation','r')
x = pickle.load(ip)
y = x
y.pop(0)
At the start, len(x) == len(y). Even after y.pop(0), len(x) == len(y). Why is that? And how can I avoid it?

use y = x[:] instead of y = x. y = x means both y and x are now pointing to the same object.
Take a look at this example:
>>> x=[1,2,3,4]
>>> y=x
>>> y is x
True # it means both y and x are just references to a same object [1,2,3,4], so changing either of y or x will affect [1,2,3,4]
>>> y=x[:] # this makes a copy of x and assigns that copy to y,
>>> y is x # y & x now point to different object, so changing one will not affect the other.
False
If x is a list is list of list then [:] is of no use:
>>> x= [[1,2],[4,5]]
>>> y=x[:] #it makes a shallow copy,i.e if the objects inside it are mutable then it just copies their reference to the y
>>> y is x
False # now though y and x are not same object but the object contained in them are same
>>> y[0].append(99)
>>> x
[[1, 2, 99], [4, 5]]
>>> y
[[1, 2, 99], [4, 5]]
>>> y[0] is x[0]
True #see both point to the same object
in such case you should use copy module's deepcopy() function , it makes non-shallow copies of object.

y = x does not copy anything. It binds the name y to the same object already referred to by x. Assignment to a bare name never copies anything in Python.
If you want to copy an object, you need to copy it explicitly, using whatever methods are available for the object you're trying to copy. You don't say what kind of object x is, so there's no way to say how you might be able to copy it, but the copy module provides some functions that work for many types. See also this answer.

Related

Assignment operator and variable identifiers

The following code (python3) prints 5 and not 4.
x = 5
y = x
x = 4
print(y)
But every now and I find a situation where the opposite behavior occurs, where a variable is assigned to some location in memory rather than the value stored there. A few times I have found a bug in my code because of something like this.
Is there a similar situation to the above code (and in Python), in which a beginner might intend for the code to store the current value of one variable in a new variable, but instead of the value it assigns the identifier? Also, I don't mean to impose any specific data type by the words "value" and "variable" here, so my apologies if these terms are not correct.
If there are some comments about how this behavior varies over some common languages (esp. python, javascript, java, c, haskell), I would be interested to hear about that. And of course, any suggestions on what is appropriate terminology for this question (and how to tag it) would be kindly appreciated as well.
EDIT: I'm accepting an answer which describes how the behavior varies with immutable/mutable types as this is likely the behavior I had encountered, and I had asked in particular about a source of confusion for a beginner programmer. However, someone visiting this page with a similar question should refer also to the comments section which indicates that a general answer isn't as simple as mutable/immutable data types.
>>> 1
1
>>> id(1)
140413541333480
>>> x = 1
>>> id(x)
140413541333480
>>> z = 1
>>> id(z)
140413541333480
>>> y = x
>>> id(y)
140413541333480
>>>
For the purpose of optimisation, there's only single copy of 1 and all variables are referencing to it.
Now, integers and strings, in python, are immutable. Every time you define a new one, a new reference/id gets generated.
>>> x = 1 # x points to reference (id) of 1
>>> y = x # now y points to reference (id) of 1
>>> x = 5 # x now points to a new reference: id(5)
>>> y # y still points to the reference of 1
1
>>> x = "foo"
>>> y = x
>>> x = "bar"
>>> y
'foo'
>>>
Lists, dicts are mutable, i.e., you can modify the value at the same reference.
>>> x = [1, 'foo']
>>> id(x)
4493994032
>>> x.append('bar')
>>> x
[1, 'foo', 'bar']
>>> id(x)
4493994032
>>>
So, if your variable is pointing to a reference and the reference contains an mutable value and the value is updated, the variable will reflect the latest value.
If the reference is overridden, it'll point to whatever the reference is pointing to.
>>> x = [1, 'foo']
>>> y = x # y points to reference of [1, 'foo']
>>> x = [1, 'foo', 'bar'] # reference is overridden. x points to reference of [1, 'foo', 'bar']. This is a new reference. In memory, we now have [1, 'foo'] and [1, 'foo', 'bar'] at two different locations.
>>> y
[1, 'foo']
>>>
>>> x = [1, 'foo']
>>> y = x
>>> x.append(10) # reference is updated
>>> y
[1, 'foo', 10]
>>> x = {'foo': 10}
>>> y = x
>>> x = {'foo': 20, 'bar': 20}
>>> y
{'foo': 10}
>>> x = {'foo': 10}
>>> y = x
>>> x['bar'] = 20 # reference is updated
>>> y
{'foo': 10, 'bar': 20}
>>>
The 2nd part of your questions (common languages) is too broad and Stackoverflow isn't the right forum for that. Please do your research on your languages of interest - there's plenty of information available on each language group and it's forums.
Your solution is in this video...From this video, you can understand it properly.
so please watch it carefully...
https://www.youtube.com/watch?v=_AEJHKGk9ns
unlike other languages, python does not make a copy of the object and assign it to new variable i.e. in your case y=x
What actually python does here is points the reference of y to the value object that was stored in x
let's get it more clear with your example
x=5
after execution of this line, x will be referencing to an object in memory which stores value 5
y = x
now, x and y both will be referencing to the value 5.
x=4
The magic of python starts on this line, keep in mind that here x and y are immutable objects which means their values can not be changed but when you assign 4 to x it simply creates a new block in memory containing 4 and x will now reference to that block, however, y is still referencing the old block which has value 5.
so output will be 5
in case of a mutable object like a list if you do this
x = [5, 6]
y = x
x.append(7)
print(y)
This will print
[5, 6, 7]
as output
x = 5
id(x) = 94516543976904 // Here it will create a new object
y = x
id(y) = 94516543976904 // Here it will not create a new object instead it will refer to x (because both of them have same values).
x = 4
id(x) = 94516543976928 // Here we are changing the value of x. A new object will be created because integer are immutable.

What is the need of ellipsis[...] while modifying array values in numpy?

import numpy as np
a = np.arange(0,60,5)
a = a.reshape(3,4)
for x in np.nditer(a, op_flags = ['readwrite']):
x[...] = 2*x
print 'Modified array is:'
print a
In the above code, why can't we simply write x=2*x instead of x[...]=2*x?
No matter what kind of object we were iterating over or how that object was implemented, it would be almost impossible for x = 2*x to do anything useful to that object. x = 2*x is an assignment to the variable x; even if the previous contents of the x variable were obtained by iterating over some object, a new assignment to x would not affect the object we're iterating over.
In this specific case, iterating over a NumPy array with np.nditer(a, op_flags = ['readwrite']), each iteration of the loop sets x to a zero-dimensional array that's a writeable view of a cell of a. x[...] = 2*x writes to the contents of the zero-dimensional array, rather than rebinding the x variable. Since the array is a view of a cell of a, this assignment writes to the corresponding cell of a.
This is very similar to the difference between l = [] and l[:] = [] with ordinary lists, where l[:] = [] will clear an existing list and l = [] will replace the list with a new, empty list without modifying the original. Lists don't support views or zero-dimensional lists, though.

Does slice operation allocate a new object always?

I am confused about the slice operation.
>>> s = "hello world"
>>> y = s[::]
>>> id(s)
4507906480
>>> id(y)
4507906480 # they are the same - no new object was created
>>> z = s[:2]
>>> z
'he'
>>> id(z)
4507835488 # z is a new object
What allocation rule does slice operation follow?
For most built-in types, slicing is always a shallow copy... in the sense that modifying the copy will not modify the original. This means that for immutable types, an object counts as a copy of itself. The copy module also uses this concept of "copy":
>>> t = (1, 2, 3)
>>> copy.copy(t) is t
True
Objects are free to use whatever allocation strategy they choose, as long as they implement the semantics they document. y can be the same object as s, but z cannot, because s and z store different values.

Variable Assignment: Pointers in Python and Containers in Other Languages

I am reading a book, A Whirlwind Tour of Python by, Jake VanderPlas. He is explaining a consequence of "variable as pointer" in python to be aware of. He uses the following example to explain:
x = [1, 2, 3]
y = x
print(y) ==> [1, 2, 3]
x.append(4)
print(y) ==> [1, 2, 3, 4]
This shows that x and y are pointing to the same object as modifying x changes y (if the object is mutable).
My question is in other languages that assign variables as containers, when you do the y = x assignment does that mean you are simply creating a copy of x with the same elements as in x so that when you modify x, y is unaffected? I think that is what he implies, but its not clearly stated. But I am also wondering whether the strict type declaration in such languages has more to do with the fact that y would be unaffected if x changes?

Why does 'remove' mutate a list when called locally? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Understanding Python's call-by-object style of passing function arguments
I recently came across this:
x = [1,2,3]
def change_1(x):
x = x.remove(x[0])
return x
Which results in:
>>> change_1(x)
>>> x
[2, 3]
I find this behavior surprising, as I thought that whatever goes in inside a function has no effect on outside variables. Furthermore, I constructed an example where basically do the same thing but without using remove:
x = [1,2,3]
def change_2(x):
x = x[1:]
return x
Which results in:
>>> change_2(x)
[2, 3] # Also the output prints out here not sure why this is
>>> x
[1, 2, 3]
And I get the result that I would expect, the function does not change x.
So it must be something remove specific that has effect. What is going on here?
One of the things that is confusing is that you've called lots of different things 'x'. For example:
def change_1(x): # the x parameter is a reference to the global 'x' list
x = x.remove(x[0]) # on the left, though, is a new 'x' that is local to the function
return x # the local x is returned
>>> x = [1, 2, 3]
>>> y = change_1(x) # assign the return value to 'y'
>>> print y
None # this is None because x.remove() assigned None to the local 'x' inside the function
>>> print x
[2, 3] # but x.remove() modified the global x inside the function
def change_2(x):
x = x[1:] # again, x on left is local, it gets a copy of the slice, but the 'x' parameter is not changed
return x # return the slice (copy)
>>> x = [1, 2, 3]
>>> y = change_2(x)
>>> print x
[1, 2, 3] # the global 'x' is not changed!
>>> print y
[2, 3] # but the slice created in the function is assigned to 'y'
You would get the same result if you called the parameter of your function n, or q.
It's not the variable name that's being affected. The x in the scope of your list and the x outside that scope are two different "labels". Since you passed the value that x was attached to, to change_1() however, they are both referring to the same object. When you do x.remove() on the object in the function, you are basically saying: "get the object that x refers to. Now remove an element from that object. This is very different than a lhs assignment. If you did y=0; x=y inside your function. Your not doing anything to the list object. You're basically just tearing the 'x' label of of [1, 2. 3] and putting it on whatever y is pointing to.
Your variable x is just a reference to the list you've created. When you call that function, you are passing that reference by value. But in the function, you have a reference to the same list So when the function modifies it, it is modified at any scope.
Also, when executing commands in the interactive interpreter, python prints the return value, if it is not assigned to a variable.

Categories

Resources