Changes to copies of object mutate original object - python

I have a class within which there is a DataFrame type property. I want to be able to perform arithmetic on the objects using the built-ins while keeping the original objects immutable. Unfortunately, the operations seem to be mutating the original objects as well. Here's an example:
import numpy as np
import pandas as pd
class Container:
def __init__(self):
self.data = pd.DataFrame()
def generate(self):
self.data = pd.DataFrame(np.random.randint(0,100,size=(100, 1)), columns=['A'])
return self
def __add__(self, other):
copy = self
new = Container()
new.data['A'] = copy.data.eval(f"A + {0}".format(other))
return new
one = Container().generate()
two = one + 1
print(one.data == two.data)
I think the problem is the copy = self line, but I can't seem to preserve the original object even using the copy() method.
How do I make sure the original object doesn't change when a new one is created from it?

Surprisingly, while copy = self isn't a copy, your bug doesn't actually have anything to do with that. I don't think you even need a copy there.
Your bug is due to double-formatting a string:
f"A + {0}".format(other)
f"A + {0}" is an f-string. Unlike format, it evaluates the text 0 as a Python expression and substitutes the string representation of the resulting object into the resulting string, producing "A + 0". Calling format on that doesn't do anything, since there's no format placeholder left. You end up calling
copy.data.eval("A + 0")
instead of adding what you wanted to add.

Did you deepcopy?
from copy import deepcopy
dupe=deepcopy(thing)
#now thing and dupe are two separate objects

Related

Quick way to convert all instance variables in a class, to a list (Python)

I have created a class with around 100+ instance variables (as it will be used in a function to do something else).
Is there a way to translate all the instance variables; into an array list. Without manually appending each instance variable.
For instance:
class CreateHouse(object):
self.name = "Foobar"
self.title = "FooBarTest"
self.value = "FooBarValue"
# ...
# ...
# (100 more instance variables)
Is there a quicker way to append all these items to a list:
Quicker than:
theList = []
theList.append(self.name)
theList.append(self.title)
theList.append(self.value)
# ... (x100 elements)
The list would be used to perform another task, in another class/method.
The only solution (without totally rethinking your whole design - which FWIW might be an option to consider, cf my comments on your question) is to have a list of the attribute names (in the order you want them in the final list) and use getattr
class MonstruousGodClass(object):
_fields_list = ["name", "title", "value", ] #etc...
def as_list(self):
return [getattr(self, fieldname) for fieldname in self._fields_list]
Now since, as I mentionned in a comment, a list is NOT the right datatype here (from a semantical POV at least), you may want to use a dict instead - which makes the code much simpler:
import copy
def as_dict(self):
# we return a deepcopy to avoid unexpected side-effects
return copy.deepcopy(self.__dict__)

Assigning an (OOP) object to another

I was trying to assign a Python object to another in-place using a member function such as replace_object() below. However, as you can see, object_A remains unchanged and the only way to copy object_B is to create an entirely new object object_C, which defeats the purpose of in-place assignment.
What is going on here and how can I make the assignment in-place?
class some_class():
def __init__(self, attribute):
self.attribute = attribute
def replace_object(self, new_object):
self = new_object
# Does this line even have any effect?
self.attribute = new_object.attribute
self.new_attribute = 'triangle'
return self
object_A = some_class('yellow')
print(object_A.attribute) # yellow
object_B = some_class('green')
object_C = object_A.replace_object(object_B)
print(object_A.attribute) # yellow
print(object_C.attribute) # green
#print(object_A.new_attribute) # AttributeError!
print(object_B.new_attribute) # triangle
print(object_C.new_attribute) # triangle
I also tried to play around with deep copies using copy.copy(), but to no avail.
An interesting twist to this is that if I replace
object_C = object_A.replace_object(object_B)
with
object_A = object_A.replace_object(object_B)
then I get what I want. But why can't the same result be achieved by the statement self = new_object statement within replace_object()?
PS: I have a very good reason to do this in-place assignment, so although it may not be best practice in general, just go along with me here.
You can't 'assign an object to another'. You can assign new and existing objects to new and existing names.
self = new_object only says 'from now on the name self will refer to new_object', and does nothing to the old object. (Note self is just a variable name like any other and only by convention refers to an object within a class definition.)
The subsequent command self.attribute = new_object.attribute has no effect because self has already become a duplicate label for the new_object.
You could copy all the properties of a new object to the old object. You would end up with two distinct objects with different names and identical properties. A test of equality (a == b) would return false unless you overrode the equality operator for these objects.
To copy all the properties inline you could do something like this:
def replace_object(self, new_object):
self.__dict__ = new_object.__dict__.copy() # just a shallow copy of the attributes
There are very likely better ways to do whatever it is you want to do.

Why does .append() not work on this list?

I have an object scene which is an instance of class Scene and has a list children which returns:
[<pythreejs.pythreejs.Mesh object at 0x000000002E836A90>, <pythreejs.pythreejs.SurfaceGrid object at 0x000000002DBF9F60>, <pythreejs.pythreejs.Mesh object at 0x000000002E8362E8>, <pythreejs.pythreejs.AmbientLight object at 0x000000002E8366D8>, <pythreejs.pythreejs.DirectionalLight object at 0x000000002E836630>]
If i want to update this list with a point which has type:
<class 'pythreejs.pythreejs.Mesh'>
I need to execute:
scene.children = list(scene.children) + [point]
Usually, I would execute:
scene.children.append(point)
However, while these two approaches both append point, only the first actually updates the list and produce the expected output (that is; voxels on a grid). Why?
The full code can be found here.
I am guessing your issue is due to children being a property (or other descriptor) rather than a simple attribute of the Scene instance you're interacting with. You can get a list of the children, or assign a new list of children to the attribute, but the lists you're dealing with are not really how the class keeps track of its children internally. If you modify the list you get from scene.children, the modifications are not reflected in the class.
One way to test this would be to save the list from scene.children several times in different variables and see if they are all the same list or not. Try:
a = scene.children
b = scene.children
c = scene.children
print(id(a), id(b), id(c))
I suspect you'll get different ids for each list.
Here's a class that demonstrates the same issue you are seeing:
class Test(object):
def __init__(self, values=()):
self._values = list(values)
#property
def values(self):
return list(self._values)
#values.setter
def values(self, new_values):
self._values = list(new_values)
Each time you check the values property, you'll get a new (copied) list.
I don't think there's a fix that is fundamentally different than what you've found to work. You might streamline things a little by by using:
scene.children += [point]
Because of how the += operator in Python works, this extends the list and then reassigns it back to scene.children (a += b is equivalent to a = a.__iadd__(b) if the __iadd__ method exists).
Per this issue, it turns out this is a traitlets issue. Modifying elements of self.children does not trigger an event notification unless a new list is defined.

Python: How do I pass a string by reference?

From this link: How do I pass a variable by reference?, we know, Python will copy a string (an immutable type variable) when it is passed to a function as a parameter, but I think it will waste memory if the string is huge. In many cases, we need to use functions to wrap some operations for strings, so I want to know how to do it more effective?
Python does not make copies of objects (this includes strings) passed to functions:
>>> def foo(s):
... return id(s)
...
>>> x = 'blah'
>>> id(x) == foo(x)
True
If you need to "modify" a string in a function, return the new string and assign it back to the original name:
>>> def bar(s):
... return s + '!'
...
>>> x = 'blah'
>>> x = bar(x)
>>> x
'blah!'
Unfortunately, this can be very inefficient when making small changes to large strings because the large string gets copied. The pythonic way of dealing with this is to hold strings in an list and join them together once you have all the pieces.
Python does pass a string by reference. Notice that two strings with the same content are considered identical:
a = 'hello'
b = 'hello'
a is b # True
Since when b is assigned by a value, and the value already exists in memory, it uses the same reference of the string. Notice another fact, that if the string was dynamically created, meaning being created with string operations (i.e concatenation), the new variable will reference a new instance of the same string:
c = 'hello'
d = 'he'
d += 'llo'
c is d # False
That being said, creating a new string will allocate a new string in memory and returning a reference for the new string, but using a currently created string will reuse the same string instance. Therefore, passing a string as a function parameter will pass it by reference, or in other words, will pass the address in memory of the string.
And now to the point you were looking for- if you change the string inside the function, the string outside of the function will remain the same, and that stems from string immutability. Changing a string means allocating a new string in memory.
a = 'a'
b = a # b will hold a reference to string a
a += 'a'
a is b # False
Bottom line:
You cannot really change a string. The same as for maybe every other programming language (but don't quote me).
When you pass the string as an argument, you pass a reference. When you change it's value, you change the variable to point to another place in memory. But when you change a variable's reference, other variables that points to the same address will naturally keep the old value (reference) they held.
Wish the explanation was clear enough
In [7]: strs="abcd"
In [8]: id(strs)
Out[8]: 164698208
In [9]: def func(x):
print id(x)
x=x.lower() #perform some operation on string object, it returns a new object
print id(x)
...:
In [10]: func(strs)
164698208 # same as strs, i.e it actually passes the same object
164679776 # new object is returned if we perform an operation
# That's why they are called immutable
But operations on strings always return a new string object.
def modify_string( t ):
the_string = t[0]
# do stuff
modify_string( ["my very long string"] )
If you want to potentially change the value of something passed in, wrap it in a dict or a list:
This doesn't change s
def x(s):
s += 1
This does change s:
def x(s):
s[0] += 1
This is the only way to "pass by reference".
wrapping the string into a class will make it pass by reference:
class refstr:
"wrap string in object, so it is passed by reference rather than by value"
def __init__(self,s=""):
self.s=s
def __add__(self,s):
self.s+=s
return self
def __str__(self):
return self.s
def fn(s):
s+=" world"
s=refstr("hello")
fn(s) # s gets modified because objects are passed by reference
print(s) #returns 'hello world'
Just pass it in as you would any other parameter. The contents won't get copied, only the reference will.

Python: make a copy of object when equal old object to new

I've created new class based on default str class. I've also changed default methods like __add__, __mul__, __repr__ etc. But I want to change default behaviour when user equal new variable to old one. Look what I have now:
a = stream('new stream')
b = a
b += ' was modified'
a == b
>>> True
print a
>>> stream('new stream was modified')
print b
>>> stream('new stream was modified')
So as you see each time I modify second variable Python also changes original variable. As I understand Python simply sends adress of variable a to variable b. Is it possible to make a copy of variable on creation like in usual str? As I think I need smth like new in C++.
a = 'new string'
b = a
b += ' was modified'
a == b
>>> False
P.S. Creation of the object begins in self.new() method. Creation is made like this:
def __new__(self, string):
return(str.__new__(self, string))
It is more complicated, because it takes care of unicode and QString type, first getting str object from them, but I think it's not neccessary.
I don't believe you can change the behavior of the assignment operator, but there are explicit ways to create a copy of an object rather than just using a reference. For a complex object, take a look at the copy module. For a basic sequence type (like str), the following works assuming you're implementing slice properly:
Code
a = str('abc')
#A slice creates a copy of a sequence object.
#[:] creates a copy of the entire thing.
b = a[:]
#Since b is a full copy of a, this will not modify a
b += ' was modified'
#Check the various values
print('a == b' + str(a == b))
print(a)
print(b)
Output
False
abc
abc was modified

Categories

Resources