Python - changing object attributes initialised from dictionary affects the original dictionary? - python

I have a class with attributes initialised based on a user-defined dictionary (read in using JSON):
class Knight(object):
def __init__(self, traits):
for k, v in traits.items():
self.__setattr__(k, v)
traitfile = json.load(open(input(), 'r'))
# Where the input file is e.g.
# {'helmet': 'horned',
# 'sword': 'big',
# 'words': ['Ni!', 'Peng', 'Neee-Wom!']}
When I instantiate the object, helmet, sword, and words become attributes as expected. But if I then change an instance attribute, it seems like it affects the original dictionary from which the object was initialised in the first place:
tall_knight = Knight(traitfile)
print(tall_knight.words) # prints ['Ni!', 'Peng', 'Neee-Wom!']
print(traitfile['words']) # also prints ['Ni!', 'Peng', 'Neee-Wom!']
tall_knight.words.append('Ekke ekke!')
print(tall_knight.words) # prints ['Ni!', 'Peng', 'Neee-Wom!', 'Ekke ekke!'] as expected
print(traitfile['words']) # also prints ['Ni!', 'Peng', 'Neee-Wom!', 'Ekke ekke!'] NOT EXPECTED
I did not expect the change to the object's attribute would affect the dictionary it was initialised from. I thought the whole point of instantiation is that the instance is, well, its own instance! What is going on here?! (And how can I stop it?)

Your problem is that traitfile['words'] is a list, and when you copy it to tall_knight.words, you are copying a reference to the list, not the values in it. So when you modify the list in tall_knight, you also modify the value in traitfile['words']. You can work around this by making a copy of the value in the object using copy.copy (or copy.deepcopy if the values may be nested):
import copy()
class Knight(object):
def __init__(self, traits):
for k, v in traits.items():
self.__setattr__(k, copy.copy(v))

As list is mutable object in python so when you create an object the reference would be the same behind the scene so you need to call the list.copy() which will create a copy with different reference then your changes will not reflect in the original one.
first_list = {"a":1, "b":[2,3,4]}
second_list = first_list
second_list["b"].append(34)
print("first one: ", first_list)
print("second one: ", second_list)
Output:
first one: {'a': 1, 'b': [2, 3, 4, 34]}
second one: {'a': 1, 'b': [2, 3, 4, 34]}
So the better to avoid the changes into the original one is to use the copy function: second_list = first_list.copy()
Including more as per your case you also need to create the copy before changing the object:
import copy
class Knight(object):
def __init__(self, traits):
for k, v in traits.items():
self.__setattr__(k, copy.deepcopy(v))
traitfile = json.load(open(input(), 'r'))
Here is the link for reference: Mutable and Immutable datatypes

Related

How to make a class method in place?

Is there a way to make a method in a python class modify its data in place, specifically for lists?
For example, I want to write a function that behave like list.append() by modifying the origonal list instead of returning a new one
I have already
class newlist(list):
def add(self, otherlist):
self = self+otherlist
A method written like that does not seem to modify the variable it is called on.
Obviosly, I could add return self at the end, but then it would have to be called with mylist = mylist.add(stuff) to actually modify mylist. How do write the function so it will modify mylist when called with just mylist.add(stuff)?
Since newlist is a subclass of list it already has a method that does exactly what you want: extend. You don't have to write any code in newlist at all.
But if you really want to reinvent the wheel you can call extend within your new add method and get the same result:
class newlist(list):
def add(self, otherlist):
self.extend(otherlist)
mylist = newlist()
mylist.append(0)
mylist.extend([1,2,3])
print(mylist)
mylist = newlist()
mylist.append(0)
mylist.add([1,2,3])
print(mylist)
[0, 1, 2, 3]
[0, 1, 2, 3]
Plain assignment to self will rebind it; self is bound to the new object, and you've lost the reference to the original type.
The easiest approach here it to use lists existing overload for augmented assignment, +=:
class newlist(list):
def add(self, otherlist):
self += otherlist
That mutates self in place, rather than making a new object and reassigning it (it works because list is a mutable type; it wouldn't work for an immutable type without an overload for +=). You could also implement it as self.extend(otherlist), or for extra cleverness, don't even bother to write a Python level implementation at all and just alias the existing list method:
class newlist(list):
add = list.__iadd__ # Or add = list.extend
Since the definition of add is basically identical to existing += or list.extend behavior, just under a new name, aliasing concisely gets the performance of the built-in function; the only downside is that introspection (print(newline.add)) will not indicate that the function's name is add (because it's __iadd__ or extend; aliasing doesn't change the function metadata).
Try using the in-place addition += for your function:
class newlist(list):
def add(self, other):
self += other
a = newlist([1,2])
b = newlist([3,4])
a.add(b)
a
# returns:
[1, 2, 3, 4]

Python call by reference issue

As what I have understand on python, when you pass a variable on a function parameter it is already reference to the original variable. On my implementation when I try to equate a variable that I pass on the function it resulted empty list.
This is my code:
#on the main -------------
temp_obj = []
obj = [
{'name':'a1', 'level':0},
{'name':'a2', 'level':0},
{'name':'a3', 'level':1},
{'name':'a4', 'level':1},
{'name':'a5', 'level':2},
{'name':'a6', 'level':2},
]
the_result = myFunction(obj, temp_obj)
print(temp_obj)
#above print would result to an empty list
#this is my problem
#end of main body -------------
def myFunction(obj, new_temp_obj):
inside_list = []
for x in obj[:]:
if x['level'] == 0:
inside_list.append(x)
obj.remove(x) #removing the element that was added to the inside_list
new_temp_obj = obj[:] #copying the remaining element
print(new_temp_obj)
# the above print would result to
#[{'name': 'a3', 'level': 1}, {'name': 'a4', 'level': 1}, {'name': 'a5', 'level': 2}, {'name': 'a6', 'level': 2}]
return inside_list
Am I missing something or did I misunderstand the idea of python call by reference?
Python is not pass-by-reference. It is pass-by-object.
Consider the following two functions:
def f(mylist):
mylist = []
def g(mylist):
mylist.append(1)
Now let's say I call them.
mylist = [1]
f(mylist)
print(mylist)
mylist = [1] # reset the list
g(mylist)
print(mylist)
What would the output be?
If Python were pass-by-value, the functions would take a copy of the list, so modifying it would not affect the original list once you return out of the function. So in both cases, you would be printing the original list, [1].
If Python were pass-by-reference, the functions would accept a reference to the object and modifying it would modify the actual object that the reference references, so the first output would be [] and the second, [1,2].
If you run this example, you will find that the first output is [1] (the list if unaffected) and second output is [1,2] (the list is affected).
O_O
When you do new_temp_obj = obj[:], Python is constructing a new object obj[:] and giving it the name new_temp_obj.
If you were to append, Python would look for the thing called new_temp_obj and add elements to it. The argument you passed in tells it where to look for the list.
You are creating a totally new object at a totally new location in memory and simply giving it the same name,new_temp_obj
new_temp_obj = obj[:] #copying the remaining element
This would make new_temp_obj reference to another new list object. You could use id to see that its id changes with this assignment.
Change it to:
new_temp_obj[:] = obj[:]
You'll see your expected result.

How to make live connection between two objects

I declare two variables. The first is a dictionary. The second is a list (it is the output of dictionary's '.values()' method).
dictVar={'one':1,'two':2,'three':3}
listVar=dictVar.values()
At this point the content of listVar accurately represents every value stored in dictionary dictVar
Later somewhere down the code the dictionary is updated with a new value:
dictVar['four']=4
Now the content of listVar is "outdated". It does not represent every value stored in dictionary.
In order to keep list updated I have to manually append a new value such as:
dictVar['four']=4
listVar.append(4)
I wonder if there is a way to establish a "live" update between the list variable and dictionary. So every time dictionary is changed the list is updated too.
Use a dictionary view object:
>>> dictVar={'one':1,'two':2,'three':3}
>>> listVar=dictVar.viewvalues()
>>> listVar
dict_values([3, 2, 1])
>>> dictVar['one']=100
>>> listVar
dict_values([3, 2, 100])
>>> dictVar['four']=4
>>> listVar
dict_values([4, 3, 2, 100])
>>> list(listVar)==dictVar.values()
True
Something you could do would be to create a custom class that acts as a wrapper for the dictionary. Whenever you call obj[key] = val, you're implicitly calling that object's __setitem__(self, key, val) method. When you create a custom class, you can overwrite this method to do what you like with it (namely, update an associated list).
Here's a sample class wrapper:
class EnhancedDict(object):
def __init__(self): # The constructor
self.dictVar = {} # Your dictionary
self.listVar = [] # Your list
def __getitem__(self, key): # Equivalent to obj[key]
return self.dictVar[key]
def __setitem__(self, key, val) # Equivalent to obj[key] = val
self.dictVar[key] = val
self.listvar.append(val)
Then the list is automatically updated whenever you add a new item to the dictionary, which you can do easily:
>>> dict_obj = EnhancedDict()
>>> dict_obj["foo"] = "bar" # Automatically updates both the list and the dict
>>> dict_obj["foo"]
'bar'
>>> dict_obj.dictVar
{'foo': 'bar'}
>>> dict_obj.listVar
['bar']
There's also a __delitem__ function you can override to complete the functionality of the class. Lots more information can be found in the docs:
https://docs.python.org/2/reference/datamodel.html

__getitem__ for a list vs a dict

The Dictionary __getitem__ method does not seem to work the same way as it does for List, and it is causing me headaches. Here is what I mean:
If I subclass list, I can overload __getitem__ as:
class myList(list):
def __getitem__(self,index):
if isinstance(index,int):
#do one thing
if isinstance(index,slice):
#do another thing
If I subclass dict, however, the __getitem__ does not expose index, but key instead as in:
class myDict(dict):
def __getitem__(self,key):
#Here I want to inspect the INDEX, but only have access to key!
So, my question is how can I intercept the index of a dict, instead of just the key?
Example use case:
a = myDict()
a['scalar'] = 1 # Create dictionary entry called 'scalar', and assign 1
a['vector_1'] = [1,2,3,4,5] # I want all subsequent vectors to be 5 long
a['vector_2'][[0,1,2]] = [1,2,3] # I want to intercept this and force vector_2 to be 5 long
print(a['vector_2'])
[1,2,3,0,0]
a['test'] # This should throw a KeyError
a['test'][[0,2,3]] # So should this
Dictionaries have no order; there is no index to pass in; this is why Python can use the same syntax ([..]) and the same magic method (__getitem__) for both lists and dictionaries.
When you index a dictionary on an integer like 0, the dictionary treats that like any other key:
>>> d = {'foo': 'bar', 0: 42}
>>> d.keys()
[0, 'foo']
>>> d[0]
42
>>> d['foo']
'bar'
Chained indexing applies to return values; the expression:
a['vector_2'][0, 1, 2]
is executed as:
_result = a['vector_2'] # via a.__getitem__('vector_2')
_result[0, 1, 2] # via _result.__getitem__((0, 1, 2))
so if you want values in your dictionary to behave in a certain way, you must return objects that support those operations.

Delete all objects in a list

I create many object then I store in a list. But I want to delete them after some time because I create news one and don't want my memory goes high (in my case, it jumps to 20 gigs of ram if I don't delete it).
Here is a little code to illustrate what I trying to do:
class test:
def __init__(self):
self.a = "Hello World"
def kill(self):
del self
a = test()
b = test()
c = [a,b]
print("1)Before:",a,b)
for i in c:
del i
for i in c:
i.kill()
print("2)After:",a,b)
A and B are my objects. C is a list of these two objects. I'm trying to delete it definitely with a for-loop in C: one time with DEL and other time with a function. It's not seem to work because the print continue to show the objects.
I need this because I create 100 000 objects many times. The first time I create 100k object, the second time another 100k but I don't need to keep the previous 100k. If I don't delete them, the memory usage goes really high, very quickly.
tl;dr;
mylist.clear() # Added in Python 3.3
del mylist[:]
are probably the best ways to do this. The rest of this answer tries to explain why some of your other efforts didn't work.
cpython at least works on reference counting to determine when objects will be deleted. Here you have multiple references to the same objects. a refers to the same object that c[0] references. When you loop over c (for i in c:), at some point i also refers to that same object. the del keyword removes a single reference, so:
for i in c:
del i
creates a reference to an object in c and then deletes that reference -- but the object still has other references (one stored in c for example) so it will persist.
In the same way:
def kill(self):
del self
only deletes a reference to the object in that method. One way to remove all the references from a list is to use slice assignment:
mylist = list(range(10000))
mylist[:] = []
print(mylist)
Apparently you can also delete the slice to remove objects in place:
del mylist[:] #This will implicitly call the `__delslice__` or `__delitem__` method.
This will remove all the references from mylist and also remove the references from anything that refers to mylist. Compared that to simply deleting the list -- e.g.
mylist = list(range(10000))
b = mylist
del mylist
#here we didn't get all the references to the objects we created ...
print(b) #[0, 1, 2, 3, 4, ...]
Finally, more recent python revisions have added a clear method which does the same thing that del mylist[:] does.
mylist = [1, 2, 3]
mylist.clear()
print(mylist)
Here's how you delete every item from a list.
del c[:]
Here's how you delete the first two items from a list.
del c[:2]
Here's how you delete a single item from a list (a in your case), assuming c is a list.
del c[0]
If the goal is to delete the objects a and b themselves (which appears to be the case), forming the list [a, b] is not helpful. Instead, one should keep a list of strings used as the names of those objects. These allow one to delete the objects in a loop, by accessing the globals() dictionary.
c = ['a', 'b']
# create and work with a and b
for i in c:
del globals()[i]
To delete all objects in a list, you can directly write list = []
Here is example:
>>> a = [1, 2, 3]
>>> a
[1, 2, 3]
>>> a = []
>>> a
[]

Categories

Resources