How to manage access to a mutable attribute in Python

How to manage access to a mutable attribute in Python - python

In Python, we can use the #property decorator to manage access to attributes. For example, if we define the class:
class C:
def __init__(self,value):
self._x = value
#property
def x(self):
"""I'm the 'x' property."""
return self._x
we can get the value of x, but not change it:
c = C(1)
#c.x = 4 # <= this would raise an AttributeError: can't set attribute
However, if the attribute is of a mutable type (e.g., a list), we can set a different value for a position of the attribute:
c = C([0,0])
c.x[0] = 1 # <= this works
Is there a way to prevent it? If x is a list, I would like to able to change the value of positions of x only using methods of class C.

One way to do this would be to return a copy of the attribute, rather than the list itself.
>>> class C:
... def __init__(self, value):
... self._x = value
... #property
... def x(self):
... return self._x[:]
...
>>> c = C([1, 2, 3])
>>> c.x
[1, 2, 3]
>>> c.x.append(5)
>>> c.x
[1, 2, 3]
>>> c.x[0] = 6
>>> c.x
[1, 2, 3]
Alternatively, the property could return an iterator over attribute, or a view (for example dict.items() instead of a dict). Returning iterators or views may help limit memory use if the attribute is large, and is more consistent with the behaviour of modern Python builtin functions and types.
If the mutable attribute contains mutable attributes itself - for example a list of lists or dictionaries - then it may be necessary to return copies these objects too. This can be expensive in terms of time and resources if the object graph is deep. See the docs for the copy module for ways to customise how objects are copied.
This technique is commonly used to prevent the problem of aliasing - where other objects hold references to your object's internal state.
It does mean that the copies may go out of sync with the real attribute, but if your code is well designed then other classes should not be holding onto the values of your class anyway.

Related

What's the difference between the following ways of initializing attributes? [duplicate]

Is there any meaningful distinction between:
class A(object):
foo = 5 # some default value
vs.
class B(object):
def __init__(self, foo=5):
self.foo = foo
If you're creating a lot of instances, is there any difference in performance or space requirements for the two styles? When you read the code, do you consider the meaning of the two styles to be significantly different?

There is a significant semantic difference (beyond performance considerations):
when the attribute is defined on the instance (which is what we usually do), there can be multiple objects referred to. Each gets a totally separate version of that attribute.
when the attribute is defined on the class, there is only one underlying object referred to, so if operations on different instances of that class both attempt to set/(append/extend/insert/etc.) the attribute, then:
if the attribute is a builtin type (like int, float, boolean, string), operations on one object will overwrite (clobber) the value
if the attribute is a mutable type (like a list or a dict), we will get unwanted leakage.
For example:
>>> class A: foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[5]
>>> class A:
... def __init__(self): self.foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[]

The difference is that the attribute on the class is shared by all instances. The attribute on an instance is unique to that instance.
If coming from C++, attributes on the class are more like static member variables.

Here is a very good post, and summary it as below.
class Bar(object):
## No need for dot syntax
class_var = 1
def __init__(self, i_var):
self.i_var = i_var
## Need dot syntax as we've left scope of class namespace
Bar.class_var
## 1
foo = MyClass(2)
## Finds i_var in foo's instance namespace
foo.i_var
## 2
## Doesn't find class_var in instance namespace…
## So look's in class namespace (Bar.__dict__)
foo.class_var
## 1
And in visual form
Class attribute assignment
If a class attribute is set by accessing the class, it will override the value for all instances
foo = Bar(2)
foo.class_var
## 1
Bar.class_var = 2
foo.class_var
## 2
If a class variable is set by accessing an instance, it will override the value only for that instance. This essentially overrides the class variable and turns it into an instance variable available, intuitively, only for that instance.
foo = Bar(2)
foo.class_var
## 1
foo.class_var = 2
foo.class_var
## 2
Bar.class_var
## 1
When would you use class attribute?
Storing constants. As class attributes can be accessed as attributes of the class itself, it’s often nice to use them for storing Class-wide, Class-specific constants
class Circle(object):
pi = 3.14159
def __init__(self, radius):
self.radius = radius
def area(self):
return Circle.pi * self.radius * self.radius
Circle.pi
## 3.14159
c = Circle(10)
c.pi
## 3.14159
c.area()
## 314.159
Defining default values. As a trivial example, we might create a bounded list (i.e., a list that can only hold a certain number of elements or fewer) and choose to have a default cap of 10 items
class MyClass(object):
limit = 10
def __init__(self):
self.data = []
def item(self, i):
return self.data[i]
def add(self, e):
if len(self.data) >= self.limit:
raise Exception("Too many elements")
self.data.append(e)
MyClass.limit
## 10

Since people in the comments here and in two other questions marked as dups all appear to be confused about this in the same way, I think it's worth adding an additional answer on top of Alex Coventry's.
The fact that Alex is assigning a value of a mutable type, like a list, has nothing to do with whether things are shared or not. We can see this with the id function or the is operator:
>>> class A: foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
True
>>> class A:
... def __init__(self): self.foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
False
(If you're wondering why I used object() instead of, say, 5, that's to avoid running into two whole other issues which I don't want to get into here; for two different reasons, entirely separately-created 5s can end up being the same instance of the number 5. But entirely separately-created object()s cannot.)
So, why is it that a.foo.append(5) in Alex's example affects b.foo, but a.foo = 5 in my example doesn't? Well, try a.foo = 5 in Alex's example, and notice that it doesn't affect b.foo there either.
a.foo = 5 is just making a.foo into a name for 5. That doesn't affect b.foo, or any other name for the old value that a.foo used to refer to.* It's a little tricky that we're creating an instance attribute that hides a class attribute,** but once you get that, nothing complicated is happening here.
Hopefully it's now obvious why Alex used a list: the fact that you can mutate a list means it's easier to show that two variables name the same list, and also means it's more important in real-life code to know whether you have two lists or two names for the same list.
* The confusion for people coming from a language like C++ is that in Python, values aren't stored in variables. Values live off in value-land, on their own, variables are just names for values, and assignment just creates a new name for a value. If it helps, think of each Python variable as a shared_ptr<T> instead of a T.
** Some people take advantage of this by using a class attribute as a "default value" for an instance attribute that instances may or may not set. This can be useful in some cases, but it can also be confusing, so be careful with it.

There is one more situation.
Class and instance attributes is Descriptor.
# -*- encoding: utf-8 -*-
class RevealAccess(object):
def __init__(self, initval=None, name='var'):
self.val = initval
self.name = name
def __get__(self, obj, objtype):
return self.val
class Base(object):
attr_1 = RevealAccess(10, 'var "x"')
def __init__(self):
self.attr_2 = RevealAccess(10, 'var "x"')
def main():
b = Base()
print("Access to class attribute, return: ", Base.attr_1)
print("Access to instance attribute, return: ", b.attr_2)
if __name__ == '__main__':
main()
Above will output:
('Access to class attribute, return: ', 10)
('Access to instance attribute, return: ', <__main__.RevealAccess object at 0x10184eb50>)
The same type of instance access through class or instance return different result!
And i found in c.PyObject_GenericGetAttr definition，and a great post.
Explain
If the attribute is found in the dictionary of the classes which make up.
the objects MRO, then check to see if the attribute being looked up points to a Data Descriptor (which is nothing more that a class implementing both the __get__ and the __set__ methods).
If it does, resolve the attribute lookup by calling the __get__ method of the Data Descriptor (lines 28–33).

How to prevent class field(list) change in Python?

For example, I have a class with a field __x, which is a list:
class C():
def __init__(self, xx):
self.__x = xx
#property
def x(self):
return self.__x
#x.setter
def x(self, xx):
raise Exception("Attempt to change an immutable field")
I can prevent changes such as these:
c = C([1,2,3])
c.x = [3,2,1]
But how can I prevent a change such as this?
c.x.append(4)

In the final analysis, you cannot protect your objects from inspection and manipulation.
Also, always ask yourself "from whom, exactly?" when you want to "protect" data.
Sometimes it's just not worth the effort to code around users not reading the documentation.
That being said, you could consider return tuple(self.__x) in the getter.
On the other hand, if __x contains other mutable objects, that would not prevent a user from manipulating those inner objects. (return list(self.__x) would also return a shallow copy of the data, but with less implicit "hey, I'm supposed to be immutable!" signaling.)
Something you should definitely consider is to change self.__x = xx to self.__x = list(xx) in the __init__ method, such that users doing
var = []
c = C(var)
can't "easily" (or by mistake, and again, there could be mutable inner objects) change the state of c by mutating var.

The simplest approach would be to accept an iterable on __init__ and turn it to a tuple internally:
class C(object):
def __init__(self, iterable):
self._tuple = tuple(iterable)
#property
def x(self):
return self._tuple
#x.setter
def x(self, value):
raise RuntimeError('can\'t reset the x attribute.')
c = C([1, 2, 3])
# c.x = 'a' Will raise 'RuntimeError: can't reset the x attribute.'
print(c.x)
A design like this one makes any object instantiated from the class immutable, so that mutating operations should return new objects instead of changing the state of the current one.
Let's say for instance that you want to implement a function that increment by one each item in self.x. With this approach you need to write something similar to:
def increment_by_one(c):
return C(t+1 for t in c.x)
As there's a cost associated with creating and destroying objects the trade-offs between this approach (which prevents mutation of the x attribute) and the one suggested by #timgeb should be evaluated on your use-case.

does #property update changed elements in an attribute or calculates it again?

I was wondering if using the #property in python to update an attribute overwrites it or simply updates it? As the speed is very different in the 2 cases.
And in case it gets overwritten, what alternative can I use? Example:
class sudoku:
def __init__(self,puzzle):
self.grid={(i,j):puzzle[i][j] for i in range(9) for j in range(9)}
self.elements
self.forbidden=set()
#property
def elements(self):
self.rows=[[self.grid[(i,j)] for j in range(9)] for i in range(9)]
self.columns=[[self.grid[(i,j)] for i in range(9)] for j in range(9)]
self.squares={(i,j): [self.grid[(3*i+k,3*j+l)] for k in range(3) for l in range(3)] for i in range(3) for j in range(3) }
self.stack=[self.grid]
self.empty={k for k in self.grid.keys() if self.grid[k]==0}
Basically, I work with the grid method, and whenever I need to update the other attributes I call elements. I prefer to call it manually tho. The question, however, is that if I change self.grid[(i,j)], does python calculate each attribute from scratch because self.grid was changed or does it only change the i-th row, j-th column etc?
Thank you
edit: added example code

As is, your question is totally unclear - but anyway, since you don't seem to understand what a property is and how it works...
class Obj(object):
def __init__(self, x, y):
self.x = x
#property
def x(self):
return self._x / 2
#x.setter
def x(self, value):
self._x = value * 2
Here we have a class with a get/set ("binding") property x, backed by a protected attribute _x.
The "#property" syntax here is mainly syntactic sugar, you could actually write this code as
class Obj(object):
def __init__(self, x, y):
self.x = x
self.y = y
def get_x(self):
return self._x / 2
def set_x(self, value):
self._x = value * 2
x = property(fget=get_x, fset=set_x)
The only difference with the previous version being that the get_x and set_x functions remain available as methods. Then if we have an obj instance:
obj = Obj(2, 4)
Then
x = obj.x
is just a shortcut for
x = obj.get_x()
and
obj.x = 42
is just a shortcut for
obj.set_x(42)
How this "shortcut" works is fully documented here, with a whole chapter dedicated to the property type.
As you can see there's nothing magical here, and once you get (no pun intended) the descriptor protocol and how the property class uses it, you can answer the question by yourself.
Note that properties will ALWAYS add some overhead (vs plain attributes or direct method call) since you have more indirections levels and method calls invoked, so it's best to only use them when it really makes sense.
EDIT: now you posted your code, I confirm that you don't understand Python's "properties" - not only the technical side of it but even the basic concept of a "computed attribute".
The point of computed attributes in general (the builtin property type being just one generic implementation of) is to have the interface of a plain attribute (something you can get the value if with value = obj.attrname and eventually set the value of with obj.attrname = somevalue) but actually invoking a getter (and eventually a setter) behind the hood.
Your elements "property" while technically implemented as a read-only property, is really a method that initializes half a dozen attributes of your class, doesn't return anything (well it implicitely returns None) and which return value is actually never used (of course). This is definitly not what computed attributes are for. This should NOT be a property, it should be a plain function (with some explicit name such as "setup_elements" or whatever makes sense here).
# nb1 : classes names should be CamelCased
# nb2 : in Python 2x, you want to inherit from 'object'
class Sudoku(object):
def __init__(self,puzzle):
self.grid={(i,j):puzzle[i][j] for i in range(9) for j in range(9)}
self.setup_elements()
self.forbidden=set()
def setup_elements(self):
self.rows=[[self.grid[(i,j)] for j in range(9)] for i in range(9)]
self.columns=[[self.grid[(i,j)] for i in range(9)] for j in range(9)]
self.squares={(i,j): [self.grid[(3*i+k,3*j+l)] for k in range(3) for l in range(3)] for i in range(3) for j in range(3) }
self.stack=[self.grid]
self.empty={k for k, v in self.grid.items() if v==0}
Now to answer your question:
if I change self.grid[(i,j)], does python calculate each attribute from scratch because self.grid was changed
self.grid is a plain attribute, so just rebinding self.grid[(i, j)] doesn't make "python" calculate anything else, of course. None of your object's other attributes will be impacted. Actually Python (the interpreter) has no mind-reading ability and will only do exactly what you asked for, nothing less, nothing more, period.
or does it only change the i-th row, j-th column
This :
obj = Sudoku(some_puzzle)
obj.grid[(1, 1)] = "WTF did you expect ?"
will NOT (I repeat: "NOT") do anything else than assigning the literal string "WTF did you expect ?" to obj.grid[(1, 1)]. None of the other attributes will be updated in any way.
Now if your question was: "if I change something to self.grid and call self.setup_elements() after, will Python recompute all attributes or only update self.rows[xxx] and self.columns[yyy]", then the answer is plain simple: Python will do exactly what you asked for: it will execute self.setup_elements(), line after line, statement after statement. Plain and simple. No magic here, and the only thing you'll get from making it a property instead of a plain method is that you won't have to type the () after to invoke the method.
So if what you expected from making this elements() method a property was to have some impossible magic happening behind the scene to detect that you actually only wanted to recompute impacted elements, then bad news, this is not going to happen, and you will have to explicitely tell the interpreter how to do so. Computed attributes might be part of the solution here, but not by any magic - you will have to write all the code needed to intercept assignments to any of those attributes and recompute what needs to be recomputed.
Beware, since all those attributes are mutable containers, just wrapping each of them into properties won't be enough - consider this:
class Foo(object):
def __init__(self):
self._bar = {"a":1, "b": 2}
#property
def bar(self):
print("getting self._bar")
return self._bar
#bar.setter
def bar(self, value):
print("setting self._bar to {}".format(value))
self._bar = value
>>> f = Foo()
>>> f.bar
getting self._bar
{'a': 1, 'b': 2}
>>> f.bar['z'] = "WTF ?"
getting self._bar
>>> f.bar
getting self._bar
{'a': 1, 'b': 2, 'z': 'WTF ?'}
>>> bar = f.bar
getting self._bar
>>> bar
{'a': 1, 'b': 2, 'z': 'WTF ?'}
>>> bar["a"] = 99
>>> f.bar
getting self._bar
{'a': 99, 'b': 2, 'z': 'WTF ?'}
As you can see, we could mutate self._bar without the bar.setter function ever being invoked - because f.bar["x"] = "y" is actually NOT assigning to f.bar (which would need f.bar = "something else") but _getting_ thef._bardict thru theFoo.bargetter, then invokingsetitem()` on this dict.
So if you want to intercept something like f.bar["x"] = "y", you will also have to write some dict-like object that will intercept all mutators access on the dict itself ( __setitem__, but also __delitem__ etc) and notify f of those changes, and change your property so that it returns an instance of this dict-like objects instead.

Specifiying a data attribute in multiple instasnces of the same class in python

I have already seen this post and even though the symptoms are similar, the way I am defining my class is different as I am using __init__:
>>> class foo(object):
... def __init__(self,x):
... self.x = x
...
>>>
I next define an instance of this class:
>>> inst1 = foo(10)
>>> inst1.x
10
Now, I would like to copy the same instance into a new variable and then change the value of x:
>>> inst2 = inst1
>>> inst2.x = 20
>>> inst2.x
20
It seems, however, (like a class-level attribute) all data attributes are shared between inst1 and inst2 since changing the value of x for inst2 will also change that for inst1:
>>> inst1.x
20
I do know that an alternative method is to say:
>>> inst2 = foo(20)
However, I don't like to do this because my actual class takes a lot of input arguments out of which I need to change only one or two specific data attribute(s) when creating different instances (i.e., the rest of input arguments remain the same for all instances. Any suggestions is greatly appreciated!

You are not copying the class instance (the object). The following line
>>> inst2 = inst1
copies a reference of inst1. It does not copy the object.
Easy way to confirm this is to look at the result of the builtin id()-function, which results a unique memory address for each python object. The value for both inst2 and inst1 should be the same in this case.
And an easy way to solve it is to use Joran's answer.

class foo(object):
def __init__(self,x):
self.x = x
def copy(self):
return foo(self.x)
foo2 = foo1.copy()
is a pretty safe way to implement it
there is also the builtin copy method
from copy import deepcopy
foo2 = deepcopy(foo1)
if you define a __copy__ method, the copy.copy will use your own __copy__
class foo2:
def __init__(self,val):
self.state = 0
self.val = val
def __copy__(self):
newfoo = foo2(self.val)
newfoo.state = self.state
print "Copied self:",self
return newfoo
from copy import copy
f = foo2(5)
f2 = copy(f) #will do copy we defined
this will make your code a somewhat more generic allowing you to just use the copy method on all objects without worrying about what the object is

Python doesn't allocate new space for objects instantiated outside the constructor of a class-- expected behavior? [duplicate]

Is there any meaningful distinction between:
class A(object):
foo = 5 # some default value
vs.
class B(object):
def __init__(self, foo=5):
self.foo = foo
If you're creating a lot of instances, is there any difference in performance or space requirements for the two styles? When you read the code, do you consider the meaning of the two styles to be significantly different?

There is a significant semantic difference (beyond performance considerations):
when the attribute is defined on the instance (which is what we usually do), there can be multiple objects referred to. Each gets a totally separate version of that attribute.
when the attribute is defined on the class, there is only one underlying object referred to, so if operations on different instances of that class both attempt to set/(append/extend/insert/etc.) the attribute, then:
if the attribute is a builtin type (like int, float, boolean, string), operations on one object will overwrite (clobber) the value
if the attribute is a mutable type (like a list or a dict), we will get unwanted leakage.
For example:
>>> class A: foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[5]
>>> class A:
... def __init__(self): self.foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[]

The difference is that the attribute on the class is shared by all instances. The attribute on an instance is unique to that instance.
If coming from C++, attributes on the class are more like static member variables.

Here is a very good post, and summary it as below.
class Bar(object):
## No need for dot syntax
class_var = 1
def __init__(self, i_var):
self.i_var = i_var
## Need dot syntax as we've left scope of class namespace
Bar.class_var
## 1
foo = MyClass(2)
## Finds i_var in foo's instance namespace
foo.i_var
## 2
## Doesn't find class_var in instance namespace…
## So look's in class namespace (Bar.__dict__)
foo.class_var
## 1
And in visual form
Class attribute assignment
If a class attribute is set by accessing the class, it will override the value for all instances
foo = Bar(2)
foo.class_var
## 1
Bar.class_var = 2
foo.class_var
## 2
If a class variable is set by accessing an instance, it will override the value only for that instance. This essentially overrides the class variable and turns it into an instance variable available, intuitively, only for that instance.
foo = Bar(2)
foo.class_var
## 1
foo.class_var = 2
foo.class_var
## 2
Bar.class_var
## 1
When would you use class attribute?
Storing constants. As class attributes can be accessed as attributes of the class itself, it’s often nice to use them for storing Class-wide, Class-specific constants
class Circle(object):
pi = 3.14159
def __init__(self, radius):
self.radius = radius
def area(self):
return Circle.pi * self.radius * self.radius
Circle.pi
## 3.14159
c = Circle(10)
c.pi
## 3.14159
c.area()
## 314.159
Defining default values. As a trivial example, we might create a bounded list (i.e., a list that can only hold a certain number of elements or fewer) and choose to have a default cap of 10 items
class MyClass(object):
limit = 10
def __init__(self):
self.data = []
def item(self, i):
return self.data[i]
def add(self, e):
if len(self.data) >= self.limit:
raise Exception("Too many elements")
self.data.append(e)
MyClass.limit
## 10

Since people in the comments here and in two other questions marked as dups all appear to be confused about this in the same way, I think it's worth adding an additional answer on top of Alex Coventry's.
The fact that Alex is assigning a value of a mutable type, like a list, has nothing to do with whether things are shared or not. We can see this with the id function or the is operator:
>>> class A: foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
True
>>> class A:
... def __init__(self): self.foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
False
(If you're wondering why I used object() instead of, say, 5, that's to avoid running into two whole other issues which I don't want to get into here; for two different reasons, entirely separately-created 5s can end up being the same instance of the number 5. But entirely separately-created object()s cannot.)
So, why is it that a.foo.append(5) in Alex's example affects b.foo, but a.foo = 5 in my example doesn't? Well, try a.foo = 5 in Alex's example, and notice that it doesn't affect b.foo there either.
a.foo = 5 is just making a.foo into a name for 5. That doesn't affect b.foo, or any other name for the old value that a.foo used to refer to.* It's a little tricky that we're creating an instance attribute that hides a class attribute,** but once you get that, nothing complicated is happening here.
Hopefully it's now obvious why Alex used a list: the fact that you can mutate a list means it's easier to show that two variables name the same list, and also means it's more important in real-life code to know whether you have two lists or two names for the same list.
* The confusion for people coming from a language like C++ is that in Python, values aren't stored in variables. Values live off in value-land, on their own, variables are just names for values, and assignment just creates a new name for a value. If it helps, think of each Python variable as a shared_ptr<T> instead of a T.
** Some people take advantage of this by using a class attribute as a "default value" for an instance attribute that instances may or may not set. This can be useful in some cases, but it can also be confusing, so be careful with it.

There is one more situation.
Class and instance attributes is Descriptor.
# -*- encoding: utf-8 -*-
class RevealAccess(object):
def __init__(self, initval=None, name='var'):
self.val = initval
self.name = name
def __get__(self, obj, objtype):
return self.val
class Base(object):
attr_1 = RevealAccess(10, 'var "x"')
def __init__(self):
self.attr_2 = RevealAccess(10, 'var "x"')
def main():
b = Base()
print("Access to class attribute, return: ", Base.attr_1)
print("Access to instance attribute, return: ", b.attr_2)
if __name__ == '__main__':
main()
Above will output:
('Access to class attribute, return: ', 10)
('Access to instance attribute, return: ', <__main__.RevealAccess object at 0x10184eb50>)
The same type of instance access through class or instance return different result!
And i found in c.PyObject_GenericGetAttr definition，and a great post.
Explain
If the attribute is found in the dictionary of the classes which make up.
the objects MRO, then check to see if the attribute being looked up points to a Data Descriptor (which is nothing more that a class implementing both the __get__ and the __set__ methods).
If it does, resolve the attribute lookup by calling the __get__ method of the Data Descriptor (lines 28–33).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to manage access to a mutable attribute in Python - python

Related

What's the difference between the following ways of initializing attributes? [duplicate]

How to prevent class field(list) change in Python?

does #property update changed elements in an attribute or calculates it again?

Specifiying a data attribute in multiple instasnces of the same class in python

Python doesn't allocate new space for objects instantiated outside the constructor of a class-- expected behavior? [duplicate]

Categories

Resources