How should I declare default values for instance variables in Python?

How should I declare default values for instance variables in Python? - python

Should I give my class members default values like this:
class Foo:
num = 1
or like this?
class Foo:
def __init__(self):
self.num = 1
In this question I discovered that in both cases,
bar = Foo()
bar.num += 1
is a well-defined operation.
I understand that the first method will give me a class variable while the second one will not. However, if I do not require a class variable, but only need to set a default value for my instance variables, are both methods equally good? Or one of them more 'pythonic' than the other?
One thing I've noticed is that in the Django tutorial, they use the second method to declare Models. Personally I think the second method is more elegant, but I'd like to know what the 'standard' way is.

Extending bp's answer, I wanted to show you what he meant by immutable types.
First, this is okay:
>>> class TestB():
... def __init__(self, attr=1):
... self.attr = attr
...
>>> a = TestB()
>>> b = TestB()
>>> a.attr = 2
>>> a.attr
2
>>> b.attr
1
However, this only works for immutable (unchangable) types. If the default value was mutable (meaning it can be replaced), this would happen instead:
>>> class Test():
... def __init__(self, attr=[]):
... self.attr = attr
...
>>> a = Test()
>>> b = Test()
>>> a.attr.append(1)
>>> a.attr
[1]
>>> b.attr
[1]
>>>
Note that both a and b have a shared attribute. This is often unwanted.
This is the Pythonic way of defining default values for instance variables, when the type is mutable:
>>> class TestC():
... def __init__(self, attr=None):
... if attr is None:
... attr = []
... self.attr = attr
...
>>> a = TestC()
>>> b = TestC()
>>> a.attr.append(1)
>>> a.attr
[1]
>>> b.attr
[]
The reason my first snippet of code works is because, with immutable types, Python creates a new instance of it whenever you want one. If you needed to add 1 to 1, Python makes a new 2 for you, because the old 1 cannot be changed. The reason is mostly for hashing, I believe.

The two snippets do different things, so it's not a matter of taste but a matter of what's the right behaviour in your context. Python documentation explains the difference, but here are some examples:
Exhibit A
class Foo:
def __init__(self):
self.num = 1
This binds num to the Foo instances. Change to this field is not propagated to other instances.
Thus:
>>> foo1 = Foo()
>>> foo2 = Foo()
>>> foo1.num = 2
>>> foo2.num
1
Exhibit B
class Bar:
num = 1
This binds num to the Bar class. Changes are propagated!
>>> bar1 = Bar()
>>> bar2 = Bar()
>>> bar1.num = 2 #this creates an INSTANCE variable that HIDES the propagation
>>> bar2.num
1
>>> Bar.num = 3
>>> bar2.num
3
>>> bar1.num
2
>>> bar1.__class__.num
3
Actual answer
If I do not require a class variable, but only need to set a default value for my instance variables, are both methods equally good? Or one of them more 'pythonic' than the other?
The code in exhibit B is plain wrong for this: why would you want to bind a class attribute (default value on instance creation) to the single instance?
The code in exhibit A is okay.
If you want to give defaults for instance variables in your constructor I would however do this:
class Foo:
def __init__(self, num = None):
self.num = num if num is not None else 1
...or even:
class Foo:
DEFAULT_NUM = 1
def __init__(self, num = None):
self.num = num if num is not None else DEFAULT_NUM
...or even: (preferrable, but if and only if you are dealing with immutable types!)
class Foo:
def __init__(self, num = 1):
self.num = num
This way you can do:
foo1 = Foo(4)
foo2 = Foo() #use default

Using class members to give default values works very well just so long as you are careful only to do it with immutable values. If you try to do it with a list or a dict that would be pretty deadly. It also works where the instance attribute is a reference to a class just so long as the default value is None.
I've seen this technique used very successfully in repoze which is a framework that runs on top of Zope. The advantage here is not just that when your class is persisted to the database only the non-default attributes need to be saved, but also when you need to add a new field into the schema all the existing objects see the new field with its default value without any need to actually change the stored data.
I find it also works well in more general coding, but it's a style thing. Use whatever you are happiest with.

With dataclasses, a feature added in Python 3.7, there is now yet another (quite convenient) way to achieve setting default values on class instances. The decorator dataclass will automatically generate a few methods on your class, such as the constructor. As the documentation linked above notes, "[t]he member variables to use in these generated methods are defined using PEP 526 type annotations".
Considering OP's example, we could implement it like this:
from dataclasses import dataclass
#dataclass
class Foo:
num: int = 0
When constructing an object of this class's type we could optionally overwrite the value.
print('Default val: {}'.format(Foo()))
# Default val: Foo(num=0)
print('Custom val: {}'.format(Foo(num=5)))
# Custom val: Foo(num=5)

Using class members for default values of instance variables is not a good idea, and it's the first time I've seen this idea mentioned at all. It works in your example, but it may fail in a lot of cases. E.g., if the value is mutable, mutating it on an unmodified instance will alter the default:
>>> class c:
... l = []
...
>>> x = c()
>>> y = c()
>>> x.l
[]
>>> y.l
[]
>>> x.l.append(10)
>>> y.l
[10]
>>> c.l
[10]

You can also declare class variables as None which will prevent propagation. This is useful when you need a well defined class and want to prevent AttributeErrors.
For example:
>>> class TestClass(object):
... t = None
...
>>> test = TestClass()
>>> test.t
>>> test2 = TestClass()
>>> test.t = 'test'
>>> test.t
'test'
>>> test2.t
>>>
Also if you need defaults:
>>> class TestClassDefaults(object):
... t = None
... def __init__(self, t=None):
... self.t = t
...
>>> test = TestClassDefaults()
>>> test.t
>>> test2 = TestClassDefaults([])
>>> test2.t
[]
>>> test.t
>>>
Of course still follow the info in the other answers about using mutable vs immutable types as the default in __init__.

Related

What's the difference between the following ways of initializing attributes? [duplicate]

Is there any meaningful distinction between:
class A(object):
foo = 5 # some default value
vs.
class B(object):
def __init__(self, foo=5):
self.foo = foo
If you're creating a lot of instances, is there any difference in performance or space requirements for the two styles? When you read the code, do you consider the meaning of the two styles to be significantly different?

There is a significant semantic difference (beyond performance considerations):
when the attribute is defined on the instance (which is what we usually do), there can be multiple objects referred to. Each gets a totally separate version of that attribute.
when the attribute is defined on the class, there is only one underlying object referred to, so if operations on different instances of that class both attempt to set/(append/extend/insert/etc.) the attribute, then:
if the attribute is a builtin type (like int, float, boolean, string), operations on one object will overwrite (clobber) the value
if the attribute is a mutable type (like a list or a dict), we will get unwanted leakage.
For example:
>>> class A: foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[5]
>>> class A:
... def __init__(self): self.foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[]

The difference is that the attribute on the class is shared by all instances. The attribute on an instance is unique to that instance.
If coming from C++, attributes on the class are more like static member variables.

Here is a very good post, and summary it as below.
class Bar(object):
## No need for dot syntax
class_var = 1
def __init__(self, i_var):
self.i_var = i_var
## Need dot syntax as we've left scope of class namespace
Bar.class_var
## 1
foo = MyClass(2)
## Finds i_var in foo's instance namespace
foo.i_var
## 2
## Doesn't find class_var in instance namespace…
## So look's in class namespace (Bar.__dict__)
foo.class_var
## 1
And in visual form
Class attribute assignment
If a class attribute is set by accessing the class, it will override the value for all instances
foo = Bar(2)
foo.class_var
## 1
Bar.class_var = 2
foo.class_var
## 2
If a class variable is set by accessing an instance, it will override the value only for that instance. This essentially overrides the class variable and turns it into an instance variable available, intuitively, only for that instance.
foo = Bar(2)
foo.class_var
## 1
foo.class_var = 2
foo.class_var
## 2
Bar.class_var
## 1
When would you use class attribute?
Storing constants. As class attributes can be accessed as attributes of the class itself, it’s often nice to use them for storing Class-wide, Class-specific constants
class Circle(object):
pi = 3.14159
def __init__(self, radius):
self.radius = radius
def area(self):
return Circle.pi * self.radius * self.radius
Circle.pi
## 3.14159
c = Circle(10)
c.pi
## 3.14159
c.area()
## 314.159
Defining default values. As a trivial example, we might create a bounded list (i.e., a list that can only hold a certain number of elements or fewer) and choose to have a default cap of 10 items
class MyClass(object):
limit = 10
def __init__(self):
self.data = []
def item(self, i):
return self.data[i]
def add(self, e):
if len(self.data) >= self.limit:
raise Exception("Too many elements")
self.data.append(e)
MyClass.limit
## 10

Since people in the comments here and in two other questions marked as dups all appear to be confused about this in the same way, I think it's worth adding an additional answer on top of Alex Coventry's.
The fact that Alex is assigning a value of a mutable type, like a list, has nothing to do with whether things are shared or not. We can see this with the id function or the is operator:
>>> class A: foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
True
>>> class A:
... def __init__(self): self.foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
False
(If you're wondering why I used object() instead of, say, 5, that's to avoid running into two whole other issues which I don't want to get into here; for two different reasons, entirely separately-created 5s can end up being the same instance of the number 5. But entirely separately-created object()s cannot.)
So, why is it that a.foo.append(5) in Alex's example affects b.foo, but a.foo = 5 in my example doesn't? Well, try a.foo = 5 in Alex's example, and notice that it doesn't affect b.foo there either.
a.foo = 5 is just making a.foo into a name for 5. That doesn't affect b.foo, or any other name for the old value that a.foo used to refer to.* It's a little tricky that we're creating an instance attribute that hides a class attribute,** but once you get that, nothing complicated is happening here.
Hopefully it's now obvious why Alex used a list: the fact that you can mutate a list means it's easier to show that two variables name the same list, and also means it's more important in real-life code to know whether you have two lists or two names for the same list.
* The confusion for people coming from a language like C++ is that in Python, values aren't stored in variables. Values live off in value-land, on their own, variables are just names for values, and assignment just creates a new name for a value. If it helps, think of each Python variable as a shared_ptr<T> instead of a T.
** Some people take advantage of this by using a class attribute as a "default value" for an instance attribute that instances may or may not set. This can be useful in some cases, but it can also be confusing, so be careful with it.

There is one more situation.
Class and instance attributes is Descriptor.
# -*- encoding: utf-8 -*-
class RevealAccess(object):
def __init__(self, initval=None, name='var'):
self.val = initval
self.name = name
def __get__(self, obj, objtype):
return self.val
class Base(object):
attr_1 = RevealAccess(10, 'var "x"')
def __init__(self):
self.attr_2 = RevealAccess(10, 'var "x"')
def main():
b = Base()
print("Access to class attribute, return: ", Base.attr_1)
print("Access to instance attribute, return: ", b.attr_2)
if __name__ == '__main__':
main()
Above will output:
('Access to class attribute, return: ', 10)
('Access to instance attribute, return: ', <__main__.RevealAccess object at 0x10184eb50>)
The same type of instance access through class or instance return different result!
And i found in c.PyObject_GenericGetAttr definition，and a great post.
Explain
If the attribute is found in the dictionary of the classes which make up.
the objects MRO, then check to see if the attribute being looked up points to a Data Descriptor (which is nothing more that a class implementing both the __get__ and the __set__ methods).
If it does, resolve the attribute lookup by calling the __get__ method of the Data Descriptor (lines 28–33).

Class variable vs instance variable

While learning python through python docs, i came across the following wherein its explained that class variable is common to the class and that any object can change it:
Sample Code 1:
class Dog:
tricks = [] # mistaken use of a class variable
def __init__(self, name):
self.name = name
def add_trick(self, trick):
self.tricks.append(trick)
Output:
>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.add_trick('roll over')
>>> e.add_trick('play dead')
>>> d.tricks # unexpectedly shared by all dogs
['roll over', 'play dead']
Question => If so, then why doesn't y in the following example get affected when x changes its tricks attribute to 5?
Sample Code 2:
class Complex:
tricks = 3
def __init__(self,var1):
self.tricks=var1
def add_tricks(self,var1):
self.tricks=var1
x = Complex(11)
y = Complex(12)
print (x.tricks)
print (y.tricks)
x.add_tricks(5)
print (x.tricks)
print (y.tricks) -->Remains unchanged
Output:
11
12
5
12 -->Remains unchanged
And what exactly is the difference when i remove the self in the following program:
Sample Code 3:
class Complex:
tricks = 3
def __init__(self,var1):
self.tricks=var1
def add_tricks(self,var1):
tricks=var1
x = Complex(11)
y = Complex(12)
print (x.tricks)
print (y.tricks)
x.add_tricks(5) -->This change is not reflected anywhere
print (x.tricks)
print (y.tricks)
print(Complex.tricks)
Output:
11
12
11
12
3

This example may be illustrative. Given the following class (I've dropped the initialiser from your example because it doesn't let us demonstrate the behaviour):
class Complex:
tricks = 3
def add_tricks(self, value):
self.tricks = value
We can see, upon creation, the value of their tricks attribute is both 3:
>>> a = Complex()
>>> b = Complex()
>>>
>>> a.tricks
3
>>> b.tricks
3
Let's take a second and look at the names defined on those objects:
>>> a.__dict__
{}
>>> b.__dict__
{}
They're both objects with no attributes themselves. Let's see what happens after we call add_tricks on b:
>>> b.add_tricks(5)
>>>
>>> a.tricks
3
>>> b.tricks
5
Okay. So, this looks like the shared value hasn't been affected. Let's take a look at their names again:
>>> a.__dict__
{}
>>> b.__dict__
{'tricks': 5}
And there it is. Assigning to self.tricks creates an attribute local to that object with name tricks, which when accessed via the object (or self) is the one that we'll use from that point forward.
The shared value is still there and unchanged:
>>> a.__class__.tricks
3
>>> b.__class__.tricks
3
It's just on the class, not on the object.

Random failing in for loop [duplicate]

Is there any meaningful distinction between:
class A(object):
foo = 5 # some default value
vs.
class B(object):
def __init__(self, foo=5):
self.foo = foo
If you're creating a lot of instances, is there any difference in performance or space requirements for the two styles? When you read the code, do you consider the meaning of the two styles to be significantly different?

There is a significant semantic difference (beyond performance considerations):
when the attribute is defined on the instance (which is what we usually do), there can be multiple objects referred to. Each gets a totally separate version of that attribute.
when the attribute is defined on the class, there is only one underlying object referred to, so if operations on different instances of that class both attempt to set/(append/extend/insert/etc.) the attribute, then:
if the attribute is a builtin type (like int, float, boolean, string), operations on one object will overwrite (clobber) the value
if the attribute is a mutable type (like a list or a dict), we will get unwanted leakage.
For example:
>>> class A: foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[5]
>>> class A:
... def __init__(self): self.foo = []
>>> a, b = A(), A()
>>> a.foo.append(5)
>>> b.foo
[]

The difference is that the attribute on the class is shared by all instances. The attribute on an instance is unique to that instance.
If coming from C++, attributes on the class are more like static member variables.

Here is a very good post, and summary it as below.
class Bar(object):
## No need for dot syntax
class_var = 1
def __init__(self, i_var):
self.i_var = i_var
## Need dot syntax as we've left scope of class namespace
Bar.class_var
## 1
foo = MyClass(2)
## Finds i_var in foo's instance namespace
foo.i_var
## 2
## Doesn't find class_var in instance namespace…
## So look's in class namespace (Bar.__dict__)
foo.class_var
## 1
And in visual form
Class attribute assignment
If a class attribute is set by accessing the class, it will override the value for all instances
foo = Bar(2)
foo.class_var
## 1
Bar.class_var = 2
foo.class_var
## 2
If a class variable is set by accessing an instance, it will override the value only for that instance. This essentially overrides the class variable and turns it into an instance variable available, intuitively, only for that instance.
foo = Bar(2)
foo.class_var
## 1
foo.class_var = 2
foo.class_var
## 2
Bar.class_var
## 1
When would you use class attribute?
Storing constants. As class attributes can be accessed as attributes of the class itself, it’s often nice to use them for storing Class-wide, Class-specific constants
class Circle(object):
pi = 3.14159
def __init__(self, radius):
self.radius = radius
def area(self):
return Circle.pi * self.radius * self.radius
Circle.pi
## 3.14159
c = Circle(10)
c.pi
## 3.14159
c.area()
## 314.159
Defining default values. As a trivial example, we might create a bounded list (i.e., a list that can only hold a certain number of elements or fewer) and choose to have a default cap of 10 items
class MyClass(object):
limit = 10
def __init__(self):
self.data = []
def item(self, i):
return self.data[i]
def add(self, e):
if len(self.data) >= self.limit:
raise Exception("Too many elements")
self.data.append(e)
MyClass.limit
## 10

Since people in the comments here and in two other questions marked as dups all appear to be confused about this in the same way, I think it's worth adding an additional answer on top of Alex Coventry's.
The fact that Alex is assigning a value of a mutable type, like a list, has nothing to do with whether things are shared or not. We can see this with the id function or the is operator:
>>> class A: foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
True
>>> class A:
... def __init__(self): self.foo = object()
>>> a, b = A(), A()
>>> a.foo is b.foo
False
(If you're wondering why I used object() instead of, say, 5, that's to avoid running into two whole other issues which I don't want to get into here; for two different reasons, entirely separately-created 5s can end up being the same instance of the number 5. But entirely separately-created object()s cannot.)
So, why is it that a.foo.append(5) in Alex's example affects b.foo, but a.foo = 5 in my example doesn't? Well, try a.foo = 5 in Alex's example, and notice that it doesn't affect b.foo there either.
a.foo = 5 is just making a.foo into a name for 5. That doesn't affect b.foo, or any other name for the old value that a.foo used to refer to.* It's a little tricky that we're creating an instance attribute that hides a class attribute,** but once you get that, nothing complicated is happening here.
Hopefully it's now obvious why Alex used a list: the fact that you can mutate a list means it's easier to show that two variables name the same list, and also means it's more important in real-life code to know whether you have two lists or two names for the same list.
* The confusion for people coming from a language like C++ is that in Python, values aren't stored in variables. Values live off in value-land, on their own, variables are just names for values, and assignment just creates a new name for a value. If it helps, think of each Python variable as a shared_ptr<T> instead of a T.
** Some people take advantage of this by using a class attribute as a "default value" for an instance attribute that instances may or may not set. This can be useful in some cases, but it can also be confusing, so be careful with it.

There is one more situation.
Class and instance attributes is Descriptor.
# -*- encoding: utf-8 -*-
class RevealAccess(object):
def __init__(self, initval=None, name='var'):
self.val = initval
self.name = name
def __get__(self, obj, objtype):
return self.val
class Base(object):
attr_1 = RevealAccess(10, 'var "x"')
def __init__(self):
self.attr_2 = RevealAccess(10, 'var "x"')
def main():
b = Base()
print("Access to class attribute, return: ", Base.attr_1)
print("Access to instance attribute, return: ", b.attr_2)
if __name__ == '__main__':
main()
Above will output:
('Access to class attribute, return: ', 10)
('Access to instance attribute, return: ', <__main__.RevealAccess object at 0x10184eb50>)
The same type of instance access through class or instance return different result!
And i found in c.PyObject_GenericGetAttr definition，and a great post.
Explain
If the attribute is found in the dictionary of the classes which make up.
the objects MRO, then check to see if the attribute being looked up points to a Data Descriptor (which is nothing more that a class implementing both the __get__ and the __set__ methods).
If it does, resolve the attribute lookup by calling the __get__ method of the Data Descriptor (lines 28–33).

Specifiying a data attribute in multiple instasnces of the same class in python

I have already seen this post and even though the symptoms are similar, the way I am defining my class is different as I am using __init__:
>>> class foo(object):
... def __init__(self,x):
... self.x = x
...
>>>
I next define an instance of this class:
>>> inst1 = foo(10)
>>> inst1.x
10
Now, I would like to copy the same instance into a new variable and then change the value of x:
>>> inst2 = inst1
>>> inst2.x = 20
>>> inst2.x
20
It seems, however, (like a class-level attribute) all data attributes are shared between inst1 and inst2 since changing the value of x for inst2 will also change that for inst1:
>>> inst1.x
20
I do know that an alternative method is to say:
>>> inst2 = foo(20)
However, I don't like to do this because my actual class takes a lot of input arguments out of which I need to change only one or two specific data attribute(s) when creating different instances (i.e., the rest of input arguments remain the same for all instances. Any suggestions is greatly appreciated!

You are not copying the class instance (the object). The following line
>>> inst2 = inst1
copies a reference of inst1. It does not copy the object.
Easy way to confirm this is to look at the result of the builtin id()-function, which results a unique memory address for each python object. The value for both inst2 and inst1 should be the same in this case.
And an easy way to solve it is to use Joran's answer.

class foo(object):
def __init__(self,x):
self.x = x
def copy(self):
return foo(self.x)
foo2 = foo1.copy()
is a pretty safe way to implement it
there is also the builtin copy method
from copy import deepcopy
foo2 = deepcopy(foo1)
if you define a __copy__ method, the copy.copy will use your own __copy__
class foo2:
def __init__(self,val):
self.state = 0
self.val = val
def __copy__(self):
newfoo = foo2(self.val)
newfoo.state = self.state
print "Copied self:",self
return newfoo
from copy import copy
f = foo2(5)
f2 = copy(f) #will do copy we defined
this will make your code a somewhat more generic allowing you to just use the copy method on all objects without worrying about what the object is

Creating equivalent classes in Python?

I played around with overloading or masking classes in Python. Do the following code examples create equivalent classes?
class CustASample(object):
def __init__(self):
self.__class__.__name__ = "Sample"
def doSomething(self):
dummy = 1
and
class Sample(object):
def doSomething(self):
dummy = 1
EDIT: From the comments and and the good answer by gs, it occured to me, that I really wanted to ask: What "attributes" make these classes differ?
Because
>>> dir(a) == dir(b)
True
and
>>> print Sample
<class '__main__.Sample'>
>>> print CustASample
<class '__main__.Sample'>
but
>>> Sample == CustASample
False

No, they are still different.
a = CustASample()
b = Sample()
a.__class__ is b.__class__
-> False
Here's how you could do it:
class A(object):
def __init__(self):
self.__class__ = B
class B(object):
def bark(self):
print "Wuff!"
a = A()
b = B()
a.__class__ is b.__class__
-> True
a.bark()
-> Wuff!
b.bark()
-> Wuff!
Usually you would do it in the __new__ method instead of in __init__:
class C(object):
def __new__(cls):
return A()
To answer your updated question:
>>> a = object()
>>> b = object()
>>> a == b
False
Why would a not be equal to b, since both are just plain objects without attributes?
Well, that answer is simple. The == operator invokes __eq__, if it's available. But unless you define it yourself it's not. Instead of it a is b gets used.
is compares the ids of the objects. (In CPython the memory address.) You can get the id of an object like this:
>>> id(a)
156808

Classes also, not only instances, are objects. For example, you can get id(Sample). Try it and see that these two numbers differ, as differ the classes' memory locations. They are not the same object. It's like asking whether [] is [].
EDIT: Too late and the explanation by gs is better.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How should I declare default values for instance variables in Python? - python

Related

What's the difference between the following ways of initializing attributes? [duplicate]

Class variable vs instance variable

Random failing in for loop [duplicate]

Specifiying a data attribute in multiple instasnces of the same class in python

Creating equivalent classes in Python?

Categories

Resources