I have found no reference for a short constructor call that would initialize variables of the caller's choice. I am looking for
class AClass:
def __init__(self):
pass
instance = AClass(var1=3, var2=5)
instead of writing the heavier
class AClass:
def __init__(self, var1, var2):
self.var1 = var1
self.var2 = var2
or the much heavier
instance = AClass()
instance.var1 = 3
instance.var2 = 5
Am I missing something?
This is an excellent question and has been a puzzle also for me.
In the modern Python world, there are three (excellent) shorthand initializers (this term is clever, I am adopting it), depending on your needs. None requires any footwork with __init__ methods (which is what you wanted to avoid in the first place).
Namespace object
If you wish to assign arbitrary values to an instance (i.e. not enforced by the class), you should use a particular data structure called namespace. A namespace object is an object accessible with the dot notation, to which you can assign basically what you want.
You can import the Namespace class from argparse (it is covered here: How do I create a Python namespace (argparse.parse_args value)?). Since Python 3.3. a SimpleNamespace class is available from the standard types package.
from types import SimpleNamespace
instance = SimpleNamespace(var1=var1, var2=var2)
You can also write:
instance = SimpleNamespace()
instance.var1 = var1
instance.var2 = var2
Let's say its the "quick and dirty way", which would work in a number of cases. In general there is not even the need to declare your class.
If you want your instances to still have a few methods and properties you could still do:
class AClass(Namespace):
def mymethod(self, ...):
pass
And then:
instance = AClass(var1=var1, var2=var2, etc.)
That gives you maximum flexibility.
Named tuple
On the other hand, if you want the class to enforce those attributes, then you have another, more solid option.
A named tuple produces immutable instances, which are initialized once and for all. Think of them as ordinary tuples, but with each item also accessible with the dot notation. This class namedtuple is part of the standard distribution of Python. This how you generate your class:
from collections import namedtuple
AClass = namedtuple("AClass", "var1 var2")
Note how cool and short the definition is and not __init__ method required. You can actually complete your class after that.
And to create an object:
instance = AClass(var1, var2)
or
instance = AClass(var1=var1, var2=var2)
Named list
But what if you want that instance to be mutable, i.e. to allow you update the properties of the instance? The answer is the named list (also known as RecordClass). Conceptually it is like a normal list, where the items are also accessible with the dot notation.
There are various implementations. I personally use the aptly named namedlist.
The syntax is identical:
from namedlist import namedlist
AClass = namedlist("AClass", "var1 var2")
And to create an object:
instance = AClass(var1, var2)
or:
instance = AClass(var1=var1, var2=var2)
And you can then modify them:
instance.var1 = var3
But you can't add an attribute that is not defined.
>>> instance.var4 = var4
File "<stdin>", line 1, in <module>
AttributeError: 'instance' object has no attribute 'var4'
Usage
Here is my two-bit:
Namespace object is for maximum flexibility and there is not even the need to declare a class; with the risk of having instances that don't behave properly (but Python is a language for consenting adults). If you have only one instance and/or you know what you're doing, that would be the way to go.
namedtuple class generator is perfect to generate objects for returns from functions (see this brief explanation in a lecture from Raymond Hettinger). Rather than returning bland tuples that the user needs to look up in the documentation, the tuple returned is self-explanatory (a dir or help will do it). And it it's compatible with tuple usage anyway (e.g. k,v, z = my_func()). Plus it's immutable, which has its own advantages.
namedlist class generator is useful in a wide range of cases, including when you need to return multiple values from a function, which then need to be amended at a later stage (and you can still unpack them: k, v, z = instance). If you need a mutable object from a proper class with enforced attributes, that might be the go-to solution.
If you use them well, this might significantly cut down time spent on writing classes and handling instances!
Update (September 2020)
#PPC: your dream has come true.
Since Python 3.7, a new tool is available as a standard: dataclasses (unsurprisingly, the designer of the named list package, Eric V. Smith, is also behind it).
In essence, it provides an automatic initialization of class variables.
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
(from the official doc)
What the #dataclass decorator will do, will be to automatically add the __init__() method:
def __init__(self, name: str, unit_price: float, quantity_on_hand: int=0):
self.name = name
self.unit_price = unit_price
self.quantity_on_hand = quantity_on_hand
IMHO, it's a pretty, eminently pythonic solution.
Eric also maintains a backport of dataclasses on github, for Python 3.6.
You can update the __dict__ attribute of your object directly, which is where the attributes are stored
class AClass:
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
c = AClass(var1=1, var2='a')
You can use the dictionary representation of the object's attributes, and update its elements with the keyword arguments given to the constructor:
class AClass:
def __init__(self, **kwargs):
self.__dict__.update(**kwargs)
instance = AClass(var1=3, var2=5)
print(instance.var1, instance.var2) # prints 3 5
However, consider this question and its answers considering the style of this. Unless you know what you are doing, better explicitly set the arguments one by one. It will be better understandable for you and other people later - explicit is better than implicit. If you do it the __dict__.update way, document it properly.
Try
class AClass:
def __init__(self, **vars):
self.var1 = vars.get('var1')
Related
My confusion is with the interplay between dataclasses & __init_subclass__.
I am trying to implement a base class that will exclusively be inherited from. In this example, A is the base class. It is my understanding from reading the python docs on dataclasses that simply adding a decorator should automatically create some special dunder methods for me. Quoting their docs:
For example, this code:
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
will add, among other things, a __init__() that looks like:
def __init__(self, name: str, unit_price: float, quantity_on_hand: int = 0):
self.name = name
self.unit_price = unit_price
self.quantity_on_hand = quantity_on_hand
This is an instance variable, no? From the classes docs, it shows a toy example, which reads super clear.
class Dog:
kind = 'canine' # class variable shared by all instances
def __init__(self, name):
self.name = name # instance variable unique to each instance
A main gap in my understanding is - is it an instance variable or a class variable? From my testing below, it is a class variable, but from the docs, it shows an instance variable as it's proximal implementation. It may be that most of my problem is there. I've also read the python docs on classes, which do not go into dataclasses.
The problem continues with the seemingly limited docs on __init_subclass__, which yields another gap in my understanding. I am also making use of __init_subclass__, in order to enforce that my subclasses have indeed instantiated the variable x.
Below, we have A, which has an instance variable x set to None. B, C, and D all subclass A, in different ways (hoping) to determine implementation specifics.
B inherits from A, setting a class variable of x.
D is a dataclass, which inherits from A, setting what would appear to be a class variable of x. However, given their docs from above, it seems that the class variable x of D should be created as an instance variable. Thus, when D is created, it should first call __init_subclass__, in that function, it will check to see if x exists in D - by my understanding, it should not; however, the code passes scot-free. I believe D() will create x as an instance variable because the dataclass docs show that this will create an __init__ for the user.
"will add, among other things..." <insert __init__ code>
I must be wrong here but I'm struggling to put it together.
import dataclasses
class A:
def __init__(self):
self.x = None
def __init_subclass__(cls):
if not getattr(cls, 'x') or not cls.x:
raise TypeError(
f'Cannot instantiate {cls.__name__}, as all subclasses of {cls.__base__.__name__} must set x.'
)
class B(A):
x = 'instantiated-in-b'
#dataclasses.dataclass
class D(A):
x : str = 'instantiated-in-d'
class C(A):
def __init__(self):
self.x = 'instantiated-in-c'
print('B', B())
print('D', D())
print('C', C())
The code, per my expectation, properly fails with C(). Executing the above code will succeed with D, which does not compute for me. In my understanding (which is wrong), I am defining a field, which means that dataclass should expand my class variables as instance variables. (The previous statement is most probably where I am wrong, but I cannot find anything that documents this behavior. Are data classes not actually expanding class variables as instance variables? It certainly appears that way from the visual explanation in their docs.) From the dataclass docs:
The dataclass() decorator examines the class to find fields. A field is defined as a class variable that has a type annotation.
Thus - why - when creating an instance D() - does it slide past the __init_subclass__ of its parent A?
Apologies for the lengthy post, I must be missing something simple, so if once can point me in the right direction, that would be excellent. TIA!
I have just found the implementation for dataclasses from the CPython github.
Related Articles:
Understanding __init_subclass__
python-why-use-self-in-a-class
proper-way-to-create-class-variable-in-data-class
how-to-get-instance-variables-in-python
enforcing-class-variables-in-a-subclass
__init_subclass__ is called when initializing a subclass. Not when initializing an instance of a subclass - it's called when initializing the subclass itself. Your exception occurs while trying to create the C class, not while trying to evaluate C().
Decorators, such as #dataclass, are a post-processing mechanism, not a pre-processing mechanism. A class decorator takes an existing class that has already gone through all the standard initialization, including __init_subclass__, and modifies the class. Since this happens after __init_subclass__, __init_subclass__ doesn't see any of the modifications that #dataclass performs.
Even if the decorator were to be applied first, D still would have passed the check in A.__init_subclass__, because the dataclass decorator will set D.x to the default value of the x field anyway, so __init_subclass__ will find a value of x. In this case, that happens to be the same thing you set D.x to in the original class definition, but it can be a different object in cases where you construct field objects explicitly.
(Also, you probably wanted to write hasattr instead of getattr in not getattr(cls, 'x').)
I wrote a class that can handle integers with arbitrary precision (just for learning purposes). The class takes a string representation of an integer and converts it into an instance of BigInt for further calculations.
Often times you need the numbers Zero and One, so I thought it would be helpfull if the class could return these. I tried the following:
class BigInt():
zero = BigInt("0")
def __init__(self, value):
####yada-yada####
This doesn't work. Error: "name 'BigInt' is not defined"
Then I tried the following:
class BigInt():
__zero = None
#staticmethod
def zero():
if BigInt.__zero is None:
BigInt.__zero = BigInt('0')
return BigInt.__zero
def __init__(self, value):
####yada-yada####
This actually works very well. What I don't like is that zero is a method (and thus has to be called with BigInt.zero()) which is counterintuitive since it should just refer to a fixed value.
So I tried changing zero to become a property, but then writing BigInt.zero returns an instance of the class property instead of BigInt because of the decorator used. That instance cannot be used for calculations because of the wrong type.
Is there a way around this issue?
A static property...? We call a static property an "attribute". This is not Java, Python is a dynamically typed language and such a construct would be really overcomplicating matters.
Just do this, setting a class attribute:
class BigInt:
def __init__(self, value):
...
BigInt.zero = BigInt("0")
If you want it to be entirely defined within the class, do it using a decorator (but be aware it's just a more fancy way of writing the same thing).
def add_zero(cls):
cls.zero = cls("0")
return cls
#add_zero
class BigInt:
...
The question is contradictory: static and property don't go together in this way. Static attributes in Python are simply ones that are only assigned once, and the language itself includes a very large number of these. (Most strings are interred, all integers < a certain value are pre-constructed, etc. E.g. the string module.). Easiest approach is to statically assign the attributes after construction as wim illustrates:
class Foo:
...
Foo.first = Foo()
...
Or, as he further suggested, using a class decorator to perform the assignments, which is functionally the same as the above. A decorator is, effectively, a function that is given the "decorated" function as an argument, and must return a function to effectively replace the original one. This may be the original function, say, modified with some annotations, or may be an entirely different function. The original (decorated) function may or may not be called as appropriate for the decorator.
def preload(**values):
def inner(cls):
for k, v in values.items():
setattr(cls, k, cls(v))
return cls
return inner
This can then be used dynamically:
#preload(zero=0, one=1)
class Foo:
...
If the purpose is to save some time on common integer values, a defaultdict mapping integers to constructed BigInts could be useful as a form of caching and streamlined construction / singleton storage. (E.g. BigInt.numbers[27])
However, the problem of utilizing #property at the class level intrigued me, so I did some digging. It is entirely possible to make use of "descriptor protocol objects" (which the #property decorator returns) at the class level if you punt the attribute up the object model hierarchy, to the metaclass.
class Foo(type):
#property
def bar(cls):
print("I'm a", cls)
return 27
class Bar(metaclass=Foo):
...
>>> Bar.bar
I'm a <class '__main__.Bar'>
<<< 27
Notably, this attribute is not accessible from instances:
>>> Bar().bar
AttributeError: 'Bar' object has no attribute 'bar'
Hope this helps!
This is about multiple inheritance. Parent class A provides a few methods and B parent class B a few additional ones. By creating a class inheriting from A and B I could instantiate an object having both method sets.
Now my problem is, that I detect only after having instantiated A, that the methods from B would be helpful too (or more strictly stated, that my object is also of class B).
While
aInstance.bMethod = types.MethodType(localFunction, aInstance)
works in principle, it has to be repeated for any bMethod, and looks unnecessary complicated. It also requires stand-alone (local) functions instead of a conceptually cleaner class B. Is there a more streamlined approach?
Update:
I tried abstract base class with some success, but there only the methods of one additional class could be added.
What I finally achieved is a little routine, which adds all top-level procedures of a given module:
from types import MethodType
from inspect import ismodule, isfunction, getmembers
# adds all functions found in module as methods to given obj
def classMagic(obj, module):
assert(ismodule(module))
for name, fn in getmembers(module, isfunction):
if not name.startswith("__"):
setattr(obj, name, MethodType(fn, obj))
Functionally this is sufficient, and I'm also pleased with the automatism, that all functions are processed and I don't have separate places of function definition and adding it as method, so maintenace is easy. The only remaining issue is reflected by the startswith line, as an example for a neccessary naming convention, if selected functions shall not be added.
If I understand correctly, you want to add mixins to your class at run time. A very common way of adding mixins in Python is through decorators (rather than inheritance), so we can borrow this idea to do something runtime to the object (instead to the class).
I used functools.partial to freeze the self parameter, to emulate the process of binding a function to an object (i.e. turn a function into a method).
from functools import partial
class SimpleObject():
pass
def MixinA(obj):
def funcA1(self):
print('A1 - propertyA is equal to %s' % self.propertyA)
def funcA2(self):
print('A2 - propertyA is equal to %s' % self.propertyA)
obj.propertyA = 0
obj.funcA1 = partial(funcA1, self=obj)
obj.funcA2 = partial(funcA2, self=obj)
return obj
def MixinB(obj):
def funcB1(self):
print('B1')
obj.funcB1 = partial(funcB1, self=obj)
return obj
o = SimpleObject()
# need A characteristics?
o = MixinA(o)
# need B characteristics?
o = MixinB(o)
Instead of functools.partial, you can also use types.MethodType as you did in your question; I think that is a better/cleaner solution.
I'm writing a class for something and I keep stumbling across the same tiresome to type out construction. Is there some simple way I can set up class so that all the parameters in the constructor get initialized as their own name, i.e. fish = 0 -> self.fish = fish?
class Example(object):
def __init__(self, fish=0, birds=0, sheep=0):
self.fish = fish
self.birds = birds
self.sheep = sheep
Short answer: no. You are not required to initialize everything in the constructor (you could do it lazily), unless you need it immediately or expose it (meaning that you don't control access). But, since in Python you don't declare data fields, it will become difficult, much difficult, to track them all if they appear in different parts of the code.
More comprehensive answer: you could do some magic with **kwargs (which holds a dictionary of argument name/value pairs), but that is highly discouraged, because it makes documenting the changes almost impossible and difficult for users to check if a certain argument is accepted or not. Use it only for optional, internal flags. It could be useful when having 20 or more parameters to pass, but in that case I would suggest to rethink the design and cluster data.
In case you need a simple key/value storage, consider using a builtin, such as dict.
You could use the inspect module:
import inspect
class Example(object):
def __init__(self, fish=0, birds=0, sheep=0):
frame = inspect.currentframe()
args, _, _, values = inspect.getargvalues(frame)
for i in args:
setattr(self, i, values[i])
This works, but is more complicated that just setting them manually. It should be possible to hide this with a decorator:
#set_attributes
def __init__(self, fish=0, birds=0, sheep=0):
pass
but defining set_attributes gets tricky because the decorator inserts another stack frame into the mix, and I can't quite get the details right.
For Python 3.7+, you can try using data classes in combination with type annotations.
https://docs.python.org/3/library/dataclasses.html
Import the module and use the decorator. Type-annotate your variables and there's no need to define an init method, because it will automatically be created for you.
from dataclasses import dataclass
#dataclass
class Example:
fish: int = 0
birds: int = 0
sheep: int = 0
I'm teaching myself Python and I see the following in Dive into Python section 5.3:
By convention, the first argument of any Python class method (the reference to the current instance) is called self. This argument fills the role of the reserved word this in C++ or Java, but self is not a reserved word in Python, merely a naming convention. Nonetheless, please don't call it anything but self; this is a very strong convention.
Considering that self is not a Python keyword, I'm guessing that it can sometimes be useful to use something else. Are there any such cases? If not, why is it not a keyword?
No, unless you want to confuse every other programmer that looks at your code after you write it. self is not a keyword because it is an identifier. It could have been a keyword and the fact that it isn't one was a design decision.
As a side observation, note that Pilgrim is committing a common misuse of terms here: a class method is quite a different thing from an instance method, which is what he's talking about here. As wikipedia puts it, "a method is a subroutine that is exclusively associated either with a class (in which case it is called a class method or a static method) or with an object (in which case it is an instance method).". Python's built-ins include a staticmethod type, to make static methods, and a classmethod type, to make class methods, each generally used as a decorator; if you don't use either, a def in a class body makes an instance method. E.g.:
>>> class X(object):
... def noclass(self): print self
... #classmethod
... def withclass(cls): print cls
...
>>> x = X()
>>> x.noclass()
<__main__.X object at 0x698d0>
>>> x.withclass()
<class '__main__.X'>
>>>
As you see, the instance method noclass gets the instance as its argument, but the class method withclass gets the class instead.
So it would be extremely confusing and misleading to use self as the name of the first parameter of a class method: the convention in this case is instead to use cls, as in my example above. While this IS just a convention, there is no real good reason for violating it -- any more than there would be, say, for naming a variable number_of_cats if the purpose of the variable is counting dogs!-)
The only case of this I've seen is when you define a function outside of a class definition, and then assign it to the class, e.g.:
class Foo(object):
def bar(self):
# Do something with 'self'
def baz(inst):
return inst.bar()
Foo.baz = baz
In this case, self is a little strange to use, because the function could be applied to many classes. Most often I've seen inst or cls used instead.
I once had some code like (and I apologize for lack of creativity in the example):
class Animal:
def __init__(self, volume=1):
self.volume = volume
self.description = "Animal"
def Sound(self):
pass
def GetADog(self, newvolume):
class Dog(Animal):
def Sound(this):
return self.description + ": " + ("woof" * this.volume)
return Dog(newvolume)
Then we have output like:
>>> a = Animal(3)
>>> d = a.GetADog(2)
>>> d.Sound()
'Animal: woofwoof'
I wasn't sure if self within the Dog class would shadow self within the Animal class, so I opted to make Dog's reference the word "this" instead. In my opinion and for that particular application, that was more clear to me.
Because it is a convention, not language syntax. There is a Python style guide that people who program in Python follow. This way libraries have a familiar look and feel. Python places a lot of emphasis on readability, and consistency is an important part of this.
I think that the main reason self is used by convention rather than being a Python keyword is because it's simpler to have all methods/functions take arguments the same way rather than having to put together different argument forms for functions, class methods, instance methods, etc.
Note that if you have an actual class method (i.e. one defined using the classmethod decorator), the convention is to use "cls" instead of "self".