__new__ and __init__ in Python - python

I am learning Python and so far I can tell the things below about __new__ and __init__:
__new__ is for object creation
__init__ is for object initialization
__new__ is invoked before __init__ as __new__ returns a new instance and __init__ invoked afterwards to initialize inner state.
__new__ is good for immutable object as they cannot be changed once they are assigned. So we can return new instance which has new state.
We can use __new__ and __init__ for both mutable object as its inner state can be changed.
But I have another questions now.
When I create a new instance such as a = MyClass("hello","world"), how these arguments are passed? I mean how I should structure the class using __init__ and __new__ as they are different and both accepts arbitrary arguments besides default first argument.
self keyword is in terms of name can be changed to something else? But I am wondering cls is in terms of name is subject to change to something else as it is just a parameter name?
I made a little experiments as such below:
>>> class MyClass(tuple):
def __new__(tuple):
return [1,2,3]
and I did below:
>>> a = MyClass()
>>> a
[1, 2, 3]
Albeit I said I want to return tuple, this code works fine and returned me [1,2,3]. I knew we were passing the first parameters as the type we wanted to receive once the __new__ function is invoked. We are talking about New function right? I don't know other languages return type other than bound type?
And I did anther things as well:
>>> issubclass(MyClass,list)
False
>>> issubclass(MyClass,tuple)
True
>>> isinstance(a,MyClass)
False
>>> isinstance(a,tuple)
False
>>> isinstance(a,list)
True
I didn't do more experiment because the further wasn't bright and I decided to stop there and decided to ask StackOverflow.
The SO posts I read:
Python object creation
Python's use of __new__ and __init__?

how I should structure the class using __init__ and __new__ as they are different and both accepts arbitrary arguments besides default first argument.
Only rarely will you have to worry about __new__. Usually, you'll just define __init__ and let the default __new__ pass the constructor arguments to it.
self keyword is in terms of name can be changed to something else? But I am wondering cls is in terms of name is subject to change to something else as it is just a parameter name?
Both are just parameter names with no special meaning in the language. But their use is a very strong convention in the Python community; most Pythonistas will never change the names self and cls in these contexts and will be confused when someone else does.
Note that your use of def __new__(tuple) re-binds the name tuple inside the constructor function. When actually implementing __new__, you'll want to do it as
def __new__(cls, *args, **kwargs):
# do allocation to get an object, say, obj
return obj
Albeit I said I want to return tuple, this code works fine and returned me [1,2,3].
MyClass() will have the value that __new__ returns. There's no implicit type checking in Python; it's the responsibility of the programmer to return the correct type ("we're all consenting adults here"). Being able to return a different type than requested can be useful for implementing factories: you can return a subclass of the type requested.
This also explains the issubclass/isinstance behavior you observe: the subclass relationship follows from your use of class MyClass(tuple), the isinstance reflects that you return the "wrong" type from __new__.
For reference, check out the requirements for __new__ in the Python Language Reference.
Edit: ok, here's an example of potentially useful use of __new__. The class Eel keeps track of how many eels are alive in the process and refuses to allocate if this exceeds some maximum.
class Eel(object):
MAX_EELS = 20
n_eels = 0
def __new__(cls, *args, **kwargs):
if cls.n_eels == cls.MAX_EELS:
raise HovercraftFull()
obj = super(Eel, cls).__new__(cls)
cls.n_eels += 1
return obj
def __init__(self, voltage):
self.voltage = voltage
def __del__(self):
type(self).n_eels -= 1
def electric(self):
"""Is this an electric eel?"""
return self.voltage > 0
Mind you, there are smarter ways to accomplish this behavior.

Related

What's the correct way to implement a metaclass with a different signature than `type`?

Say I want to implement a metaclass that should serve as a class factory. But unlike the type constructor, which takes 3 arguments, my metaclass should be callable without any arguments:
Cls1 = MyMeta()
Cls2 = MyMeta()
...
For this purpose I defined a custom __new__ method with no parameters:
class MyMeta(type):
def __new__(cls):
return super().__new__(cls, 'MyCls', (), {})
But the problem is that python automatically calls the __init__ method with the same arguments as the __new__ method, so trying to call MyMeta() ends up throwing an exception:
TypeError: type.__init__() takes 1 or 3 arguments
Which makes sense, since type can be called with 1 or 3 arguments. But what's the correct way to fix this? I see 3 (4?) options:
I could add an empty __init__ method to my metaclass, but since I'm not sure if type.__init__ does anything important, this might not be a good idea.
I could implement an __init__ method that calls super().__init__(cls.__name__, cls.__bases__, vars(cls)).
I could use a meta-metaclass and override its __call__ method, rather than messing with __new__ and __init__.
Bonus option: Maybe I shouldn't try to change the signature?
So my question is: Are the 3 solutions I listed correct or are there any subtle bugs hidden in them? Which solution is best (i.e. the most correct)?
An interface deviating from the parent signature is a questionable design in regular classes too. You don't need the extra complexity of metaclasses to get into this kind of mess - you can cause the same new/init jumble by subclassing a datetime or whatever.
I want to have a metaclass and an easy way to create instances of that metaclass.
The usual pattern in Python is to write a factory using a from_something classmethod. To take the example of creating datetime instances from a different init signature, there is for example datetime.fromtimestamp, but you have many other examples too (dict.fromkeys, int.from_bytes, bytes.fromhex...)
There is nothing specific to metaclasses here, so use the same pattern:
class MyMeta(type):
#classmethod
def from_no_args(cls, name=None):
if name is None:
name = cls.__name__ + 'Instance'
return cls(name, (), {})
Usage:
>>> class A(metaclass=MyMeta):
... pass
...
>>> B = MyMeta.from_no_args()
>>> C = MyMeta.from_no_args(name='C')
>>> A.__name__
'A'
>>> B.__name__
'MyMetaInstance'
>>> C.__name__
'C'

Porting Subclass of Unicode to Python 3

I'm porting a legacy codebase from Python 2.7 to Python 3.6. In that codebase I have a number of instances of things like:
class EntityName(unicode):
#staticmethod
def __new__(cls, s):
clean = cls.strip_junk(s)
return super(EntityName, cls).__new__(cls, clean)
def __init__(self, s):
self._clean = s
self._normalized = normalized_name(self._clean)
self._simplified = simplified_name(self._clean)
self._is_all_caps = None
self._is_all_lower = None
super(EntityName, self).__init__(self._clean)
It might be called like this:
EntityName("Guy DeFalt")
When porting this to Python 3 the above code fails because unicode is no longer a class you can extend (at least, if there is an equivalent class I cannot find it). Given that str is unicode now, I tried to just swap str in, but the parent init doesn't take a the string value I'm trying to pass:
TypeError: object.__init__() takes no parameters
This makes sense because str does not have an __init__ method - this does not seem to be an idiomatic way of using this class. So my question has two major branches:
Is there a better way to be porting classes that sub-classed the old unicode class?
If subclassing str is appropriate, how should I modify the __init__ function for idiomatic behavior?
The right way to subclass a string or another immutable class in Python 3 is same as in Python 2:
class MyString(str):
def __new__(cls, initial_arguments): # no staticmethod
desired_string_value = get_desired_string_value(initial_arguments)
return super(MyString, cls).__new__(cls, desired_string_value)
# can be shortened to super().__new__(...)
def __init__(self, initial_arguments): # arguments are unused
self.whatever = whatever(self)
# no need to call super().__init__(), but if you must, do not pass arguments
There are several issues with your sample. First, why __new__ is #staticmethod? It's #classmethod, although you don't need to specify this. Second, the code seems to operate under the assumption that when you call __new__ of the superclass, it somehow calls your __init__ as well. I'm deriving this from looking at how self._clean is supposed to be set. This is not the case. When you call MyString(arguments), the following happens:
First Python calls __new__ with the class parameter (usually called cls) and arguments. __new__ must return the class instance. To do this it can create it, as we do, or do something else; e.g. it may return an existing one or, in fact, anything.
Then Python calls __init__ with the instance it received from __new__ (this parameter is usually called self) and the same arguments.
(There's a special case: Python won't call __init__ if __new__ returned something that is not a subclass of the passed class.)
Python uses class hierarchy to see which __new__ and __init__ to call. It's up to you to correctly sort out the arguments and use proper superclass calls in these two methods.

Accessing the parameters of a constructor from a metaclass

TL;DR -
I have a class that uses a metaclass.
I would like to access the parameters of the object's constructor from the metaclass, just before the initialization process, but I couldn't find a way to access those parameters.
How can I access the constructor's parameters from the metaclass function __new__?
In order to practice the use of metaclasses in python, I would like to create a class that would be used as the supercomputer "Deep Thought" from the book "The Hitchhiker's Guide to the Galaxy".
The purpose of my class would be to store the various queries the supercomputer gets from users.
At the bottom line, it would just get some arguments and store them.
If one of the given arguments is number 42 or the string "The answer to life, the universe, and everything", I don't want to create a new object but rather return a pointer to an existing object.
The idea behind this is that those objects would be the exact same so when using the is operator to compare those two, the result would be true.
In order to be able to use the is operator and get True as an answer, I would need to make sure those variables point to the same object. So, in order to return a pointer to an existing object, I need to intervene in the middle of the initialization process of the object. I cannot check the given arguments at the constructor itself and modify the object's inner-variables accordingly because it would be too late: If I check the given parameters only as part of the __init__ function, those two objects would be allocated on different portions of the memory (they might be equal but won't return True when using the is operator).
I thought of doing something like that:
class SuperComputer(type):
answer = 42
def __new__(meta, name, bases, attributes):
# Check if args contains the number "42"
# or has the string "The answer to life, the universe, and everything"
# If so, just return a pointer to an existing object:
return SuperComputer.answer
# Else, just create the object as it is:
return super(SuperComputer, meta).__new__(meta, name, bases, attributes)
class Query(object):
__metaclass__ = SuperComputer
def __init__(self, *args, **kwargs):
self.args = args
for key, value in kwargs.items():
setattr(self, key, value)
def main():
number = Query(42)
string = Query("The answer to life, the universe, and everything")
other = Query("Sunny", "Sunday", 123)
num2 = Query(45)
print number is string # Should print True
print other is string # Should print False
print number is num2 # Should print False
if __name__ == '__main__':
main()
But I'm stuck on getting the parameters from the constructor.
I saw that the __new__ method gets only four arguments:
The metaclass instance itself, the name of the class, its bases, and its attributes.
How can I send the parameters from the constructor to the metaclass?
What can I do in order to achieve my goal?
You don't need a metaclass for that.
The fact is __init__ is not the "constructor" of an object in Python, rather, it is commonly called an "initializator" . The __new__ is closer to the role of a "constructor" in other languages, and it is not available only for the metaclass - all classes have a __new__ method. If it is not explicitly implemented, the object.__new__ is called directly.
And actually, it is object.__new__ which creates a new object in Python. From pure Python code, there is no other possible way to create an object: it will always go through there. That means that if you implement the __new__ method on your own class, you have the option of not creating a new instance, and instead return another pre-existing instance of the same class (or any other object).
You only have to keep in mind that: if __new__ returns an instance of the same class, then the default behavior is that __init__ is called on the same instance. Otherwise, __init__ is not called.
It is also worth noting that in recent years some recipe for creating "singletons" in Python using metaclasses became popular - it is actually an overkill approach,a s overriding __new__ is also preferable for creating singletons.
In your case, you just need to have a dictionary with the parameters you want to track as your keys, and check if you create a new instance or "recycle" one whenever __new__ runs. The dictionary may be a class attribute, or a global variable at module level - that is your pick:
class Recycler:
_instances = {}
def __new__(cls, parameter1, ...):
if parameter1 in cls._instances:
return cls._instances[parameter1]
self = super().__new__(cls) # don't pass remaining parameters to object.__new__
_instances[parameter1] = self
return self
If you'd have any code in __init__ besides that, move it to __new__ as well.
You can have a baseclass with this behavior and have a class hierarchy without needing to re-implement __new__ for every class.
As for a metaclass, none of its methods are called when actually creating a new instance of the classes created with that metaclass. It would only be of use to automatically insert this behavior, by decorating or creating a fresh __new__ method, on classes created with that metaclass. Since this behavior is easier to track, maintain, and overall to combine with other classes just using ordinary inheritance, no need for a metaclass at all.

What happens when we edit(append, remove...) a list and can we execute actions each time a list is edited

I would like to know if there is a way to create a list that will execute some actions each time I use the method append(or an other similar method).
I know that I could create a class that inherits from list and overwrite append, remove and all other methods that change content of list but I would like to know if there is an other way.
By comparison, if I want to print 'edited' each time I edit an attribute of an object I will not execute print("edited") in all methods of the class of that object. Instead, I will only overwrite __setattribute__.
I tried to create my own type which inherits of list and overwrite __setattribute__ but that doesn't work. When I use myList.append __setattribute__ isn't called. I would like to know what's realy occured when I use myList.append ? Is there some magic methods called that I could overwrite ?
I know that the question have already been asked there : What happens when you call `append` on a list?. The answer given is just, there is no answer... I hope it's a mistake.
I don't know if there is an answer to my request so I will also explain you why I'm confronted to that problem. Maybe I can search in an other direction to do what I want. I have got a class with several attributes. When an attribute is edited, I want to execute some actions. Like I explain before, to do this I am use to overwrite __setattribute__. That works fine for most of attributes. The problem is lists. If the attribute is used like this : myClass.myListAttr.append(something), __setattribute__ isn't called while the value of the attribute have changed.
The problem would be the same with dictionaries. Methods like pop doesn't call __setattribute__.
If I understand correctly, you would want something like Notify_list that would call some method (argument to the constructor in my implementation) every time a mutating method is called, so you could do something like this:
class Test:
def __init__(self):
self.list = Notify_list(self.list_changed)
def list_changed(self,method):
print("self.list.{} was called!".format(method))
>>> x = Test()
>>> x.list.append(5)
self.list.append was called!
>>> x.list.extend([1,2,3,4])
self.list.extend was called!
>>> x.list[1] = 6
self.list.__setitem__ was called!
>>> x.list
[5, 6, 2, 3, 4]
The most simple implementation of this would be to create a subclass and override every mutating method:
class Notifying_list(list):
__slots__ = ("notify",)
def __init__(self,notifying_method, *args,**kw):
self.notify = notifying_method
list.__init__(self,*args,**kw)
def append(self,*args,**kw):
self.notify("append")
return list.append(self,*args,**kw)
#etc.
This is obviously not very practical, writing the entire definition would be very tedious and very repetitive, so we can create the new subclass dynamically for any given class with functions like the following:
import functools
import types
def notify_wrapper(name,method):
"""wraps a method to call self.notify(name) when called
used by notifying_type"""
#functools.wraps(method)
def wrapper(*args,**kw):
self = args[0]
# use object.__getattribute__ instead of self.notify in
# case __getattribute__ is one of the notifying methods
# in which case self.notify will raise a RecursionError
notify = object.__getattribute__(self, "_Notify__notify")
# I'd think knowing which method was called would be useful
# you may want to change the arguments to the notify method
notify(name)
return method(*args,**kw)
return wrapper
def notifying_type(cls, notifying_methods="all"):
"""creates a subclass of cls that adds an extra function call when calling certain methods
The constructor of the subclass will take a callable as the first argument
and arguments for the original class constructor after that.
The callable will be called every time any of the methods specified in notifying_methods
is called on the object, it is passed the name of the method as the only argument
if notifying_methods is left to the special value 'all' then this uses the function
get_all_possible_method_names to create wrappers for nearly all methods."""
if notifying_methods == "all":
notifying_methods = get_all_possible_method_names(cls)
def init_for_new_cls(self,notify_method,*args,**kw):
self._Notify__notify = notify_method
namespace = {"__init__":init_for_new_cls,
"__slots__":("_Notify__notify",)}
for name in notifying_methods:
method = getattr(cls,name) #if this raises an error then you are trying to wrap a method that doesn't exist
namespace[name] = notify_wrapper(name, method)
# I figured using the type() constructor was easier then using a meta class.
return type("Notify_"+cls.__name__, (cls,), namespace)
unbound_method_or_descriptor = ( types.FunctionType,
type(list.append), #method_descriptor, not in types
type(list.__add__),#method_wrapper, also not in types
)
def get_all_possible_method_names(cls):
"""generates the names of nearly all methods the given class defines
three methods are blacklisted: __init__, __new__, and __getattribute__ for these reasons:
__init__ conflicts with the one defined in notifying_type
__new__ will not be called with a initialized instance, so there will not be a notify method to use
__getattribute__ is fine to override, just really annoying in most cases.
Note that this function may not work correctly in all cases
it was only tested with very simple classes and the builtin list."""
blacklist = ("__init__","__new__","__getattribute__")
for name,attr in vars(cls).items():
if (name not in blacklist and
isinstance(attr, unbound_method_or_descriptor)):
yield name
Once we can use notifying_type creating Notify_list or Notify_dict would be as simple as:
import collections
mutating_list_methods = set(dir(collections.MutableSequence)) - set(dir(collections.Sequence))
Notify_list = notifying_type(list, mutating_list_methods)
mutating_dict_methods = set(dir(collections.MutableMapping)) - set(dir(collections.Mapping))
Notify_dict = notifying_type(dict, mutating_dict_methods)
I have not tested this extensively and it quite possibly contains bugs / unhandled corner cases but I do know it worked correctly with list!

understanding instance object in reference to self convention in __init__(self) function when defining class

New to Python, trying to understand exactly what the self in the __init_(self) function is referring to.
A few tutorials I'm working with describe self as
referring to the instance whose method was called.
Which is not exactly a trivial statement for someone new to OOP.
I've been reading a lot about the whole backstory as to why you have to actually include an explicit self in Python, but need a simple explanation as to what it means to say that self is used to refer to the instance object ——> Does that mean that self is actually referring to the object that is the class itself you've just created? In other words, self somehow "boots up" the class in memory as an object?
Your second-last sentence is correct, but the last sentence is not. It has nothing to do with "booting up" or creating the object at all - the object already exists by that point.
I think you are missing the fact that self is used in all methods, not just __init__, to refer to the specific object that the method belongs to.
For instance, if you had a simple object with a name property, and a method called print_name, it might look like this:
def print_name(self):
print(self.name)
So here the method is using self to refer to the properties of the object it has been called on.
When objects are instantiated, the object itself is passed into the self parameter.
Because of this, the object’s data is bound to the object. Below is an example of how you might like to visualize what each object’s data might look. Notice how ‘self’ is replaced with the objects name. I'm not saying this example diagram below is wholly accurate but it hopefully with serve a purpose in visualizing the use of self.
EDIT (due to further question: Could you explain why exactly when objects are instantiated, the object itself is passed into the self parameter?)
The Object is passed into the self parameter so that the object can keep hold of its own data.
Although this may not be wholly accurate, think of the process of instantiating an object like this: When an object is made it uses the class as a template for its own data and methods. Without passing it's own name into the self parameter, the attributes and methods in the class would remain as a general template and would not be referenced to (belong to) the object. So by passing the object's name into the self parameter it means that if 100 objects are instantiated from the one class, they can all keep track of their own data and methods.
See the illustration below:
Every member function of a class, including the constructor (__init__) is invoked for a certain instance (object) of that class. Member functions have to be able to access the object for which they are called.
So e.g. in a.f(), f() has to have acces to a. In f, defined as f (this), this refers to a.
The special thing for a constructor is that there is no object "before the dot" yet, because precisely that object is being constructed. So this refers to the object "just being constructed" in that case.
When you write myClass(), python first creates an instance of your class, then immediately calls __init__() passing this object as the argument. self is a defined object in memory by the time you call __init__().
Behind the scenes, object construction is actually quite complicated.
Classes are objects too, and the type of a class is type (or a subclass, if using metaclasses). type has a __call__ method that is responsible for constructing instances. It works something like:
class type:
def __call__(cls, *args, **kwargs):
self = cls.__new__(cls, *args, **kwargs)
if isinstance(self, cls):
cls.__init__(self, *args, **kwargs)
Note, the above is for demonstrative purposes only.
Remember that, if a function is not defined on a class itself, it is looked up on its parent (as controlled by the mro), and usually.
Ultimately, __new__ must either call object.__new__(cls) to allocate a new instance of a class cls, or else return an existing object. If the existing object is of a different class, __init__ will not be called. Note that if it returns an existing object of the right class (or a subclass), __init__ will be called more than once. For such classes, all of the work is usually done in __new__.
Chances are you'll never use any of this, but it might help you understand what's going on behind the scenes.
Simply, it means you are referring to a method or variable that is local to the object.
You can look at 'self' as referrer or a pointer to class internals which with that you can invoke methods or add/remove/update/delete attributes . Class is somehow an isolated object which has its own representation of data given to it . So basically , self is only explicitly defined as an argument, which with using that you can get access to class internals . Some programming languages does not explicitly include the keyword self. or some uses this ( like C ++ ) . take a look here:
a = 1
b = 2
class test(object):
def __init__(self,a,b):
self.a = a + 1
self.b = b + 1
def show_internals(self):
print self.a, '\t', self.b
def change_internals(self,a,b):
self.a = a
self.b = b
_my_class = test(3,4)
print a , b
_my_class.show_internals()
_my_class.change_internals(5,6)
_my_class.show_internals()
print a , b
the result is :
1 2
4 5
5 6
1 2
As you can see, with using self you can manipulate the data within the object itself. Otherwise you would end up editing global variables.

Categories

Resources