Python __init__ Compared to C++ Constructor

Python __init__ Compared to C++ Constructor - python

I have worked with Python for about 4 years and have recently started learning C++. In C++ you create a constructor method for each class I I was wondering if it is correct to think that this is equivalent to the __init__(self) function in Python? Are there any notable differences? Same question for a C++ destructor method vs. Python _exit__(self)

Yes, Python's __init__ is analogous to C++'s constructor. Both are typically where non-static data members are initialized. In both languages, these functions take the in-creation object as the first argument, explicit and by convention named self in Python and implicit and by language named this in C++. In both languages, these functions can return nothing. One notable difference between the languages is that in Python base-class __init__ must be called explicitly from an inherited class __init__ and in C++ it is implicit and automatic. C++ also has ways to declare data member initializers outside the body of the constructor, both by member initializer lists and non-static data member initializers. C++ will also generate a default constructor for you in some circumstances.
Python's __new__ is analogous to C++'s class-level operator new. Both are static class functions which must return a value for the creation to proceed. In C++, that something is a pointer to memory and in Python it is an uninitialized value of the class type being created.
Python's __del__ has no direct analogue in C++. It is an object finalizer, which exist also in other garbage collected languages like Java. It is not called at a lexically predetermined time, but the runtime calls it when it is time to deallocate the object.
__exit__ plays a role similar to C++'s destructor, in that it can provide for deterministic cleanup and a lexically predetermined point. In C++, this tends to be done through the C++ destructor of an RAII type. In Python, the same object can have __enter__ and __exit__ called multiple times. In C++, that would be accomplished with the constructor and destructor of a separate RAII resource holding type. For example, in Python given an instance lock of a mutual exclusion lock type, one can say with lock: to introduce a critical section. In C++, we create an instance of a different type taking the lock as a parameter std::lock_guard g{lock} to accomplish the same thing. The Python __enter__ and __exit__ calls map to the constructor and destructor of the C++ RAII type.

The best you can say is that __init__ and a C++ constructor are called at roughly the same point in the lifetime of a new object, and that __del__ and a C++ destructor are also called near the end of the lifetime of an object. The semantics, however, are markedly different, and the execution model of each language makes further comparison more difficult.
Suffice it to say that __init__ is used to initialize an object after it has been created. __del__ is like a destructor that may be called at some unspecified point in time after the last reference to an object goes away, and __exit__ is more like a callback invoked at the end of a with statement, whether or not the object's reference count reaches zero.

I was wondering if it is correct to think that this is equivalent to
the init(self) function in Python?
No. Just by looking at the structure of the statement you can understand. Indeed, self is a reference to the instance. Therefore, the instance must be constructed before __init__ is called.
See this for more information (__new__ is actually what you're looking for)
Same question for a C++ destructor method vs. Python _exit__(self)
No. __exit__ only exit the Runtime context related to the object. In this case, what you are really looking for is __del__.
See this, which clearly state:
Called when the instance is about to be destroyed. This is also called
a destructor.

Related

How does python interpreter build objects [duplicate]

This question already has answers here:
Python (and Python C API): __new__ versus __init__ [duplicate]
(6 answers)
Closed 2 years ago.
I recently started following Python after studying Java. I'm confused with the way of python interpreter's object construction.
Compared to Java when we construct an object when simply provide our arguments, for Python that is the case too.
But I can't think why the __init()__ method requires a self parameter when we define it in our class.
I read this question and I got that the methods require a self parameter because python calls a method in the format ClassA.methodA(ObjectA, arg1, arg2).
But I really don't get why the __init()__ method require this.
Is it because the way that Python generate an object differs from the way that Java generates an object.
I really appreciate if someone can explain it to me.

Why must ‘self’ be used explicitly in method definitions and calls?
The idea was borrowed from Modula-3. It turns out to be very useful, for a variety of reasons.
First, it’s more obvious that you are using a method or instance attribute instead of a local variable. Reading self.x or self.meth() makes it absolutely clear that an instance variable or method is used even if you don’t know the class definition by heart. In C++, you can sort of tell by the lack of a local variable declaration (assuming globals are rare or easily recognizable) – but in Python, there are no local variable declarations, so you’d have to look up the class definition to be sure. Some C++ and Java coding standards call for instance attributes to have an m_ prefix, so this explicitness is still useful in those languages, too.
Second, it means that no special syntax is necessary if you want to explicitly reference or call the method from a particular class. In C++, if you want to use a method from a base class which is overridden in a derived class, you have to use the :: operator – in Python you can write baseclass.methodname(self, ). This is particularly useful for __init__() methods, and in general in cases where a derived class method wants to extend the base class method of the same name and thus has to call the base class method somehow.
Finally, for instance variables it solves a syntactic problem with assignment: since local variables in Python are (by definition!) those variables to which a value is assigned in a function body (and that aren’t explicitly declared global), there has to be some way to tell the interpreter that an assignment was meant to assign to an instance variable instead of to a local variable, and it should preferably be syntactic (for efficiency reasons). C++ does this through declarations, but Python doesn’t have declarations and it would be a pity having to introduce them just for this purpose. Using the explicit self.var solves this nicely. Similarly, for using instance variables, having to write self.var means that references to unqualified names inside a method don’t have to search the instance’s directories. To put it another way, local variables and instance variables live in two different namespaces, and you need to tell Python which namespace to use.
reference: https://docs.python.org/3/faq/design.html#why-must-self-be-used-explicitly-in-method-definitions-and-calls

When (and why) was Python `new()` introduced?

When (and why) was the Python __new__() function introduced?
There are three steps in creating an instance of a class, e.g. MyClass():
MyClass.__call__() is called. This method must be defined in the metaclass of MyClass.
MyClass.__new__() is called (by __call__). Defined on MyClass itself. This creates the instance.
MyClass.__init__() is called (also by __call__). This initializes the instance.
Creation of the instance can be influenced either by overloading __call__ or __new__. There usually is little reason to overload __call__ instead of __new__ (e.g. Using the __call__ method of a metaclass instead of __new__?).
We have some old code (still running strong!) where __call__ is overloaded. The reason given was that __new__ was not available at the time. So I tried to learn more about the history of both Python and our code, but I could not figure out when __new__ was introduced.
__new__ appears in the documentation for Python 2.4 and not in those for Python 2.3, but it does not appear in the whathsnew of any of the Python 2 versions. The first commit that introduced __new__ (Merge of descr-branch back into trunk.) that I could find is from 2001, but the 'back into trunk' message is an indication that there was something before. PEP 252 (Making Types Look More Like Classes) and PEP 253 (Subtyping Built-in Types) from a few months earlier seem to be relevant.
Learning more about the introduction of __new__ would teach us more about why Python is the way it is.
Edit for clarification:
It seems that class.__new__ duplicates functionality that is already provided by metaclass.__call__. It seems un-Pythonic to add a method only to replicate existing functionality in a better way.
__new__ is one of the few class methods that you get out of the box (i.e. with cls as first argument), thereby introducing complexity that wasn't there before. If the class is the first argument of a function, then it can be argued that the function should be a normal method of the metaclass. But that method did already exist: __call__(). I feel like I'm missing something.
There should be one-- and preferably only one --obvious way to do it.

The blog post The Inside Story on New-Style Classes
(from the aptly named http://python-history.blogspot.com) written by Guido van Rossum (Python's BDFL) provides some good information regarding this subject.
Some relevant quotes:
New-style classes introduced a new class method __new__() that lets
the class author customize how new class instances are created. By
overriding __new__() a class author can implement patterns like the
Singleton Pattern, return a previously created instance (e.g., from a
free list), or to return an instance of a different class (e.g., a
subclass). However, the use of __new__ has other important
applications. For example, in the pickle module, __new__ is used to
create instances when unserializing objects. In this case, instances
are created, but the __init__ method is not invoked.
Another use of __new__ is to help with the subclassing of immutable
types. By the nature of their immutability, these kinds of objects can
not be initialized through a standard __init__() method. Instead, any
kind of special initialization must be performed as the object is
created; for instance, if the class wanted to modify the value being
stored in the immutable object, the __new__ method can do this by
passing the modified value to the base class __new__ method.
You can read the entire post for more information on this subject.
Another post about New-style Classes which was written along with the above quoted post has some additional information.
Edit:
In response to OP's edit and the quote from the Zen of Python, I would say this.
Zen of Python was not written by the creator of the language but by Tim Peters and was published only in August 19, 2004. We have to take into account the fact that __new__ appears only in the documentation of Python 2.4 (which was released on November 30, 2004), and this particular guideline (or aphorism) did not even exist publicly when __new__ was introduced into the language.
Even if such a document of guidelines existed informally before, I do not think that the author(s) intended them to be misinterpreted as a design document for an entire language and ecosystem.

I will not explain the history of __new__ here because I have only used Python since 2005, so after it was introduced into the language. But here is the rationale behind it.
The normal configuration method for a new object is the __init__ method of its class. The object has already been created (usually via an indirect call to object.__new__) and the method just initializes it. Simply, if you have a truely non mutable object, it is too late.
In that use case the Pythonic way is the __new__ method, which builds and returns the new object. The nice point with it, is that is is still included in the class definition and does not require a specific metaclass. Standard documentation states:
new() is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation. It is also commonly overridden in custom metaclasses in order to customize class creation.
Defining a __call__ method on the metaclass is indeed allowed but is IMHO non Pythonic, because __new__ should be enough. In addition, __init__, __new__ and metaclasses each dig deeper inside the internal Python machinery. So the rule shoud be do not use __new__ if __init__ is enough, and do not use metaclasses if __new__ is enough.

Is it conventional to say that functions are called and methods are invoked?

I’m reading Think Python: How to Think Like a Computer Scientist. The author uses “invoke” with methods and “call” with functions.
Is it a convention? And, if so, why is this distinction made? Why are functions said to be called, but methods are said to be invoked?

Not really, maybe it is easier for new readers to make an explicit distinction in order to understand that their invocation is slightly different. At least that why I suspect the author might have chosen different wording for each.
There doesn't seem to be a convention that dictates this in the Reference Manual for the Python language. What I seem them doing is choosing invoke when the call made to a function is implicit and not explicit.
For example, in the Callables section of the Standard Type Hierarchy you see:
[..] When an instance method object is called, the underlying function (__func__) is called, inserting the class instance (__self__) in front of the argument list. [...]
(Emphasis mine) Explicit call
Further down in Basic Customization and specifically for __new__ you can see:
Called to create a new instance of class cls. __new__() is a static method [...]
(Emphasis mine) Explicit call
While just a couple of sentences later you'll see how invoked is used because __new__ implicitly calls __init__:
If __new__() does not return an instance of cls, then the new instance’s __init__() method will not be invoked.
(Emphasis mine) Implicitly called
So no, no convention seems to be used, at least by the creators of the language. Simple is better than complex, I guess :-).

One good source for this would be the Python documentation. A simple text search through the section on Classes reveals the word "call" being used many times in reference to "calling methods", and the word "invoke" being used only once.
In my experience, the same is true: I regularly hear "call" used in reference to methods and functions, while I rarely hear "invoke" for either. However, I assume this is mainly a matter of personal preference and/or context (is the setting informal?, academic?, etc.).
You will also see places in the documentation where the word "invoke" is used in refernce to functions:
void Py_FatalError(const char *message)
Print a fatal error message
and kill the process. No cleanup is performed. This function should
only be invoked when a condition is detected that would make it
dangerous to continue using the Python interpreter; e.g., when the
object administration appears to be corrupted. On Unix, the standard C
library function abort() is called which will attempt to produce a
core file.
And from here:
void Py_DECREF(PyObject *o)
Decrement the reference count for object o. The object must not be NULL; if you aren’t sure that it isn’t NULL,
use Py_XDECREF(). If the reference count reaches zero, the object’s
type’s deallocation function (which must not be NULL) is invoked.
Although both these references are from the Python C API, so that may be significant.
To summerize:
I think it is safe to use either "invoke" or "call" in the context of functions or methods without sounding either like a noob or a showoff.
Note that I speak only of Python, and what I know from my own experience. I cannot speak to the difference between these terms in other languages.

Is del really a destructor?

I do things mostly in C++, where the destructor method is really meant for destruction of an acquired resource. Recently I started with python (which is really a fun and fantastic), and I came to learn it has GC like java.
Thus, there is no heavy emphasis on object ownership (construction and destruction).
As far as I've learned, the __init__() method makes more sense to me in python than it does for ruby too, but the __del__() method, do we really need to implement this built-in function in our class? Will my class lack something if I miss __del__()? The one scenario I could see __del__() useful is, if I want to log something when destroying an object. Is there anything other than this?

In the Python 3 docs the developers have now made clear that destructor is in fact not the appropriate name for the method __del__.
object.__del__(self)
Called when the instance is about to be destroyed. This is also called a finalizer or (improperly) a destructor.
Note that the OLD Python 3 docs used to suggest that 'destructor' was the proper name:
object.__del__(self)
Called when the instance is about to be destroyed. This is also called a destructor. If a base class has a __del__() method, the derived class’s __del__() method, if any, must explicitly call it to ensure proper deletion of the base class part of the instance.
From other answers but also from the Wikipedia:
In a language with an automatic garbage collection mechanism, it would be difficult to deterministically ensure the invocation of a destructor, and hence these languages are generally considered unsuitable for RAII [Resource Acquisition Is Initialization]
So you should almost never be implementing __del__, but it gives you the opportunity to do so in some (rare?) use cases

As the other answers have already pointed out, you probably shouldn't implement __del__ in Python. If you find yourself in the situation thinking you'd really need a destructor (for example if your class wraps a resource that needs to be explicitly closed) then the Pythonic way to go is using context managers.

Is del really a destructor?
No, __del__ method is not a destructor, is just a normal method you can call whenever you want to perform any operation, but it is always called before the garbage collector destroys the object.
Think of it like a clean or last will method.

So uncommon it is that I have learned about it today (and I'm long ago into python).
Memory is deallocated, files closed, ... by the GC. But you could need to perform some task with effects outside of the class.
My use case is about implementing some sort of RAII regarding some temporal directories. I'd like it to be removed no matter what.
Instead of removing it after the processing (which, after some change, was no longer run) I've moved it to the __del__ method, and it works as expected.
This is a very specific case, where we don't really care about when the method is called, as long as it's called before leaving the program. So, use with care.

Destructor in metaclass Singleton object

I'm modifying a legacy library that uses the singleton pattern through the metaclass approach.
The Singleton class, inheriting from type, defines de __call__ function.
Right now, my singleton object using this library are never deleted. I defined the __del__ method in the singleton classes and that function is never called.
Clarification: I have implemented one (meta)class named Singleton, that is used by several classes, using Singleton as __metaclass__.
For example, I have class A(object), that has __metaclass__ = Singleton. The A class has several members that I want to be destroyed when my program ends and the A object (the only one that can exist) is destroyed.
I tried defining __del__ method in A class, but it doesn't work.

Point 1: __del__() may not be called at process exit
The first thing to say is that
It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits.
From the python data model docs. Therefore you should not be relying on it to tidy up state that you need to tidy up at exit, and at the highest level, that's why your __del__() may not be being called. That's what atexit is for.
Point 2: predictable object lifetimes is an implementation detail in python
The next thing to say is that while CPython uses reference counting to enable it to detect that an object can be released without having to use the garbage collector (leading to more predictable CPU impact and potentially more efficient applications), it only takes one circular reference, one uncleared exception, one forgotten closure or one different python implementation to break, and so you should think really really hard about whether you want to rely on __del__() being called at a particular point.
Point 3: Singleton implementations generally maintain a global reference to the singleton instance
By the sound of it, I would guess your singleton metaclass (itself a singleton...) is retaining your singleton instance the first time __call__() is called. Since the metaclass is not released since it belongs to the module, which is itself retained by sys.modules, that reference is not going to go away by the time the program terminates, so even given a guaranteed prompt tidy up of all external references to the singleton being freed, your __del__() is not going to get called.
What you could try
Add an atexit handler when you create your singleton instance to do your necessary tidy up at process exit.
Also do that tidy up in the __del__() method if you want. E.g, you may decide for neatness / future extensibility (e.g. pluralizing the singleton) that you would like the singleton instance to tidy up after itself when it is no longer being used.
And if you implement a __del__() method expecting to want tidy up to be done during normal program execution, you will probably want to remove the atexit handler too.
If you would like your singleton to be cleaned up when no one is using it anymore, consider storing it on your metaclass using weakref so that you don't retain it yourself.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.