Why do functions/methods in python need self as parameter? [duplicate]

Why do functions/methods in python need self as parameter? [duplicate] - python

This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 8 years ago.
I can understand why it is needed for local variables, (self.x), but why is is nessecary as parameter in a function? Is there something else you could put there instead of self?
Please explain in as much layman terms as possible, I never had decent programming education.

By default, every function declared in the namespace of a class assumes that its first argument will be a reference to an instance of that class. (Other types of functions are decorated with #classmethod and #staticmethod to change this assumption.) Such functions are called methods. By convention, Python programmers name that first parameter self, but Python doesn't care what you call it. When a method is called, you must provide that reference. For example (with self replaced by foobar to demonstrate that self is not the required name):
class A:
def __init__(foobar):
foobar.x = 5
def somefunc(foobar, y):
print foobar.x + y
a = A()
print A.somefunc(a, 3) # Prints 8
Python provides some syntactic sugar to make the link between an object and a method called on it more obvious, by letting you call a bound method instead of the function itself. That is, a.somefunc(3) and A.somefunc(a, 3) are equivalent. In Python terminology, A.somefunc is an unbound method, since it still needs an instance when it is called:
f = A.somefunc
print f(a, 3)
By contrast, a.somefunc is called a bound method, since you have already provided the instance to use as the first argument:
f = a.somefunc
print f(3)

If you consider it, EVERY programming language does that - or, at least, the most common languages like pascal, c++ or java do). EXCEPT that, in most programming languages, the this keyword is assumed and not passed as a parameter. Consider the function pointers in those languages: they're different than method-pointers.
Pascal:
function(a: Integer): Integer;
vs
function(a: Integer): Integer of object;
The latter considers the self pointer (yes, it's named self but it's an implicit pointer like the this in c++, while the python self is explicit).
C++:
typedef int (*mytype)(int a);
vs
typedef int Anyclass::(*mytype)(int a);
As difference with Pascal, in C++ you must specify the class owning the method. Anyway, this method pointer declaration states the difference between a function expecting a this or not.
But Python takes seriously it's Zen, as quichua people takes seriously their Ama Suway, Ama Llullay, Ama K'ellay:
Explicit is better than implicit.
So, that's why you see explicitly the self parameter (and must write it, of course) for instance methods and #classmethods (althought it's usually called cls there since it's intention is to dynamically know the class and not the instance). Python does not assume a this or self keyword must exist inside the methods (so, the namespace has only true variables - remember you are NOT forced to name them self or cls althought it's usual and expected).
Finally, if you get:
x = AClass.amethod #unbound method
You must call it as
x(aninstance, param, param2, ..., named=param, named2=param2, ...)
While getting it as:
x = anInstance.method #bound method, has `im_self` attribute set to the instance.
must be called as:
x(param, param2, ..., named=param, named2=param2, ...)
Yes, the self is explicit in the parameter list since it's not assumed a keyword or 'backdoor' must exist, but not in the argument list because of syntactic sugar every OOP language has (weird criteria, huh?).

It's how Python's implementation of object oriented programming works -- a method of an instance (a so-called bound method) is called with that instance as its first argument.
Besides variables of the instance (self.x) you can also use it for anything else, e.g. call another method (self.another_method()), pass this instance as a parameter to something else entirely (mod.some_function(3, self)), or use it to call this method in the superclass of this class (return super(ThisClass, self).this_method()).
You can give it an entirely different name as well. Using pony instead of self will work just as well. But you shouldn't, for obvious reasons.

The use of self for the first reference in a method is entirely convention.
You can call it something else, even inconsistently in the same class:
class Foo(object):
def __init__(notself, i):
notself.i=i # note 'notself' instead of 'self'
def __str__(self):
return str(self.i) # back to the convention
f=Foo(22)
print f
But please don't do that. It is confusing to others that may read your code (or yourself when you read it later).

Related

Why method accepts class name and name 'object' as an argument?

Consider the following code, I expected it to generate error. But it worked. mydef1(self) should only be invoked with instance of MyClass1 as an argument, but it is accepting MyClass1 as well as rather vague object as instance.
Can someone explain why mydef is accepting class name(MyClass1) and object as argument?
class MyClass1:
def mydef1(self):
return "Hello"
print(MyClass1.mydef1(MyClass1))
print(MyClass1.mydef1(object))
Output
Hello
Hello

There are several parts to the answer to your question because your question signals confusion about a few different aspects of Python.
First, type names are not special in Python. They're just another variable. You can even do something like object = 5 and cause all kinds of confusion.
Secondly, the self parameter is just that, a parameter. When you say MyClass1.mydef1 you're asking for the value of the variable with the name mydef1 inside the variable (that's a module, or class, or something else that defines the __getattr__ method) MyClass1. You get back a function that takes one argument.
If you had done this:
aVar = MyClass1()
aVar.mydef1(object)
it would've failed. When Python gets a method from an instance of a class, the instance's __getattr__ method has special magic to bind the first argument to the same object the method was retrieved from. It then returns the bound method, which now takes one less argument.
I would recommend fiddling around in the interpreter and type in your MyClass1 definition, then type in MyClass1.mydef1 and aVar = MyClass1(); aVar.mydef1 and observe the difference in the results.
If you come from a language like C++ or Java, this can all seem very confusing. But, it's actually a very regular and logical structure. Everything works the same way.
Also, as people have pointed out, names have no type associated with them. The type is associated with the object the name references. So any name can reference any kind of thing. This is also referred to as 'dynamic typing'. Python is dynamically typed in another way as well. You can actually mess around with the internal structure of something and change the type of an object as well. This is fairly deep magic, and I wouldn't suggest doing it until you know what you're doing. And even then you shouldn't do it as it will just confuse everybody else.

Python is dynamically typed, so it doesn't care what gets passed. It only cares that the single required parameter gets an argument as a value. Once inside the function, you never use self, so it doesn't matter what the argument was; you can't misuse what you don't use in the first place.
This question only arises because you are taking the uncommon action of running an instance method as an unbound method with an explicit argument, rather than invoking it on an instance of the class and letting the Python runtime system take care of passing that instance as the first argument to mydef1: MyClass().mydef1() == MyClass.mydef1(MyClass()).

Python is not a statically-typed language, so you can pass to any function any objects of any data types as long as you pass in the right number of parameters, and the self argument in a class method is no different from arguments in any other function.

There is no problem with that whatsoever - self is an object like any other and may be used in any context where object of its type/behavior would be welcome.
Python - Is it okay to pass self to an external function

What is the point of _=None in method python method signature?

What is the purpose of _=None in a method signature?
Example:
def method(_=None):
pass

_ is a conventional name for an unused placeholder variable in Python (and shell, and some other languages).
When not in the context of a class (that is, when defining a function rather than a method), this would define a function with a single optional (keyword) argument (defaulting to None), with a name indicating that that argument is deliberately unused (and exempting it from warnings about unused variables in some static-checking tools).
When defining a method where the first and only argument accepted is defined in this way, this becomes effectively a poor man's static method. That is to say, it's indicating that self is unused and optional, and allowing the method to be called independently of whether any object instance is associated (that is, whether a self is actually available at runtime).
Note that this is not common idiom, and using the #staticmethod decorator is going to make much more sense to readers of your code.

Does the Python interpreter bind instances to methods or the self parameter?

I am reading a book about Object-Oriented Programming in Python. There is a sentence that I am confused by:
The interpreter automatically binds the instance upon which the method is invoked to the self parameter.
In this sentence what is bound to the instance. the method, or the self parameter?

This is actually not such a bad question and I'm not sure why it got downvoted so quickly...
Even though Python supports object-oriented, I find it to be much closer to functional-programming languages, one of the reasons for that is that functions are invoked "on" objects, not "by" them.
For example: len(obj) where in a "true" object oriented programing language you'd expect to be able to do something like obj.length()
In regards to the self parameter, you're calling obj.method(other_args) but what really happens under the hood is a translation of this call to: method(obj, other_args) you can see that when the method is declared you're doing it with the self variable passed in as the first argument:
class ...
def method(self, other_args):
...
so it's basically all about the "translation" of obj.method(other_args) to method(obj, other_args)

Why are parenthesis optional when defining a class, but mandatory when defining a function?

In Python, defining a function with an empty parameter list requires a set of empty parenthesis. However, defining a class with the default superclass does not require a set of empty parenthesis; rather, those are optional, and appear to be uncommon. Why is it so?
See also: Python class definition syntax.

I think the answer to your question is simply syntax. That is just the way Python is set up, but my take on how it got that way is:
I would think functions came out of mathematics things like:
f(x) = x
So when computer programming languages were being created there seems to have been some logical continuity from analog mathematics into programming languages.
Classes on the other hand are more constructs of Computer Science, and repetitive memory management, so they were not created in such a fashion, but because they have a functional quality to them, they were given similar notation.
For Python, I will use the term method for function as that is the usual lingo...
I understand your argument that both a class and method should be allowed to be defined using a short-cut in the no argument case:
for classes when there is no inheritence
for methods when there are no arguments
One reason I can think of is for consistency across usage and definition. Let's look at some examples:
definition:
def funcA():
return 0
def funcB(arg):
return arg
and you want to call that funciton:
>>> funcA()
>>> functB("an argument")
and
>>> f1 = funcA
>>> f2 = funcB
>>> f1()
>>> f2("another argument")
to pass references and call them.
The syntax of the paranthesis between method declaration is consistent with calling the methods.
You need to put those empty parenthesis otherwise the interpreter will give you a reference to the method, and not actually call it.
So one benefit is it makes your code very clear.
definition:
class classA:
pass
class classB(object):
pass
usage:
# create an instance
my_instance_of_A = classA()
my_instance_of_B = classB()
# pass a reference
my_ref_to_A = classA
my_ref_to_B = classB
# call by reference
new_A = my_ref_to_A()
new_B = my_ref_to_B()
Here there is no change in behavior with regards to whether the class inherits or not, its calling behavior is dictated by what its internal or inherited __init__ method is defined as.
I think the current set up of requiring the empty () makes the code more readable to the untrained eye.
If you really really really want to do what you ask, there is a workaround... you could always do this:
func = lambda: "im a function declared with no arguments, and I didn't use parenthesis =p"
which can be called:
>>> func
<function <lambda> at 0x6ffffef26e0>
>>> func()
"im a function declared with no arguments, and I didn't use parenthesis =p"
But the python holy book says No

Python and reference passing. Limitation?

I would like to do something like the following:
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory(foo):
foo = Foo()
aTestFoo = None
factory(aTestFoo)
print aTestFoo.member
However it crashes with AttributeError: 'NoneType' object has no attribute 'member':
the object aTestFoo has not been modified inside the call of the function factory.
What is the pythonic way of performing that ? Is it a pattern to avoid ? If it is a current mistake, how is it called ?
In C++, in the function prototype, I would have added a reference to the pointer to be created in the factory... but maybe this is not the kind of things I should think about in Python.
In C#, there's the key word ref that allows to modify the reference itself, really close to the C++ way. I don't know in Java... and I do wonder in Python.

Python does not have pass by reference. One of the few things it shares with Java, by the way. Some people describe argument passing in Python as call by value (and define the values as references, where reference means not what it means in C++), some people describe it as pass by reference with reasoning I find quite questionable (they re-define it to use to what Python calls "reference", and end up with something which has nothing to do with what has been known as pass by reference for decades), others go for terms which are not as widely used and abused (popular examples are "{pass,call} by {object,sharing}"). See Call By Object on effbot.org for a rather extensive discussion on the defintions of the various terms, on history, and on the flaws in some of the arguments for the terms pass by reference and pass by value.
The short story, without naming it, goes like this:
Every variable, object attribute, collection item, etc. refers to an object.
Assignment, argument passing, etc. create another variable, object attribute, collection item, etc. which refers to the same object but has no knowledge which other variables, object attributes, collection items, etc. refer to that object.
Any variable, object attribute, collection item, etc. can be used to modify an object, and any other variable, object attribute, collection item, etc. can be used to observe that modification.
No variable, object attribute, collection item, etc. refers to another variable, object attribute, collection items, etc. and thus you can't emulate pass by reference (in the C++ sense) except by treating a mutable object/collection as your "namespace". This is excessively ugly, so don't use it when there's a much easier alternative (such as a return value, or exceptions, or multiple return values via iterable unpacking).
You may consider this like using pointers, but not pointers to pointers (but sometimes pointers to structures containing pointers) in C. And then passing those pointers by value. But don't read too much into this simile. Python's data model is significantly different from C's.

You are making a mistake here because in Python
"We call the argument passing technique _call by sharing_,
because the argument objects are shared between the
caller and the called routine. This technique does not
correspond to most traditional argument passing techniques
(it is similar to argument passing in LISP). In particular it
is not call by value because mutations of arguments per-
formed by the called routine will be visible to the caller.
And it is not call by reference because access is not given
to the variables of the caller, but merely to certain objects."
in Python, the variables in the formal argument list are bound to the
actual argument objects. the objects are shared between caller
and callee; there are no "fresh locations" or extra "stores" involved.
(which, of course, is why the CLU folks called this mechanism "call-
by-sharing".)
and btw, Python functions doesn't run in an extended environment, either. function bodies have very limited access to the surrounding environment.

The Assignment Statements section of the Python docs might be interesting.
The = statement in Python acts differently depending on the situation, but in the case you present, it just binds the new object to a new local variable:
def factory(foo):
# This makes a new instance of Foo,
# and binds it to a local variable `foo`,
foo = Foo()
# This binds `None` to a top-level variable `aTestFoo`
aTestFoo = None
# Call `factory` with first argument of `None`
factory(aTestFoo)
print aTestFoo.member
Although it can potentially be more confusing than helpful, the dis module can show you the byte-code representation of a function, which can reveal how Python works internally. Here is the disassembly of `factory:
>>> dis.dis(factory)
4 0 LOAD_GLOBAL 0 (Foo)
3 CALL_FUNCTION 0
6 STORE_FAST 0 (foo)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
What that says is, Python loads the global Foo class by name (0), and calls it (3, instantiation and calling are very similar), then stores the result in a local variable (6, see STORE_FAST). Then it loads the default return value None (9) and returns it (12)
What is the pythonic way of performing that ? Is it a pattern to avoid ? If it is a current mistake, how is it called ?
Factory functions are rarely necessary in Python. In the occasional case where they are necessary, you would just return the new instance from your factory (instead of trying to assign it to a passed-in variable):
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory():
return Foo()
aTestFoo = factory()
print aTestFoo.member

Your factory method doesn't return anything - and by default it will have a return value of None. You assign aTestFoo to None, but never re-assign it - which is where your actual error is coming from.
Fixing these issues:
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory(obj):
return obj()
aTestFoo = factory(Foo)
print aTestFoo.member
This should do what I think you are after, although such patterns are not that typical in Python (ie, factory methods).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.