I'm teaching myself Python and I was translating some sample code into this
class Student(object):
def __init__( self, name, a,b,c ):
self.name = name
self.a = a
self.b = b
self.c = c
def average(self):
return ( a+b+c ) / 3.0
Which is pretty much my intended class definition.
Later in the main method I create an instance and call it a:
if __name__ == "__main__" :
a = Student( "Oscar", 10, 10, 10 )
That's how I find out that the variable a declared in main is available to the method average and to make that method work, I have to type self.a + self.b + self.c instead.
What's the rationale for this?
Barenames (like a, b, c) are always scoped as local or global (save for nested functions, which are nowhere around in your code). The rationale is that adding further scopes would needlessly make things more complicated -- e.g, if in your self.a = a the barename a could be scoped to mean what you appear to want (equivalent to self.a) then the assignment itself would be meaningless (assigning a name to itself), so you'd need further complicated rules.
Just using qualified names (like self.a) when you want something different than barenames' simple, straightforward, and optimized behavior, is by far the simplest approach -- perfectly workable, no complicated rules whatsoever, and allows the compiler to optimize things effectively (since e.g. a barename's scope is always lexically determined, not dependent on dynamically varying characteristics of the environment). So, besides perhaps nostalgia for other language with more complicated scoping rules, there's really no rationale for complicating the semantics of barenames.
There are several reasons, though the main one is from the Zen of Python: "Explicit is better than implicit." In a language like C++, a method on the class always has an implicit argument this which is pushed onto the stack every time the method is called. In this case, when an instance variable b exists as well as a global variable b, then the user may just refer to b referring to one without realizing that the other will be used. So Python forces you to be explicit about your scope to avoid confusion.
With that being said, there are other reasons as well. For example, I may define a function outside of a class and then attach it to a class at runtime. For example:
def log(self):
print "some library function requires all objects to have a log method"
print "unfortunately we're using the Student class, which doesn't have one"
print "this class is defined in a separate library, so we can't add the method"
print "fortunately, we can just add the method dynamically at runtime"
Student.log = log
Here the fact that self is explicit makes it trivial for us to define a function outside of a class and then attach it to that class. I don't do this sort of thing incredibly often, but it's EXTREMELY useful when I do.
Here's an even more complex example; suppose we want to define a class inside another class, such as for the purposes of unit testing:
class SomeUnitTests(TestCase):
def test_something(self):
class SomeMockObject(SomeActualObject):
def foo(self2):
self.assertEqual(self2.x, SOME_CONSTANT)
some_lib.do_something_with(SomeMockObject)
Here the presence of an explicit self (which we can call whatever we want, it doesn't have to be self) allows to to distinguish between the self of the inner and outer classes. Again, this isn't something I do frequently, but when I do then it's incredibly useful.
All instance variables should be called using self
Related
This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 6 months ago.
When defining a method on a class in Python, it looks something like this:
class MyClass(object):
def __init__(self, x, y):
self.x = x
self.y = y
But in some other languages, such as C#, you have a reference to the object that the method is bound to with the "this" keyword without declaring it as an argument in the method prototype.
Was this an intentional language design decision in Python or are there some implementation details that require the passing of "self" as an argument?
I like to quote Peters' Zen of Python. "Explicit is better than implicit."
In Java and C++, 'this.' can be deduced, except when you have variable names that make it impossible to deduce. So you sometimes need it and sometimes don't.
Python elects to make things like this explicit rather than based on a rule.
Additionally, since nothing is implied or assumed, parts of the implementation are exposed. self.__class__, self.__dict__ and other "internal" structures are available in an obvious way.
It's to minimize the difference between methods and functions. It allows you to easily generate methods in metaclasses, or add methods at runtime to pre-existing classes.
e.g.
>>> class C:
... def foo(self):
... print("Hi!")
...
>>>
>>> def bar(self):
... print("Bork bork bork!")
...
>>>
>>> c = C()
>>> C.bar = bar
>>> c.bar()
Bork bork bork!
>>> c.foo()
Hi!
>>>
It also (as far as I know) makes the implementation of the python runtime easier.
I suggest that one should read Guido van Rossum's blog on this topic - Why explicit self has to stay.
When a method definition is decorated, we don't know whether to automatically give it a 'self' parameter or not: the decorator could turn the function into a static method (which has no 'self'), or a class method (which has a funny kind of self that refers to a class instead of an instance), or it could do something completely different (it's trivial to write a decorator that implements '#classmethod' or '#staticmethod' in pure Python). There's no way without knowing what the decorator does whether to endow the method being defined with an implicit 'self' argument or not.
I reject hacks like special-casing '#classmethod' and '#staticmethod'.
Python doesn't force you on using "self". You can give it whatever name you want. You just have to remember that the first argument in a method definition header is a reference to the object.
Also allows you to do this: (in short, invoking Outer(3).create_inner_class(4)().weird_sum_with_closure_scope(5) will return 12, but will do so in the craziest of ways.
class Outer(object):
def __init__(self, outer_num):
self.outer_num = outer_num
def create_inner_class(outer_self, inner_arg):
class Inner(object):
inner_arg = inner_arg
def weird_sum_with_closure_scope(inner_self, num)
return num + outer_self.outer_num + inner_arg
return Inner
Of course, this is harder to imagine in languages like Java and C#. By making the self reference explicit, you're free to refer to any object by that self reference. Also, such a way of playing with classes at runtime is harder to do in the more static languages - not that's it's necessarily good or bad. It's just that the explicit self allows all this craziness to exist.
Moreover, imagine this: We'd like to customize the behavior of methods (for profiling, or some crazy black magic). This can lead us to think: what if we had a class Method whose behavior we could override or control?
Well here it is:
from functools import partial
class MagicMethod(object):
"""Does black magic when called"""
def __get__(self, obj, obj_type):
# This binds the <other> class instance to the <innocent_self> parameter
# of the method MagicMethod.invoke
return partial(self.invoke, obj)
def invoke(magic_self, innocent_self, *args, **kwargs):
# do black magic here
...
print magic_self, innocent_self, args, kwargs
class InnocentClass(object):
magic_method = MagicMethod()
And now: InnocentClass().magic_method() will act like expected. The method will be bound with the innocent_self parameter to InnocentClass, and with the magic_self to the MagicMethod instance. Weird huh? It's like having 2 keywords this1 and this2 in languages like Java and C#. Magic like this allows frameworks to do stuff that would otherwise be much more verbose.
Again, I don't want to comment on the ethics of this stuff. I just wanted to show things that would be harder to do without an explicit self reference.
I think it has to do with PEP 227:
Names in class scope are not accessible. Names are resolved in the
innermost enclosing function scope. If a class definition occurs in a
chain of nested scopes, the resolution process skips class
definitions. This rule prevents odd interactions between class
attributes and local variable access. If a name binding operation
occurs in a class definition, it creates an attribute on the resulting
class object. To access this variable in a method, or in a function
nested within a method, an attribute reference must be used, either
via self or via the class name.
I think the real reason besides "The Zen of Python" is that Functions are first class citizens in Python.
Which essentially makes them an Object. Now The fundamental issue is if your functions are object as well then, in Object oriented paradigm how would you send messages to Objects when the messages themselves are objects ?
Looks like a chicken egg problem, to reduce this paradox, the only possible way is to either pass a context of execution to methods or detect it. But since python can have nested functions it would be impossible to do so as the context of execution would change for inner functions.
This means the only possible solution is to explicitly pass 'self' (The context of execution).
So i believe it is a implementation problem the Zen came much later.
As explained in self in Python, Demystified
anything like obj.meth(args) becomes Class.meth(obj, args). The calling process is automatic while the receiving process is not (its explicit). This is the reason the first parameter of a function in class must be the object itself.
class Point(object):
def __init__(self,x = 0,y = 0):
self.x = x
self.y = y
def distance(self):
"""Find distance from origin"""
return (self.x**2 + self.y**2) ** 0.5
Invocations:
>>> p1 = Point(6,8)
>>> p1.distance()
10.0
init() defines three parameters but we just passed two (6 and 8). Similarly distance() requires one but zero arguments were passed.
Why is Python not complaining about this argument number mismatch?
Generally, when we call a method with some arguments, the corresponding class function is called by placing the method's object before the first argument. So, anything like obj.meth(args) becomes Class.meth(obj, args). The calling process is automatic while the receiving process is not (its explicit).
This is the reason the first parameter of a function in class must be the object itself. Writing this parameter as self is merely a convention. It is not a keyword and has no special meaning in Python. We could use other names (like this) but I strongly suggest you not to. Using names other than self is frowned upon by most developers and degrades the readability of the code ("Readability counts").
...
In, the first example self.x is an instance attribute whereas x is a local variable. They are not the same and lie in different namespaces.
Self Is Here To Stay
Many have proposed to make self a keyword in Python, like this in C++ and Java. This would eliminate the redundant use of explicit self from the formal parameter list in methods. While this idea seems promising, it's not going to happen. At least not in the near future. The main reason is backward compatibility. Here is a blog from the creator of Python himself explaining why the explicit self has to stay.
The 'self' parameter keeps the current calling object.
class class_name:
class_variable
def method_name(self,arg):
self.var=arg
obj=class_name()
obj.method_name()
here, the self argument holds the object obj. Hence, the statement self.var denotes obj.var
There is also another very simple answer: according to the zen of python, "explicit is better than implicit".
I'm coming from the Java world and reading Bruce Eckels' Python 3 Patterns, Recipes and Idioms.
While reading about classes, it goes on to say that in Python there is no need to declare instance variables. You just use them in the constructor, and boom, they are there.
So for example:
class Simple:
def __init__(self, s):
print("inside the simple constructor")
self.s = s
def show(self):
print(self.s)
def showMsg(self, msg):
print(msg + ':', self.show())
If that’s true, then any object of class Simple can just change the value of variable s outside of the class.
For example:
if __name__ == "__main__":
x = Simple("constructor argument")
x.s = "test15" # this changes the value
x.show()
x.showMsg("A message")
In Java, we have been taught about public/private/protected variables. Those keywords make sense because at times you want variables in a class to which no one outside the class has access to.
Why is that not required in Python?
It's cultural. In Python, you don't write to other classes' instance or class variables. In Java, nothing prevents you from doing the same if you really want to - after all, you can always edit the source of the class itself to achieve the same effect. Python drops that pretence of security and encourages programmers to be responsible. In practice, this works very nicely.
If you want to emulate private variables for some reason, you can always use the __ prefix from PEP 8. Python mangles the names of variables like __foo so that they're not easily visible to code outside the namespace that contains them (although you can get around it if you're determined enough, just like you can get around Java's protections if you work at it).
By the same convention, the _ prefix means _variable should be used internally in the class (or module) only, even if you're not technically prevented from accessing it from somewhere else. You don't play around with another class's variables that look like __foo or _bar.
Private variables in Python is more or less a hack: the interpreter intentionally renames the variable.
class A:
def __init__(self):
self.__var = 123
def printVar(self):
print self.__var
Now, if you try to access __var outside the class definition, it will fail:
>>> x = A()
>>> x.__var # this will return error: "A has no attribute __var"
>>> x.printVar() # this gives back 123
But you can easily get away with this:
>>> x.__dict__ # this will show everything that is contained in object x
# which in this case is something like {'_A__var' : 123}
>>> x._A__var = 456 # you now know the masked name of private variables
>>> x.printVar() # this gives back 456
You probably know that methods in OOP are invoked like this: x.printVar() => A.printVar(x). If A.printVar() can access some field in x, this field can also be accessed outside A.printVar()... After all, functions are created for reusability, and there isn't any special power given to the statements inside.
As correctly mentioned by many of the comments above, let's not forget the main goal of Access Modifiers: To help users of code understand what is supposed to change and what is supposed not to. When you see a private field you don't mess around with it. So it's mostly syntactic sugar which is easily achieved in Python by the _ and __.
Python does not have any private variables like C++ or Java does. You could access any member variable at any time if wanted, too. However, you don't need private variables in Python, because in Python it is not bad to expose your classes' member variables. If you have the need to encapsulate a member variable, you can do this by using "#property" later on without breaking existing client code.
In Python, the single underscore "_" is used to indicate that a method or variable is not considered as part of the public API of a class and that this part of the API could change between different versions. You can use these methods and variables, but your code could break, if you use a newer version of this class.
The double underscore "__" does not mean a "private variable". You use it to define variables which are "class local" and which can not be easily overridden by subclasses. It mangles the variables name.
For example:
class A(object):
def __init__(self):
self.__foobar = None # Will be automatically mangled to self._A__foobar
class B(A):
def __init__(self):
self.__foobar = 1 # Will be automatically mangled to self._B__foobar
self.__foobar's name is automatically mangled to self._A__foobar in class A. In class B it is mangled to self._B__foobar. So every subclass can define its own variable __foobar without overriding its parents variable(s). But nothing prevents you from accessing variables beginning with double underscores. However, name mangling prevents you from calling this variables /methods incidentally.
I strongly recommend you watch Raymond Hettinger's Python's class development toolkit from PyCon 2013, which gives a good example why and how you should use #property and "__"-instance variables.
If you have exposed public variables and you have the need to encapsulate them, then you can use #property. Therefore you can start with the simplest solution possible. You can leave member variables public unless you have a concrete reason to not do so. Here is an example:
class Distance:
def __init__(self, meter):
self.meter = meter
d = Distance(1.0)
print(d.meter)
# prints 1.0
class Distance:
def __init__(self, meter):
# Customer request: Distances must be stored in millimeters.
# Public available internals must be changed.
# This would break client code in C++.
# This is why you never expose public variables in C++ or Java.
# However, this is Python.
self.millimeter = meter * 1000
# In Python we have #property to the rescue.
#property
def meter(self):
return self.millimeter *0.001
#meter.setter
def meter(self, value):
self.millimeter = value * 1000
d = Distance(1.0)
print(d.meter)
# prints 1.0
There is a variation of private variables in the underscore convention.
In [5]: class Test(object):
...: def __private_method(self):
...: return "Boo"
...: def public_method(self):
...: return self.__private_method()
...:
In [6]: x = Test()
In [7]: x.public_method()
Out[7]: 'Boo'
In [8]: x.__private_method()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-fa17ce05d8bc> in <module>()
----> 1 x.__private_method()
AttributeError: 'Test' object has no attribute '__private_method'
There are some subtle differences, but for the sake of programming pattern ideological purity, it's good enough.
There are examples out there of #private decorators that more closely implement the concept, but your mileage may vary. Arguably, one could also write a class definition that uses meta.
As mentioned earlier, you can indicate that a variable or method is private by prefixing it with an underscore. If you don't feel like this is enough, you can always use the property decorator. Here's an example:
class Foo:
def __init__(self, bar):
self._bar = bar
#property
def bar(self):
"""Getter for '_bar'."""
return self._bar
This way, someone or something that references bar is actually referencing the return value of the bar function rather than the variable itself, and therefore it can be accessed but not changed. However, if someone really wanted to, they could simply use _bar and assign a new value to it. There is no surefire way to prevent someone from accessing variables and methods that you wish to hide, as has been said repeatedly. However, using property is the clearest message you can send that a variable is not to be edited. property can also be used for more complex getter/setter/deleter access paths, as explained here: https://docs.python.org/3/library/functions.html#property
Python has limited support for private identifiers, through a feature that automatically prepends the class name to any identifiers starting with two underscores. This is transparent to the programmer, for the most part, but the net effect is that any variables named this way can be used as private variables.
See here for more on that.
In general, Python's implementation of object orientation is a bit primitive compared to other languages. But I enjoy this, actually. It's a very conceptually simple implementation and fits well with the dynamic style of the language.
The only time I ever use private variables is when I need to do other things when writing to or reading from the variable and as such I need to force the use of a setter and/or getter.
Again this goes to culture, as already stated. I've been working on projects where reading and writing other classes variables was free-for-all. When one implementation became deprecated it took a lot longer to identify all code paths that used that function. When use of setters and getters was forced, a debug statement could easily be written to identify that the deprecated method had been called and the code path that calls it.
When you are on a project where anyone can write an extension, notifying users about deprecated methods that are to disappear in a few releases hence is vital to keep module breakage at a minimum upon upgrades.
So my answer is; if you and your colleagues maintain a simple code set then protecting class variables is not always necessary. If you are writing an extensible system then it becomes imperative when changes to the core is made that needs to be caught by all extensions using the code.
"In java, we have been taught about public/private/protected variables"
"Why is that not required in python?"
For the same reason, it's not required in Java.
You're free to use -- or not use private and protected.
As a Python and Java programmer, I've found that private and protected are very, very important design concepts. But as a practical matter, in tens of thousands of lines of Java and Python, I've never actually used private or protected.
Why not?
Here's my question "protected from whom?"
Other programmers on my team? They have the source. What does protected mean when they can change it?
Other programmers on other teams? They work for the same company. They can -- with a phone call -- get the source.
Clients? It's work-for-hire programming (generally). The clients (generally) own the code.
So, who -- precisely -- am I protecting it from?
In Python 3, if you just want to "encapsulate" the class attributes, like in Java, you can just do the same thing like this:
class Simple:
def __init__(self, str):
print("inside the simple constructor")
self.__s = str
def show(self):
print(self.__s)
def showMsg(self, msg):
print(msg + ':', self.show())
To instantiate this do:
ss = Simple("lol")
ss.show()
Note that: print(ss.__s) will throw an error.
In practice, Python 3 will obfuscate the global attribute name. It is turning this like a "private" attribute, like in Java. The attribute's name is still global, but in an inaccessible way, like a private attribute in other languages.
But don't be afraid of it. It doesn't matter. It does the job too. ;)
Private and protected concepts are very important. But Python is just a tool for prototyping and rapid development with restricted resources available for development, and that is why some of the protection levels are not so strictly followed in Python. You can use "__" in a class member. It works properly, but it does not look good enough. Each access to such field contains these characters.
Also, you can notice that the Python OOP concept is not perfect. Smalltalk or Ruby are much closer to a pure OOP concept. Even C# or Java are closer.
Python is a very good tool. But it is a simplified OOP language. Syntactically and conceptually simplified. The main goal of Python's existence is to bring to developers the possibility to write easy readable code with a high abstraction level in a very fast manner.
Here's how I handle Python 3 class fields:
class MyClass:
def __init__(self, public_read_variable, private_variable):
self.public_read_variable_ = public_read_variable
self.__private_variable = private_variable
I access the __private_variable with two underscores only inside MyClass methods.
I do read access of the public_read_variable_ with one underscore
outside the class, but never modify the variable:
my_class = MyClass("public", "private")
print(my_class.public_read_variable_) # OK
my_class.public_read_variable_ = 'another value' # NOT OK, don't do that.
So I’m new to Python but I have a background in C# and JavaScript. Python feels like a mix of the two in terms of features. JavaScript also struggles in this area and the way around it here, is to create a closure. This prevents access to data you don’t want to expose by returning a different object.
def print_msg(msg):
# This is the outer enclosing function
def printer():
# This is the nested function
print(msg)
return printer # returns the nested function
# Now let's try calling this function.
# Output: Hello
another = print_msg("Hello")
another()
https://www.programiz.com/python-programming/closure
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Closures#emulating_private_methods_with_closures
About sources (to change the access rights and thus bypass language encapsulation like Java or C++):
You don't always have the sources and even if you do, the sources are managed by a system that only allows certain programmers to access a source (in a professional context). Often, every programmer is responsible for certain classes and therefore knows what he can and cannot do. The source manager also locks the sources being modified and of course, manages the access rights of programmers.
So I trust more in software than in human, by experience. So convention is good, but multiple protections are better, like access management (real private variable) + sources management.
I have been thinking about private class attributes and methods (named members in further reading) since I have started to develop a package that I want to publish. The thought behind it were never to make it impossible to overwrite these members, but to have a warning for those who touch them. I came up with a few solutions that might help. The first solution is used in one of my favorite Python books, Fluent Python.
Upsides of technique 1:
It is unlikely to be overwritten by accident.
It is easily understood and implemented.
Its easier to handle than leading double underscore for instance attributes.
*In the book the hash-symbol was used, but you could use integer converted to strings as well. In Python it is forbidden to use klass.1
class Technique1:
def __init__(self, name, value):
setattr(self, f'private#{name}', value)
setattr(self, f'1{name}', value)
Downsides of technique 1:
Methods are not easily protected with this technique though. It is possible.
Attribute lookups are just possible via getattr
Still no warning to the user
Another solution I came across was to write __setattr__. Pros:
It is easily implemented and understood
It works with methods
Lookup is not affected
The user gets a warning or error
class Demonstration:
def __init__(self):
self.a = 1
def method(self):
return None
def __setattr__(self, name, value):
if not getattr(self, name, None):
super().__setattr__(name, value)
else:
raise ValueError(f'Already reserved name: {name}')
d = Demonstration()
#d.a = 2
d.method = None
Cons:
You can still overwrite the class
To have variables not just constants, you need to map allowed input.
Subclasses can still overwrite methods
To prevent subclasses from overwriting methods you can use __init_subclass__:
class Demonstration:
__protected = ['method']
def method(self):
return None
def __init_subclass__(cls):
protected_methods = Demonstration.__protected
subclass_methods = dir(cls)
for i in protected_methods:
p = getattr(Demonstration,i)
j = getattr(cls, i)
if not p is j:
raise ValueError(f'Protected method "{i}" was touched')
You see, there are ways to protect your class members, but it isn't any guarantee that users don't overwrite them anyway. This should just give you some ideas. In the end, you could also use a meta class, but this might open up new dangers to encounter. The techniques used here are also very simple minded and you should definitely take a look at the documentation, you can find useful feature to this technique and customize them to your need.
That is a kind of best practices question.
I have a class structure with some methods defined. In some cases I want to override a particular part of a method. First thought on that is splitting my method to more atomic pieces and override related parts like below.
class myTest(object):
def __init__(self):
pass
def myfunc(self):
self._do_atomic_job()
...
...
def _do_atomic_job(self):
print "Hello"
That is a practical-looking way to solve the problem. But since I have too many parameters that is needed to be transferred to and revieced back from _do_atomic_job(), I do not want to pass and retrieve tons of parameters. Other option is setting these parameters as class variables with self.param_var etc but those parameters are used in a small part of the code and using self is not my preferred way of solving this.
Last option I thought is using inner functions. (I know I will have problems in variable scopes but as I said, this is a best practise and just ignore them and think scope and all things about the inner functions are working as expected)
class MyTest2(object):
mytext = ""
def myfunc(self):
def _do_atomic_job():
mytext = "Hello"
_do_atomic_job()
print mytext
Lets assume that works as expected. What I want to do is overriding the inner function _do_atomic_job()
class MyTest3(MyTest2):
def __init__(self):
super(MyTest3, self).__init__()
self.myfunc._do_atomic_job = self._alt_do_atomic_job # Of course this do not work!
def _alt_do_atomic_job(self):
mytext = "Hollla!"
Do what I want to achieve is overriding inherited class' method's inner function _do_atomic_job
Is it possible?
Either factoring _do_atomic_job() into a proper method, or maybe factoring it
into its own class seem like the best approach to take. Overriding an inner
function can't work, because you won't have access to the local variable of the
containing method.
You say that _do_atomic_job() takes a lot of parameters returns lots of values. Maybe you group some of these parameters into reasonable objects:
_do_atomic_job(start_x, start_y, end_x, end_y) # Separate coordinates
_do_atomic_job(start, end) # Better: start/end points
_do_atomic_job(rect) # Even better: rectangle
If you can't do that, and _do_atomic_job() is reasonably self-contained,
you could create helper classes AtomicJobParams and AtomicJobResult.
An example using namedtuples instead of classes:
AtomicJobParams = namedtuple('AtomicJobParams', ['a', 'b', 'c', 'd'])
jobparams = AtomicJobParams(a, b, c, d)
_do_atomic_job(jobparams) # Returns AtomicJobResult
Finally, if the atomic job is self-contained, you can even factor it into its
own class AtomicJob.
class AtomicJob:
def __init__(self, a, b, c, d):
self.a = a
self.b = b
self.c = c
self.d = d
self._do_atomic_job()
def _do_atomic_job(self):
...
self.result_1 = 42
self.result_2 = 23
self.result_3 = 443
Overall, this seems more like a code factorization problem. Aim for rather lean
classes that delegate work to helpers where appropriate. Follow the single responsibility principle. If values belong together, bundle them up in a value class.
As David Miller (a prominent Linux kernel developer) recently said:
If you write interfaces with more than 4 or 5 function arguments, it's
possible that you and I cannot be friends.
Inner variables are related to where they are defined and not where they are executed. This prints "hello".
class MyTest2(object):
def __init__(self):
localvariable = "hello"
def do_atomic_job():
print localvariable
self.do_atomic_job = do_atomic_job
def myfunc(self):
localvariable = "hollla!"
self.do_atomic_job()
MyTest2().myfunc()
So I can't see any way you could use the local variables without passing them, which is probably the best way to do it.
Note: Passing locals() will get you a dict of the variables, this is considered quite bad style though.
I just can't see why do we need to use #staticmethod. Let's start with an exmaple.
class test1:
def __init__(self,value):
self.value=value
#staticmethod
def static_add_one(value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
a=test1(3)
print(a.new_val) ## >>> 4
class test2:
def __init__(self,value):
self.value=value
def static_add_one(self,value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
b=test2(3)
print(b.new_val) ## >>> 4
In the example above, the method, static_add_one , in the two classes do not require the instance of the class(self) in calculation.
The method static_add_one in the class test1 is decorated by #staticmethod and work properly.
But at the same time, the method static_add_one in the class test2 which has no #staticmethod decoration also works properly by using a trick that provides a self in the argument but doesn't use it at all.
So what is the benefit of using #staticmethod? Does it improve the performance? Or is it just due to the zen of python which states that "Explicit is better than implicit"?
The reason to use staticmethod is if you have something that could be written as a standalone function (not part of any class), but you want to keep it within the class because it's somehow semantically related to the class. (For instance, it could be a function that doesn't require any information from the class, but whose behavior is specific to the class, so that subclasses might want to override it.) In many cases, it could make just as much sense to write something as a standalone function instead of a staticmethod.
Your example isn't really the same. A key difference is that, even though you don't use self, you still need an instance to call static_add_one --- you can't call it directly on the class with test2.static_add_one(1). So there is a genuine difference in behavior there. The most serious "rival" to a staticmethod isn't a regular method that ignores self, but a standalone function.
Today I suddenly find a benefit of using #staticmethod.
If you created a staticmethod within a class, you don't need to create an instance of the class before using the staticmethod.
For example,
class File1:
def __init__(self, path):
out=self.parse(path)
def parse(self, path):
..parsing works..
return x
class File2:
def __init__(self, path):
out=self.parse(path)
#staticmethod
def parse(path):
..parsing works..
return x
if __name__=='__main__':
path='abc.txt'
File1.parse(path) #TypeError: unbound method parse() ....
File2.parse(path) #Goal!!!!!!!!!!!!!!!!!!!!
Since the method parse is strongly related to the classes File1 and File2, it is more natural to put it inside the class. However, sometimes this parse method may also be used in other classes under some circumstances. If you want to do so using File1, you must create an instance of File1 before calling the method parse. While using staticmethod in the class File2, you may directly call the method by using the syntax File2.parse.
This makes your works more convenient and natural.
I will add something other answers didn't mention. It's not only a matter of modularity, of putting something next to other logically related parts. It's also that the method could be non-static at other point of the hierarchy (i.e. in a subclass or superclass) and thus participate in polymorphism (type based dispatching). So if you put that function outside the class you will be precluding subclasses from effectively overriding it. Now, say you realize you don't need self in function C.f of class C, you have three two options:
Put it outside the class. But we just decided against this.
Do nothing new: while unused, still keep the self parameter.
Declare you are not using the self parameter, while still letting other C methods to call f as self.f, which is required if you wish to keep open the possibility of further overrides of f that do depend on some instance state.
Option 2 demands less conceptual baggage (you already have to know about self and methods-as-bound-functions, because it's the more general case). But you still may prefer to be explicit about self not being using (and the interpreter could even reward you with some optimization, not having to partially apply a function to self). In that case, you pick option 3 and add #staticmethod on top of your function.
Use #staticmethod for methods that don't need to operate on a specific object, but that you still want located in the scope of the class (as opposed to module scope).
Your example in test2.static_add_one wastes its time passing an unused self parameter, but otherwise works the same as test1.static_add_one. Note that this extraneous parameter can't be optimized away.
One example I can think of is in a Django project I have, where a model class represents a database table, and an object of that class represents a record. There are some functions used by the class that are stand-alone and do not need an object to operate on, for example a function that converts a title into a "slug", which is a representation of the title that follows the character set limits imposed by URL syntax. The function that converts a title to a slug is declared as a staticmethod precisely to strongly associate it with the class that uses it.
I'm teaching myself Python and I see the following in Dive into Python section 5.3:
By convention, the first argument of any Python class method (the reference to the current instance) is called self. This argument fills the role of the reserved word this in C++ or Java, but self is not a reserved word in Python, merely a naming convention. Nonetheless, please don't call it anything but self; this is a very strong convention.
Considering that self is not a Python keyword, I'm guessing that it can sometimes be useful to use something else. Are there any such cases? If not, why is it not a keyword?
No, unless you want to confuse every other programmer that looks at your code after you write it. self is not a keyword because it is an identifier. It could have been a keyword and the fact that it isn't one was a design decision.
As a side observation, note that Pilgrim is committing a common misuse of terms here: a class method is quite a different thing from an instance method, which is what he's talking about here. As wikipedia puts it, "a method is a subroutine that is exclusively associated either with a class (in which case it is called a class method or a static method) or with an object (in which case it is an instance method).". Python's built-ins include a staticmethod type, to make static methods, and a classmethod type, to make class methods, each generally used as a decorator; if you don't use either, a def in a class body makes an instance method. E.g.:
>>> class X(object):
... def noclass(self): print self
... #classmethod
... def withclass(cls): print cls
...
>>> x = X()
>>> x.noclass()
<__main__.X object at 0x698d0>
>>> x.withclass()
<class '__main__.X'>
>>>
As you see, the instance method noclass gets the instance as its argument, but the class method withclass gets the class instead.
So it would be extremely confusing and misleading to use self as the name of the first parameter of a class method: the convention in this case is instead to use cls, as in my example above. While this IS just a convention, there is no real good reason for violating it -- any more than there would be, say, for naming a variable number_of_cats if the purpose of the variable is counting dogs!-)
The only case of this I've seen is when you define a function outside of a class definition, and then assign it to the class, e.g.:
class Foo(object):
def bar(self):
# Do something with 'self'
def baz(inst):
return inst.bar()
Foo.baz = baz
In this case, self is a little strange to use, because the function could be applied to many classes. Most often I've seen inst or cls used instead.
I once had some code like (and I apologize for lack of creativity in the example):
class Animal:
def __init__(self, volume=1):
self.volume = volume
self.description = "Animal"
def Sound(self):
pass
def GetADog(self, newvolume):
class Dog(Animal):
def Sound(this):
return self.description + ": " + ("woof" * this.volume)
return Dog(newvolume)
Then we have output like:
>>> a = Animal(3)
>>> d = a.GetADog(2)
>>> d.Sound()
'Animal: woofwoof'
I wasn't sure if self within the Dog class would shadow self within the Animal class, so I opted to make Dog's reference the word "this" instead. In my opinion and for that particular application, that was more clear to me.
Because it is a convention, not language syntax. There is a Python style guide that people who program in Python follow. This way libraries have a familiar look and feel. Python places a lot of emphasis on readability, and consistency is an important part of this.
I think that the main reason self is used by convention rather than being a Python keyword is because it's simpler to have all methods/functions take arguments the same way rather than having to put together different argument forms for functions, class methods, instance methods, etc.
Note that if you have an actual class method (i.e. one defined using the classmethod decorator), the convention is to use "cls" instead of "self".