I'm looking at the source code for a trie implementation
On lines 80-85:
def keys(self, prefix=[]):
return self.__keys__(prefix)
def __keys__(self, prefix=[], seen=[]):
result = []
etc.
What is def __keys__? Is that a magic object that is self-created? If so, is this poor code? Or does __keys__ exist as a standard Python magic method? I can't find it anywhere in the Python documentation, though.
Why is it legal for the function to call self.__keys__ before def __keys__ is even instantiated? Wouldn't def __keys__ have to go before def keys (since keys calls __keys__)?
For your second question, it is legal, the functions for a class are defined when the class gets defined , so you can be sure both functions would be defined before keys() is called, the logic also applies to normal functions, we can do -
>>> def a():
... b()
...
>>> def b():
... print("In B()")
...
>>> a()
In B()
This is legal because both a() and b() are defined before a() is called. It would only be illegal , if you try to call a() before b() gets defined. Please note defining a function does not automatically call it , and python does not validate at time of definition of function whether any functions used in a function is defined or not (untill runtime, when the function is called and in that case it throws a NameError)
For your first question, I do not know of any such magic methods called __keys__() , cannot find it in documentation either.
All of the real "magic methods" are in the data model documentation; __keys__ isn't one of them. The style guide says:
Never invent such names; only use them as documented.
so yes, making up a new one is bad form (the convention would have been to call it _keys).
The second part of your question doesn't make sense; even if this wasn't a class, there is no need to define methods and functions in the order they're called. As long as they exist by the time the call actually gets made, it's not a problem. I tend to define public methods before private ones, even though the former may call the latter, simply for the reader's convenience.
There is no magic method named __keys__(), so as you suspected this is just poor naming.
The code in the class definition can be in any order. All the matters that the definition has been made by the time the actual call is made downstream.
There is no magic method named __keys__, so its just a wrong naming convention. Looking at the code, the author just wanted to have a private method which is used internally, and also from the public method keys. As you can see __keys__ accepts an additional argument.
About the second question, there is no need that you define the functions in the same order as they called. It will be available by the time code is compiled.
The compilation of a class in Python is done way before the class is instantiated.
Whenever class type is created, the body of the class block is compiled and executed. Then, all the functions are transformed either into bound handles (normal functions) or into classmethod/staticmethod objects. Then, when a new instance is created, content of the type's __dict__ is copied over to the instance (and bound handles are transformed into methods).
Therefore, at the moment of calling instance.keys(), the instance already has both keys and __keys__ methods.
Also, there is no __keys__ method in any data mode, as far as I know.
Related
This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 6 months ago.
When defining a method on a class in Python, it looks something like this:
class MyClass(object):
def __init__(self, x, y):
self.x = x
self.y = y
But in some other languages, such as C#, you have a reference to the object that the method is bound to with the "this" keyword without declaring it as an argument in the method prototype.
Was this an intentional language design decision in Python or are there some implementation details that require the passing of "self" as an argument?
I like to quote Peters' Zen of Python. "Explicit is better than implicit."
In Java and C++, 'this.' can be deduced, except when you have variable names that make it impossible to deduce. So you sometimes need it and sometimes don't.
Python elects to make things like this explicit rather than based on a rule.
Additionally, since nothing is implied or assumed, parts of the implementation are exposed. self.__class__, self.__dict__ and other "internal" structures are available in an obvious way.
It's to minimize the difference between methods and functions. It allows you to easily generate methods in metaclasses, or add methods at runtime to pre-existing classes.
e.g.
>>> class C:
... def foo(self):
... print("Hi!")
...
>>>
>>> def bar(self):
... print("Bork bork bork!")
...
>>>
>>> c = C()
>>> C.bar = bar
>>> c.bar()
Bork bork bork!
>>> c.foo()
Hi!
>>>
It also (as far as I know) makes the implementation of the python runtime easier.
I suggest that one should read Guido van Rossum's blog on this topic - Why explicit self has to stay.
When a method definition is decorated, we don't know whether to automatically give it a 'self' parameter or not: the decorator could turn the function into a static method (which has no 'self'), or a class method (which has a funny kind of self that refers to a class instead of an instance), or it could do something completely different (it's trivial to write a decorator that implements '#classmethod' or '#staticmethod' in pure Python). There's no way without knowing what the decorator does whether to endow the method being defined with an implicit 'self' argument or not.
I reject hacks like special-casing '#classmethod' and '#staticmethod'.
Python doesn't force you on using "self". You can give it whatever name you want. You just have to remember that the first argument in a method definition header is a reference to the object.
Also allows you to do this: (in short, invoking Outer(3).create_inner_class(4)().weird_sum_with_closure_scope(5) will return 12, but will do so in the craziest of ways.
class Outer(object):
def __init__(self, outer_num):
self.outer_num = outer_num
def create_inner_class(outer_self, inner_arg):
class Inner(object):
inner_arg = inner_arg
def weird_sum_with_closure_scope(inner_self, num)
return num + outer_self.outer_num + inner_arg
return Inner
Of course, this is harder to imagine in languages like Java and C#. By making the self reference explicit, you're free to refer to any object by that self reference. Also, such a way of playing with classes at runtime is harder to do in the more static languages - not that's it's necessarily good or bad. It's just that the explicit self allows all this craziness to exist.
Moreover, imagine this: We'd like to customize the behavior of methods (for profiling, or some crazy black magic). This can lead us to think: what if we had a class Method whose behavior we could override or control?
Well here it is:
from functools import partial
class MagicMethod(object):
"""Does black magic when called"""
def __get__(self, obj, obj_type):
# This binds the <other> class instance to the <innocent_self> parameter
# of the method MagicMethod.invoke
return partial(self.invoke, obj)
def invoke(magic_self, innocent_self, *args, **kwargs):
# do black magic here
...
print magic_self, innocent_self, args, kwargs
class InnocentClass(object):
magic_method = MagicMethod()
And now: InnocentClass().magic_method() will act like expected. The method will be bound with the innocent_self parameter to InnocentClass, and with the magic_self to the MagicMethod instance. Weird huh? It's like having 2 keywords this1 and this2 in languages like Java and C#. Magic like this allows frameworks to do stuff that would otherwise be much more verbose.
Again, I don't want to comment on the ethics of this stuff. I just wanted to show things that would be harder to do without an explicit self reference.
I think it has to do with PEP 227:
Names in class scope are not accessible. Names are resolved in the
innermost enclosing function scope. If a class definition occurs in a
chain of nested scopes, the resolution process skips class
definitions. This rule prevents odd interactions between class
attributes and local variable access. If a name binding operation
occurs in a class definition, it creates an attribute on the resulting
class object. To access this variable in a method, or in a function
nested within a method, an attribute reference must be used, either
via self or via the class name.
I think the real reason besides "The Zen of Python" is that Functions are first class citizens in Python.
Which essentially makes them an Object. Now The fundamental issue is if your functions are object as well then, in Object oriented paradigm how would you send messages to Objects when the messages themselves are objects ?
Looks like a chicken egg problem, to reduce this paradox, the only possible way is to either pass a context of execution to methods or detect it. But since python can have nested functions it would be impossible to do so as the context of execution would change for inner functions.
This means the only possible solution is to explicitly pass 'self' (The context of execution).
So i believe it is a implementation problem the Zen came much later.
As explained in self in Python, Demystified
anything like obj.meth(args) becomes Class.meth(obj, args). The calling process is automatic while the receiving process is not (its explicit). This is the reason the first parameter of a function in class must be the object itself.
class Point(object):
def __init__(self,x = 0,y = 0):
self.x = x
self.y = y
def distance(self):
"""Find distance from origin"""
return (self.x**2 + self.y**2) ** 0.5
Invocations:
>>> p1 = Point(6,8)
>>> p1.distance()
10.0
init() defines three parameters but we just passed two (6 and 8). Similarly distance() requires one but zero arguments were passed.
Why is Python not complaining about this argument number mismatch?
Generally, when we call a method with some arguments, the corresponding class function is called by placing the method's object before the first argument. So, anything like obj.meth(args) becomes Class.meth(obj, args). The calling process is automatic while the receiving process is not (its explicit).
This is the reason the first parameter of a function in class must be the object itself. Writing this parameter as self is merely a convention. It is not a keyword and has no special meaning in Python. We could use other names (like this) but I strongly suggest you not to. Using names other than self is frowned upon by most developers and degrades the readability of the code ("Readability counts").
...
In, the first example self.x is an instance attribute whereas x is a local variable. They are not the same and lie in different namespaces.
Self Is Here To Stay
Many have proposed to make self a keyword in Python, like this in C++ and Java. This would eliminate the redundant use of explicit self from the formal parameter list in methods. While this idea seems promising, it's not going to happen. At least not in the near future. The main reason is backward compatibility. Here is a blog from the creator of Python himself explaining why the explicit self has to stay.
The 'self' parameter keeps the current calling object.
class class_name:
class_variable
def method_name(self,arg):
self.var=arg
obj=class_name()
obj.method_name()
here, the self argument holds the object obj. Hence, the statement self.var denotes obj.var
There is also another very simple answer: according to the zen of python, "explicit is better than implicit".
I have a class like this
class A(object):
def __init__(self, name):
self.name = name
def run(self):
pass
if we look at the type of run it is a function. I am now writing a decorator and this decorator should be used with either a stand alone function or a method but has different behavior if the function it is decorating is a method. When registering the method run, the decorator cannot really tell if the function is a method because it has not been bounded to an object yet. I have tried inspect.ismethod and it also does not work. Is there a way that I can detect run is a method in my decorator instead of a standalone function? Thanks!
To add a bit more info:
Basically I am logging something out. If it is decorating an object method, I need the name of the class of that object and the method name, if it is the decorating a function, I just need the function name.
As mentionned by chepner, a function only becomes a method when it's used as one - ie when it's looked up on an instance and resolved on the class. What you are decorating is and will always be a function (well, unless you already decorated it with something that returns another callable type of course, cf the classmethod type).
At this point you have two options: the safe and explicit one, and the unsafe guessing game one.
The safe and explicit solution is, simply, to have two distinct decorators, one for plain functions, and another for "functions to be used as methods".
The unsafe guessing game one is to inspect the function's first arg name (using inspect.getargspecs()) and consider it's a "function to be used as method" if the first argument is named "self".
Obviously the safe and explicit solution is also much simpler ;-)
In Python, I have a class that I've built.
However, there is one method where I apply a rather specific type of substring-search procedure. This procedure could be a standalone function by itself (it just requires a needle a haystack string), but it feels odd to have the function outside the class, because my class depends on it.
What is the typical design paradigm for this? Is it typical to just have myClassName.py with the main class, as well as all the support functions outside the class itself, in the same file? Or is it better to have the support function embedded within the class at the expense of modularity?
You can create a staticmethod, like so:
class yo:
#staticmethod
def say_hi():
print "Hi there!"
Then, you can do this:
>>> yo.say_hi()
Hi there!
>>> a = yo()
>>> a.say_hi()
Hi there!
They can be used non-statically, and statically (if that makes sense).
About where to put your functions...
If a method is required by a class, and it is appropriate for the method to perform data that is specific to the class, then make it a method. This is what you would want:
class yo:
self.message = "Hello there!"
def say_message(self):
print self.message
My say_message relies on the data that is particular to the instance of a class.
If you feel the need to have a function, in addition to the class method, by all means go ahead. Use whichever one is more appropriate in your script. There are many examples of this, including in the python built-ins. Take generator objects for example:
a = my_new_generator()
a.next()
Can also be done as:
a = my_new_generator()
next(a)
Use whichever is more appropriate, and obviously whichever one is more readable. :)
If you can think or any reason to override this function one day, make it a staticmethod, else a plain function is just ok - FWIW, your class probably depends on much more than this simple function. And if you cannot think of any reason for anyone else to ever use this function, keep it in the same module as your class.
As a side note: "myClassName.py" is definitly unpythonic. First because module names should be all_lower, then because the one-module-per-class stuff is a nonsense in Python - we group related classes and functions (and exceptions and whatnots) together.
If the search method you are talking about is really so specific and you will never need to reuse it somewhere else, I do not see any reason to make it static. The fact that it doesn't require access to instance variables doesn't make it static by definition.
If there is a possibility, that this method is going to be reused, refactor it into a helper/utility class (no static again).
ADDED:
Just wanted to add, that when you consider something being static or not, think about how method name relates to the class name. Does this method name makes more sense when used in class context or object context?
class Foo(object):
pass
foo = Foo()
def bar(self):
print 'bar'
Foo.bar = bar
foo.bar() #bar
Coming from JavaScript, if a "class" prototype was augmented with a certain attribute. It is known that all instances of that "class" would have that attribute in its prototype chain, hence no modifications has to be done on any of its instances or "sub-classes".
In that sense, how can a Class-based language like Python achieve Monkey patching?
The real question is, how can it not? In Python, classes are first-class objects in their own right. Attribute access on instances of a class is resolved by looking up attributes on the instance, and then the class, and then the parent classes (in the method resolution order.) These lookups are all done at runtime (as is everything in Python.) If you add an attribute to a class after you create an instance, the instance will still "see" the new attribute, simply because nothing prevents it.
In other words, it works because Python doesn't cache attributes (unless your code does), because it doesn't use negative caching or shadowclasses or any of the optimization techniques that would inhibit it (or, when Python implementations do, they take into account the class might change) and because everything is runtime.
I just read through a bunch of documentation, and as far as I can tell, the whole story of how foo.bar is resolved, is as follows:
Can we find foo.__getattribute__ by the following process? If so, use the result of foo.__getattribute__('bar').
(Looking up __getattribute__ will not cause infinite recursion, but the implementation of it might.)
(In reality, we will always find __getattribute__ in new-style objects, as a default implementation is provided in object - but that implementation is of the following process. ;) )
(If we define a __getattribute__ method in Foo, and access foo.__getattribute__, foo.__getattribute__('__getattribute__') will be called! But this does not imply infinite recursion - if you are careful ;) )
Is bar a "special" name for an attribute provided by the Python runtime (e.g. __dict__, __class__, __bases__, __mro__)? If so, use that. (As far as I can tell, __getattribute__ falls into this category, which avoids infinite recursion.)
Is bar in the foo.__dict__ dict? If so, use foo.__dict__['bar'].
Does foo.__mro__ exist (i.e., is foo actually a class)? If so,
For each base-class base in foo.__mro__[1:]:
(Note that the first one will be foo itself, which we already searched.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
For each base-class base of foo.__class__.__mro__:
(Note that this recursion is not a problem: those attributes should always exist, and fall into the "provided by the Python runtime" case. foo.__class__.__mro__[0] will always be foo.__class__, i.e. Foo in our example.)
(Note that we do this even if foo.__mro__ exists. This is because classes have a class, too: its name is type, and it provides, among other things, the method used to calculate __mro__ attributes in the first place.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
If we still haven't found something to use: can we find foo.__getattr__ by the preceding process? If so, use the result of foo.__getattr__('bar').
If everything failed, raise AttributeError.
bar.__get__ is not really a function - it's a "method-wrapper" - but you can imagine it being implemented vaguely like this:
# Somewhere in the Python internals
class __method_wrapper(object):
def __init__(self, func):
self.func = func
def __call__(self, obj, cls):
return lambda *args, **kwargs: func(obj, *args, **kwargs)
# Except it actually returns a "bound method" object
# that uses cls for its __repr__
# and there is a __repr__ for the method_wrapper that I *think*
# uses the hashcode of the underlying function, rather than of itself,
# but I'm not sure.
# Automatically done after compiling bar
bar.__get__ = __method_wrapper(bar)
The "binding" that happens within the __get__ automatically attached to bar (called a descriptor), by the way, is more or less the reason why you have to specify self parameters explicitly for Python methods. In Javascript, this itself is magical; in Python, it is merely the process of binding things to self that is magical. ;)
And yes, you can explicitly set a __get__ method on your own objects and have it do special things when you set a class attribute to an instance of the object and then access it from an instance of that other class. Python is extremely reflective. :) But if you want to learn how to do that, and get a really full understanding of the situation, you have a lot of reading to do. ;)
Being probably one of the worst OOP programmers on the planet, I've been reading through a lot of example code to help 'get' what a class can be used for. Recently I found this example:
class NextClass: # define class
def printer(self, text): # define method
self.message = text # change instance
print self.message # access instance
x = NextClass() # make instance
x.printer('instance call') # call its method
print x.message # instance changed
NextClass.printer(x, 'class call') # direct class call
print x.message # instance changed again
It doesn't appear there is any difference between what the direct class call does and the instance call does; but it goes against the Zen to include features like that without some use to them. So if there is a difference, what is it? Performance? Overhead reduction? Maybe readability?
There is no difference. instance.method(...) is class.method(instance, ...). But this doesn't go against the Zen, since it says (emphasis mine):
There should be one-- and preferably only one --obvious way to do it.
The second way is possible, and everyone with good knowledge of Python should know that (and why), but it's a nonobvious way of doing that, nobody does it in real code.
So why is it that way? It's just how methods work in any language - a method is some code that operates on an object/instance (and possibly more arguments). Except that usually, the instance is supplied implicitly (e.g. this in C++/Java/D) - but since the Zen says "explicit is better than implicit", self is explicitly a parameter of every method, which inevitable allows this. Explicitly prohibiting it would be pointless.
And apart from that, the fact that methods are not forced to (implicitly) take an instance allows class methods and static methods to be defined without special treatment of the language - the first is just a method that expects a class instead of an instance, and the latter is just a method that doesn't expect an instance at all.
In this situation, there is no difference. In both calls, you are supplying an instance of that class:
x.printer('instance call') # you supplied x and then called its printer method
NextClass.printer(x, 'class call') # you supplied x as a parameter this time
Normally, though, I wouldn't write the second method very often. I usually think of any method that operates on an instance as an instance method. Things like:
car.drive('place')
car.refuel
car.impound
And, I use class methods to operate more generally (I'm struggling to describe this):
Car.numberintheworld # returns the number of cars in the world (or your program)
Here's some more help, for your reading.
This might give you a better clarification.
When an instance attribute is
referenced that isn’t a data
attribute, its class is searched. If
the name denotes a valid class
attribute that is a function object, a
method object is created by packing
(pointers to) the instance object and
the function object just found
together in an abstract object (i.e. this
method object). When the method
object is called with an argument
list, a new argument list is
constructed from the instance object
and the argument list, and the
function object is called with this
new argument list.