Dynamically overriding __functions__ in Python - python

Let's say I want to override a function like __int__ in a Python class so that I may do something like this.
class A(object):
def __init__(self):
self.__int__ = lambda: 1
a = A()
print int(a)
I expect that it would output "1" here instead of produce this error message
TypeError: int() argument must be a string or a number, not 'A'
When __int__ instead becomes a method built into the class it works as expected. Why? (This problem exists with any of the double underscore functions also)

That appears to be one more bit of magic in the __ magic methods. Unlike other methods, they're not looked up on the class instances when called implicitly.
It's well documented that they don't get resolved with the __getattribute__ magic method (and it would be a nice paradox if they did, since __getattribute__ would have to call itself to resolve its own name). But not checking the instances surprises me.
It's discussed a bit here, under the header "Special Method Lookup":
http://segfaulthunter.github.io/articles/biggestsurprise/
For instances of new-style classes, all special method lookup that is done implicitely is done in the class struct. Thus changing an instance's __str__ attribute does not effect the result of str(). Still, explicitely getting the attribute from the instance gives you the method in the instance.
I will be curious to see if anyone else can offer a more detailed explanation.

If we for the moment ignore that you are asking about special methods, then your code would look like this:
class A(object):
def __init__(self):
self.a_fun = lambda: 1
Is written more clearly like this:
class A(object):
def __init__(self):
self._int = 1
def a_fun(self):
return self._int
The resulting code isn't exactly the same, but close enough for it to not make much of a difference. The only difference is that the _int name has to be looked up as an attribute.
But if we now change it back to be a special method, it looks like this:
class A(object):
def __init__(self):
self.__int__ = lambda: 1
vs:
class A(object):
def __init__(self):
self._int = 1
def __int__(self):
return self._int
And now there is a very big difference: The second variant works, the first one doesn't. This is because special methods is always looked up on the class, not the instance. This is by design.
So instead of trying to be clever, just write what is clear and readable. In Python that tends to work best. ;-)

Related

How to keep python project modular?

Context
I've been working on a python project recently, and found modularity very important. For example you made a class with some attributes and some line of code that uses those attributes like
a = A()
print("hi"+a.imA)
If you were to modify imA of class A to another type, you would have to modify the print statement. In my case I had to do this so many times. It was annoying and time consuming. get/set methods would've solved this, but I heard that get/set are not 'good python'. So how would you solve this problem without using get and set methods?
First point: you would have saved yourself quite some hassle by using string formatting instead of string concatenation, ie:
print("hi {}".format(a.imA))
Granted, the final result may or not be what you'd expect depending on how a.imA type implements __str__() and __repr__() but at least this will not break the code.
wrt/ getters and setters, they are indeed considered rather unpythonic, because python has a strong support for computed attributes, and a simple generic implementation is available as the builtin property type.
NB: actually what's considered unpythonic is to systematically use implementation attributes and getters/setters (either explicits or - as is the case with computed attributes - implicits) when a plain public attribute is enough, and this is considered unpythonic because you can always turn a plain attribute into a computed one without breaking the client code (assuming of course you don't change the type nor semantic of the attribute) - something that was not possible with early OOPLs like Smalltalk, C++ or Java (Smalltalk being a bit of a special case actually but that's another topic).
In your case, if the point was to change the stored value's type without breaking the API, the simple obvious canonical solution was to use a property delegating to an implementation attribute:
before:
class Foo(object):
def __init__(self, bar):
# `bar` is expected to be the string representation of an int.
self.bar = bar
def frobnicate(self, val):
return (int(self.bar) + val) / 2
after:
class Foo(object):
def __init__(self, bar):
# `bar` is expected to be the string representation of an int.
self.bar = bar
# but we want to store it as an int
#property
def bar(self):
return str(self._bar)
#bar.setter
def bar(self, value):
self._bar = int(value)
def frobnicate(self, val):
# internally we use the implementation attribute `_bar`
return (self._bar + val) / 2
And you now have the value stored internally as an int, but the public interface is (almost) exactly the same - the only difference being that passing something that cannot be passed to int() will raise at the expected place (when you set it) instead than breaking at the most unexpected one (when you call .frobnicate())
Now note that that changing the type of a public attribute is just like changing the return type of a getter (or the type of a setter argument) - in both cases you are breaking the contract - so if what you wanted was really to change the type of A.imA, neither getters nor properties would have solved your issue - getters and setters (or in Python computed attributes) can only protect you from implementation changes.
EDIT: oh and yes: this has nothing to do with modularity (which is about writing decoupled, self-contained code that's easier to read, test, maintain and eventually reuse), but with encapsulation (which aim is to make the public interface resilient to implementation changes).
First, use
print(f"hi {a.imA}") # Python 3.6+
or
print("hi {}".format(a.imA)) # all Python 3
instead of
print("hi"+a.imA)
That way, str will be called automatically on each argument.
Then define a __str__ function in all your classes, so that printing any class always works.
class A:
def __init__(self):
self._member_1 = "spam"
def __str__(self):
return f"A(member 1: {self._member_1})"

Why do we use #staticmethod?

I just can't see why do we need to use #staticmethod. Let's start with an exmaple.
class test1:
def __init__(self,value):
self.value=value
#staticmethod
def static_add_one(value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
a=test1(3)
print(a.new_val) ## >>> 4
class test2:
def __init__(self,value):
self.value=value
def static_add_one(self,value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
b=test2(3)
print(b.new_val) ## >>> 4
In the example above, the method, static_add_one , in the two classes do not require the instance of the class(self) in calculation.
The method static_add_one in the class test1 is decorated by #staticmethod and work properly.
But at the same time, the method static_add_one in the class test2 which has no #staticmethod decoration also works properly by using a trick that provides a self in the argument but doesn't use it at all.
So what is the benefit of using #staticmethod? Does it improve the performance? Or is it just due to the zen of python which states that "Explicit is better than implicit"?
The reason to use staticmethod is if you have something that could be written as a standalone function (not part of any class), but you want to keep it within the class because it's somehow semantically related to the class. (For instance, it could be a function that doesn't require any information from the class, but whose behavior is specific to the class, so that subclasses might want to override it.) In many cases, it could make just as much sense to write something as a standalone function instead of a staticmethod.
Your example isn't really the same. A key difference is that, even though you don't use self, you still need an instance to call static_add_one --- you can't call it directly on the class with test2.static_add_one(1). So there is a genuine difference in behavior there. The most serious "rival" to a staticmethod isn't a regular method that ignores self, but a standalone function.
Today I suddenly find a benefit of using #staticmethod.
If you created a staticmethod within a class, you don't need to create an instance of the class before using the staticmethod.
For example,
class File1:
def __init__(self, path):
out=self.parse(path)
def parse(self, path):
..parsing works..
return x
class File2:
def __init__(self, path):
out=self.parse(path)
#staticmethod
def parse(path):
..parsing works..
return x
if __name__=='__main__':
path='abc.txt'
File1.parse(path) #TypeError: unbound method parse() ....
File2.parse(path) #Goal!!!!!!!!!!!!!!!!!!!!
Since the method parse is strongly related to the classes File1 and File2, it is more natural to put it inside the class. However, sometimes this parse method may also be used in other classes under some circumstances. If you want to do so using File1, you must create an instance of File1 before calling the method parse. While using staticmethod in the class File2, you may directly call the method by using the syntax File2.parse.
This makes your works more convenient and natural.
I will add something other answers didn't mention. It's not only a matter of modularity, of putting something next to other logically related parts. It's also that the method could be non-static at other point of the hierarchy (i.e. in a subclass or superclass) and thus participate in polymorphism (type based dispatching). So if you put that function outside the class you will be precluding subclasses from effectively overriding it. Now, say you realize you don't need self in function C.f of class C, you have three two options:
Put it outside the class. But we just decided against this.
Do nothing new: while unused, still keep the self parameter.
Declare you are not using the self parameter, while still letting other C methods to call f as self.f, which is required if you wish to keep open the possibility of further overrides of f that do depend on some instance state.
Option 2 demands less conceptual baggage (you already have to know about self and methods-as-bound-functions, because it's the more general case). But you still may prefer to be explicit about self not being using (and the interpreter could even reward you with some optimization, not having to partially apply a function to self). In that case, you pick option 3 and add #staticmethod on top of your function.
Use #staticmethod for methods that don't need to operate on a specific object, but that you still want located in the scope of the class (as opposed to module scope).
Your example in test2.static_add_one wastes its time passing an unused self parameter, but otherwise works the same as test1.static_add_one. Note that this extraneous parameter can't be optimized away.
One example I can think of is in a Django project I have, where a model class represents a database table, and an object of that class represents a record. There are some functions used by the class that are stand-alone and do not need an object to operate on, for example a function that converts a title into a "slug", which is a representation of the title that follows the character set limits imposed by URL syntax. The function that converts a title to a slug is declared as a staticmethod precisely to strongly associate it with the class that uses it.

In python, how can I tell which class a super object is wrapping?

Ie, if I have a class MyClass, and I do super(MyClass).init, how can I tell which class's init is actually going to be called?
Some code to illustrate:
class MyClass(OtherClass, ThirdClass):
def __init__(self):
mySuper = super(MyClass)
if mySuper == SomeClass:
# doesn't work - mySuper is a super object (not a normal class object)
pass
if mySuper.__init__ == SomeClass.__init__:
# doesn't work - mySuper.__init__ is a super-method-wrapper object
pass
if mySuper.__thisclass__ == SomeClass:
# doesn't work - __thisclass__ is set to be MyClass, not the "parent" class
pass
Any ideas?
EDIT:
If I hadn't already awarded points here, I would probably delete this question, as it's not really very useful as posed, and could potentially encourage bad habits.
As sven-marnach notes, I'm using the one-arg version, super(MyClass), instead of the more useful two-arg version, super(MyClass, self)... and now, I have no idea why I would have wanted to do that. My best guess is that I was still unclear on the proper usage of super at the time.
If you're using the two-arg version, then the second check works - with the caveat that you would need to get .im_func, ie:
if mySuper.__init__.im_func == SomeClass.__init__.im_func:
See Determine whether super().__new__ will be object.__new__ in Python 3? for an example of why this sort of check is useful...
You can extract the wrapped class using
mro = my_super.__self_class__.mro()
wrapped_class = mro[mro.index(my_super.__thisclass__) + 1]
This looks complex, but I also think it is rather pointless to do this.
Edit: I just noticed you don't pass self to super(). For that case, you could use
wrapped_class = my_super.__thisclass__.mro()[1]
The question that remains is: why would you want to do this?

Mutate an object into an instance of one its subclasses

Is it possible to mutate an object into an instance of a derived class of the initial's object class?
Something like:
class Base():
def __init__(self):
self.a = 1
def mutate(self):
self = Derived()
class Derived(Base):
def __init__(self):
self.b = 2
But that doesn't work.
>>> obj = Base()
>>> obj.mutate()
>>> obj.a
1
>>> obj.b
AttributeError...
If this isn't possible, how should I do otherwise?
My problem is the following:
My Base class is like a "summary", and the Derived class is the "whole thing". Of course getting the "whole thing" is a bit expensive so working on summaries as long as it is possible is the point of having these two classes. But you should be able to get it if you want, and then there's no point in having the summary anymore, so every reference to the summary should now be (or contain, at least) the whole thing. I guess I would have to create a class that can hold both, right?
class Thing():
def __init__(self):
self.summary = Summary()
self.whole = None
def get_whole_thing(self):
self.whole = Whole()
Responding to the original question as posed, changing the mutate method to:
def mutate(self):
self.__class__ = Derived
will do exactly what was requested -- change self's class to be Derived instead of Base. This does not automatically execute Derived.__init__, but if that's desired it can be explicitly called (e.g. as self.__init__() as the second statement in the method).
Whether this is a good approach for the OP's actual problem is a completely different question than the original question, which was
Is it possible to mutate an object
into an instance of a derived class of
the initial's object class?
The answer to this is "yes, it's possible" (and it's done the way I just showed). "Is it the best approach for my specific application problem" is a different question than "is it possible";-)
A general OOP approach would be to make the summary object be a Façade that Delegates the expensive operations to a (dynamically constructed) back-end object. You could even make it totally transparent so that callers of the object don't see that there is anything going on (well, not unless they start timing things of course).
I forgot to say that I also wanted to be able to create a "whole thing" from the start and not a summary if it wasn't needed.
I've finally done it like that:
class Thing():
def __init__(self, summary=False):
if summary:
self.summary = "summary"
self._whole = None
else:
self._whole = "wholething"
#property
def whole(self):
if self._whole: return self._whole
else:
self.__init__()
return self._whole
Works like a charm :)
You cannot assign to self to do what you want, but you can change the class of an object by assigning to self.__class__ in your mutate method.
However this is really bad practice - for your situation delegation is better than inheritance.

is it ever useful to define a class method with a reference to self not called 'self' in Python?

I'm teaching myself Python and I see the following in Dive into Python section 5.3:
By convention, the first argument of any Python class method (the reference to the current instance) is called self. This argument fills the role of the reserved word this in C++ or Java, but self is not a reserved word in Python, merely a naming convention. Nonetheless, please don't call it anything but self; this is a very strong convention.
Considering that self is not a Python keyword, I'm guessing that it can sometimes be useful to use something else. Are there any such cases? If not, why is it not a keyword?
No, unless you want to confuse every other programmer that looks at your code after you write it. self is not a keyword because it is an identifier. It could have been a keyword and the fact that it isn't one was a design decision.
As a side observation, note that Pilgrim is committing a common misuse of terms here: a class method is quite a different thing from an instance method, which is what he's talking about here. As wikipedia puts it, "a method is a subroutine that is exclusively associated either with a class (in which case it is called a class method or a static method) or with an object (in which case it is an instance method).". Python's built-ins include a staticmethod type, to make static methods, and a classmethod type, to make class methods, each generally used as a decorator; if you don't use either, a def in a class body makes an instance method. E.g.:
>>> class X(object):
... def noclass(self): print self
... #classmethod
... def withclass(cls): print cls
...
>>> x = X()
>>> x.noclass()
<__main__.X object at 0x698d0>
>>> x.withclass()
<class '__main__.X'>
>>>
As you see, the instance method noclass gets the instance as its argument, but the class method withclass gets the class instead.
So it would be extremely confusing and misleading to use self as the name of the first parameter of a class method: the convention in this case is instead to use cls, as in my example above. While this IS just a convention, there is no real good reason for violating it -- any more than there would be, say, for naming a variable number_of_cats if the purpose of the variable is counting dogs!-)
The only case of this I've seen is when you define a function outside of a class definition, and then assign it to the class, e.g.:
class Foo(object):
def bar(self):
# Do something with 'self'
def baz(inst):
return inst.bar()
Foo.baz = baz
In this case, self is a little strange to use, because the function could be applied to many classes. Most often I've seen inst or cls used instead.
I once had some code like (and I apologize for lack of creativity in the example):
class Animal:
def __init__(self, volume=1):
self.volume = volume
self.description = "Animal"
def Sound(self):
pass
def GetADog(self, newvolume):
class Dog(Animal):
def Sound(this):
return self.description + ": " + ("woof" * this.volume)
return Dog(newvolume)
Then we have output like:
>>> a = Animal(3)
>>> d = a.GetADog(2)
>>> d.Sound()
'Animal: woofwoof'
I wasn't sure if self within the Dog class would shadow self within the Animal class, so I opted to make Dog's reference the word "this" instead. In my opinion and for that particular application, that was more clear to me.
Because it is a convention, not language syntax. There is a Python style guide that people who program in Python follow. This way libraries have a familiar look and feel. Python places a lot of emphasis on readability, and consistency is an important part of this.
I think that the main reason self is used by convention rather than being a Python keyword is because it's simpler to have all methods/functions take arguments the same way rather than having to put together different argument forms for functions, class methods, instance methods, etc.
Note that if you have an actual class method (i.e. one defined using the classmethod decorator), the convention is to use "cls" instead of "self".

Categories

Resources