Python std methods hierarchy calls documented? - python

just encountered a problem at dict "type" subclassing. I did override __iter__ method and expected it will affect other methods like iterkeys, keys etc. because I believed they call __iter__ method to get values but it seems they are implemented independently and I have to override all of them.
Is this a bug or intention they don't make use of other methods and retrieves values separately ?
I didn't find in the standard Python documentation description of calls dependency between methods of standard classes. It would be handy for sublassing work and for orientation what methods is required to override for proper behaviour. Is there some supplemental documentation about python base types/classes internals ?

Subclass Mapping or MuteableMapping from the collections module instead of dict and you get all those methods for free.
Here is a example of a minimal mapping and some of the methods you get for free:
import collections
class MinimalMapping(collections.Mapping):
def __init__(self, *items ):
self.elements = dict(items)
def __getitem__(self, key):
return self.elements[key]
def __len__(self):
return len(self.elements)
def __iter__(self):
return iter(self.elements)
t = MinimalMapping()
print (t.iteritems, t.keys, t.itervalues, t.get)
To subclass any of the builtin containers you should always use the appropriate baseclass from the collections module.

If not specified in the documentation, it is implementation specific. Implementations other that CPython might re-use the iter method to implement iterkeys and others. I would not consider this to be a bug, but simply a bit of freedom for the implementors.
I suspect there is a performance factor in implementing the methods independently, especially as dictionaries are so widely used in Python.
So basically, you should implement them.

You know the saying: "You know what happens when you assume." :-)
They don't officially document that stuff because they may decide to change it in the future. Any unofficial documentation you may find would simply document the current behavior of one Python implementation, and relying on it would result in your code being very, very fragile.
When there is official documentation of special methods, it tends to describe behavior of the interpreter with respect to your own classes, such as using __len__() when __nonzero__() isn't implemented, or only needing __lt()__ for sorting.
Since Python uses duck typing, you usually don't actually need to inherit from a built-in class to make your own class act like one. So you might reconsider whether subclassing dict is really what you want to do. You might choose a different class, such as something from the collections module, or to encapsulate rather than inheriting. (The UserString class uses encapsulation.) Or just start from scratch.

Instead of subclassing dict, you could instead just create make your own class that has exactly the properties you want without too much trouble. Here's a blog post with an example of how to do this. The __str__() method in it isn't the greatest, but that's easily corrected the rest provide the functionality you seek.

Related

Why __slots__ isn't the default in Python?

I've been programming in Python for a long time, but I still can't understand why classes base their attribute lookup on the __dict__ dictionary by default instead of the faster __slots__ tuple.
Wouldn't it make more sense to use the more efficient and less flexible __slots__ method as the default implementation and instead make the more flexible, but slower __dict__ method optional?
Also, if a class uses __slots__ to store its attributes, there's no chance of mistakenly creating new attributes like this:
class Object:
__slots__ = ("name",)
def __init__(self, name):
self.name = name
obj = Object()
# Note the typo here
obj.namr = "Karen"
So, I was wondering if there's a valid reason why Python defaults to accessing instance attributes through __dict__ instead of through __slots__.
Python is designed to be an extremely flexible language, and allows objects to modify themselves in many interesting ways at runtime. Making a change to prevent that kind of flexibility would break a massive amount of other people's code, so for the sake of backwards compatibility I don't think it will happen any time soon (if at all).
As well as this, due to the way Python code is interpreted, it is very difficult to design a system that can look ahead and determine exactly what variables a particular class will use ahead of time, especially given the existence of setattr() and other similar functions, which can modify the state of other objects in unpredictable ways.
In summary, Python is designed to value flexibility over performance, and as such, having __slots__ be an optional technique to speed up parts of your code is a trade-off that you choose to make if you wish to write your code in Python. I can't answer whether this is a worthwhile design decision for you, since it's entirely based on opinion.
If you wish to have a bit more safety to prevent issues such as the one you described, there are tools such as mypy and pylint which can catch that sort of error.

Static classes in Python

I once read (I think on a page from Microsoft) that it's a good way to use static classes, when you don't NEED two or more instances of a class.
I'm writing a program in Python. Is it a bad style, if I use #classmethod for every method of a class?
Generally, usage like this is better done by just using functions in a module, without a class at all.
It's terrible style, unless you actually need to access the class.
A static method [...] does not translate to a Python classmethod. Oh sure, it results in more or less the same effect, but the goal of a classmethod is actually to do something that's usually not even possible [...] (like inheriting a non-default constructor). The idiomatic translation of a [...] static method is usually a module-level function, not a classmethod or staticmethod.
source
In my experience creating a class is a very good solution for a number of reasons. One is that you wind up using the class as a 'normal' class (esp. making more than just one instance) more often than you might think. It's also a reasonable style choice to stick with classes for everthing; this can make it easier for others who read/maintain your code, esp if they are very OO - they will be comfortable with classes. As noted in other replies, it's also reasonable to just use 'bare' functions for the implementation. You may wish to start with a class and make it a singleton/Borg pattern (lots of examples if you googlefor these); it gives you the flexibility to (re)use the class to meet other needs. I would recommend against the 'static class' approach as being non-conventional and non-Pythonic, which makes it harder to read and maintain.
There are a few approaches you might take for this. As others have mentioned, you could just use module-level functions. In this case, the module itself is the namespace that holds them together. Another option, which can be useful if you need to keep track of state, is to define a class with normal methods (taking self), and then define a single global instance of it, and copy its instance methods to the module namespace. This is the approach taken by the standard library "random" module -- take a look at lib/python2.5/random.py in your python directory. At the bottom, it has something like this:
# Create one instance, seeded from current time, and export its methods
# as module-level functions. [...]
_inst = Random()
seed = _inst.seed
random = _inst.random
uniform = _inst.uniform
...
Or you can take the basic approach you described (though I would recommend using #staticmethod rather than #classmethod in most cases).
You might actually want a singleton class rather than a static class:
Making a singleton class in python

Decorators versus inheritance

How do you decide between using decorators and inheritance when both are possible?
E.g., this problem has two solutions.
I'm particularly interested in Python.
Decorators...:
...should be used if what you are trying to do is "wrapping". Wrapping consists of taking something, modifying (or registering it with something), and/or returning a proxy object that behaves "almost exactly" like the original.
...are okay for applying mixin-like behavior, as long as you aren't creating a large stack of proxy objects.
...have an implied "stack" abstraction:
e.g.
#decoA
#decoB
#decoC
def myFunc(...): ...
...
Is equivalent to:
def myFunc(...): ...
...
myFunc = decoA(decoB(decoC(myFunc))) #note the *ordering*
Multiple inheritance...:
... is best for adding methods to classes; you cannot use it to decorate functions easily. In this context, it can be used to achieve mixin-like behavior if all you need is a set of "duck-typing style" extra methods.
... may be a bit unwieldy if your problem is not a good match for it, with issues with superclass constructors, etc. For example, the subclasses __init__ method will not be called unless it is called explicitly (via the method-resolution-order protocol)!
To sum up, I would use decorators for mixin-like behavior if they didn't return proxy objects. Some examples would include any decorator which returns the original function, slightly modified (or after registering it somewhere or adding it to some collection).
Things you will often find decorators for (like memoization) are also good candidates, but should be used in moderation if they return proxy objects; the order they are applied matter. And too many decorators on top of one another is using them in a way they aren't intended to be used.
I would consider using inheritance if it was a "classic inheritance problem", or if all I needed for the mixin behavior were methods. A classic inheritance problem is one where you can use the child wherever you could use the parent.
In general, I try to write code where it is not necessary to enhance arbitrary things.
The problem you reference is not deciding between decorators and classes. It is using decorators, but you have the option of using either:
a decorator, which returns a class
a decorator, which returns a function
A decorator is just a fancy name for the "wrapper" pattern, i.e. replacing something with something else. The implementation is up to you (class or function).
When deciding between them, it's completely a matter of personal preference. You can do everything you can do in one with the other.
if decorating a function, you may prefer decorators which return proxy functions
if decorating a class, you may prefer decorators which return proxy classes
(Why is it a good idea? There may be assumptions that a decorated function is still a function, and a decorated class is still a class.)
Even better in both cases would be to use a decorator which just returns the original, modified somehow.
edit: After better understanding your question, I have posted another solution at Python functools.wraps equivalent for classes
The other answers are quite great, but I wanted to give a succinct list of pros and cons.
The main advantage of mixins is that the type can be checked at runtime using isinstance and it can be checked with linters like MyPy. Like all inheritance, it should be used when you have an is-a relationship. For example dataclass should probably have been a mixin in order to expose dataclass-specific introspection variables like the list of dataclass fields.
Decorators should be preferred when you don't have an is-a relationship. For example, a decorator that propagates documentation from another class, or registers a class in some collection.
Decoration typically only affects the class it decorates, but not classes that inherit from the base class:
#decorator
class A:
... # Can be affected by the decorator.
class B(A):
... # Not affected by the decorator in most cases.
Now that Python has __init_subclass__, everything that decorators can do can be done with mixins, and they typically do affect child subclasses:
class A(Mixin):
... # Is affected by Mixin.__init_subclass__.
class B(A):
... # Is affected by Mixin.__init_subclass__.
Mixins have another advantage, which is that they can provide empty base class methods. Child classes can override these methods with some "augmenting" behavior, and then call super. The decorator cannot easily provide such base class methods. This is another way in which mixins are more flexible.
In summary, the questions you should ask when deciding between a mixin and decoration are:
Is there an is-a pattern?
Would you ever call isinstance?
Would you use the mixin in a type annotation?
Do you want the behavior to affect child classes?
Do you need augmenting methods?
In general, lean towards inheritance.
If both are equivalent, I would prefer decorators, since you can use the same decorator for many classes, while inheriting apply to only one specific class.
Personally, I would think in terms of code reuse. Decorator is sometimes more flexible than inheritance.
Let's take caching as an example. If you want to add caching facility to two classes in your system: A and B, with inheritance, you'll probably wind up having ACached and BCached. And by overriding some of the methods in these classes, you'll probably duplicate a lot of codes for the same caching logic. But if you use decorator in this case, you only need to define one decorator to decorate both classes.
So, when deciding which one to use, you may first want to check if the extended functionality is only specific to this class or if the same extended functionality can be reused in other parts of your system. If it cannot be reused, then inheritance should probably do the job. Otherwise, you can think about using decorator.

When is using __call__ a good idea?

What are peoples' opinions on using the __call__. I've only very rarely seen it used, but I think it's a very handy tool to use when you know that a class is going to be used for some default behaviour.
I think your intuition is about right.
Historically, callable objects (or what I've sometimes heard called "functors") have been used in the OO world to simulate closures. In C++ they're frequently indispensable.
However, __call__ has quite a bit of competition in the Python world:
A regular named method, whose behavior can sometimes be a lot more easily deduced from the name. Can convert to a bound method, which can be called like a function.
A closure, obtained by returning a function that's defined in a nested block.
A lambda, which is a limited but quick way of making a closure.
Generators and coroutines, whose bodies hold accumulated state much like a functor can.
I'd say the time to use __call__ is when you're not better served by one of the options above. Check the following criteria, perhaps:
Your object has state.
There is a clear "primary" behavior for your class that's kind of silly to name. E.g. if you find yourself writing run() or doStuff() or go() or the ever-popular and ever-redundant doRun(), you may have a candidate.
Your object has state that exceeds what would be expected of a generator function.
Your object wraps, emulates, or abstracts the concept of a function.
Your object has other auxilliary methods that conceptually belong with your primary behavior.
One example I like is UI command objects. Designed so that their primary task is to execute the comnand, but with extra methods to control their display as a menu item, for example, this seems to me to be the sort of thing you'd still want a callable object for.
Use it if you need your objects to be callable, that's what it's there for
I'm not sure what you mean by default behaviour
One place I have found it particularly useful is when using a wrapper or somesuch where the object is called deep inside some framework/library.
More generally, Python has a lot of double-underscore methods. They're there for a reason: they are the Python way of overloading operators. For instance, if you want a new class in which addition, I don't know, prints "foo", you define the __add__ and __radd__ methods. There's nothing inherently good or bad about this, any more than there's anything good or bad about using for loops.
In fact, using __call__ is often the more Pythonic approach, because it encourages clarity of code. You could replace MyCalculator.calculateValues( foo ) with MyCalculator( foo ), say.
Its usually used when class is used as function with some instance context, like some DecoratorClass which would be used as #DecoratorClass('some param'), so 'some param' would be stored in the instance's namespace and then instance being called as actual decorator.
It is not very useful when your class provides some different methods, since its usually not obvious what would the call do, and explicit is better than implicit in these cases.

Is it correct to inherit from built-in classes?

I want to parse an Apache access.log file with a python program in a certain way, and though I am completely new to object-oriented programming, I want to start doing it now.
I am going to create a class ApacheAccessLog, and the only thing I can imagine now, it will be doing is 'readline' method. Is it conventionally correct to inherit from the builtin file class in this case, so the class will behave just like an instance of the file class itself, or not? What is the best way of doing that?
In this case I would use delegation rather than inheritance. It means that your class should contain the file object as an attribute and invoke a readline method on it. You could pass a file object in the constructor of the logger class.
There are at least two reasons for this:
Delegation reduces coupling, for example in place of file objects you can use any other object that implements a readline method (duck typing comes handy here).
When inheriting from file the public interface of your class becomes unnecessarily broad. It includes all the methods defined on file even if these methods don't make sense in case of Apache log.
I am coming from a Java background but I am fairly confident that the same principles will apply in Python. As a rule of thumb you should never inherit from a class whose implementation you don't understand and control unless that class has been designed specifically for inheritance. If it has been designed in this way it should describe this clearly in its documentation.
The reason for this is that inheritance can potentially bind you to the implementation details of the class that you are inheriting from.
To use an example from Josh Bloch's book 'Effective Java'
If we were to extend the class ArrayList class in order to be able to count the number of items that were added to it during its life-time (not necessarily the number it currently contains) we may be tempted to write something like this.
public class CountingList extends ArrayList {
int counter = 0;
public void add(Object o) {
counter++;
super.add(0);
}
public void addAll(Collection c) {
count += c.size();
super.addAll(c);
}
// Etc.
}
Now this extension looks like it would accurately count the number of elements that were added to the list but in fact it may not. If ArrayList has implemented addAll by iterating over the Collection provided and calling its interface method addAll for each element then we will count each element added through the addAll method twice. Now the behaviour of our class is dependent on the implementation details of ArrayList.
This is of course in addition to the disadvantage of not being able to use other implementations of List with our CountingList class. Plus the disadvantages of inheriting from a concrete class that are discussed above.
It is my understanding that Python uses a similar (if not identical) method dispatch mechanism to Java and will therefore be subject to the same limitations. If someone could provide an example in Python I'm sure it would be even more useful.
It is perfectly acceptable to inherit from a built in class. In this case I'd say you're right on the money.
The log "is a" file so that tells you inheritance is ok..
General rule.
Dog "is a"n animal, therefore inherit from animal.
Owner "has a"n animal therefore don't inherit from animal.
Although it is in some cases useful to inherit from builtins, the real question here is what you want to do with the output and what's your big-picture design. I would usually write a reader (that uses a file object) and spit out whatever data class I need to hold the information I just read. It's then easy to design that data class to fit in with the rest of my design.
You should be fairly safe inheriting from a "builtin" class, as later modifications to these classes will usually be compatible with the current version.
However, you should think seriously about wether you really want to tie your class to the additional functionality provided by the builtin class. As mentioned in another answer you should consider (perhaps even prefer) using delegation instead.
As an example of why to avoid inheritance if you don't need it you can look at the java.util.Stack class. As it extends Vector it inherits all of the methods on Vector. Most of these methods break the contract implied by Stack, e.g. LIFO. It would have been much better to implement Stack using a Vector internally, only exposing Stack methods as the API. It would then have been easy to change the implementation to ArrayList or something else later, none of which is possible now due to inheritance.
You seem to have found your answer that in this case delegation is the better strategy. Nevertheless, I would like to add that, excepting delegation, there is nothing wrong with extending a built-in class, particularly if your alternative, depending on the language, is "monkey patching" (see http://en.wikipedia.org/wiki/Monkey_patch)

Categories

Resources