I realize that in most cases, it's preferred in Python to just access attributes directly, since there's no real concept of encapsulation like there is in Java and the like. However, I'm wondering if there aren't any exceptions, particularly with abstract classes that have disparate implementations.
Let's say I'm writing a bunch of abstract classes (because I am) and that they represent things having to do with version control systems like repositories and revisions (because they do). Something like an SvnRevision and an HgRevision and a GitRevision are very closely semantically linked, and I want them to be able to do the same things (so that I can have code elsewhere that acts on any kind of Repository object, and is agnostic of the subclass), which is why I want them to inherit from an abstract class. However, their implementations vary considerably.
So far, the subclasses that have been implemented share a lot of attribute names, and in a lot of code outside of the classes themselves, direct attribute access is used. For example, every subclass of Revision has an author attribute, and a date attribute, and so on. However, the attributes aren't described anywhere in the abstract class. This seems to me like a very fragile design.
If someone wants to write another implementation of the Revision class, I feel like they should be able to do so just by looking at the abstract class. However, an implementation of the class that satisfies all of the abstract methods will almost certainly fail, because the author won't know that they need attributes called 'author' and 'date' and so on, so code that tries to access Revision.author will throw an exception. Probably not hard to find the source of the problem, but irritating nonetheless, and it just feels like an inelegant design.
My solution was to write accessor methods for the abstract classes (get_id, get_author, etc.). I thought this was actually a pretty clean solution, since it eliminates arbitrary restrictions on how attributes are named and stored, and just makes clear what data the object needs to be able to access. Any class that implements all of the methods of the abstract class will work... that feels right.
Anyways, the team I'm working with hates this solution (seemingly for the reason that accessors are unpythonic, which I can't really argue with). So... what's the alternative? Documentation? Or is the problem I'm imagining a non-issue?
Note: I've considered properties, but I don't think they're a cleaner solution.
Note: I've considered properties, but I don't think they're a cleaner solution.
But they are. By using properties, you'll have the class signature you want, while being able to use the property as an attribute itself.
def _get_id(self):
return self._id
def _set_id(self, newid):
self._id = newid
Is likely similar to what you have now. To placate your team, you'd just need to add the following:
id = property(_get_id, _set_id)
You could also use property as a decorator:
#property
def id(self):
return self._id
#id.setter
def id(self, newid):
self._id = newid
And to make it readonly, just leave out set_id/the id.setter bit.
You've missed the point. It isn't the lack of encapsulation that removes the need for accessors, it's the fact that, by changing from a direct attribute to a property, you can add an accessor at a later time without changing the published interface in any way.
In many other languages, if you expose an attribute as public and then later want to wrap some code round it on access or mutation then you have to change the interface and anyone using the code has at the very least to recompile and possibly to edit their code also. Python isn't like that: you can flip flop between attribute or property just as much as you want and no code that uses the class will break.
Only behind a property.
"they should be able to do so just by looking at the abstract class"
Don't know what this should be true. A "programmer's guide", a "how to extend" document, plus some training seems appropriate to me.
"the author won't know that they need attributes called 'author' and 'date' and so on".
In that case, the documentation isn't complete. Perhaps the abstract class needs a better docstring. Or a "programmer's guide", or a "how to extend" document.
Also, it doesn't seem very difficult to (1) document these attributes in the docstring and (2) provide default values in the __init__ method.
What's wrong with providing extra support for programmers?
It sounds like you have a social problem, not a technical one. Writing code to solve a social problem seems like a waste of time and money.
The discussion already ended a year ago, but this snippet seemed telltale, it's worth discussing:
However, an implementation of the
class that satisfies all of the
abstract methods will almost certainly
fail, because the author won't know
that they need attributes called
'author' and 'date' and so on, so code
that tries to access Revision.author
will throw an exception.
Uh, something is deeply wrong. (What S. Lott said, minus the personal comments).
If these are required members, aren't they referenced (if not required) in the constructor, and defined by docstring? or at very least, as required args of methods, and again documented?
How could users of the class not know what the required members are?
To be the devil's advocate, what if your constructor(s) requires you to supply all the members that will/may be required, what issue does that cause?
Also, are you checking the parameters when passed, and throwing informative exceptions?
(The ideological argument of accessor-vs-property is a sidebar. Properties are preferable but I don't think that's the issue with your class design.)
Related
I've been programming in Python for a long time, but I still can't understand why classes base their attribute lookup on the __dict__ dictionary by default instead of the faster __slots__ tuple.
Wouldn't it make more sense to use the more efficient and less flexible __slots__ method as the default implementation and instead make the more flexible, but slower __dict__ method optional?
Also, if a class uses __slots__ to store its attributes, there's no chance of mistakenly creating new attributes like this:
class Object:
__slots__ = ("name",)
def __init__(self, name):
self.name = name
obj = Object()
# Note the typo here
obj.namr = "Karen"
So, I was wondering if there's a valid reason why Python defaults to accessing instance attributes through __dict__ instead of through __slots__.
Python is designed to be an extremely flexible language, and allows objects to modify themselves in many interesting ways at runtime. Making a change to prevent that kind of flexibility would break a massive amount of other people's code, so for the sake of backwards compatibility I don't think it will happen any time soon (if at all).
As well as this, due to the way Python code is interpreted, it is very difficult to design a system that can look ahead and determine exactly what variables a particular class will use ahead of time, especially given the existence of setattr() and other similar functions, which can modify the state of other objects in unpredictable ways.
In summary, Python is designed to value flexibility over performance, and as such, having __slots__ be an optional technique to speed up parts of your code is a trade-off that you choose to make if you wish to write your code in Python. I can't answer whether this is a worthwhile design decision for you, since it's entirely based on opinion.
If you wish to have a bit more safety to prevent issues such as the one you described, there are tools such as mypy and pylint which can catch that sort of error.
I am running into an issue with Liskov Substitution Principle and am not quite sure what would be the best way to go around it.
Code in question
class BaseModel:
def run(self, base_model_input: BaseModelInput) -> BaseModelOutput:
"""Throws NotImplemented or #abstractmethod"""
pass
class SpecificModel(BaseModel):
def run(self, specific_input: SpecificModelInput) -> SpecificModelOutput:
# do things...
I understand well why is this not a great code, and why it violates the Liskov Substitution Principle. I am wondering how to design my system better to avoid this problem in the first place.
Fundamentally I have a BaseModel class that acts like an interface, providing some methods like run that the extending classes must implement. But extending classes also deal with specific input/output, that are also extensions of the base input/output classes (that is SpecificModelInput inherits from BaseModelInput and adds some fields and functionality, same with output)
What would be a better approach here?
As I stated in the comments:If you can really constrain the inputs and outputs to subtypes of a base-input and base-output, I don't see any problem with this model
If by violation you mean that on restricting the input options, the subclass can no longer b used everywhere the baseclass can, I'd say it is a case where you are taking the Liskov Substitution principle as dogma where it should be very good advice.
See, the Liskov..principle is a good guideline, and a bit simplistic, as it will give you a really good feeling of what inheritance should be.
But in real-world terms, restricting input parameters (and attribute types) is not a concern in all cases. Anyway, having a concrete example will serve us well: think of an abstract "Vehicle" class with a "board" method which allows you to board "Transportables" - and concrete "Transportables" and some os the allowable subclasses of "Transportable" are Persons, Dogs, Grocery Bag, Pianos, and Elephants. If your "Vehicle" class is a "Ship" - it can take all of those. If your vehicle class is a Car, some of those are out.
That seems to be your use case.
So, of course, you are violating the principle here. The wording in most places explaining it is that "if you substitute an instance of the subclass anywhere the superclass appears, the program should not break".
So, the most correct thing to do, is to modify the superclass contracts in a way that any input may or may not work, and allow it to raise a runtime exception (or return an object signalizing an error) even if the input is valid.
Everyone calling that method should handle the "did not work" state - with the example above it is easy to see that if I have a method in an unrelated class that calls vehicle.board, it should be responsible to see it is not trying to put an Elephant inside a car. If everywhere the method is called these checks are made, the principle holds!
If engineering this is overkill for whatever task you have, I'd say "...practicality beats purity" in this case, and simply set the annotations to silence the static type checker.
Of course, telling that to the static type checker is another thing - I think the use of Generics as pointed in the answer linked in the comments could do: How do I annotate the type of a parameter of an abstractmethod, when the parameter can have any type derived from a specific base type?
General Python Question
I'm importing a Python library (call it animals.py) with the following class structure:
class Animal(object): pass
class Rat(Animal): pass
class Bat(Animal): pass
class Cat(Animal): pass
...
I want to add a parent class (Pet) to each of the species classes (Rat, Bat, Cat, ...); however, I cannot change the actual source of the library I'm importing, so it has to be a run time change.
The following seems to work:
import animals
class Pet(object): pass
for klass in (animals.Rat, animals.Bat, animals.Cat, ...):
klass.__bases__ = (Pet,) + klass.__bases__
Is this the best way to inject a parent class into an inheritance tree in Python without making modification to the source definition of the class to be modified?
Motivating Circumstances
I'm trying to graft persistence onto the a large library that controls lab equipment. Messing with it is out of the question. I want to give ZODB's Persistent a try. I don't want to write the mixin/facade wrapper library because I'm dealing with 100+ classes and lots of imports in my application code that would need to be updated. I'm testing options by hacking on my entry point only: setting up the DB, patching as shown above (but pulling the species classes w/ introspection on the animals module instead of explicit listing) then closing out the DB as I exit.
Mea Culpa / Request
This is an intentionally general question. I'm interested in different approaches to injecting a parent and comments on the pros and cons of those approaches. I agree that this sort of runtime chicanery would make for really confusing code. If I settle on ZODB I'll do something explicit. For now, as a regular user of python, I'm curious about the general case.
Your method is pretty much how to do it dynamically. The real question is: What does this new parent class add? If you are trying to insert your own methods in a method chain that exists in the classes already there, and they were not written properly, you won't be able to; if you are adding original methods (e.g. an interface layer), then you could possibly just use functions instead.
I am one who embraces Python's dynamic nature, and would have no problem using the code you have presented. Make sure you have good unit tests in place (dynamic or not ;), and that modifying the inheritance tree actually lets you do what you need, and enjoy Python!
You should try really hard not to do this. It is strange, and will likely end in tears.
As #agf mentions, you can use Pet as a mixin. If you tell us more about why you want to insert a parent class, we can help you find a nicer solution.
I recently discovered metaclasses in python.
Basically a metaclass in python is a class that creates a class. There are many useful reasons why you would want to do this - any kind of class initialisation for example. Registering classes on factories, complex validation of attributes, altering how inheritance works, etc. All of this becomes not only possible but simple.
But in python, metaclasses are also plain classes. So, I started wondering if the abstraction could usefully go higher, and it seems to me that it can and that:
a metaclass corresponds to or implements a role in a pattern (as in GOF pattern languages).
a meta-metaclass is the pattern itself (if we allow it to create tuples of classes representing abstract roles, rather than just a single class)
a meta-meta-metaclass is a pattern factory, which corresponds to the GOF pattern groupings, e.g. Creational, Structural, Behavioural. A factory where you could describe a case of a certain type of problem and it would give you a set of classes that solved it.
a meta-meta-meta-metaclass (as far as I could go), is a pattern factory factory, a factory to which you could perhaps describe the type of your problem and it would give you a pattern factory to ask.
I have found some stuff about this online, but mostly not very useful. One problem is that different languages define metaclasses slightly differently.
Has anyone else used metaclasses like this in python/elsewhere, or seen this used in the wild, or thought about it? What are the analogues in other languages? E.g. in C++ how deep can the template recursion go?
I'd very much like to research it further.
This reminds me of the eternal quest some people seem to be on to make a "generic implementation of a pattern." Like a factory that can create any object (including another factory), or a general-purpose dependency injection framework that is far more complex to manage than simply writing code that actually does something.
I had to deal with people intent on abstraction to the point of navel-gazing when I was managing the Zend Framework project. I turned down a bunch of proposals to create components that didn't do anything, they were just magical implementations of GoF patterns, as though the pattern were a goal in itself, instead of a means to a goal.
There's a point of diminishing returns for abstraction. Some abstraction is great, but eventually you need to write code that does something useful.
Otherwise it's just turtles all the way down.
To answer your question: no.
Feel free to research it further.
Note, however, that you've conflated design patterns (which are just ideas) with code (which is an implementation.)
Good code often reflects a number of interlocking design patterns. There's no easy way for formalize this. The best you can do is a nice picture, well-written docstrings, and method names that reflect the various design patterns.
Also note that a meta-class is a class. That's a loop. There's no higher level of abstractions. At that point, it's just intent. The idea of meta-meta-class doesn't mean much -- it's a meta-class for meta-classes, which is silly but technically possible. It's all just a class, however.
Edit
"Are classes that create metaclasses really so silly? How does their utility suddenly run out?"
A class that creates a class is fine. That's pretty much it. The fact that the target class is a meta class or an abstract superclass or a concrete class doesn't matter. Metaclasses make classes. They might make other metaclasses, which is weird, but they're still just metaclasses making classes.
The utility "suddenly" runs out because there's no actual thing you need (or can even write) in a metaclass that makes another metaclass. It isn't that it "suddenly" becomes silly. It's that there's nothing useful there.
As I seed, feel free to research it. For example, actually write a metaclass that builds another metaclass. Have fun. There might be something useful there.
The point of OO is to write class definitions that model real-world entities. As such, a metaclass is sometimes handy to define cross-cutting aspects of several related classes. (It's a way to do some Aspect-Oriented Programming.) That's all a metaclass can really do; it's a place to hold a few functions, like __new__(), that aren't proper parts of the class itself.
During the History of Programming Languages conference in 2007, Simon Peyton Jones commented that Haskell allows meta programming using Type Classes, but that its really turtles all the way down. You can meta-meta-meta-meta etc program in Haskell, but that he's never heard of anyone using more than 3 levels of indirection.
Guy Steele pointed out that its the same thing in Lisp and Scheme. You can do meta-programming using backticks and evals (you can think of a backtick as a Python lambda, kinda), but he's never seen more than 3 backticks used.
Presumably they have seen more code than you or I ever has, so its only a slight exaggeration to say that no-one has ever gone beyond 3 levels of meta.
If you think about it, most people don't ever use meta-programming, and two levels is pretty hard to wrap your head around. I would guess that three is nearly impossible, and the that last guy to try four ended up in an asylum.
Since when I first understood metaclasses in Python, I kept wondering "what could be done with a meta-meta class?". This is at least 10 years ago - and now, just a couple months ago, it became clear for me that there is one mechanism in Python class creation that actually involves a "meta-meta" class. And therefore, it is possible to try to imagine some use for that.
To recap object instantiation in Python: Whenever one instantiates an object in Python by "calling" its class with the same syntax used for calling an ordinary function, the class's __new__ and __init__. What "orchestrates" the calling of these methods on the class is exactly the class'metaclass' __call__ method. Usually when one writes a metaclass in Python, either the __new__ or __init__ method of the metaclass is customized.
So, it turns out that by writing a "meta-meta" class one can customize its __call__ method and thus control which parameters are passed and to the metaclass's __new__ and __init__ methods, and if some other code is to be called before of after those. What turns out in the end is that metcalsses themselves are usually hardcoded and one needs just a few, if any, even in very large projects. So any customization that might be done at the "meta meta" call is usually done directly on the metaclass itself.
And them, there are those other less frequent uses for Python metaclasses - one can customize an __add__ method in a metaclass so that the classes they define are "addable", and create a derived class having the two added classes as superclasses. That mechanism is perfectly valid with metaclasses as well - therefore, so just we "have some actual code", follows an example of "meta-meta" class that allows one to compose "metaclasses" for a class just by adding them on class declaration:
class MM(type):
def __add__(cls, other):
metacls = cls.__class__
return metacls(cls.__name__ + other.__name__, (cls, other), {})
class M1(type, metaclass=MM):
def __new__(metacls, name, bases, namespace):
namespace["M1"] = "here"
print("At M1 creation")
return super().__new__(metacls, name, bases, namespace)
class M2(type, metaclass=MM):
def __new__(metacls, name, bases, namespace):
namespace["M2"] = "there"
print("At M2 creation")
return super().__new__(metacls, name, bases, namespace)
And we can see that working on the interactive console:
In [22]: class Base(metaclass = M1 + M2):
...: pass
...:
At M1 creation
At M2 creation
Note that as different metaclasses in Python are usually difficult to combine, this can actually be useful by allowing a user-made metaclass to be combined with a library's or stdlib one, without this one having to be explicitly declared as parent of the former:
In [23]: import abc
In [24]: class Combined(metaclass=M1 + abc.ABCMeta):
...: pass
...:
At M1 creation
The class system in Smalltalk is an interesting one to study. In Smalltalk, everything is an object and every object has a class. This doesn't imply that the hierarchy goes to infinity. If I remember correctly, it goes something like:
5 -> Integer -> Integer class -> Metaclass -> Metaclass class -> Metaclass -> ... (it loops)
Where '->' denotes "is an instance of".
I want to parse an Apache access.log file with a python program in a certain way, and though I am completely new to object-oriented programming, I want to start doing it now.
I am going to create a class ApacheAccessLog, and the only thing I can imagine now, it will be doing is 'readline' method. Is it conventionally correct to inherit from the builtin file class in this case, so the class will behave just like an instance of the file class itself, or not? What is the best way of doing that?
In this case I would use delegation rather than inheritance. It means that your class should contain the file object as an attribute and invoke a readline method on it. You could pass a file object in the constructor of the logger class.
There are at least two reasons for this:
Delegation reduces coupling, for example in place of file objects you can use any other object that implements a readline method (duck typing comes handy here).
When inheriting from file the public interface of your class becomes unnecessarily broad. It includes all the methods defined on file even if these methods don't make sense in case of Apache log.
I am coming from a Java background but I am fairly confident that the same principles will apply in Python. As a rule of thumb you should never inherit from a class whose implementation you don't understand and control unless that class has been designed specifically for inheritance. If it has been designed in this way it should describe this clearly in its documentation.
The reason for this is that inheritance can potentially bind you to the implementation details of the class that you are inheriting from.
To use an example from Josh Bloch's book 'Effective Java'
If we were to extend the class ArrayList class in order to be able to count the number of items that were added to it during its life-time (not necessarily the number it currently contains) we may be tempted to write something like this.
public class CountingList extends ArrayList {
int counter = 0;
public void add(Object o) {
counter++;
super.add(0);
}
public void addAll(Collection c) {
count += c.size();
super.addAll(c);
}
// Etc.
}
Now this extension looks like it would accurately count the number of elements that were added to the list but in fact it may not. If ArrayList has implemented addAll by iterating over the Collection provided and calling its interface method addAll for each element then we will count each element added through the addAll method twice. Now the behaviour of our class is dependent on the implementation details of ArrayList.
This is of course in addition to the disadvantage of not being able to use other implementations of List with our CountingList class. Plus the disadvantages of inheriting from a concrete class that are discussed above.
It is my understanding that Python uses a similar (if not identical) method dispatch mechanism to Java and will therefore be subject to the same limitations. If someone could provide an example in Python I'm sure it would be even more useful.
It is perfectly acceptable to inherit from a built in class. In this case I'd say you're right on the money.
The log "is a" file so that tells you inheritance is ok..
General rule.
Dog "is a"n animal, therefore inherit from animal.
Owner "has a"n animal therefore don't inherit from animal.
Although it is in some cases useful to inherit from builtins, the real question here is what you want to do with the output and what's your big-picture design. I would usually write a reader (that uses a file object) and spit out whatever data class I need to hold the information I just read. It's then easy to design that data class to fit in with the rest of my design.
You should be fairly safe inheriting from a "builtin" class, as later modifications to these classes will usually be compatible with the current version.
However, you should think seriously about wether you really want to tie your class to the additional functionality provided by the builtin class. As mentioned in another answer you should consider (perhaps even prefer) using delegation instead.
As an example of why to avoid inheritance if you don't need it you can look at the java.util.Stack class. As it extends Vector it inherits all of the methods on Vector. Most of these methods break the contract implied by Stack, e.g. LIFO. It would have been much better to implement Stack using a Vector internally, only exposing Stack methods as the API. It would then have been easy to change the implementation to ArrayList or something else later, none of which is possible now due to inheritance.
You seem to have found your answer that in this case delegation is the better strategy. Nevertheless, I would like to add that, excepting delegation, there is nothing wrong with extending a built-in class, particularly if your alternative, depending on the language, is "monkey patching" (see http://en.wikipedia.org/wiki/Monkey_patch)