I have a project I'm working on (http://github.com/lusis/vogeler). One of the goals is to provide swappable persistance and messaging backends. I think I have a workable model in place but wanted to get input from the Python crowd about best practices. You can see the new implementation here:
http://github.com/lusis/vogeler/blob/master/vogeler/db/generic.py
couch2.py is my subclass of generic.
Essentially the generic class provides a common set of interfaces (createdb, usedb, create, update) which call private methods such as _create_db, _use_db and so on.
My expectation is that the database specific stuff will subclass GenericPersistence and override the private methods. Is that considered bad form? Overriding private methods in general feels kind of weird but the end result is that it works. I just want to make sure I'm not breaking some sort of unwritten contract about subclassing in Python.
I think, by convention, the single underscore is a hint that the attribute is an implementation detail that may be changed in the future. Subclasses should not override or invoke underscored methods, because their presence may not be relied on.
So, I'd change the underscored methods to hooks: _update --> update_hook
and ask developers to override the *_hook methods.
Related
I have a class A that need to implement a method meth().
Now, I don't want this method to be called by the end-user of my package. Thus, I have to make this method private (i.e. _meth(). I know that it's not really, private, but conventions matter.)
The problem though is that I have yet another class B in my package that has to call that method _meth(). Problem is that I now get the warning method that say that B tries to access a protected method of a class. Thus, I have to make the method public, i.e. without the leading underscore. This contradicts my intentions.
What is the most pythonic way to solve this dilemma?
I know I can re-implement that method outside of A, but it will lead to code duplication and, as meth() uses private attributes of A, will lead to the same problem.
Inheriting from a single metaclass is not an option as those classes have entirely different purposes and that will be contributing towards a ghastly mess.
The fact that pylint/your editor/whatever external tool gives you a warning doesn't prevent code execution. I don't know about your editor but pylint warnings can be disabled on a case-by-case basis using special comments (nb: "case by case" meaning: "do not warn me for this line or block", not "totally disable this warning").
And it's perfectly ok for your own code to access protected attributes and methods in the same package - the "_protected" naming convention does not mean "None shall pass", just "are you sure you understand what you're doing and willing to take responsability if you break something ?". Since you're the author/maintainer of the package and those are intra-package access you are obviously entitled to take this responsability ;)
The "most pythonic way" would be to not care about private and protected, as these concepts do not exist in Python.
Everything is public. Adding a underscore in the name does not make it private, it just indicates the method is for internal use in the class (not to prevent usage by some end-user).
If you need to use the method from another class, it shows that you're not using classes and objects correctly, and you probably come from a different language like Java where classes are used to group methods together in some namespace.
Just move the function to the module level (outside the class), as you're not using the object (self) anyway.
My IDE keeps suggesting I convert my instance methods to static methods. I guess because I haven't referenced any self within these methods.
An example is :
class NotificationViewSet(NSViewSet):
def pre_create_processing(self, request, obj):
log.debug(" creating messages ")
# Ensure data is consistent and belongs to the sending bot.
obj['user_id'] = request.auth.owner.id
obj['bot_id'] = request.auth.id
So my question would be: do I lose anything by just ignoring the IDE suggestions, or is there more to it?
This is a matter of workflow, intentions with your design, and also a somewhat subjective decision.
First of all, you are right, your IDE suggests converting the method to a static method because the method does not use the instance. It is most likely a good idea to follow this suggestion, but you might have a few reasons to ignore it.
Possible reasons to ignore it:
The code is soon to be changed to use the instance (on the other hand, the idea of soon is subjective, so be careful)
The code is legacy and not entirely understood/known
The interface is used in a polymorphic/duck typed way (e.g. you have a collection of objects with this method and you want to call them in a uniform way, but the implementation in this class happens to not need to use the instance - which is a bit of a code smell)
The interface is specified externally and cannot be changed (this is analog to the previous reason)
The AST of the code is read/manipulated either by itself or something that uses it and expects this method to be an instance method (this again is an external dependency on the interface)
I'm sure there can be more, but failing these types of reasons I would follow the suggestion. However, if the method does not belong to the class (e.g. factory method or something similar), I would refactor it to not be part of the class.
I think that you might be mixing up some terminology - the example is not a class method. Class methods receive the class as the first argument, they do not receive the instance. In this case you have a normal instance method that is not using its instance.
If the method does not belong in the class, you can move it out of the class and make it a standard function. Otherwise, if it should be bundled as part of the class, e.g. it's a factory function, then you should probably make it a static method as this (at a minimum) serves as useful documentation to users of your class that the method is coupled to the class, but not dependent on it's state.
Making the method static also has the advantage this it can be overridden in subclasses of the class. If the method was moved outside of the class as a regular function then subclassing is not possible.
I've been making a lot of classes an Python recently and I usually just access instance variables like this:
object.variable_name
But often I see that objects from other modules will make wrapper methods to access variables like this:
object.getVariable()
What are the advantages/disadvantages to these different approaches and is there a generally accepted best practice (even if there are exceptions)?
There should never be any need in Python to use a method call just to get an attribute. The people who have written this are probably ex-Java programmers, where that is idiomatic.
In Python, it's considered proper to access the attribute directly.
If it turns out that you need some code to run when accessing the attribute, for instance to calculate it dynamically, you should use the #property decorator.
The main advantages of "getters" (the getVariable form) in my modest opinion is that it's much easier to add functionality or evolve your objects without changing the signatures.
For instance, let's say that my object changes from implementing some functionality to encapsulating another object and providing the same functionality via Proxy Pattern (composition). If I'm using getters to access the properties, it doesn't matter where that property is being fetched from, and no change whatsoever is visible to the "clients" using your code.
I use getters and such methods especially when my code is being reused (as a library for instance), by others. I'm much less picky when my code is self-contained.
In Java this is almost a requirement, you should never access your object fields directly. In Python it's perfectly legitimate to do so, but you may take in consideration the possible benefits of encapsulation that I mentioned. Still keep in mind that direct access is not considered bad form in Python, on the contrary.
making getVariable() and setVariable() methods is called enncapsulation.
There are many advantages to this practice and it is the preffered style in object-oriented programming.
By accessing your variables through methods you can add another layer of "error checking/handling" by making sure the value you are trying to set/get is correct.
The setter method is also used for other tasks like notifying listeners that the variable have changed.
At least in java/c#/c++ and so on.
I realize that in most cases, it's preferred in Python to just access attributes directly, since there's no real concept of encapsulation like there is in Java and the like. However, I'm wondering if there aren't any exceptions, particularly with abstract classes that have disparate implementations.
Let's say I'm writing a bunch of abstract classes (because I am) and that they represent things having to do with version control systems like repositories and revisions (because they do). Something like an SvnRevision and an HgRevision and a GitRevision are very closely semantically linked, and I want them to be able to do the same things (so that I can have code elsewhere that acts on any kind of Repository object, and is agnostic of the subclass), which is why I want them to inherit from an abstract class. However, their implementations vary considerably.
So far, the subclasses that have been implemented share a lot of attribute names, and in a lot of code outside of the classes themselves, direct attribute access is used. For example, every subclass of Revision has an author attribute, and a date attribute, and so on. However, the attributes aren't described anywhere in the abstract class. This seems to me like a very fragile design.
If someone wants to write another implementation of the Revision class, I feel like they should be able to do so just by looking at the abstract class. However, an implementation of the class that satisfies all of the abstract methods will almost certainly fail, because the author won't know that they need attributes called 'author' and 'date' and so on, so code that tries to access Revision.author will throw an exception. Probably not hard to find the source of the problem, but irritating nonetheless, and it just feels like an inelegant design.
My solution was to write accessor methods for the abstract classes (get_id, get_author, etc.). I thought this was actually a pretty clean solution, since it eliminates arbitrary restrictions on how attributes are named and stored, and just makes clear what data the object needs to be able to access. Any class that implements all of the methods of the abstract class will work... that feels right.
Anyways, the team I'm working with hates this solution (seemingly for the reason that accessors are unpythonic, which I can't really argue with). So... what's the alternative? Documentation? Or is the problem I'm imagining a non-issue?
Note: I've considered properties, but I don't think they're a cleaner solution.
Note: I've considered properties, but I don't think they're a cleaner solution.
But they are. By using properties, you'll have the class signature you want, while being able to use the property as an attribute itself.
def _get_id(self):
return self._id
def _set_id(self, newid):
self._id = newid
Is likely similar to what you have now. To placate your team, you'd just need to add the following:
id = property(_get_id, _set_id)
You could also use property as a decorator:
#property
def id(self):
return self._id
#id.setter
def id(self, newid):
self._id = newid
And to make it readonly, just leave out set_id/the id.setter bit.
You've missed the point. It isn't the lack of encapsulation that removes the need for accessors, it's the fact that, by changing from a direct attribute to a property, you can add an accessor at a later time without changing the published interface in any way.
In many other languages, if you expose an attribute as public and then later want to wrap some code round it on access or mutation then you have to change the interface and anyone using the code has at the very least to recompile and possibly to edit their code also. Python isn't like that: you can flip flop between attribute or property just as much as you want and no code that uses the class will break.
Only behind a property.
"they should be able to do so just by looking at the abstract class"
Don't know what this should be true. A "programmer's guide", a "how to extend" document, plus some training seems appropriate to me.
"the author won't know that they need attributes called 'author' and 'date' and so on".
In that case, the documentation isn't complete. Perhaps the abstract class needs a better docstring. Or a "programmer's guide", or a "how to extend" document.
Also, it doesn't seem very difficult to (1) document these attributes in the docstring and (2) provide default values in the __init__ method.
What's wrong with providing extra support for programmers?
It sounds like you have a social problem, not a technical one. Writing code to solve a social problem seems like a waste of time and money.
The discussion already ended a year ago, but this snippet seemed telltale, it's worth discussing:
However, an implementation of the
class that satisfies all of the
abstract methods will almost certainly
fail, because the author won't know
that they need attributes called
'author' and 'date' and so on, so code
that tries to access Revision.author
will throw an exception.
Uh, something is deeply wrong. (What S. Lott said, minus the personal comments).
If these are required members, aren't they referenced (if not required) in the constructor, and defined by docstring? or at very least, as required args of methods, and again documented?
How could users of the class not know what the required members are?
To be the devil's advocate, what if your constructor(s) requires you to supply all the members that will/may be required, what issue does that cause?
Also, are you checking the parameters when passed, and throwing informative exceptions?
(The ideological argument of accessor-vs-property is a sidebar. Properties are preferable but I don't think that's the issue with your class design.)
I want to parse an Apache access.log file with a python program in a certain way, and though I am completely new to object-oriented programming, I want to start doing it now.
I am going to create a class ApacheAccessLog, and the only thing I can imagine now, it will be doing is 'readline' method. Is it conventionally correct to inherit from the builtin file class in this case, so the class will behave just like an instance of the file class itself, or not? What is the best way of doing that?
In this case I would use delegation rather than inheritance. It means that your class should contain the file object as an attribute and invoke a readline method on it. You could pass a file object in the constructor of the logger class.
There are at least two reasons for this:
Delegation reduces coupling, for example in place of file objects you can use any other object that implements a readline method (duck typing comes handy here).
When inheriting from file the public interface of your class becomes unnecessarily broad. It includes all the methods defined on file even if these methods don't make sense in case of Apache log.
I am coming from a Java background but I am fairly confident that the same principles will apply in Python. As a rule of thumb you should never inherit from a class whose implementation you don't understand and control unless that class has been designed specifically for inheritance. If it has been designed in this way it should describe this clearly in its documentation.
The reason for this is that inheritance can potentially bind you to the implementation details of the class that you are inheriting from.
To use an example from Josh Bloch's book 'Effective Java'
If we were to extend the class ArrayList class in order to be able to count the number of items that were added to it during its life-time (not necessarily the number it currently contains) we may be tempted to write something like this.
public class CountingList extends ArrayList {
int counter = 0;
public void add(Object o) {
counter++;
super.add(0);
}
public void addAll(Collection c) {
count += c.size();
super.addAll(c);
}
// Etc.
}
Now this extension looks like it would accurately count the number of elements that were added to the list but in fact it may not. If ArrayList has implemented addAll by iterating over the Collection provided and calling its interface method addAll for each element then we will count each element added through the addAll method twice. Now the behaviour of our class is dependent on the implementation details of ArrayList.
This is of course in addition to the disadvantage of not being able to use other implementations of List with our CountingList class. Plus the disadvantages of inheriting from a concrete class that are discussed above.
It is my understanding that Python uses a similar (if not identical) method dispatch mechanism to Java and will therefore be subject to the same limitations. If someone could provide an example in Python I'm sure it would be even more useful.
It is perfectly acceptable to inherit from a built in class. In this case I'd say you're right on the money.
The log "is a" file so that tells you inheritance is ok..
General rule.
Dog "is a"n animal, therefore inherit from animal.
Owner "has a"n animal therefore don't inherit from animal.
Although it is in some cases useful to inherit from builtins, the real question here is what you want to do with the output and what's your big-picture design. I would usually write a reader (that uses a file object) and spit out whatever data class I need to hold the information I just read. It's then easy to design that data class to fit in with the rest of my design.
You should be fairly safe inheriting from a "builtin" class, as later modifications to these classes will usually be compatible with the current version.
However, you should think seriously about wether you really want to tie your class to the additional functionality provided by the builtin class. As mentioned in another answer you should consider (perhaps even prefer) using delegation instead.
As an example of why to avoid inheritance if you don't need it you can look at the java.util.Stack class. As it extends Vector it inherits all of the methods on Vector. Most of these methods break the contract implied by Stack, e.g. LIFO. It would have been much better to implement Stack using a Vector internally, only exposing Stack methods as the API. It would then have been easy to change the implementation to ArrayList or something else later, none of which is possible now due to inheritance.
You seem to have found your answer that in this case delegation is the better strategy. Nevertheless, I would like to add that, excepting delegation, there is nothing wrong with extending a built-in class, particularly if your alternative, depending on the language, is "monkey patching" (see http://en.wikipedia.org/wiki/Monkey_patch)