Need advice how to decouple logging functionality from data processing in Python - python

In my project, I have a set of classes that do some job by calling external commands, and return their results.
Let's call them "reports".
I want to add logging to these reports, however, a concrete logger object will be defined at runtime rather than
at the time when classes are defined.
So, for now I see two variants how to implement logging:
I. At runtime, instantiate some ReportLogger class instance, that can monkey-patch all given report instances with logging functionality using given concrete logger.
Pros:
It is possible to apply logging to any child report class I really need, and not touch other classes.
Cons:
Magic! Monkey-patching is not explicit way to do things.
Logging is actually applied at runtime, so it's less clear to understand that there is some logging when looking to report classes code.
II. Singleton ReportLogger class, that wraps all reports at creation time via decorators, but accepts concrete logger at runtime.
Pros:
Explicit and clean way to mark that these reports require logging (as well as apply it actually).
Cons:
It's harder to deal with child classes that inherit from basic Report class. If, for example, in base Report class, some method like collect_data() is decorated with #log_collect_data, then for child classes logging will be tightly coupled with collect_data(). Or, maybe, I have to split actual code from collect_data() to, say, _collect_data() to modify it in child classes, call _collect_data() from collect_data(), and then wrap collect_data() with #log_collect_data.
I like second method, but I want better way to deal with child classes rather than using _collect_data(). Any advices are welcome!

Related

How to structure methods of classes that inherit from one BaseClass?

I have a lot of different child classes that inherit from one base class. However all the different child classes implement very similar methods. So if I want to change code in the child classes, I have to change it multiple times.
For me this sounds like bad practice and I would like to implement it correcty. But after a lot of googling I still didn't find a coherent way of how this should be done.
Here is an example of what I mean:
from ABC import ABC, abstractmethod
import logging.config
class BaseModel(ABC):
def __init__(self):
# initialize logging
logging.config.fileConfig(os.path.join(os.getcwd(),
'../myconfig.ini'))
self.logger = logging.getLogger(__name__)
#abstractmethod
def prepare_data(self):
"""
Prepares the needed data.
"""
self.logger.info('Data preparation started.\n')
pass
So this is my BaseClass. Now from this class multiple other classes inherit the init and prepare_data method. The prepare_data method is very similar for every class.
class Class_One(BaseModel):
def __init__(self):
super.__init()__
def prepare_data(self):
super().prepare_data()
# Some code that this method does
class Class_Two(BaseModel):
def __init__(self):
super.__init()__
def prepare_data(self):
super().prepare_data()
# Some code that this method does
# Code is almost the same as for Class_One
class Class_Three(BaseModel):
def __init__(self):
super.__init()__
def prepare_data(self):
super().prepare_data()
# Some code that this method does
# Code is almost the same as for Class_One and Class_Two
# etc.
I suppose you could refactor the methods into another file and then call them in each class. I would love to know how to do this correctly. Thanks a lot in advance!
I'm afraid there's no generic one-size-fits-all magic answer - it all really depend on the "almost" part AND on the forces that will drive change in those parts of the code. IOW, one can only really answer on a concrete example...
This being said, there are a couple lessons learned from experience, which are mostly summmarized in the famous (but unfortunately often misunderstood) GOF "Design Patterns" book. If you take time to first read the first part of the book, you understand that most of (if not all) the patterns in the catalog are based on the same principle: separate the variant from the invariant. Once you can tell one from the other in your code (warning: there's a trap here and beginner almost always fall into it), which pattern to apply is usually obvious (sometimes to the point you only realize you used this and that patterns after you refactored your code).
Now as I said, there IS a trap: accidental duplication. Just because two pieces of code look similar doesn't mean they are duplicates - quite often, they are only "accidentally" similar now but the forces that will make one or the other change are mostly unrelated. If you try to immediatly refactor this code, you'll soon find yourself making the "generic" case more and more complicated to support changes that are actually unrelated, and end up with an overcomplicated, undecipherable mess that only make your code unmaintainable. So the trick here is to carefully examine the whole context, ask yourself what would drive change in one or the other "similar" parts, and if in doubt, wait until you know more. If it happens than each time you change A you have to make the exact same change in B for the exact same reasons then you DO have real duplicate.
For a more practical, short-term advise based on what we can guess from your way too abstract example (and from experience), there are at least two patterns that are most often involved in factoring out duplication in a class hierarchy: the template method and the strategy.
NB : I said "unfortunately often misunderstood" because most people seem to jump to the patterns catalog and try to forcefit all of them in their code (whether it makes sense for the problem at hand or not), and usually by copy-pasting the canonical textbook _implementation_ (usually Java or C++ based) instead of understanding the _concept_ and implementing it in a way that's both idiomatic and adapted to the concrete use case (example: when functions are first class object, you don't necessarily need a Strategie class with abstract base and concrete subclasses - most often plain old callback functions JustWork(tm)).
EDIT totally unrelated but this:
def __init__(self):
# initialize logging
logging.config.fileConfig(os.path.join(os.getcwd(),
'../myconfig.ini'))
self.logger = logging.getLogger(__name__)
is NOT how to use logging. Library code can use loggers, but must not configure anything - this is the application's (your main script / function / whatever) responsability, the rational being that the proper logging config depends on the context - which type of application is using the lib (a CLI app, a local GUI app and a backend web app don't have the same needs at all) and in which kind of environment (a local dev env will want much more logs than a production one for example).
Also, with the logger created with __name__ in your base class module, all child classes will send their log to the same logger, which is certainly not what you want (you want them to have their own package / module specific loggers so you can fine tune the config per package / module).
And finally, this:
os.path.join(os.getcwd(), '../myconfig.ini')
certainly doesn't work as you expect - your cwd can be just anything at this point and you have no way of knowing in advance. If you want to reference a path relative to the current file's directory, you want os.path.dirname(os.path.realpath(__file__)). And of course adding system specific path stuff (ie "../") in a os.path.join() call totally defeats the whole point of using os.path.

Python/Django and services as classes

Are there any conventions on how to implement services in Django? Coming from a Java background, we create services for business logic and we "inject" them wherever we need them.
Not sure if I'm using python/django the wrong way, but I need to connect to a 3rd party API, so I'm using an api_service.py file to do that. The question is, I want to define this service as a class, and in Java, I can inject this class wherever I need it and it acts more or less like a singleton. Is there something like this I can use with Django or should I build the service as a singleton and get the instance somewhere or even have just separate functions and no classes?
TL;DR It's hard to tell without more details but chances are you only need a mere module with a couple plain functions or at most just a couple simple classes.
Longest answer:
Python is not Java. You can of course (technically I mean) use Java-ish designs, but this is usually not the best thing to do.
Your description of the problem to solve is a bit too vague to come with a concrete answer, but we can at least give you a few hints and pointers (no pun intended):
1/ Everything is an object
In python, everything (well, everything you can find on the RHS of an assignment that is) is an object, including modules, classes, functions and methods.
One of the consequences is that you don't need any complex framework for dependency injection - you just pass the desired object (module, class, function, method, whatever) as argument and you're done.
Another consequence is that you don't necessarily need classes for everything - a plain function or module can be just enough.
A typical use case is the strategy pattern, which, in Python, is most often implemented using a mere callback function (or any other callable FWIW).
2/ a python module is a singleton.
As stated above, at runtime a python module is an object (of type module) whose attributes are the names defined at the module's top-level.
Except for some (pathological) corner cases, a python module is only imported once for a given process and is garanteed to be unique. Combined with the fact that python's "global" scope is really only "module-level" global, this make modules proper singletons, so this design pattern is actually already builtin.
3/ a python class is (almost) a singleton
Python classes are objects too (instance of type type, directly or indirectly), and python has classmethods (methods that act on the class itself instead of acting on the current instance) and class-level attributes (attributes that belong to the class object itself, not to it's instances), so if you write a class that only has classmethods and class attributes, you technically have a singleton - and you can use this class either directly or thru instances without any difference since classmethods can be called on instances too.
The main difference here wrt/ "modules as singletons" is that with classes you can use inheritance...
4/ python has callables
Python has the concept of "callable" objects. A "callable" is an object whose class implements the __call__() operator), and each such object can be called as if it was a function.
This means that you can not only use functions as objects but also use objects as functions - IOW, the "functor" pattern is builtin. This makes it very easy to "capture" some context in one part of the code and use this context for computations in another part.
5/ a python class is a factory
Python has no new keyword. Pythonc classes are callables, and instanciation is done by just calling the class.
This means that you can actually use a class or function the same way to get an instance, so the "factory" pattern is also builtin.
6/ python has computed attributes
and beside the most obvious application (replacing a public attribute by a pair of getter/setter without breaking client code), this - combined with other features like callables etc - can prove to be very powerful. As a matter of fact, that's how functions defined in a class become methods
7/ Python is dynamic
Python's objects are (usually) dict-based (there are exceptions but those are few and mostly low-level C-coded classes), which means you can dynamically add / replace (and even remove) attributes and methods (since methods are attributes) on a per-instance or per-class basis.
While this is not a feature you want to use without reasons, it's still a very powerful one as it allows to dynamically customize an object (remember that classes are objects too), allowing for more complex objects and classes creation schemes than what you can do in a static language.
But Python's dynamic nature goes even further - you can use class decorators and/or metaclasses to taylor the creation of a class object (you may want to have a look at Django models source code for a concrete example), or even just dynamically create a new class using it's metaclass and a dict of functions and other class-level attributes.
Here again, this can really make seemingly complex issues a breeze to solve (and avoid a lot of boilerplate code).
Actually, Python exposes and lets you hook into most of it's inners (object model, attribute resolution rules, import mechanism etc), so once you understand the whole design and how everything fits together you really have the hand on most aspects of your code at runtime.
Python is not Java
Now I understand that all of this looks a bit like a vendor's catalog, but the point is highlight how Python differs from Java and why canonical Java solutions - or (at least) canonical Java implementations of those solutions - usually don't port well to the Python world. It's not that they don't work at all, just that Python usually has more straightforward (and much simpler IMHO) ways to implement common (and less common) design patterns.
wrt/ your concrete use case, you will have to post a much more detailed description, but "connecting to a 3rd part API" (I assume a REST api ?) from a Django project is so trivial that it really doesn't warrant much design considerations by itself.
In Python you can write the same as Java program structure. You don't need to be so strongly typed but you can. I'm using types when creating common classes and libraries that are used across multiple scripts.
Here you can read about Python typing
You can do the same here in Python. Define your class in package (folder) called services
Then if you want singleton you can do like that:
class Service(object):
instance = None
def __new__(cls):
if cls.instance is not None:
return cls.instance
else:
inst = cls.instance = super(Service, cls).__new__()
return inst
And now you import it wherever you want in the rest of the code
from services import Service
Service().do_action()
Adding to the answer given by bruno desthuilliers and TreantBG.
There are certain questions that you can ask about the requirements.
For example one question could be, does the api being called change with different type of objects ?
If the api doesn't change, you will probably be okay with keeping it as a method in some file or class.
If it does change, such that you are calling API 1 for some scenario, API 2 for some and so on and so forth, you will likely be better off with moving/abstracting this logic out to some class (from a better code organisation point of view).
PS: Python allows you to be as flexible as you want when it comes to code organisation. It's really upto you to decide on how you want to organise the code.

How to redesign an instance where common functionality (in a parent class) warrants returning instances of the children?

As per the title, I know this is inherently terrible design, as a parent should know nothing about its children. However, I am in the scenario where
All children have reusable behaviour that could be derived from the parent class,
Those reusable methods that all subclasses need, warrant returning instances of the current sub classes!
I need all child classes to have the repeatable behaviour but also have the parent class capable of returning instances of the children, this will need a complete refactor, but how do I design it properly?
I have tried using composition instead here, the only problem being that all classes will have to have to declare an explicit public API for using the common functionality and each and every subclass in the future will need to declare it.
class BasePage(object):
# Some Very Common Behaviour goes here.
# Nothing product specific, just selenium specific
class ProductBasePage(BasePage):
# Reusable behaviour here for the product
# Some of this behaviour in the parent does web-app navigation
# which merits (quite rightly) to return instances of children
# pages, for example (CustomersPage, DashboardPage, ProductsPage)
# but this is clearly flawed and inheritance is not the answer?
# but how do I keep the functionality that all subclasses need
# without then declaring it explicitly in every one of them?
class CustomersPage(ProductBasePage):
# Customer specific behaviour
# Should have access the common functionality implicitly from the parent
Customer Page and all other pages which extend the product page should have the ability to use a lot of common functionality that all of the subclasses of ProductBasePage should have, without having each and every subclass to define that behaviour explicitly. The re-usable methods however, return instances of ProductBasePages sub classes as their actions warrant this behaviour.
So how do I achieve this? I think Inheritance is not the answer here, but then how do I get reusability of the common functionality without declaring it explicitly in every class?
The solution to this problem should avoid circular Python import dependencies as well.
Just put those reusable methods into a separated class, and make them static. Then the subclasses can call those methods implicitly.
If you don't want to use static methods, go with plain functions.

Python: add a parent class to a class after initial evaluation

General Python Question
I'm importing a Python library (call it animals.py) with the following class structure:
class Animal(object): pass
class Rat(Animal): pass
class Bat(Animal): pass
class Cat(Animal): pass
...
I want to add a parent class (Pet) to each of the species classes (Rat, Bat, Cat, ...); however, I cannot change the actual source of the library I'm importing, so it has to be a run time change.
The following seems to work:
import animals
class Pet(object): pass
for klass in (animals.Rat, animals.Bat, animals.Cat, ...):
klass.__bases__ = (Pet,) + klass.__bases__
Is this the best way to inject a parent class into an inheritance tree in Python without making modification to the source definition of the class to be modified?
Motivating Circumstances
I'm trying to graft persistence onto the a large library that controls lab equipment. Messing with it is out of the question. I want to give ZODB's Persistent a try. I don't want to write the mixin/facade wrapper library because I'm dealing with 100+ classes and lots of imports in my application code that would need to be updated. I'm testing options by hacking on my entry point only: setting up the DB, patching as shown above (but pulling the species classes w/ introspection on the animals module instead of explicit listing) then closing out the DB as I exit.
Mea Culpa / Request
This is an intentionally general question. I'm interested in different approaches to injecting a parent and comments on the pros and cons of those approaches. I agree that this sort of runtime chicanery would make for really confusing code. If I settle on ZODB I'll do something explicit. For now, as a regular user of python, I'm curious about the general case.
Your method is pretty much how to do it dynamically. The real question is: What does this new parent class add? If you are trying to insert your own methods in a method chain that exists in the classes already there, and they were not written properly, you won't be able to; if you are adding original methods (e.g. an interface layer), then you could possibly just use functions instead.
I am one who embraces Python's dynamic nature, and would have no problem using the code you have presented. Make sure you have good unit tests in place (dynamic or not ;), and that modifying the inheritance tree actually lets you do what you need, and enjoy Python!
You should try really hard not to do this. It is strange, and will likely end in tears.
As #agf mentions, you can use Pet as a mixin. If you tell us more about why you want to insert a parent class, we can help you find a nicer solution.

Traversing object hierarchy pickle style

I'm in a need for doing some sort of processing on the objects that get pickled just before it happens. More precisely for instances of subclasses of a certain base class I would like something totally different to be pickled instead and then recreated on loading.
I'm aware of __getstate__ & __setstate__ however this is a very invasive approach. My understanding is that these are private methods (begin with double underscore: __), and as such are subject to name mangling. Therefore this effectively would force me to redefine those two methods for every single class that I want to be subject to this non standard behavior. In addition I don't really have a full control over the hierarchy of all classes.
I was wondering if there is some sort of brief way of hooking into pickling process and applying this sort of control that __getstate__ and __setstate__ give but without having to modify the pickled classes as such.
A side note for the curious ones. This is a use case taken from a project using Django and Celery. Django models are either unpickable or very unpractical and cumbersome to pickle. Therefore it's much more advisable to pickle pairs of values ID + model class instead. However sometimes it's not the model directly that is pickled but rather a dictionary of models, a list of models, a list of lists of models, you name it. This forces me to write a lot of copy-paste code that I really dislike. A need for pickling models comes itself from Django-celery setup, where functions along with their call arguments are scheduled for later execution. Unfortunately among those arguments there are usually a lot of models mixed up in some nontrivial hierarchy.
EDIT
I do have a possibility of specifying a custom serializer to be used by Celery, so it's really a question of being able to build a slightly modified serializer on top of pickle without much effort.
The only additional hooks that are related are reduce() and __reduce__ex()
http://docs.python.org/library/pickle.html
What is the difference between __reduce__ and __reduce_ex__?
Python: Ensuring my class gets pickled only with the latest protocol
Not sure if they really provide what you need in particular.

Categories

Resources