Python's dependency injection, and custom pickler - python

I am trying to serialize a class that has external dependency.
The way the class is created is that it receives a config in its init function, and creates an object that receives that config and assign it to self.
What I'm trying to accomplish, is that I want to serialize the class, and depending upon the context of creation, I want to be able to inject a different config.
class Foo:
def __init__(self, some_value, config):
self.some_value = some_value
self.some_service = SomeService(config)
What I'd want in this scenario, is to serialzie self.some_value, but not self.some_service (and neither config, as this is changed).
So, what is the proper pattern? I've had a look at the getstate/setstate dunder, which is perfect for serializing only part of the class, but not injecting config when unpickling. I would have expected Unpickler to work perfectly in this instance, but it doesn't seem like it (and seems to work only with files for some reason? The data is serialized in a redis DB, so no file). I'd rather not have a service locator either, but have the config injected than fetched.
Clarifications :
The issue is not how to use pickle or the pickler. The issue is more of a choice of pattern. I have external dependecies in the object (as denoted by self.some_service = SomeService(config)).
There are 2 ways to reconstruct that object at unpickling :
Use a custom Pickler/Unpickler to detect external dependency instance, serialize a hash of it, and when unpickling, give it whatever instance it requries at that moment
create the get/setstate dunder functions that detects the service, and does not serialize it.
Both have their pros and cons, but I'd like to know which one would be recommended. The unpickler can have the external dependecies when unpickling, and reassign it to the object when unpickling, but it seems like a 'heavy' solution. Using the get/setstate dunder requires to have the class the know how to fetch the external dependencies, and seems a bit 'magical' (the class fectching external services instead of them being given to the class).

The solution I ended up with is essentially a bit like a service locator : I unpickle the instance, then I call a custom install_dependencies(injector)

Related

Python/Django and services as classes

Are there any conventions on how to implement services in Django? Coming from a Java background, we create services for business logic and we "inject" them wherever we need them.
Not sure if I'm using python/django the wrong way, but I need to connect to a 3rd party API, so I'm using an api_service.py file to do that. The question is, I want to define this service as a class, and in Java, I can inject this class wherever I need it and it acts more or less like a singleton. Is there something like this I can use with Django or should I build the service as a singleton and get the instance somewhere or even have just separate functions and no classes?
TL;DR It's hard to tell without more details but chances are you only need a mere module with a couple plain functions or at most just a couple simple classes.
Longest answer:
Python is not Java. You can of course (technically I mean) use Java-ish designs, but this is usually not the best thing to do.
Your description of the problem to solve is a bit too vague to come with a concrete answer, but we can at least give you a few hints and pointers (no pun intended):
1/ Everything is an object
In python, everything (well, everything you can find on the RHS of an assignment that is) is an object, including modules, classes, functions and methods.
One of the consequences is that you don't need any complex framework for dependency injection - you just pass the desired object (module, class, function, method, whatever) as argument and you're done.
Another consequence is that you don't necessarily need classes for everything - a plain function or module can be just enough.
A typical use case is the strategy pattern, which, in Python, is most often implemented using a mere callback function (or any other callable FWIW).
2/ a python module is a singleton.
As stated above, at runtime a python module is an object (of type module) whose attributes are the names defined at the module's top-level.
Except for some (pathological) corner cases, a python module is only imported once for a given process and is garanteed to be unique. Combined with the fact that python's "global" scope is really only "module-level" global, this make modules proper singletons, so this design pattern is actually already builtin.
3/ a python class is (almost) a singleton
Python classes are objects too (instance of type type, directly or indirectly), and python has classmethods (methods that act on the class itself instead of acting on the current instance) and class-level attributes (attributes that belong to the class object itself, not to it's instances), so if you write a class that only has classmethods and class attributes, you technically have a singleton - and you can use this class either directly or thru instances without any difference since classmethods can be called on instances too.
The main difference here wrt/ "modules as singletons" is that with classes you can use inheritance...
4/ python has callables
Python has the concept of "callable" objects. A "callable" is an object whose class implements the __call__() operator), and each such object can be called as if it was a function.
This means that you can not only use functions as objects but also use objects as functions - IOW, the "functor" pattern is builtin. This makes it very easy to "capture" some context in one part of the code and use this context for computations in another part.
5/ a python class is a factory
Python has no new keyword. Pythonc classes are callables, and instanciation is done by just calling the class.
This means that you can actually use a class or function the same way to get an instance, so the "factory" pattern is also builtin.
6/ python has computed attributes
and beside the most obvious application (replacing a public attribute by a pair of getter/setter without breaking client code), this - combined with other features like callables etc - can prove to be very powerful. As a matter of fact, that's how functions defined in a class become methods
7/ Python is dynamic
Python's objects are (usually) dict-based (there are exceptions but those are few and mostly low-level C-coded classes), which means you can dynamically add / replace (and even remove) attributes and methods (since methods are attributes) on a per-instance or per-class basis.
While this is not a feature you want to use without reasons, it's still a very powerful one as it allows to dynamically customize an object (remember that classes are objects too), allowing for more complex objects and classes creation schemes than what you can do in a static language.
But Python's dynamic nature goes even further - you can use class decorators and/or metaclasses to taylor the creation of a class object (you may want to have a look at Django models source code for a concrete example), or even just dynamically create a new class using it's metaclass and a dict of functions and other class-level attributes.
Here again, this can really make seemingly complex issues a breeze to solve (and avoid a lot of boilerplate code).
Actually, Python exposes and lets you hook into most of it's inners (object model, attribute resolution rules, import mechanism etc), so once you understand the whole design and how everything fits together you really have the hand on most aspects of your code at runtime.
Python is not Java
Now I understand that all of this looks a bit like a vendor's catalog, but the point is highlight how Python differs from Java and why canonical Java solutions - or (at least) canonical Java implementations of those solutions - usually don't port well to the Python world. It's not that they don't work at all, just that Python usually has more straightforward (and much simpler IMHO) ways to implement common (and less common) design patterns.
wrt/ your concrete use case, you will have to post a much more detailed description, but "connecting to a 3rd part API" (I assume a REST api ?) from a Django project is so trivial that it really doesn't warrant much design considerations by itself.
In Python you can write the same as Java program structure. You don't need to be so strongly typed but you can. I'm using types when creating common classes and libraries that are used across multiple scripts.
Here you can read about Python typing
You can do the same here in Python. Define your class in package (folder) called services
Then if you want singleton you can do like that:
class Service(object):
instance = None
def __new__(cls):
if cls.instance is not None:
return cls.instance
else:
inst = cls.instance = super(Service, cls).__new__()
return inst
And now you import it wherever you want in the rest of the code
from services import Service
Service().do_action()
Adding to the answer given by bruno desthuilliers and TreantBG.
There are certain questions that you can ask about the requirements.
For example one question could be, does the api being called change with different type of objects ?
If the api doesn't change, you will probably be okay with keeping it as a method in some file or class.
If it does change, such that you are calling API 1 for some scenario, API 2 for some and so on and so forth, you will likely be better off with moving/abstracting this logic out to some class (from a better code organisation point of view).
PS: Python allows you to be as flexible as you want when it comes to code organisation. It's really upto you to decide on how you want to organise the code.

Advantages of using static methods over instance methods in python

My IDE keeps suggesting I convert my instance methods to static methods. I guess because I haven't referenced any self within these methods.
An example is :
class NotificationViewSet(NSViewSet):
def pre_create_processing(self, request, obj):
log.debug(" creating messages ")
# Ensure data is consistent and belongs to the sending bot.
obj['user_id'] = request.auth.owner.id
obj['bot_id'] = request.auth.id
So my question would be: do I lose anything by just ignoring the IDE suggestions, or is there more to it?
This is a matter of workflow, intentions with your design, and also a somewhat subjective decision.
First of all, you are right, your IDE suggests converting the method to a static method because the method does not use the instance. It is most likely a good idea to follow this suggestion, but you might have a few reasons to ignore it.
Possible reasons to ignore it:
The code is soon to be changed to use the instance (on the other hand, the idea of soon is subjective, so be careful)
The code is legacy and not entirely understood/known
The interface is used in a polymorphic/duck typed way (e.g. you have a collection of objects with this method and you want to call them in a uniform way, but the implementation in this class happens to not need to use the instance - which is a bit of a code smell)
The interface is specified externally and cannot be changed (this is analog to the previous reason)
The AST of the code is read/manipulated either by itself or something that uses it and expects this method to be an instance method (this again is an external dependency on the interface)
I'm sure there can be more, but failing these types of reasons I would follow the suggestion. However, if the method does not belong to the class (e.g. factory method or something similar), I would refactor it to not be part of the class.
I think that you might be mixing up some terminology - the example is not a class method. Class methods receive the class as the first argument, they do not receive the instance. In this case you have a normal instance method that is not using its instance.
If the method does not belong in the class, you can move it out of the class and make it a standard function. Otherwise, if it should be bundled as part of the class, e.g. it's a factory function, then you should probably make it a static method as this (at a minimum) serves as useful documentation to users of your class that the method is coupled to the class, but not dependent on it's state.
Making the method static also has the advantage this it can be overridden in subclasses of the class. If the method was moved outside of the class as a regular function then subclassing is not possible.

Add methods to a third party library class in Python

I'm using a third party library (PySphere) for a project I'm working on. PySphere provides a simple API to interact with VMware. This is a general problem though, not specific to this library.
A simple use of the library would be to get a VM object, and then perform various operations on it:
vm_obj = vcenter.get_vm_by_name("My VM")
vm_obj.get_status()
vm_obj.power_on()
I'd like to add a few methods to the vm_obj class. These methods are highly specific to the OS in use on the VM and wouldn't be worthwhile to commit back to the library. Right now I've been doing it like so:
set_config_x(vm_obj, args)
This seems really unpythonic. I'd like to be able to add my methods to the vm_obj class, without modifying the class definition in the third party library directly.
While you can attach any callable to the class object (that is, vm_obj.__class__), that function would not be a method and would not have a self attribute. To make a real method, you can use the new module from the standard library:
vm_obj.set_config_x = new.instancemethod(callableFunction, vm_obj, vm_obj.__class__)
where callableFunction takes self (vm_obj) as its first argument.

Traversing object hierarchy pickle style

I'm in a need for doing some sort of processing on the objects that get pickled just before it happens. More precisely for instances of subclasses of a certain base class I would like something totally different to be pickled instead and then recreated on loading.
I'm aware of __getstate__ & __setstate__ however this is a very invasive approach. My understanding is that these are private methods (begin with double underscore: __), and as such are subject to name mangling. Therefore this effectively would force me to redefine those two methods for every single class that I want to be subject to this non standard behavior. In addition I don't really have a full control over the hierarchy of all classes.
I was wondering if there is some sort of brief way of hooking into pickling process and applying this sort of control that __getstate__ and __setstate__ give but without having to modify the pickled classes as such.
A side note for the curious ones. This is a use case taken from a project using Django and Celery. Django models are either unpickable or very unpractical and cumbersome to pickle. Therefore it's much more advisable to pickle pairs of values ID + model class instead. However sometimes it's not the model directly that is pickled but rather a dictionary of models, a list of models, a list of lists of models, you name it. This forces me to write a lot of copy-paste code that I really dislike. A need for pickling models comes itself from Django-celery setup, where functions along with their call arguments are scheduled for later execution. Unfortunately among those arguments there are usually a lot of models mixed up in some nontrivial hierarchy.
EDIT
I do have a possibility of specifying a custom serializer to be used by Celery, so it's really a question of being able to build a slightly modified serializer on top of pickle without much effort.
The only additional hooks that are related are reduce() and __reduce__ex()
http://docs.python.org/library/pickle.html
What is the difference between __reduce__ and __reduce_ex__?
Python: Ensuring my class gets pickled only with the latest protocol
Not sure if they really provide what you need in particular.

python global object cache

Little question concerning app architecture:
I have a python script, running as a daemon.
Inside i have many objects, all inheriting from one class (let's name it 'entity')
I have also one main object, let it be 'topsys'
Entities are identified by pair (id, type (= class, roughly)), and they are connected in many wicked ways. They are also created and deleted all the time, and they are need to access other entities.
So, i need a kind of storage, basically dictionary of dictionaries (one for each type), holding all entities.
And the question is, what is better: attach this dictionary to 'topsys' as a object property or to class entity, as a property of the class? I would opt for the second (so entities does not need to know of existence of 'topsys'), but i am not feeling good about using properties directly in classes. Or maybe there is another way?
There's not enough detail here to be certain of what's best, but in general I'd store the actual object registry as a module-level (global) variable in the top class, and have a method in the base class to access it.
_entities = []
class entity(object):
#staticmethod
def get_entity_registry():
return _entities
Alternatively, hide _entites entirely and expose a few methods, eg. get_object_by_id, register_object, so you can change the storage of _entities itself more easily later on.
By the way, a tip in case you're not there already: you'll probably want to look into weakrefs when creating object registries like this.
There is no problem with using properties on classes. Classes are just objects, too.
In your case, with this little information available, I would go for a class property, too, because not creating dependencies ist great and will be one worry less sometimes later.

Categories

Resources