I have a python module called model with basically the following content:
class Database:
class Publiation(object):
pass
class Article(Publication):
pass
class Book(Publication):
pass
class AnotherDatabase:
class Seminar(object):
pass
...
I define the objects in the database as classes nested under a main class in order to organize them more distinctively. The objects are parsed from a large XML file, which takes time. I would like to pickle the imported objects to make them loadable in shorter time.
I get the error:
pickle.PicklingError: Can't pickle
: it's
not found as project.model.Article
The class is now project.model.Article, not project.model.Database.Article as defined. Can I fix this error and keep the classes nested like above? Is it a bad idea to organize classes by nesting them?
When an inner class is created, there is no way for the interpreter to know which class it was defined inside of, this information is not recorded. This is why pickle does not know where to look for the class Article.
Because of this there are numerous issues when using inner classes, not just when it comes to pickling. If there are classes at the module scope with the same name, it introduces a lot of ambiguity as there is no easy way to tell the two types apart (e.g. with repr or when debugging.)
As a result it is generally best to avoid nested classes in Python unless you have a very good reason for doing so.
It's certainly a lot simpler to keep your classes unnested. As an alternative, you can use packages to group the classes together.
In any case, there is an alternate serializer named cerealizer which I think could handle the nested classes. You would need to register the classes with it before deserialization. I've used it before when pickle wouldn't suffice (also problems related to the classes) and it works well!
Related
I'm doing a python project where I got a tons of very different functions. I want to somehow organize them together for the future usage, and, of course, for the easier debugging.
However, is it good habit to arrange lots of functions in a class so that they would be attributes, or just put them all in a file?
A related post could be found here: Differences between `class` and `def` where the author specifically indicated that "class is used to define a class (a template from which you can instantiate objects)", so it seemed like functions, methods to change the objects, might not be used as class.
But the official python documentation stated that "Classes provide a means of bundling data and functionality together." So arranging a bunch of functions in a class seemed to be a suggested habit.
A attempt example was achieved as following
class class1(object):
"""description of class"""
def fun1(x,y):
return x**2+y**2
where
class1.fun1(1,2)
returned the result 5. As shown, the function fun1 was now better organized, and one could easily find where it was and debug it.
However, one could simply import the function fun1 from a file as
def fun1(x,y):
return x**2+y**2
and use it as
fun1(1,2)
which was messier.
Should I arrange all those function in a class as attributes, or just put them all in a file?
As an example, let's say I am building an Rest API using Django Rest Framework. Now as part of the application, a few methods are common across all views. My approach is that in the root directory, I have created a services.py file. Inside that module, is a class (CommonUtils) containing all the common utility methods. In that same services.py module I have instantiated an object of CommonUtils.
Now across the application, in the different views.py files I am importing the object from the module and calling the methods on that object. So, essentially I am using a singleton object for the common utility methods.
I feel like this is not a good design approach. So, I want to get an explanation for why this approach is not a good idea and What would the best practice or best approach to achieve the same thing, i.e use a set of common utility methods across all views.py files.
Thanks in advance.
Is this the right design? Why? How to do better?
I feel like this is not a good design approach. So, I want to get an explanation for why this approach is not a good idea and What would the best practice or best approach to achieve the same thing, i.e use a set of common utility methods across all views.py files.
Like #Dmitry Belaventsev wrote above, there is no general rule to solve this problem. This is a typical case of cross-cutting-concerns.
Now across the application, in the different views.py files I am importing the object from the module and calling the methods on that object. So, essentially I am using a singleton object for the common utility methods.
Yes, your implementation is actually a singleton and there is nothing wrong with it. You should ask yourself what do you want to achieve or what do you really need. There are a lot of solutions and you can start with the most basic one:
A simple function in a python module
# file is named utils.py and lives in the root directory
def helper_function_one(param):
return transcendent_all_evil_of(param)
def helper_function_two(prename, lastname):
return 'Hello {} {}'.format(prename, lastname)
In Python it is not uncommon to use just plain functions in a module. You can upgrade it to a method (and a class) if this is really necessary and you need the advantages of classes and objects.
You also can use a class with static methods:
# utils.py
class Utils():
#staticmethod
def helper_one():
print('do something')
But you can see, this is nothing different than the solution with plain functions besides the extra layer of the class. But it has no further value.
You could also write a Singleton Class but in my opinion, this is not very pythonic, because you get the same result with a simple object instance in a module.
I have a class whose instances are classes. Each instance has its own name. I can add that name to the enclosing module when the new class is created, which allows me to pickle.dump it. Originally, I didn't add the name to the module, but I had a reduce method in the top-level class. Unfortunately, there's some mysterious code in the pickle module that special-cases subclasses of type and looks for the name in the module rather than calling reduce. If that code were inside a try, and just continued to the reduce code on failure, I think using reduce would work fine.
Anyway, the problem with just adding the name is that on pickle.load, the name doesn't exist and the load fails. Presumably, I could add a getattr to the module to interpret the name and create the corresponding class, but my understanding is that not all versions of python support module-level getattr.
The reduce method in the class instances allows successfully pickling instances of those class instances by recreating the class instances on pickle.load, but pickling the class instances themselves doesn't work.
Other than making a slightly nonstandard pickle module, is there some reasonable solution to allow pickling of the class instances?
General Python Question
I'm importing a Python library (call it animals.py) with the following class structure:
class Animal(object): pass
class Rat(Animal): pass
class Bat(Animal): pass
class Cat(Animal): pass
...
I want to add a parent class (Pet) to each of the species classes (Rat, Bat, Cat, ...); however, I cannot change the actual source of the library I'm importing, so it has to be a run time change.
The following seems to work:
import animals
class Pet(object): pass
for klass in (animals.Rat, animals.Bat, animals.Cat, ...):
klass.__bases__ = (Pet,) + klass.__bases__
Is this the best way to inject a parent class into an inheritance tree in Python without making modification to the source definition of the class to be modified?
Motivating Circumstances
I'm trying to graft persistence onto the a large library that controls lab equipment. Messing with it is out of the question. I want to give ZODB's Persistent a try. I don't want to write the mixin/facade wrapper library because I'm dealing with 100+ classes and lots of imports in my application code that would need to be updated. I'm testing options by hacking on my entry point only: setting up the DB, patching as shown above (but pulling the species classes w/ introspection on the animals module instead of explicit listing) then closing out the DB as I exit.
Mea Culpa / Request
This is an intentionally general question. I'm interested in different approaches to injecting a parent and comments on the pros and cons of those approaches. I agree that this sort of runtime chicanery would make for really confusing code. If I settle on ZODB I'll do something explicit. For now, as a regular user of python, I'm curious about the general case.
Your method is pretty much how to do it dynamically. The real question is: What does this new parent class add? If you are trying to insert your own methods in a method chain that exists in the classes already there, and they were not written properly, you won't be able to; if you are adding original methods (e.g. an interface layer), then you could possibly just use functions instead.
I am one who embraces Python's dynamic nature, and would have no problem using the code you have presented. Make sure you have good unit tests in place (dynamic or not ;), and that modifying the inheritance tree actually lets you do what you need, and enjoy Python!
You should try really hard not to do this. It is strange, and will likely end in tears.
As #agf mentions, you can use Pet as a mixin. If you tell us more about why you want to insert a parent class, we can help you find a nicer solution.
I'm in a need for doing some sort of processing on the objects that get pickled just before it happens. More precisely for instances of subclasses of a certain base class I would like something totally different to be pickled instead and then recreated on loading.
I'm aware of __getstate__ & __setstate__ however this is a very invasive approach. My understanding is that these are private methods (begin with double underscore: __), and as such are subject to name mangling. Therefore this effectively would force me to redefine those two methods for every single class that I want to be subject to this non standard behavior. In addition I don't really have a full control over the hierarchy of all classes.
I was wondering if there is some sort of brief way of hooking into pickling process and applying this sort of control that __getstate__ and __setstate__ give but without having to modify the pickled classes as such.
A side note for the curious ones. This is a use case taken from a project using Django and Celery. Django models are either unpickable or very unpractical and cumbersome to pickle. Therefore it's much more advisable to pickle pairs of values ID + model class instead. However sometimes it's not the model directly that is pickled but rather a dictionary of models, a list of models, a list of lists of models, you name it. This forces me to write a lot of copy-paste code that I really dislike. A need for pickling models comes itself from Django-celery setup, where functions along with their call arguments are scheduled for later execution. Unfortunately among those arguments there are usually a lot of models mixed up in some nontrivial hierarchy.
EDIT
I do have a possibility of specifying a custom serializer to be used by Celery, so it's really a question of being able to build a slightly modified serializer on top of pickle without much effort.
The only additional hooks that are related are reduce() and __reduce__ex()
http://docs.python.org/library/pickle.html
What is the difference between __reduce__ and __reduce_ex__?
Python: Ensuring my class gets pickled only with the latest protocol
Not sure if they really provide what you need in particular.