How do I visualize and modify the content of a pickled function? - python

I need to use a pickled data processing function that was not written by me and I therefore do not know its content/structure. When I load it, a ModuleNotFound Error occurs:
ModuleNotFoundError: No module named 'sklearn.preprocessing.label'
I assume that the error occurs because the pickled object is trying to import a module named 'sklearn.preprocessing.label', which doesn't exist. I have tried to downgrade my sklearn version but that didn't work either. ยจ
If I knew what the pickled object was doing I could simply make my own function to replace the function within the pickled object. In order to do that I would have to visualize the function contained within the pickled object, or remove the import sklearn.preprocessing.label.

sklearn.preprocessing.label was available in scikit-learn version 0.21 and below: https://github.com/scikit-learn/scikit-learn/tree/0.21.X/sklearn/preprocessing

Related

When would re-importing a module cause a different object to be added to sys.modules?

The importlib library provides the reload function that can use used to re-import a library. I have often used this when importing a custom module in an interactive session (e.g. with an IPython or Jupyter notebook). I usually do the following to ensure the re-loaded object is actually the updated one.
import importlib
import mymodule
result = mymodule.fun()
...
mymodule = importlib.reload(mymodule)
I do this because the docs state
The return value is the module object (which can be different if re-importing causes a different object to be placed in sys.modules).
However, I'm wondering
Under what scenarios would re-importing a module "cause a different object to be placed in sys.modules"?
Is it necessary to assign the result of importlib.reload to the original object?

Save data with locally defined objects: df.to_picke or %store won't work

I use to_pickle / read_pickle a lot to quickly backup my dataframes while I'm manipulating them. The problem is that some of the columns have objects of a class that I defined myself in a module that I'm using, so I'm getting
AttributeError: Can't pickle local object 'get_Rc.<locals>.Rc'
where Rc is of a class that I defined myself. Is there any other way I can dump my dataframes and re-read them? The %store magic command throws the exact same error.

import hooks (custom module loaders) for pypy do not work

I'm successfully able to create import hooks to load files directly from memory in python2.7. The example I used was the accepted response to this question:
python:Import module from memory
However; when applying this code on pypy; i get an import error. I have also tried other import hook examples that work with regular python but not with pypy, such as this:
python load zip with modules from memory
Does anyone know why import hooks do not work in pypy? Is there something I am missing?
The problem is that in both of the examples you point to, load_module() does not add the loaded module to sys.modules. Normally, it should do so (and then PyPy works like CPython).
If load_module() does not add the module to sys.modules, then every single import a will call load_module() again and return a new copy of the module. For example, in the example from python:Import module from memory:
import a as a1
import a as a2
print a1 is a2 # False!
a1.foo = "foo"
print a2.foo # AttributeError
This is documented in https://www.python.org/dev/peps/pep-0302/#id27. The load_module() method is responsible for doing more checks than these simple examples show. In particular, note this line (emphasis in the original):
Note that the module object must be in sys.modules before the loader executes the module code.
So, the fact that PyPy behaves differently than CPython in this case could be understood as a behavior difference that follows from code that fails to respect the docs.
But anyway, my opinion is that it should be fixed. I've created an issue at https://bitbucket.org/pypy/pypy/issues/2382/sysmeta_path-not-working-like-cpythons.

unpickle instance after refactoring module name

Previously I defined an ElectrodePositionsModel class in the module gselu.py in the package gselu, and pickled the ElectrodePositionsModel objects into some files.
Some time later it was decided to refactor the project and the package name gselu was changed to ielu
When I attempt to unpickle the old pickle files with pickle.load(), the process fails with the error, 'module' object has no attribute 'ElectrodePositionsModel'. What I understand of the Unpicklers behavior is that this is because the pickle thinks it has stored an instance of gselu.gselu.ElectrodePositionsModel, and tries to therefore import this class from this module. When it doesn't exist, it gives up.
I think that I am supposed to add something to the module's init.py to tell it where the gselu.gselu.ElectrodePositionsModel is, but I can't get the pickle.load() function to give me any error message other than 'module' has no attribute 'ElectrodePositionsModel' and I can't figure out where I am supposed to provide the correct path to find it. The code that does the unpickling is in the same module file (gselu.py) as the ElectrodePositionsModel class.
When I load the pickle file in an ipython session and manually import ElectrodePositionsModel, it loads correctly.
How do I tell the pickler where to load this module?
I realise this questions is old, but I just ran into a similar problem.
What I did to solve it was to take the old code and unpickle the data using that.
Then instead of pickling directly the custom classes, I pickled CustomClass.__dict__ which only contained raw python.
This data could then easily be imported in the new module by doing
a = NewNameOfCustomClass()
a.__dict__ = pickle.load('olddata.p', 'rb')
This method works if your custom class only has standard variables (such as builtins or numpy arrays, etc).

Python dynamically run functions with arguments

I have bunch of modules to import and run. I have dynamically imported the modules using Dynamic module import in Python. This is in the main code. Once imported, I'm trying to run the functions in the modules.
All of the modules look like this structure
#function foo
def run(a,b)
c=a+b
return c
foo has been imported, I need to say something like bar=foo.run(a,b) dynamically
from this example:
How to call Python functions dynamically. I have already tried the following:
i='foo'
bar = getattr(sys.modules[__name__], i+'.run()')(a,b)
traceback AttributeError: 'module' object has no attribute 'foo.run()'
I'm confused, about the attribute error. The calling functions dynamically example is clearly calling functions.
If you have imported foo already, but don't have a reference to it, use:
sys.modules['foo'].run(a,b)
the_module.run(a, b)
Regardless of what magic made the module come into existence, it's an ordinary module object with ordinary attributes, and you know that the function is called run.
If you always know you'll use module foo, you're done.
You may also need to find the module object dynamically, because the module to choose varies.
If you imported the module properly, under the name you use to refer to it (e.g. foo) rather than some other name, you can also use sys.modules[mod_name].
Otherwise, you should probably have a dictionary of modules so that you can say, the_module = modules[mod_name].

Categories

Resources