Python pickle instance variables - python

I am doing some calculations on an instance variable, and after that is done I want to pickle the class instance, such that I don't have to do the calculations again. Here an example:
import cPickle as pickle
class Test(object):
def __init__(self, a, b):
self.a = a
self.b = b
self.c = None
def compute(self, x):
print 'calculating c...'
self.c = x * 2
test = Test(10, 'hello')
test.compute(6)
# I have computed c and I want to store it, so I don't have to recompute it again:
pickle.dump(test, open('test_file.pkl', 'wb'))
After test.compute(6) I can check to see what test.__dict__ is:
>>> test.__dict__
{'a': 10, 'c': 12, 'b': 'hello'}
I thought that is what going to get pickled; however,
When I go to load the class instance:
import cPickle as pickle
from pickle_class_object import Test
t2 = pickle.load(open('test_file.pkl', 'rb'))
I see this in the shell:
calculating c...
Which means that I did not pickle c and I am computing it over again.
Is there a way to pickle test how I want to? So I don't have to compute c over again. I see that I could just pickle test.__dict__, but I am wondering if there is a better solutions. Also, my understanding about what is going on here is weak, so any comment about what is going would be great. I've read about __getstate__ and __setstate__, but I don't see how to apply them here.

You are importing the pickle_class_object module again, and Python runs all code in that module.
Your top-level module code includes a call to .compute(), that is what is being called.
You may want to move the code that creates the pickle out of the module, or move it to a if __name__ == '__main__': guarded section:
if __name__ == '__main__':
test = Test(10, 'hello')
test.compute(6)
pickle.dump(test, open('test_file.pkl', 'wb'))
Only when running a python file as the main script is __name__ set to __main__; when imported as a module __name__ is set to the module name instead and the if branch will not run.

Pickling works as you expect it to work. The problem here is when you run the new script, you import the module that contains the class Test. That entire module is run including the bit where you create test.
The typical way to handle this sort of thing would be to protect the stuff in a if __name__ == "__main__: block.
class Test(object):
def __init__(self, a, b):
self.a = a
self.b = b
self.c = None
def compute(self, x):
print 'calculating c...'
self.c = x * 2
if __name__ == "__main__":
import cPickle as pickle
test = Test(10, 'hello')
test.compute(6)
# I have computed c and I want to store it, so I don't have to recompute it again:
pickle.dump(test, open('test_file.pkl', 'wb'))

That isn't what's happening. You import a python module that has code in it at the top level, which executes when you import the module. You can see that your code works as you intended:
import cPickle as pickle
class Test(object):
def __init__(self, a, b):
self.a = a
self.b = b
self.c = None
def compute(self, x):
print 'calculating c...'
self.c = x * 2
test = Test(10, 'hello')
test.compute(6)
pickle.dump(test, open('test_file.pkl', 'wb'))
t2 = pickle.load(open('test_file.pkl', 'rb'))
print t2.c
--output:--
calculating c...
12
If your code worked as you describe, then you would see "calculating c..." twice.

Related

How to mock imwrite in a loop

I am trying to use Python unittest library to test the following code.
from cv2 import imwrite
import h5py
import lumpy as np
class Myclass:
def __init__(self):
self.data = []
def get_data(self):
return self.data
def load_data(self, path):
count = 10
for i in range(count):
mat_file = path + f'{i}.mat'
with h5py.File(mat_file, 'r') as fin:
data = np.array(fin['data'])
for j in range(len(data)):
filename = path + f'{i}_{j}.jpg'
imwrite(filename, data)
self.data = data
I tried to do the following. But I am getting AttributeError saying Myclass has no attribute 'imwrite'. And I do not know how to mock the nested loops with image writing imwrite.
from mock import MagicMock, patch
def test():
m = MagicMock()
m.__enter__.return_value = data
with patch("h5py.File", return_value=m):
with patch(Myclass.imwrite) as mock_imwrite:
testclass = Myclass()
testclass.load_data(path='test')
assert testclass.get_data() == data
I hope someone can help me out. Any help is very much appreciated
The issue here is that your class Myclass does not have a function imwrite. Only the module (aka the file the class is defined in) is aware of this function.
If you want to patch the imwrite function of the package cv2, you have to write:
(untested code)
with patch('cv2.imwrite') as mock_imwrite:
testclass = Myclass()
...
But the as mock_imwrite part is only needed, if you run an assert on the mock. Otherwise, you may just skip it.

Looking for a short way to generate subclass (that would be created into a new file), with parent class' methods included in it (Python)

Will use following example to explain.
Existing python file (a.py) contains one class:
class A:
def method1(self, par1, par2='e'):
# some code here
pass
def method2(self, parA):
# some code here
pass
def method3(self, a, b, c):
# lots of code here
pass
def anothermethod(self):
pass
if __name__ == '__main__':
A().anothermethod()
Now, there is a need to create another py file (b.py), which would contain subclass (class B) of class A.
And there is a need to have all the methods included (all inherited from parent class), but without
implementation in it. Result might look like:
class B(A):
def method1(self, par1, par2='e'):
# empty here; ready to override
pass
def method2(self, parA):
# empty here; ready to override
pass
def method3(self, a, b, c):
# empty here; ready to override
pass
def anothermethod(self):
# empty here; ready to override
pass
if __name__ == '__main__':
B().anothermethod()
Having described the example, the question is: how could one generate last mentioned (skeleton-like) py file? So that after generating you can just open generated file and start right away with filling specific implementation.
There must be a shorter way, 1-2 line solution. Maybe it is already solvable by some existing functionality within modules already provided by Python (Python 3)?
Edit (2018 Mar 14). Thank you https://stackoverflow.com/a/49152537/4958287 (though was looking for short and already existing solution here). Will have to settle with longer solution for now -- will include its rough version here, maybe it would be helpful to someone else:
import inspect
from a import A
def construct_skeleton_subclass_from_parent(subcl_name, parent_cl_obj):
"""
subcl_name : str
Name for subclass and
file to be generated.
parent_cl_obj : obj (of any class to create subclass for)
Object of parent class.
"""
lines = []
subcl_name = subcl_name.capitalize()
parent_cl_module_name = parent_cl_obj.__class__.__module__
parent_cl_name = parent_cl_obj.__class__.__name__
lines.append('from {} import {}'.format(parent_cl_module_name, parent_cl_name))
lines.append('')
lines.append('class {}({}):'.format(subcl_name, parent_cl_name))
for name, method in inspect.getmembers(parent_cl_obj, predicate=inspect.ismethod):
args = inspect.signature(method)
args_others = str(args).strip('()').strip()
if len(args_others) == 0:
lines.append(' def {}(self):'.format(name))
else:
lines.append(' def {}(self, {}):'.format(name, str(args).strip('()')))
lines.append(' pass')
lines.append('')
#...
#lines.append('if __name__ == \'__main__\':')
#lines.append(' ' + subcl_name + '().anothermethod()')
#...
with open(subcl_name.lower() + '.py', 'w') as f:
for c in lines:
f.write(c + '\n')
a_obj = A()
construct_skeleton_subclass_from_parent('B', a_obj)
Get the list of methods and each of their signatures using the inspect module:
import a
import inspect
for name, method in inspect.getmembers(a.A, predicate=inspect.ismethod):
args = inspect.signature(method)
print(" def {}({}):".format(name, args))
print(" pass")
print()

Some trouble with python scope and objects

class Thing:
def __init__ (self, a, b, c,):
self.a = a
self.b = b
self.c = c
stuff = Thing("apples","butter","charlie")
otherThing = stuff
def doTheThings():
if(otherThing.a == "apples"):
print("done")
doTheThings()
I'm having a problem with the second line of the "doTheThings" function, and I have no idea what's wrong. Any help would be very appreciated.
I think add somthing maybe solve your problem.
if __name__ =="__main__":
doTheThings()
beacuse this is access of python code.

Using multiprocessing with a decorated function results in a PicklingError

I am trying to write a convenience function based on the multiprocessing library, that takes any function and argument, and runs that function using multiple processes. I have the following file "MultiProcFunctions.py" that I am importing:
import multiprocessing
from multiprocessing import Manager
def MultiProcDecorator(f,*args):
"""
Takes a function f, and formats it so that results are saved to a shared dict
"""
def g(procnum,return_dict,*args):
result = f(*args)
return_dict[procnum] = result
return g
def MultiProcFunction(f,n_procs,*args):
"""
Takes a function f, and runs it in n_procs with given args
"""
manager = Manager()
return_dict = manager.dict()
jobs = []
for i in range(n_procs):
p = multiprocessing.Process( target = f, args = (i,return_dict) + args )
jobs.append(p)
p.start()
for proc in jobs:
proc.join()
return dict(return_dict)
Here is the code I run:
from MultiProcFunctions import *
def sq(x):
return [i**2 for i in x]
g = MultiProcDecorator(sq)
if __name__ == '__main__':
result = MultiProcFunction(g,2,[1,2,3])
I get the following error: PicklingError: Can't pickle <function g at 0x01BD83B0>: it's not found as MultiProcFunctions.g
If I use the following definition for g instead, everything is fine:
def g(procnum,return_dict,x):
result = [i**2 for i in x]
return_dict[procnum] = result
Why is are the two definitions of g different, and is there any thing I can do to get the original g definition to "work"?
Trying dano's trick seem to only works in Python 2. When trying in Python 3, I get the following error:
pickle.PicklingError: Can't pickle <function serialize at 0x7f7a1ac1fd08>: it's not the same object as __main__.orig_fn
I solved this issue by "decorating" function from worker's init:
from functools import wraps
import sys
def worker_init(fn, *args):
#wraps(fn)
def wrapper(x):
# wrapper logic
pass
setattr(sys.modules[fn.__module__], fn.__name__, wrapper)
pool = mp.Pool(initializer=worker_init, initargs=[orig_fn, *args])
# ...
This is happening because g is actually defined as a nested function in MultiProcFunctions, which means it's not actually importable from the top-level of that module, which means it won't pickle properly. Now, we're actually pretty clearly defining g in the top-level of __main__ module though, when we do this:
g = MultiProcDecorator(sq)
So, it really should be picklable. We can make it work by explicitly setting the __module__ of g to be "__main__":
g = MultiProcDecorator(sq)
g.__module__ = "__main__" # Fix the __module__
This will allow the pickling process to work, since it will look for the definition of g in __main__, where it is defined at the top-level, rather than MultiProcFunctions, where it is only defined in a nested scope.
Edit:
Note that you could also make the change in the decorator itself:
def MultiProcDecorator(f,*args):
"""
Takes a function f, and formats it so that results are saved to a shared dict
"""
def g(procnum,return_dict,*args):
result = f(*args)
return_dict[procnum] = result
g.__module__ = "__main__"
return g
This probably makes more sense for you, since this decorator is strictly meant to be using for multiprocessing purposes.

Python classes and modules

I am teaching myself Python and hit a roadblock with classes and modules.
The code below is something that you would probably never write, but I would like to just understand my error.
import random
class GetRandom:
def __init__(self):
self.data = ""
def ranNumber():
return random.random()
b = GetRandom()
bnum = b.ranNumber
print bnum
The output I am getting is:
<bound method GetRandom.ranNumber of <__main__.GetRandom instance at 0x7fe87818df38>>
I had expected a random number between 0 and 1. What am I doing wrong?
Thanks
There are two problems here:
You forgot to actually invoke GetRandom.ranNumber. Add () after it to do this:
bnum = b.ranNumber()
You need to make GetRandom.ranNumber accept the self argument that is passed implicitly when you invoke the method:
def ranNumber(self):
return random.random()
Once you address these issues, the code works as expected:
>>> import random
>>> class GetRandom:
... def __init__(self):
... self.data = ""
... def ranNumber(self):
... return random.random()
...
>>> b = GetRandom()
>>> bnum = b.ranNumber()
>>> print bnum
0.819458844177
>>>

Categories

Resources