Can't Pickle Wrapper Class Object - python

All, I'm trying to parallelize some code using multiprocessing, and I've stripped my code down such that commenting the line of my main class
self.obs = Observer(self.guess)
causes the system to run error free. If I write
obs = Observer(self.guess)
It works, but if I write self.obs, I get
TypeError: can't pickle Observer objects
Here is the entire class I'm trying to import. It is just a wrapper for ephem.Observer, which had the same error when importing.
import numpy as np
from req import SETTINGS
from req.helpers import load_db, pack_into_vector, create_observer
class Observer:
def __init__(self, beta=np.zeros((2,))):
self.observer = create_observer(beta)
return
def __getstate__(self):
return {'observer': self.observer}
The error occurs on p.start() where
p = Process(target=selector,args=(first_guess, recording_queue, guess_queue))

I actually solved this. I guess the issue was with all of the self.___ methods/attributes of my selector class. I created an additional class, selector_wrapper, with an init method that created and ran the selector class. This worked perfectly.
In summary, creating a wrapper class with no methods other than init, and with no attributes, fixed the problem!

Related

Register classes in different files to a Class factory

I am trying to register classes that are in different files to the factory class. The factory class has a dictionary called "registry" which hold/maps the a user defined name to the registering class. My issue is that if my factory class and registering classes are in the same .py file everything works as expected but the moment I move the registering classes into their own .py files and import the factory class to apply the register decorator (as described in the question & article below) the "registry" dictionary stays empty, which means that the classes are not getting registered.
They way I am registering these classes is via a decorator. My code looks very much like what we see here:
Registering classes to factory with classes in different files (my question is a duplicate of this, but bumping this question to the top)
https://medium.com/#geoffreykoh/implementing-the-factory-pattern-via-dynamic-registry-and-python-decorators-479fc1537bbe
I would like to know:
What why keeping them in the same file work while splitting them out doest
How can I make the separate file approach work ?
Hopefully the code samples in the articles clarify what I am trying to do and struggling with.
I'm currently exploring a similar problem, and I think I may have found a solution. It is a bit of a 'hack' though, so take it with a grain of salt.
What why keeping them in the same file work while splitting them out doest
In order to make your classes self-register in the factory while keeping their definition in single .py files, we have to somehow force the loading of the classes in the .py files.
How can I make the separate file approach work?
In my case, I've came across this problem when trying to implement a 'Simple Factory', with self-registering subclasses to avoid having to modify the typical 'if/else' idiom in the factory's get() method.
I'll use a simple example, starting with the decorator method you've mentioned.
Example with decorators
Let's say we have a ShoeFactory as shown below, in which we register different 'classes' of shoes:
# file shoe.py
class ShoeFactory:
_shoe_classes = {}
#classmethod
def get(cls, shoe_type:str):
try:
return cls._shoe_classes[shoe_type]()
except KeyError:
raise ValueError(f"unknown product type : {shoe_type}")
#classmethod
def register(cls, shoe_type:str):
def inner_wrapper(wrapped_class):
cls._shoe_classes[shoe_type] = wrapped_class
return wrapped_class
return inner_wrapper
Examples of shoe classes:
# file sandal.py
from shoe import ShoeFactory
#ShoeFactory.register('Sandal')
class Sandal:
def __init__(self):
print("i'm a sandal")
# file croc.py
from shoe import ShoeFactory
#ShoeFactory.register('Croc')
class Croc:
def __init__(self):
print("i'm a croc")
In order to make Sandal self-register in the ShoeFactory while keeping its definition in a single .py file, we have to somehow force the loading of the Sandal class in .py file.
I've done this in 3 steps:
Keeping all class implementations in a specific folder, e.g., structuring the files as follows:
.
└- shoe.py # file with the ShoeFactory class
└─ shoes/
└- __init__.py
└- croc.py
└- sandal.py
Adding the following statement to the end of the shoe.py file, which will take care of loading and registering each individual class:
from shoes import *
Add a piece of code like the snippet below to your __init__.py within the shoes/ foder, so that to dynamically load all classes [1]:
from inspect import isclass
from pkgutil import iter_modules
from pathlib import Path
from importlib import import_module
# iterate through the modules in the current package
package_dir = Path(__file__).resolve().parent
for (_, module_name, _) in iter_modules([package_dir]):
# import the module and iterate through its attributes
module = import_module(f"{__name__}.{module_name}")
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if isclass(attribute):
# Add the class to this package's variables
globals()[attribute_name] = attribute
If we follow this approach, I get the following results when running some test code as follows:
# file shoe_test.py
from shoe import ShoeFactory
if __name__ == "__main__":
croc = ShoeFactory.get('Croc')
sandal = ShoeFactory.get('Sandal')
$ python shoe_test.py
i'm a croc
i'm a sandal
Example with __init_subclass__()
I've personally followed a slighly different approach for my simple factory design, which does not use decorators.
I've defined a RegistrableShoe base class, and then used a __init_subclass__() approach to do the self-registering ([1] item 49, [2]).
I think the idea is that when Python finds the definition of a subclass of RegistrableShoe, the __init_subclass__() method is ran, which in turn registers the subclass in the factory.
This approach requires the following changes when compared to the example above:
Added a RegistrableShoe base class to the shoe.py file, and re-factored ShoeFactory a bit:
# file shoe.py
class RegistrableShoe():
def __init_subclass__(cls, shoe_type:str):
ShoeFactory.register(shoe_type, shoe_class=cls)
class ShoeFactory:
_shoe_classes = {}
#classmethod
def get(cls, shoe_type:str):
try:
return cls._shoe_classes[shoe_type]()
except KeyError:
raise ValueError(f"unknown product type : {shoe_type}")
#classmethod
def register(cls, shoe_type:str, shoe_class:RegistrableShoe):
cls._shoe_classes[shoe_type] = shoe_class
from shoes import *
Changed the concrete shoe classes to derive from the RegistrableShoe base class and pass a shoe_type parameter:
# file croc.py
from shoe import RegistrableShoe
class Croc(RegistrableShoe, shoe_type='Croc'):
def __init__(self):
print("i'm a croc")
# file sandal.py
from shoe import RegistrableShoe
class Sandal(RegistrableShoe, shoe_type='Sandal'):
def __init__(self):
print("i'm a sandal")

What is the best way to pass arguments from a locust user to taskset parameters, where the tasksets have been separated to different files?

entry_point.py
from other_file import UserBehaviour
class ApiUser(HttpUser):
tasks = [UserBehaviour]
def on_start(self):
# log in and return session id and cookie
# example: self.foo = "foo"
other_file.py
from entry_point import ApiUser
class UserBehaviour(TaskSet):
#task
def do_something(self, session_id, session_cookie)
# use session id and session cookie from user instance running the taskset
# example: print(self.ApiUser.foo)
NOTE: Going through the documentation, I did find that "the User instance can be accessed from within a TaskSet instance through the TaskSet.user", however all my attempts to import the user into the taskset file led to a "cannot import name 'ApiUser' from 'entry_point'" error. If instead of from entry_point import ApiUser I do from entry_point import *, then I get a name 'ApiUser' is not defined error.
Thank you very much #Cyberwiz for putting me on the right track. I've finally managed to figure out what I was doing wrong... which, as it turns out, was a couple of things.
Firstly, importing ApiUser in other_file.py was incorrect for two reasons: 1) it creates a cyclical dependency, and 2) even if it would eventually work it would import the ApiUser class, not the instance of the ApiUser class.
Secondly, I was previously getting a module locust.user has no attribute {name} error, and that was because my code looked like this:
class UserBehaviour(TaskSet):
# do something with user.foo
Having figured this out, I honestly have no idea why I thought the above would work. I've changed my code to reflect the example below and everything now works like a charm:
class UserBehaviour(TaskSet):
#task
def do_something(self):
# do something with self.user.foo

class declaration in exec inits class, but functions don't work

I am going to attach two blocks of code, the first is the main code that is ran the second is the testClass file containing a sample class for testing purposes. To understand what's going on it's probably easiest to run the code on your own. When I call sC.cls.print2() it says that the self parameter is unfulfilled. Normally when working with classes, self (in this case) would be sC.cls and you wouldn't have to pass it as a parameter. Any advice is greatly appreciated on why this is occuring, I think it's something to do with exec's scope but even if I run this function in exec it gives the same error and I can't figure out a way around it. If you'd like any more info please just ask!
import testClass
def main():
inst = testClass.myClass()
classInfo = str(type(inst)).split()[1].split("'")[1].split('.')
print(classInfo)
class StoreClass:
def __init__(self):
pass
exec('from {} import {}'.format(classInfo[0], classInfo[1]))
sC = StoreClass()
exec('sC.cls = {}'.format(classInfo[1]))
print(sC.cls)
sC.cls.print2()
if __name__ == '__main__':
main()
class myClass:
def printSomething(self):
print('hello')
def print2(self):
print('hi')

Having the class handle pickle

I am changing some code to spin up VMs in ec2 instead of openstack. Main starts a thread per VM, and then various modules perform tasks on these VM. Each thread controls it's own VM. So, instead of either having to add parameters to all of the downstream modules to look up information, or having to change all of the code to unpickle the class instance that created the vm, I am hoping that I can have the class itself decide whether to start a new VM or return the existing pickle. That way the majority of the code wont need to be altered.
This is the general idea, and closest I have gotten to getting it to work:
import os
import sys
import pickle
if sys.version_info >= (2, 7):
from threading import current_thread
else:
from threading import currentThread as current_thread
class testA:
def __init__(self, var="Foo"):
self.class_pickle_file = "%s.p" % current_thread().ident
if os.path.isfile(self.class_pickle_file):
self.load_self()
else:
self.var = var
pickle.dump(self, open(self.class_pickle_file, "wb"))
def test_method(self):
print self.var
def load_self(self):
return pickle.load(open(self.class_pickle_file, "rb"))
x = testA("Bar")
y = testA()
y.test_method()
But that results in: NameError: global name 'var' is not defined
But, If I do y = pickle.load(open("140355004004096.p", "rb")) it works just fine. So the data IS getting in there by storing self inside the class, it's a problem of getting the class to return the pickle instead of itself...
Any ideas? Thanks.
It looks to me like you create a file named by the current thread's ident, then you instantiate another TestA object using the same thread (!!same ident!!), so it checks for a pickle file (and finds it, that's bad), then self.var never gets set.
In test_method, you check for a variable that was never set.
Run each item in its own thread to get different idents, or ensure you set self.var no matter what.

Reference inherited class functions

I am inheriting from both threading.Thread and bdb.Bdb. Thread requires a run function for the start function to call, and I need to user the Bdb.run function. How do I reference Bdb's run function since I can't do it with self.run? I tried super, but I'm apparently not using that right, I get TypeError: must be type, not classobj.
import sys
import os
import multiprocessing
import threading
import bdb
from bdb import Bdb
from threading import Thread
from el_tree_o import ElTreeO, _RUNNING, _PAUSED, _WAITING
from pysignal import Signal
class CommandExec(Thread, Bdb):
'''
Command Exec is an implementation of the Bdb python class with is a base
debugger. This will give the user the ability to pause scripts when needed
and see script progress through line numbers. Useful for command and
control scripts.
'''
def __init__(self, mainFile, skip=None):
Bdb.__init__(self,skip=skip)
Thread.__init__(self)
# need to define botframe to protect against an error
# generated in bdb.py when set_quit is called before
# self.botframe is defined
self.botframe = None
# self.even is used to pause execution
self.event = threading.Event()
# used so I know when to start debugging
self.mainFile = mainFile
self.start_debug = 0
# used to run a file
self.statement = ""
def run(self):
self.event.clear()
self.set_step()
super(bdb.Bdb,self).run(self.statement)
Just as you invoked Bdb's __init__ method on line 22, you can invoke its run method:
Bdb.run(self, self.statement)
super is only useful when you don't know which parent class you need to invoke next, and you want to let Python's inheritance machinery figure it out for you. Here, you know precisely which function you want to call, Bdb.run, so just call it.

Categories

Resources