I have a large Python 3.6 system where multiple processes and threads interact with each other and the user. Simplified, there is a Scheduler instance (subclasses threading.Thread) and a Worker instance (subclasses multiprocessing.Process). Both objects run for the entire duration of the program.
The user interacts with the Scheduler by adding Task instances and the Scheduler passes the task to the Worker at the correct moment in time. The worker uses the information contained in the task to do its thing.
Below is some stripped out and simplified code out of the project:
class Task:
def __init__(self, name:str):
self.name = name
self.state = 'idle'
class Scheduler(threading.Thread):
def __init__(self, worker:Worker):
super().init()
self.worker = worker
self.start()
def run(self):
while True:
# Do stuff until the user schedules a new task
task = Task() # <-- In reality the Task intance is not created here but the thread gets it from elsewhere
task.state = 'scheduled'
self.worker.change_task(task)
# Do stuff until the task.state == 'finished'
class Worker(multiprocessing.Process):
def __init__(self):
super().init()
self.current_task = None
self.start()
def change_task(self, new_task:Task):
self.current_task = new_task
self.current_task.state = 'accepted-idle'
def run(self):
while True:
# Do stuff until the current task is updated
self.current_task.state = 'accepted-running'
# Task is running
self.current_task.state = 'finished'
The system used to be structured so that the task contained multiple multiprocessing.Events indicating each of its possible states. Then, not the whole Task instance was passed to the worker, but each of the task's attributes was. As they were all multiprocessing safe, it worked, with a caveat. The events changed in worker.run had to be created in worker.run and back passed to the task object for it work. Not only is this a less than ideal solution, it no longer works with some changes I am making to the project.
Back to the current state of the project, as described by the python code above. As is, this will never work because nothing makes this multiprocessing safe at the moment. So I implemented a Proxy/BaseManager structure so that when a new Task is needed, the system gets it from the multiprocessing manager. I use this structure in a sightly different way elsewhere in the project as well. The issue is that the worker.run never knows that the self.current_task is updated, it remains None. I expected this to be fixed by using the proxy but clearly I am mistaken.
def Proxy(target: typing.Type) -> typing.Type:
"""
Normally a Manager only exposes only object methods. A NamespaceProxy can be used when registering the object with
the manager to expose all the attributes. This also works for attributes created at runtime.
https://stackoverflow.com/a/68123850/8353475
1. Instead of exposing all the attributes manually, we effectively override __getattr__ to do it dynamically.
2. Instead of defining a class that subclasses NamespaceProxy for each specific object class that needs to be
proxied, this method is used to do it dynamically. The target parameter should be the class of the object you want
to generate the proxy for. The generated proxy class will be returned.
Example usage: FooProxy = Proxy(Foo)
:param target: The class of the object to build the proxy class for
:return The generated proxy class
"""
# __getattr__ is called when an attribute 'bar' is called from 'foo' and it is not found eg. 'foo.bar'. 'bar' can
# be a class method as well as a variable. The call gets rerouted from the base object to this proxy, were it is
# processed.
def __getattr__(self, key):
result = self._callmethod('__getattribute__', (key,))
# If attr call was for a method we need some further processing
if isinstance(result, types.MethodType):
# A wrapper around the method that passes the arguments, actually calls the method and returns the result.
# Note that at this point wrapper() does not get called, just defined.
def wrapper(*args, **kwargs):
# Call the method and pass the return value along
return self._callmethod(key, args, kwargs)
# Return the wrapper method (not the result, but the method itself)
return wrapper
else:
# If the attr call was for a variable it can be returned as is
return result
dic = {'types': types, '__getattr__': __getattr__}
proxy_name = target.__name__ + "Proxy"
ProxyType = type(proxy_name, (NamespaceProxy,), dic)
# This is a tuple of all the attributes that are/will be exposed. We copy all of them from the base class
ProxyType._exposed_ = tuple(dir(target))
return ProxyType
class TaskManager(BaseManager):
pass
TaskProxy = Proxy(Task)
TaskManager.register('get_task', callable=Task, proxytype=TaskProxy)
Related
I would like to create shared object among processes. First I created server process which spawned process for class ProcessClass. Then I created another process where I want to connect to shared object.
But connection from another process created its own instance of ProcessClass.
So what I need to do to access this remote shared object.
Here is my test code.
from multiprocessing.managers import BaseManager
from multiprocessing import Process
class ProcessClass:
def __init__(self):
self._state = False
def set(self):
self._state = True
def get(self):
return self._state
class MyManager(BaseManager):
pass
def another_process():
MyManager.register('my_object')
m = MyManager(address=('', 50000))
m.connect()
proxy = m.my_object()
print(f'state from another process: {proxy.get()}')
def test_spawn_and_terminate_process():
MyManager.register('my_object', ProcessClass)
m = MyManager(address=('', 50000))
m.start()
proxy = m.my_object()
proxy.set()
print(f'state from main process: {proxy.get()}')
p = Process(target=another_process)
p.start()
p.join()
print(f'state from main process: {proxy.get()}')
if __name__ == '__main__':
test_spawn_and_terminate_process()
Output is
python test_communication.py
state from main process: True
state from another process: False
state from main process: True
Your code is working as it is supposed to. If you look at the documentation for multiprocessing.managers.SyncManager you will see that there is, for example, a method dict() to create a shareable dictionary. Would you expect that calling this method multiple times would return the same dictionary over and over again or new instances of sharable dictionaries?
What you need to do is enforce a singleton instance to be used repeatedly for successive invocations of proxy = m.my_object() and the way to do that is to first define the following function:
singleton = None
def get_singleton_process_instance():
global singleton
if singleton is None:
singleton = ProcessClass()
return singleton
Then you need to make a one line change in funtion test_spawn_and_terminate_process:
def test_spawn_and_terminate_process():
#MyManager.register('my_object', ProcessClass)
MyManager.register('my_object', get_singleton_process_instance)
This ensures that to satisfy requests for 'my_object', it always invokes get_singleton_process_instance() (returning the singleton) instead of ProcessClass(), which would return a new instance.
I try to use Multiprocessing in a class.I use Multiprocessing.pipe() to pass the instance o from parent process to child process.
self.listofwritetags = self.collectwritetaglist()
self.progressbar['value'] = 20
self.frame.update_idletasks()
self.alldevice = alldevices_V3.AllDevices(self.comm_object)
self.progressbar['value'] = 40
self.frame.update_idletasks()
con1,con2 = multiprocessing.Pipe()
con1.send(self.alldevice)
con2.send(self.comm_object)
# Multithreading section
# self.callmotor1dprocess = thread_with_trace(target=self.callallmotor1d,args=(self.comm_object,self.alldevice))
self.callmotor1dprocess = multiprocessing.Process(target = self.callallmotor1d,args= (con1,con2))
self.listofthread.append(self.callmotor1dprocess)
self.button2.config(text="Initialized")
initial = "True"
self.progressbar.stop()
Now I call all Multiprocessing Initiate
def startprocess(self):
for item in self.listofthread:
item.start()
self.button3.config(text="started")
def stopprocess(self):
for item in self.listofthread:
item.kill()
This Code I call inside the class.Now Executed method I should call outside of the class.
def callallmotor1d(con1,con2):
comobject = con1.recv()
devices = con2.recv()
while True:
Allmotorprocessing.process(comobject, devices)
But I got the error which is very common:-
Error message:-
Traceback (most recent call last):
File "C:\Users\misu01\AppData\Local\Programs\Python\Python37\lib\multiprocessing\queues.py", line 236, in _feed
obj = _ForkingPickler.dumps(obj)
File "C:\Users\misu01\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread.lock objects
Traceback (most recent call last):
File "C:\Users\misu01\AppData\Local\Programs\Python\Python37\lib\multiprocessing\queues.py", line 236, in _feed
obj = _ForkingPickler.dumps(obj)
File "C:\Users\misu01\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread.lock objects
I don't know why thread.lock object is created.
To avoid this error I try to modify my Alldevices class and comm_object class like this:-
Here is my modification:-
class AllDevices:
def __init__(self,comobject):
self.mylock = threading.Lock()
self.comobject = comobject
self.dfM1D = pd.read_excel(r'C:\OPCUA\Working_VF1_5.xls', sheet_name='Motor1D')
self.allmotor1dobjects = callallmotor1D_V3.Cal_AllMotor1D(self.dfM1D, self.comobject)
def __getstate__(self):
state = vars(self).copy()
# Remove the unpicklable entries.
del state['mylock']
return state
def __setstate__(self, state):
# Restore instance attributes.
vars(self).update(state)
Here is the comobject class.
class General():
def __init__(self):
self.client = Communication()
self.mylock = threading.Lock()
self.sta_con_plc = self.client.opc_client_connect()
self.readgeneral = ReadGeneral(self.client.PLC)
self.writegeneral = WriteGeneral(self.client.PLC)
def __getstate__(self):
state = vars(self).copy()
# Remove the unpicklable entries.
del state['mylock']
return state
def __setstate__(self, state):
# Restore instance attributes.
vars(self).update(state)
But still I got an error.
Is this my implementation is correct?
self.allmotor1dobjects = callallmotor1D_V2.Cal_AllMotor1D(self.dfM1D, self.comobject,self.logger)
Here self.allmotor1dobjects is also class instances.
like:-
self.client = Communication()
self.readgeneral = ReadGeneral(self.client.PLC)
self.writegeneral = WriteGeneral(self.client.PLC)
These are also class instances.
I never used thread.lock any of these two classes.
I don't know how it is created.
As per the docs suggested https://docs.python.org/3/library/pickle.html#pickling-class-instances
If I use getstate and setstate, it should remove this error.
In my case it doesn't work.
How can I remove this error.
any help in this regard will be highly appreciated
You can pass most classes through a pipe, but not if the instances have attributes that are unpicklable types. In this case, the instances of alldevice (the instances of which are stored in some collection you stored as self.devices) have, directly or indirectly, a threading.Lock attribute (the error message says _thread.lock because under all the abstraction, that's the actually class that threading.Lock returns an instance of).
threading.Lock isn't picklable (because thread locks only make sense within a given process; even if you recreated them in another process, they wouldn't actually provide any sort of synchronization between the processes).
If the alldevice class is under your control, you have a few options:
Remove the per-instance threading.Lock if possible (easiest solution, but assumes synchronization isn't required, or that synchronization could be done by a shared global lock)
If synchronization is required, and must operate across processes, but needn't be per-instance, and you're on a UNIX-like system (Linux, BSD) you could use a shared global multiprocessing.Lock() instead; it will be inherited when the Process call triggers a fork
If synchronization is required, and you must have a per-instance Lock (and it's okay for the lock to operate across processes), you can replace threading.Lock with a pickle-friendly multiprocessing.Manager's Lock. You'd need to make a common multiprocessing.Manager() instance somewhere (e.g. globally, or as a class attribute of alldevice), then use Lock from that manager instead of from threading. Simple example:
import multiprocessing
class alldevice:
MANAGER = multiprocessing.Manager() # Shared manager for all alldevice instances
def __init__(self, ... other args here ...):
self.mylock = self.MANAGER.Lock() # Make picklable lock
# ... rest of initialization ...
When multiprocessing isn't involved, this will be slower than a threading.Lock, as it will require IPC to lock and unlock the lock (the Manager's Lock instances are actually a proxy object that communicates with the Manager on whatever process it's actually running in), and it will lock across processes (probably for the best if you're locking access to actual hardware that can't be used concurrently from multiple processes), but it's relatively simple.
If synchronization is required, but only within a process, not across processes, you can take control of the pickling process to avoid trying to pickle the threading.Lock, and instead recreate it when it's unpickled on the other side. Just explicitly implement the pickle support methods to avoid pickling the Lock, and force it to be recreated on the other side. Example:
import copy
class alldevice:
def __init__(self, ... other args here ...):
self.mylock = threading.Lock() # Use regular lock
# ... rest of initialization ...
def __getstate__(self):
state = vars(self).copy() # Make copy of instance dict
del state['mylock'] # Remove lock attribute
return state
def __setstate__(self, state):
vars(self).update(state)
self.mylock = threading.Lock() # Make new lock on other side
# If you ever make copies within the same thread, you may want to define
# __deepcopy__ so the local copy process doesn't make a new lock:
def __deepcopy__(self, memo):
# Make new empty instance without invoking __init__
newself = self.__class__.__new__(self.__class__)
# Individually deepcopy attributes *except* mylock, which we alias
for name, value in vars(self).items():
# Cascading deepcopy for all other attributes
if name != 'mylock':
value = copy.deepcopy(value, memo)
setattr(newself, name, value)
return newself
The __deepcopy__ override is only needed if you want the copy to continue sharing the lock; otherwise, if a deep copy should behave as an entirely independent instance, you can omit it, and you'll end up with an unrelated lock in the copy.
If you don't have control of the alldevice class, but can identify the problematic attribute, your only option is to register a copyreg handler for alldevice to do the same basic thing as option #4, which would look something like this:
import copyreg
def unpickle_alldevice(state):
self = alldevice.__new__(alldevice) # Make empty alldevice
vars(self).update(state) # Update with provided state
self.mylock = threading.Lock() # Make fresh lock
return self
def pickle_alldevice(ad):
state = vars(ad).copy() # Make shallow copy of instance dict
del state['mylock'] # Remove lock attribute
return unpickle_alldevice, (state,) # Return __reduce__ style info for reconstruction
# Register alternate pickler for alldevice
copyreg.pickle(alldevice, pickle_alldevice)
I create Singleton class using Metaclass, it working good in multithreadeds and create only one instance of MySingleton class but in multiprocessing, it creates always new instance
import multiprocessing
class SingletonType(type):
# meta class for making a class singleton
def __call__(cls, *args, **kwargs):
try:
return cls.__instance
except AttributeError:
cls.__instance = super(SingletonType, cls).__call__(*args, **kwargs)
return cls.__instance
class MySingleton(object):
# singleton class
__metaclass__ = SingletonType
def __init__(*args,**kwargs):
print "init called"
def task():
# create singleton class instance
a = MySingleton()
# create two process
pro_1 = multiprocessing.Process(target=task)
pro_2 = multiprocessing.Process(target=task)
# start process
pro_1.start()
pro_2.start()
My output:
init called
init called
I need MySingleton class init method get called only once
Each of your child processes runs its own instance of the Python interpreter, hence the SingletonType in one process doesn't share its state with those in another process. This means that a true singleton that only exists in one of your processes will be of little use, because you won't be able to use it in the other processes: while you can manually share data between processes, that is limited to only basic data types (for example dicts and lists).
Instead of relying on singletons, simply share the underlying data between the processes:
#!/usr/bin/env python3
import multiprocessing
import os
def log(s):
print('{}: {}'.format(os.getpid(), s))
class PseudoSingleton(object):
def __init__(*args,**kwargs):
if not shared_state:
log('Initializating shared state')
with shared_state_lock:
shared_state['x'] = 1
shared_state['y'] = 2
log('Shared state initialized')
else:
log('Shared state was already initalized: {}'.format(shared_state))
def task():
a = PseudoSingleton()
if __name__ == '__main__':
# We need the __main__ guard so that this part is only executed in
# the parent
log('Communication setup')
shared_state = multiprocessing.Manager().dict()
shared_state_lock = multiprocessing.Lock()
# create two process
log('Start child processes')
pro_1 = multiprocessing.Process(target=task)
pro_2 = multiprocessing.Process(target=task)
pro_1.start()
pro_2.start()
# Wait until processes have finished
# See https://stackoverflow.com/a/25456494/857390
log('Wait for children')
pro_1.join()
pro_2.join()
log('Done')
This prints
16194: Communication setup
16194: Start child processes
16194: Wait for children
16200: Initializating shared state
16200: Shared state initialized
16201: Shared state was already initalized: {'x': 1, 'y': 2}
16194: Done
However, depending on your problem setting there might be better solutions using other mechanisms of inter-process communication. For example, the Queue class is often very useful.
I'm trying to override the DaemonRunner in the python standard daemon process library (found here https://pypi.python.org/pypi/python-daemon/)
The DaemonRunner responds to command line arguments for start, stop, and restart, but I want to add a fourth option for status.
The class I want to override looks something like this:
class DaemonRunner(object):
def _start(self):
...etc
action_funcs = {'start': _start}
I've tried to override it like this:
class StatusDaemonRunner(DaemonRunner):
def _status(self):
...
DaemonRunner.action_funcs['status'] = _status
This works to some extent, but the problem is that every instance of DaemonRunner now have the new behaviour. Is it possible to override it without modifying every instance of DaemonRunner?
I would override action_functs to make it a non-static member of class StatusDaemonRunner(DaemonRunner).
In terms of code I would do:
class StatusDaemonRunner(runner.DaemonRunner):
def __init__(self, app):
self.action_funcs = runner.DaemonRunner.action_funcs.copy()
self.action_funcs['status'] = StatusDaemonRunner._status
super(StatusDaemonRunner, self).__init__(app)
def _status(self):
pass # do your stuff
Indeed, if we look at the getter in the implementation of DaemonRunner (here) we can see that it acess the attribute using self
def _get_action_func(self):
""" Return the function for the specified action.
Raises ``DaemonRunnerInvalidActionError`` if the action is
unknown.
"""
try:
func = self.action_funcs[self.action]
except KeyError:
raise DaemonRunnerInvalidActionError(
u"Unknown action: %(action)r" % vars(self))
return func
Hence the previous code should do the trick.
The problem is basically this, in python's gobject and gtk bindings. Assume we have a class that binds to a signal when constructed:
class ClipboardMonitor (object):
def __init__(self):
clip = gtk.clipboard_get(gtk.gdk.SELECTION_CLIPBOARD)
clip.connect("owner-change", self._clipboard_changed)
The problem is now that, no instance of ClipboardMonitor will ever die. The gtk clipboard is an application-wide object, and connecting to it keeps a reference to the object, since we use the callback self._clipboard_changed.
I'm debating how to work around this using weak references (weakref module), but I have yet to come up with a plan. Anyone have an idea how to pass a callback to the signal registration, and have it behave like a weak reference (if the signal callback is called when the ClipboardMonitor instance is out of scope, it should be a no-op).
Addition: Phrased independently of GObject or GTK+:
How do you provide a callback method to an opaque object, with weakref semantics? If the connecting object goes out of scope, it should be deleted and the callback should act as a no-op; the connectee should not hold a reference to the connector.
To clarify: I explicitly want to avoid having to call a "destructor/finalizer" method
The standard way is to disconnect the signal. This however needs to have a destructor-like method in your class, called explicitly by code which maintains your object. This is necessary, because otherwise you'll get circular dependency.
class ClipboardMonitor(object):
[...]
def __init__(self):
self.clip = gtk.clipboard_get(gtk.gdk.SELECTION_CLIPBOARD)
self.signal_id = self.clip.connect("owner-change", self._clipboard_changed)
def close(self):
self.clip.disconnect(self.signal_id)
As you pointed out, you need weakrefs if you want to avoid explicite destroying. I would write a weak callback factory, like:
import weakref
class CallbackWrapper(object):
def __init__(self, sender, callback):
self.weak_obj = weakref.ref(callback.im_self)
self.weak_fun = weakref.ref(callback.im_func)
self.sender = sender
self.handle = None
def __call__(self, *things):
obj = self.weak_obj()
fun = self.weak_fun()
if obj is not None and fun is not None:
return fun(obj, *things)
elif self.handle is not None:
self.sender.disconnect(self.handle)
self.handle = None
self.sender = None
def weak_connect(sender, signal, callback):
wrapper = CallbackWrapper(sender, callback)
wrapper.handle = sender.connect(signal, wrapper)
return wrapper
(this is a proof of concept code, works for me -- you should probably adapt this piece to your needs). Few notes:
I am storing callback object and function separatelly. You cannot simply make a weakref of a bound method, because bound methods are very temporary objects. Actually weakref.ref(obj.method) will destroy the bound method object instantly after creating a weakref. I didn't check whether it is needed to store a weakref to the function too... I guess if your code is static, you probably can avoid that.
The object wrapper will remove itself from the signal sender when it notices that the weak reference ceased to exist. This is also necessary to destroy the circular dependence between the CallbackWrapper and the signal sender object.
(This answer tracks my progress)
This second version will disconnect as well; I have a convenience function for gobjects, but I actually need this class for a more general case -- both for D-Bus signal callbacks and GObject callbacks.
Anyway, what can one call the WeakCallback implementation style? It is a very clean encapsulation of the weak callback, but with the gobject/dbus specialization unnoticeably tacked on. Beats writing two subclasses for those two cases.
import weakref
class WeakCallback (object):
"""A Weak Callback object that will keep a reference to
the connecting object with weakref semantics.
This allows to connect to gobject signals without it keeping
the connecting object alive forever.
Will use #gobject_token or #dbus_token if set as follows:
sender.disconnect(gobject_token)
dbus_token.remove()
"""
def __init__(self, obj, attr):
"""Create a new Weak Callback calling the method #obj.#attr"""
self.wref = weakref.ref(obj)
self.callback_attr = attr
self.gobject_token = None
self.dbus_token = None
def __call__(self, *args, **kwargs):
obj = self.wref()
if obj:
attr = getattr(obj, self.callback_attr)
attr(*args, **kwargs)
elif self.gobject_token:
sender = args[0]
sender.disconnect(self.gobject_token)
self.gobject_token = None
elif self.dbus_token:
self.dbus_token.remove()
self.dbus_token = None
def gobject_connect_weakly(sender, signal, connector, attr, *user_args):
"""Connect weakly to GObject #sender's #signal,
with a callback in #connector named #attr.
"""
wc = WeakCallback(connector, attr)
wc.gobject_token = sender.connect(signal, wc, *user_args)
not actually tried it yet, but:
class WeakCallback(object):
"""
Used to wrap bound methods without keeping a ref to the underlying object.
You can also pass in user_data and user_kwargs in the same way as with
rpartial. Note that refs will be kept to everything you pass in other than
the callback, which will have a weakref kept to it.
"""
def __init__(self, callback, *user_data, **user_kwargs):
self.im_self = weakref.proxy(callback.im_self, self._invalidated)
self.im_func = weakref.proxy(callback.im_func)
self.user_data = user_data
self.user_kwargs = user_kwargs
def __call__(self, *args, **kwargs):
kwargs.update(self.user_kwargs)
args += self.user_data
self.im_func(self.im_self, *args, **kwargs)
def _invalidated(self, im_self):
"""Called by the weakref.proxy object."""
cb = getattr(self, 'cancel_callback', None)
if cb is not None:
cb()
def add_cancel_function(self, cancel_callback):
"""
A ref will be kept to cancel_callback. It will be called back without
any args when the underlying object dies.
You can wrap it in WeakCallback if you want, but that's a bit too
self-referrential for me to do by default. Also, that would stop you
being able to use a lambda as the cancel_callback.
"""
self.cancel_callback = cancel_callback
def weak_connect(sender, signal, callback):
"""
API-compatible with the function described in
http://stackoverflow.com/questions/1364923/. Mostly used as an example.
"""
cb = WeakCallback(callback)
handle = sender.connect(signal, cb)
cb.add_cancel_function(WeakCallback(sender.disconnect, handle))