Make Singleton class in Multiprocessing - python

I create Singleton class using Metaclass, it working good in multithreadeds and create only one instance of MySingleton class but in multiprocessing, it creates always new instance
import multiprocessing
class SingletonType(type):
# meta class for making a class singleton
def __call__(cls, *args, **kwargs):
try:
return cls.__instance
except AttributeError:
cls.__instance = super(SingletonType, cls).__call__(*args, **kwargs)
return cls.__instance
class MySingleton(object):
# singleton class
__metaclass__ = SingletonType
def __init__(*args,**kwargs):
print "init called"
def task():
# create singleton class instance
a = MySingleton()
# create two process
pro_1 = multiprocessing.Process(target=task)
pro_2 = multiprocessing.Process(target=task)
# start process
pro_1.start()
pro_2.start()
My output:
init called
init called
I need MySingleton class init method get called only once

Each of your child processes runs its own instance of the Python interpreter, hence the SingletonType in one process doesn't share its state with those in another process. This means that a true singleton that only exists in one of your processes will be of little use, because you won't be able to use it in the other processes: while you can manually share data between processes, that is limited to only basic data types (for example dicts and lists).
Instead of relying on singletons, simply share the underlying data between the processes:
#!/usr/bin/env python3
import multiprocessing
import os
def log(s):
print('{}: {}'.format(os.getpid(), s))
class PseudoSingleton(object):
def __init__(*args,**kwargs):
if not shared_state:
log('Initializating shared state')
with shared_state_lock:
shared_state['x'] = 1
shared_state['y'] = 2
log('Shared state initialized')
else:
log('Shared state was already initalized: {}'.format(shared_state))
def task():
a = PseudoSingleton()
if __name__ == '__main__':
# We need the __main__ guard so that this part is only executed in
# the parent
log('Communication setup')
shared_state = multiprocessing.Manager().dict()
shared_state_lock = multiprocessing.Lock()
# create two process
log('Start child processes')
pro_1 = multiprocessing.Process(target=task)
pro_2 = multiprocessing.Process(target=task)
pro_1.start()
pro_2.start()
# Wait until processes have finished
# See https://stackoverflow.com/a/25456494/857390
log('Wait for children')
pro_1.join()
pro_2.join()
log('Done')
This prints
16194: Communication setup
16194: Start child processes
16194: Wait for children
16200: Initializating shared state
16200: Shared state initialized
16201: Shared state was already initalized: {'x': 1, 'y': 2}
16194: Done
However, depending on your problem setting there might be better solutions using other mechanisms of inter-process communication. For example, the Queue class is often very useful.

Related

Pass complex object instance to class that subclasses process

I have a large Python 3.6 system where multiple processes and threads interact with each other and the user. Simplified, there is a Scheduler instance (subclasses threading.Thread) and a Worker instance (subclasses multiprocessing.Process). Both objects run for the entire duration of the program.
The user interacts with the Scheduler by adding Task instances and the Scheduler passes the task to the Worker at the correct moment in time. The worker uses the information contained in the task to do its thing.
Below is some stripped out and simplified code out of the project:
class Task:
def __init__(self, name:str):
self.name = name
self.state = 'idle'
class Scheduler(threading.Thread):
def __init__(self, worker:Worker):
super().init()
self.worker = worker
self.start()
def run(self):
while True:
# Do stuff until the user schedules a new task
task = Task() # <-- In reality the Task intance is not created here but the thread gets it from elsewhere
task.state = 'scheduled'
self.worker.change_task(task)
# Do stuff until the task.state == 'finished'
class Worker(multiprocessing.Process):
def __init__(self):
super().init()
self.current_task = None
self.start()
def change_task(self, new_task:Task):
self.current_task = new_task
self.current_task.state = 'accepted-idle'
def run(self):
while True:
# Do stuff until the current task is updated
self.current_task.state = 'accepted-running'
# Task is running
self.current_task.state = 'finished'
The system used to be structured so that the task contained multiple multiprocessing.Events indicating each of its possible states. Then, not the whole Task instance was passed to the worker, but each of the task's attributes was. As they were all multiprocessing safe, it worked, with a caveat. The events changed in worker.run had to be created in worker.run and back passed to the task object for it work. Not only is this a less than ideal solution, it no longer works with some changes I am making to the project.
Back to the current state of the project, as described by the python code above. As is, this will never work because nothing makes this multiprocessing safe at the moment. So I implemented a Proxy/BaseManager structure so that when a new Task is needed, the system gets it from the multiprocessing manager. I use this structure in a sightly different way elsewhere in the project as well. The issue is that the worker.run never knows that the self.current_task is updated, it remains None. I expected this to be fixed by using the proxy but clearly I am mistaken.
def Proxy(target: typing.Type) -> typing.Type:
"""
Normally a Manager only exposes only object methods. A NamespaceProxy can be used when registering the object with
the manager to expose all the attributes. This also works for attributes created at runtime.
https://stackoverflow.com/a/68123850/8353475
1. Instead of exposing all the attributes manually, we effectively override __getattr__ to do it dynamically.
2. Instead of defining a class that subclasses NamespaceProxy for each specific object class that needs to be
proxied, this method is used to do it dynamically. The target parameter should be the class of the object you want
to generate the proxy for. The generated proxy class will be returned.
Example usage: FooProxy = Proxy(Foo)
:param target: The class of the object to build the proxy class for
:return The generated proxy class
"""
# __getattr__ is called when an attribute 'bar' is called from 'foo' and it is not found eg. 'foo.bar'. 'bar' can
# be a class method as well as a variable. The call gets rerouted from the base object to this proxy, were it is
# processed.
def __getattr__(self, key):
result = self._callmethod('__getattribute__', (key,))
# If attr call was for a method we need some further processing
if isinstance(result, types.MethodType):
# A wrapper around the method that passes the arguments, actually calls the method and returns the result.
# Note that at this point wrapper() does not get called, just defined.
def wrapper(*args, **kwargs):
# Call the method and pass the return value along
return self._callmethod(key, args, kwargs)
# Return the wrapper method (not the result, but the method itself)
return wrapper
else:
# If the attr call was for a variable it can be returned as is
return result
dic = {'types': types, '__getattr__': __getattr__}
proxy_name = target.__name__ + "Proxy"
ProxyType = type(proxy_name, (NamespaceProxy,), dic)
# This is a tuple of all the attributes that are/will be exposed. We copy all of them from the base class
ProxyType._exposed_ = tuple(dir(target))
return ProxyType
class TaskManager(BaseManager):
pass
TaskProxy = Proxy(Task)
TaskManager.register('get_task', callable=Task, proxytype=TaskProxy)

python multiprocessing manager connect creates another object

I would like to create shared object among processes. First I created server process which spawned process for class ProcessClass. Then I created another process where I want to connect to shared object.
But connection from another process created its own instance of ProcessClass.
So what I need to do to access this remote shared object.
Here is my test code.
from multiprocessing.managers import BaseManager
from multiprocessing import Process
class ProcessClass:
def __init__(self):
self._state = False
def set(self):
self._state = True
def get(self):
return self._state
class MyManager(BaseManager):
pass
def another_process():
MyManager.register('my_object')
m = MyManager(address=('', 50000))
m.connect()
proxy = m.my_object()
print(f'state from another process: {proxy.get()}')
def test_spawn_and_terminate_process():
MyManager.register('my_object', ProcessClass)
m = MyManager(address=('', 50000))
m.start()
proxy = m.my_object()
proxy.set()
print(f'state from main process: {proxy.get()}')
p = Process(target=another_process)
p.start()
p.join()
print(f'state from main process: {proxy.get()}')
if __name__ == '__main__':
test_spawn_and_terminate_process()
Output is
python test_communication.py
state from main process: True
state from another process: False
state from main process: True
Your code is working as it is supposed to. If you look at the documentation for multiprocessing.managers.SyncManager you will see that there is, for example, a method dict() to create a shareable dictionary. Would you expect that calling this method multiple times would return the same dictionary over and over again or new instances of sharable dictionaries?
What you need to do is enforce a singleton instance to be used repeatedly for successive invocations of proxy = m.my_object() and the way to do that is to first define the following function:
singleton = None
def get_singleton_process_instance():
global singleton
if singleton is None:
singleton = ProcessClass()
return singleton
Then you need to make a one line change in funtion test_spawn_and_terminate_process:
def test_spawn_and_terminate_process():
#MyManager.register('my_object', ProcessClass)
MyManager.register('my_object', get_singleton_process_instance)
This ensures that to satisfy requests for 'my_object', it always invokes get_singleton_process_instance() (returning the singleton) instead of ProcessClass(), which would return a new instance.

Python subclassing multiprocessing.Lock

I'm trying to understand why python can not compile the following class.
class SharedResource(multiprocessing.Lock):
def __init__(self, blocking=True, timeout=-1):
# super().__init__(blocking=True, timeout=-1)
self.blocking = blocking
self.timeout = timeout
self.data = {}
TypeError: method expected 2 arguments, got 3
The reason why I'm subclassing Lock
my objective is to create a shared list of resource that should be usable only by on process at a time.
this concept will be eventually in a Flash application where the request should not be able to use the resource concurrently
RuntimeError: Lock objects should only be shared between processes through inheritance
class SharedResource():
def __init__(self, id, model):
'''
id: mode id
model: Keras Model only one worker at a time can call predict
'''
self.mutex = Lock()
self.id = id
self.model = model
manager = Manager()
shared_list = manager.list() # a List of models
shared_list.append(SharedResource())
def worker1(l):
...read some data
while True:
resource = l[0]
with m:
resource['model'].predict(...some data)
time.sleep(60)
if __name__ == "__main__":
processes = [ Process(target=worker1, args=[shared_list])]
for p in processes:
p.start()
for p in processes:
p.join()
The reason you are getting this error is because multiprocessing.Lock is actually a function.
In .../multiprocessing/context.py there are these lines:
def Lock(self):
'''Returns a non-recursive lock object'''
from .synchronize import Lock
return Lock(ctx=self.get_context())
This may change in the future so you can verify this on your version of python by doing:
import multiprocessing
print(type(multiprocessing.Lock))
To actually subclass Lock you will need to do something like this:
from multiprocessing import synchronize
from multiprocessing.synchronize import Lock
# Since Lock is now a class, this should work:
class SharedResource(Lock):
pass
I'm not endorsing this approach as a "good" solution, but it should solve your problem if you really need to subclass Lock. Subclassing things that try to avoid being subclassed is usually not a great idea, but sometimes it can be necessary. If you can solve the problem in a different way you may want to consider that.

python sharing singleton object between child processes

I know that processes do not share same context in python. But what about singleton objects? I was able to get the child process share same internal object as parent process, but am unable to understand how. Is there something wrong with the code below?
This could be a follow up to this stackoverflow question.
This is the code I have:
Singleton.py:
import os
class MetaSingleton(type):
_instances = {}
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
cls._instances[cls] = super(MetaSingleton, cls).__call__(*args, **kwargs)
return cls._instances[cls]
class Singleton:
__metaclass__ = MetaSingleton
def __init__(self):
self.key="KEY TEST"
print "inside init"
def getKey(self):
return self.key
def setKey(self,key1):
self.key = key1
process_singleton.py:
import os
from Singleton import Singleton
def callChildProc():
singleton = Singleton()
print ("singleton key: %s"%singleton.getKey())
def taskRun():
singleton = Singleton()
singleton.setKey("TEST2")
for i in range(1,10):
print ("In parent process, singleton key: %s" %singleton.getKey())
try:
pid = os.fork()
except OSError,e:
print("Could not create a child process: %s"%e)
if pid == 0:
print("In the child process that has the PID %d"%(os.getpid()))
callChildProc()
exit()
print("Back to the parent process")
taskRun()
On forking systems, child processes have a copy on write view of the parent memory space. Processes use virtual memory and right after the fork both process virtual spaces point to the same physical RAM. On write, the physical page is copied and virtual memory is remapped so that bit of the the memory is no longer shared. This deferred copy is usually faster than cloning the memory space at the fork.
The result is that neither parent or child sees the other sides changes. Since you setup the singleton before the fork, both parent and child see the same value.
Here is a quick example where I use time.sleep to control when parent and child make their private changes:
import multiprocessing as mp
import time
def proc():
global foo
time.sleep(1)
print('child foo should be 1, the value before the fork:', foo)
foo = 3 # child private copy
foo = 1 # the value both see at fork
p = mp.Process(target=proc)
p.start()
foo = 2 # parent private copy
time.sleep(2)
print('parent foo should be 2, not the 3 set by child:', foo)
When run:
child foo should be 1, the value before the fork: 1
parent foo should be 2, not the 3 set by child: 2

Python sharing class instance among threads

I have a class, that loads all resources into memory that are needed for my application (mostly images).
Then several threads need to access these resources through this class.
I don't want every instance to reload all resources, so I thought I use the Singleton Pattern.
I did it like this:
class DataContainer(object):
_instance = None
_lock = threading.Lock()
_initialised = True
def __new__(cls, *args, **kwargs):
with cls._lock:
if not cls._instance:
cls._initialised = False
cls._instance = object.__new__(cls, *args, **kwargs)
return cls._instance
def __init__(self, map_name = None):
# instance has already been created
if self._initialised:
return
self._initialised = True
# load images
This works fine, as long as I am not using multiple threads. But with multiple Threads every thread has a different instance. So using 4 threads, they each create a new instance.
I want all threads to use the same instance of this class,
so the resources are only loaded into memory once.
I also tried to do this in the same module where the class is defined, but outside the class definition:
def getDataContainer():
global dataContainer
return dataContainer
dataContainer = DataContainer()
but every thread still has its own instance.
I am new to python, if this is the wrong approach plz let me know,
I appreciate any help
To expand on #Will's comment, if a "shared object" is created by the parent, then passed in to each thread, all threads will share the same object.
(With processes, see the multiprocessing.Manager class, which directly support sharing state, including with modifications.)
import threading, time
class SharedObj(object):
image = 'beer.jpg'
class DoWork(threading.Thread):
def __init__(self, shared, *args, **kwargs):
super(DoWork,self).__init__(*args, **kwargs)
self.shared = shared
def run(self):
print threading.current_thread(), 'start'
time.sleep(1)
print 'shared', self.shared.image, id(self.shared)
print threading.current_thread(), 'done'
myshared = SharedObj()
threads = [ DoWork(shared=myshared, name='a'),
DoWork(shared=myshared, name='b')
]
for t in threads:
t.start()
for t in threads:
t.join()
print 'DONE'
Output:
<DoWork(a, started 140381090318080)> start
<DoWork(b, started 140381006067456)> start
shared beer.jpg shared140381110335440
<DoWork(b, started 140381006067456)> done
beer.jpg 140381110335440
<DoWork(a, started 140381090318080)> done
DONE
Note that the thread IDs are different, but they both use the same SharedObj instance, at memory address ending in 440.

Categories

Resources