How to use python context manager to "borrow" objects - python

I want to create a Python context manager to act as a controlled sort of "library" that lends out objects, and then takes them back when the scope of the with statement exits.
In psuedo-code I was thinking something like this:
class Library:
def __init__(self):
self.lib = [1,2,3,4]
self.lock = Condition(Lock())
def __enter__(self):
with self.lock:
# Somehow keep track of this object-thread association
if len(self.lib) > 0:
return self.lib.pop()
else:
self.lock.wait()
return self.lib.pop()
def __exit__(self):
with self.lock:
# Push the object that the calling thread obtained with
# __enter__() back into the array
self.lock.notify()

Related

Pass complex object instance to class that subclasses process

I have a large Python 3.6 system where multiple processes and threads interact with each other and the user. Simplified, there is a Scheduler instance (subclasses threading.Thread) and a Worker instance (subclasses multiprocessing.Process). Both objects run for the entire duration of the program.
The user interacts with the Scheduler by adding Task instances and the Scheduler passes the task to the Worker at the correct moment in time. The worker uses the information contained in the task to do its thing.
Below is some stripped out and simplified code out of the project:
class Task:
def __init__(self, name:str):
self.name = name
self.state = 'idle'
class Scheduler(threading.Thread):
def __init__(self, worker:Worker):
super().init()
self.worker = worker
self.start()
def run(self):
while True:
# Do stuff until the user schedules a new task
task = Task() # <-- In reality the Task intance is not created here but the thread gets it from elsewhere
task.state = 'scheduled'
self.worker.change_task(task)
# Do stuff until the task.state == 'finished'
class Worker(multiprocessing.Process):
def __init__(self):
super().init()
self.current_task = None
self.start()
def change_task(self, new_task:Task):
self.current_task = new_task
self.current_task.state = 'accepted-idle'
def run(self):
while True:
# Do stuff until the current task is updated
self.current_task.state = 'accepted-running'
# Task is running
self.current_task.state = 'finished'
The system used to be structured so that the task contained multiple multiprocessing.Events indicating each of its possible states. Then, not the whole Task instance was passed to the worker, but each of the task's attributes was. As they were all multiprocessing safe, it worked, with a caveat. The events changed in worker.run had to be created in worker.run and back passed to the task object for it work. Not only is this a less than ideal solution, it no longer works with some changes I am making to the project.
Back to the current state of the project, as described by the python code above. As is, this will never work because nothing makes this multiprocessing safe at the moment. So I implemented a Proxy/BaseManager structure so that when a new Task is needed, the system gets it from the multiprocessing manager. I use this structure in a sightly different way elsewhere in the project as well. The issue is that the worker.run never knows that the self.current_task is updated, it remains None. I expected this to be fixed by using the proxy but clearly I am mistaken.
def Proxy(target: typing.Type) -> typing.Type:
"""
Normally a Manager only exposes only object methods. A NamespaceProxy can be used when registering the object with
the manager to expose all the attributes. This also works for attributes created at runtime.
https://stackoverflow.com/a/68123850/8353475
1. Instead of exposing all the attributes manually, we effectively override __getattr__ to do it dynamically.
2. Instead of defining a class that subclasses NamespaceProxy for each specific object class that needs to be
proxied, this method is used to do it dynamically. The target parameter should be the class of the object you want
to generate the proxy for. The generated proxy class will be returned.
Example usage: FooProxy = Proxy(Foo)
:param target: The class of the object to build the proxy class for
:return The generated proxy class
"""
# __getattr__ is called when an attribute 'bar' is called from 'foo' and it is not found eg. 'foo.bar'. 'bar' can
# be a class method as well as a variable. The call gets rerouted from the base object to this proxy, were it is
# processed.
def __getattr__(self, key):
result = self._callmethod('__getattribute__', (key,))
# If attr call was for a method we need some further processing
if isinstance(result, types.MethodType):
# A wrapper around the method that passes the arguments, actually calls the method and returns the result.
# Note that at this point wrapper() does not get called, just defined.
def wrapper(*args, **kwargs):
# Call the method and pass the return value along
return self._callmethod(key, args, kwargs)
# Return the wrapper method (not the result, but the method itself)
return wrapper
else:
# If the attr call was for a variable it can be returned as is
return result
dic = {'types': types, '__getattr__': __getattr__}
proxy_name = target.__name__ + "Proxy"
ProxyType = type(proxy_name, (NamespaceProxy,), dic)
# This is a tuple of all the attributes that are/will be exposed. We copy all of them from the base class
ProxyType._exposed_ = tuple(dir(target))
return ProxyType
class TaskManager(BaseManager):
pass
TaskProxy = Proxy(Task)
TaskManager.register('get_task', callable=Task, proxytype=TaskProxy)

Is there a synchronization lock with key in Python?

I need a Lock object, similar to multiprocessing.Manager().Lock() which only is allowed to be released from the process which actually has acquired it.
My manual implementation would be something similar to the following:
class KeyLock:
def __init__(self):
self._lock = Lock()
self._key: Optional[str] = None
def acquire(self, key: 'str', blocking: bool = True, timeout: float = 10.0) -> bool:
if self._lock.acquire(blocking=blocking, timeout=timeout):
self._key = key
return True
return False
def release(self, key, raise_error: bool = False) -> bool:
if self._key == key:
self._lock.release()
return True
if raise_error:
raise RuntimeError(
'KeyLock.released called with a non matchin key!'
)
return False
def locked(self):
return self._lock.locked()
To create an instance of this lock and use it from multiple processes I would use a custom manager class:
class KeyLockManager(BaseManager):
pass
KeyLockManager.register('KeyLock', KeyLock)
manager = KeyLockManager()
manager.start()
lock = manager.KeyLock()
From different processes I then can do:
lock.acquire(os.getpid())
# use shared ressource
...
lock.release(os.getpid())
This works as expected, but is seems to be a pretty big effort for a relatively simple task.
So I wonder whether there is a easier way to do that?
There is multiprocessing.RLock that by definition can only be released by the process that acquired it. Or you might consider something like the following where the Lock instance is encapsulated and is meant to be only used as a context manager making it impossible to release it unless you have acquired it unless you violate the encapsulation. One, of course, could add extra protection to the class to protect attempts to violate encapsulation and get to the Lock instance itself. Of course, in your implementation, one could always violate encapsulation too because one can get the pid of other processes. So we assume that all the users abide by the rules.
from multiprocessing import Lock
class KeyLock:
def __init__(self):
self.__lock = Lock()
def __enter__(self):
self.__lock.acquire()
return None
def __exit__(self, exc_type, exc_val, exc_tb):
self.__lock.release()
return False
# Usage:
key_lock = KeyLock()
with key_lock:
# do something
...

Does a class instance that shares access between coroutines need a asyncio Lock?

For a project I am writing a class that holds data to be shared by multiple coroutines, a basic example would be:
import copy
class Data:
def __init__(self):
self.ls = []
self.tiles = {}
def add(self, element):
self.ls.append(element)
def rem(self, element):
self.ls.remove(element)
def set_tiles(self, t):
self.tiles = t
def get_tiles(self):
return copy.deepcopy(self.tiles)
And it would be used like this:
async def test_coro(d):
# Do multiple things including using all methods from d
test_data = Data()
# Simultaneously start many instances of `test_coro` passing `test_data` to all of them
I'm struggling to understand when you would need to use a lock in a situation like this, my question about this code is would I need to use an asyncio Lock at all, or would it be safe as all variable access / assigning happens without any awaiting inside the class?

Make Singleton class in Multiprocessing

I create Singleton class using Metaclass, it working good in multithreadeds and create only one instance of MySingleton class but in multiprocessing, it creates always new instance
import multiprocessing
class SingletonType(type):
# meta class for making a class singleton
def __call__(cls, *args, **kwargs):
try:
return cls.__instance
except AttributeError:
cls.__instance = super(SingletonType, cls).__call__(*args, **kwargs)
return cls.__instance
class MySingleton(object):
# singleton class
__metaclass__ = SingletonType
def __init__(*args,**kwargs):
print "init called"
def task():
# create singleton class instance
a = MySingleton()
# create two process
pro_1 = multiprocessing.Process(target=task)
pro_2 = multiprocessing.Process(target=task)
# start process
pro_1.start()
pro_2.start()
My output:
init called
init called
I need MySingleton class init method get called only once
Each of your child processes runs its own instance of the Python interpreter, hence the SingletonType in one process doesn't share its state with those in another process. This means that a true singleton that only exists in one of your processes will be of little use, because you won't be able to use it in the other processes: while you can manually share data between processes, that is limited to only basic data types (for example dicts and lists).
Instead of relying on singletons, simply share the underlying data between the processes:
#!/usr/bin/env python3
import multiprocessing
import os
def log(s):
print('{}: {}'.format(os.getpid(), s))
class PseudoSingleton(object):
def __init__(*args,**kwargs):
if not shared_state:
log('Initializating shared state')
with shared_state_lock:
shared_state['x'] = 1
shared_state['y'] = 2
log('Shared state initialized')
else:
log('Shared state was already initalized: {}'.format(shared_state))
def task():
a = PseudoSingleton()
if __name__ == '__main__':
# We need the __main__ guard so that this part is only executed in
# the parent
log('Communication setup')
shared_state = multiprocessing.Manager().dict()
shared_state_lock = multiprocessing.Lock()
# create two process
log('Start child processes')
pro_1 = multiprocessing.Process(target=task)
pro_2 = multiprocessing.Process(target=task)
pro_1.start()
pro_2.start()
# Wait until processes have finished
# See https://stackoverflow.com/a/25456494/857390
log('Wait for children')
pro_1.join()
pro_2.join()
log('Done')
This prints
16194: Communication setup
16194: Start child processes
16194: Wait for children
16200: Initializating shared state
16200: Shared state initialized
16201: Shared state was already initalized: {'x': 1, 'y': 2}
16194: Done
However, depending on your problem setting there might be better solutions using other mechanisms of inter-process communication. For example, the Queue class is often very useful.

Multiple threads in Python accessing same data

I am wrapping a function in a decorator so that each time it is called it starts in a new thread. Like this:
def threaded(func):
def start_thread(*args, **kw):
"""Spawn a new thread on the target function"""
threading.Thread(target=func, args=args, kwargs=kw).start()
return start_thread
Now I have a class whose init method I want to run each time in a separate thread with it's own data. The idea is to give it all the data it needs and my job is done when the init methods end. So I have my class like this:
Class CreateFigures(object):
#threaded
def __init__(self, options, sList):
self.opt = options
self.li = sList
self.generateFigs()
self.saveFigsToDisk()
I call the constructor of this class from another class with new arguments passed each time, like this:
CreateFigures(newOpts, newList)
Here newOpts and newList can change every time the CreateFigures() constructor is called. My problem is how do I safely pass data to this constructor every time I want to start a new thread. Because the way I do it currently, it is just becoming a mess with multiple threads accessing the same data at once. I tried enclosing all the statements of the CreateFigures() constructor in a with block with an Rlock object like this:
def __init__(self, options, sList):
lock = threading.RLock()
with lock:
self.opt = options
self.li = sList
self.generateFigs()
self.saveFigsToDisk()

Categories

Resources