Multiple threads in Python accessing same data - python

I am wrapping a function in a decorator so that each time it is called it starts in a new thread. Like this:
def threaded(func):
def start_thread(*args, **kw):
"""Spawn a new thread on the target function"""
threading.Thread(target=func, args=args, kwargs=kw).start()
return start_thread
Now I have a class whose init method I want to run each time in a separate thread with it's own data. The idea is to give it all the data it needs and my job is done when the init methods end. So I have my class like this:
Class CreateFigures(object):
#threaded
def __init__(self, options, sList):
self.opt = options
self.li = sList
self.generateFigs()
self.saveFigsToDisk()
I call the constructor of this class from another class with new arguments passed each time, like this:
CreateFigures(newOpts, newList)
Here newOpts and newList can change every time the CreateFigures() constructor is called. My problem is how do I safely pass data to this constructor every time I want to start a new thread. Because the way I do it currently, it is just becoming a mess with multiple threads accessing the same data at once. I tried enclosing all the statements of the CreateFigures() constructor in a with block with an Rlock object like this:
def __init__(self, options, sList):
lock = threading.RLock()
with lock:
self.opt = options
self.li = sList
self.generateFigs()
self.saveFigsToDisk()

Related

Pass complex object instance to class that subclasses process

I have a large Python 3.6 system where multiple processes and threads interact with each other and the user. Simplified, there is a Scheduler instance (subclasses threading.Thread) and a Worker instance (subclasses multiprocessing.Process). Both objects run for the entire duration of the program.
The user interacts with the Scheduler by adding Task instances and the Scheduler passes the task to the Worker at the correct moment in time. The worker uses the information contained in the task to do its thing.
Below is some stripped out and simplified code out of the project:
class Task:
def __init__(self, name:str):
self.name = name
self.state = 'idle'
class Scheduler(threading.Thread):
def __init__(self, worker:Worker):
super().init()
self.worker = worker
self.start()
def run(self):
while True:
# Do stuff until the user schedules a new task
task = Task() # <-- In reality the Task intance is not created here but the thread gets it from elsewhere
task.state = 'scheduled'
self.worker.change_task(task)
# Do stuff until the task.state == 'finished'
class Worker(multiprocessing.Process):
def __init__(self):
super().init()
self.current_task = None
self.start()
def change_task(self, new_task:Task):
self.current_task = new_task
self.current_task.state = 'accepted-idle'
def run(self):
while True:
# Do stuff until the current task is updated
self.current_task.state = 'accepted-running'
# Task is running
self.current_task.state = 'finished'
The system used to be structured so that the task contained multiple multiprocessing.Events indicating each of its possible states. Then, not the whole Task instance was passed to the worker, but each of the task's attributes was. As they were all multiprocessing safe, it worked, with a caveat. The events changed in worker.run had to be created in worker.run and back passed to the task object for it work. Not only is this a less than ideal solution, it no longer works with some changes I am making to the project.
Back to the current state of the project, as described by the python code above. As is, this will never work because nothing makes this multiprocessing safe at the moment. So I implemented a Proxy/BaseManager structure so that when a new Task is needed, the system gets it from the multiprocessing manager. I use this structure in a sightly different way elsewhere in the project as well. The issue is that the worker.run never knows that the self.current_task is updated, it remains None. I expected this to be fixed by using the proxy but clearly I am mistaken.
def Proxy(target: typing.Type) -> typing.Type:
"""
Normally a Manager only exposes only object methods. A NamespaceProxy can be used when registering the object with
the manager to expose all the attributes. This also works for attributes created at runtime.
https://stackoverflow.com/a/68123850/8353475
1. Instead of exposing all the attributes manually, we effectively override __getattr__ to do it dynamically.
2. Instead of defining a class that subclasses NamespaceProxy for each specific object class that needs to be
proxied, this method is used to do it dynamically. The target parameter should be the class of the object you want
to generate the proxy for. The generated proxy class will be returned.
Example usage: FooProxy = Proxy(Foo)
:param target: The class of the object to build the proxy class for
:return The generated proxy class
"""
# __getattr__ is called when an attribute 'bar' is called from 'foo' and it is not found eg. 'foo.bar'. 'bar' can
# be a class method as well as a variable. The call gets rerouted from the base object to this proxy, were it is
# processed.
def __getattr__(self, key):
result = self._callmethod('__getattribute__', (key,))
# If attr call was for a method we need some further processing
if isinstance(result, types.MethodType):
# A wrapper around the method that passes the arguments, actually calls the method and returns the result.
# Note that at this point wrapper() does not get called, just defined.
def wrapper(*args, **kwargs):
# Call the method and pass the return value along
return self._callmethod(key, args, kwargs)
# Return the wrapper method (not the result, but the method itself)
return wrapper
else:
# If the attr call was for a variable it can be returned as is
return result
dic = {'types': types, '__getattr__': __getattr__}
proxy_name = target.__name__ + "Proxy"
ProxyType = type(proxy_name, (NamespaceProxy,), dic)
# This is a tuple of all the attributes that are/will be exposed. We copy all of them from the base class
ProxyType._exposed_ = tuple(dir(target))
return ProxyType
class TaskManager(BaseManager):
pass
TaskProxy = Proxy(Task)
TaskManager.register('get_task', callable=Task, proxytype=TaskProxy)

Python Multiprocessing outside of main producing unexpected result

I have a program that makes a call to an API every minute and do some operations, when some condition is met, I want to create a new process that will make calls to another API every seconds and do some operations. Parent process doesn't care the result that this child process produce, the child will run on its own until everything is done. This way the parent process can continue making call to the api every minute and doing operations without interruption.
I looked into multiprocessing. However I cant get it to work outside of main. I tried passing a callback function, but that created unexpected result (where parent process starting running again in parallel at some point).
Another solution I can think of is just create another project, then make a request. However then I will have a lot of repeated code.
What is the best approach to my problem?
example code:
class Main:
[...]
foo = Foo()
child = Child()
foo.Run(child.Process)
class Foo:
[...]
def Run(callbackfunction):
while(True):
x = self.dataServices.GetDataApi()
if(x == 1020):
callbackfunction()
#start next loop after a minute
class Child:
[...]
def Compute(self):
while(True):
self.dataServics.GetDataApiTwo()
#do stuff
#start next loop after a second
def Process(self):
self.Compute() # i want this function to run from a new process, so it wont interfer
Edit2: added in multiprocess attempt
class Main:
def CreateNewProcess(self, callBack):
if __name__ == '__main__':
p = Process(target=callBack)
p.start()
p.join()
foo = Foo()
child = Child(CreateNewProcess)
foo.Run(child.Process)
class Foo:
def Run(callbackfunction):
while(True):
x = dataServices.GetDataApi()
if(x == 1020):
callbackfunction()
#start next loop after a minute
class Child:
_CreateNewProcess = None
def __init__(self, CreateNewProcess):
self._CreateNewProcess = CreateNewProcess
def Compute(self, CreateNewProcess):
while(True):
dataServics.GetDataApiTwo()
#do stuff
#start next loop after a second
def Process(self):
self.CreateNewProcess(self.Compute) # i want this function to run from a new process, so it wont interfer
I had to reorganize a few things. Among others:
The guard if __name__ == '__main__': should include creation of
objects and especially calls to functions and methods. Usually it is
placed on the global level at the end of code.
Child objects shouldn't be created in main process. In theory you can
do this to use them as containers for necessary data for the child
process and then sending them as parameter but I think a separate
class should be used for this if seen as necessary. Here I used a
simple data parameter which can be anything pickleable.
It is cleaner to have a function on global level as process
target (in my opinion)
Finally it looks like:
from multiprocessing import Process
class Main:
#staticmethod
def CreateNewProcess(data):
p = Process(target=run_child, args=(data,))
p.start()
p.join()
class Foo:
def Run(self, callbackfunction):
while(True):
x = dataServices.GetDataApi()
if(x == 1020):
callbackfunction(data)
#start next loop after a minute
class Child:
def __init__(self, data):
self._data = data
def Compute(self):
while(True):
dataServics.GetDataApiTwo()
#do stuff
#start next loop after a second
# Target for new process. It is cleaner to have a function outside of a
# class for this
def run_child(data): # "data" represents one or more parameters from
# parent to child necessary to run specific child.
# "data" must be pickleable.
# Can be omitted if unnecessary
global child
child = Child(data)
child.Compute()
if __name__ == '__main__':
foo = Foo()
foo.Run(Main.CreateNewProcess)

Does a class instance that shares access between coroutines need a asyncio Lock?

For a project I am writing a class that holds data to be shared by multiple coroutines, a basic example would be:
import copy
class Data:
def __init__(self):
self.ls = []
self.tiles = {}
def add(self, element):
self.ls.append(element)
def rem(self, element):
self.ls.remove(element)
def set_tiles(self, t):
self.tiles = t
def get_tiles(self):
return copy.deepcopy(self.tiles)
And it would be used like this:
async def test_coro(d):
# Do multiple things including using all methods from d
test_data = Data()
# Simultaneously start many instances of `test_coro` passing `test_data` to all of them
I'm struggling to understand when you would need to use a lock in a situation like this, my question about this code is would I need to use an asyncio Lock at all, or would it be safe as all variable access / assigning happens without any awaiting inside the class?

How to kill a subprocess initiated by a different function in the same class

I want to kill a process from another function in the class attending to the fact that it was initiated by another function. Here's an example:
import time
class foo_class:
global foo_global
def foo_start(self):
import subprocess
self.foo_global =subprocess.Popen(['a daemon service'])
def foo_stop(self):
self.foo_start.foo_global.kill()
self.foo_start.foo_global.wait()
foo_class().foo_start()
time.sleep(5)
foo_class().foo_stop()
How should I define foo_stop?
jterrace code works. If you don't want it to start when you initialize, just call Popen in a separate function and pass nothing to the init function
import subprocess
import time
class foo_class(object):
def __init__(self):
pass
def start(self):
self.foo = subprocess.Popen(['a daemon service'])
def stop(self):
self.foo.kill()
self.foo.wait() #don't know if this is necessary?
def restart(self):
self.start()
foo = foo_class()
foo.start()
time.sleep(5)
foo.stop()
I'm guessing you want something like this:
import subprocess
import time
class foo_class(object):
def __init__(self):
self.foo = None
def start(self):
self.stop()
self.foo = subprocess.Popen(['a daemon service'])
self.foo.start()
def stop(self):
if self.foo:
self.foo.kill()
self.foo.wait()
self.foo = None
foo = foo_class()
foo.start()
time.sleep(5)
foo.stop()
Some things I've changed:
Imports should generally go at the top of the file.
Classes should inherit from object.
You want to use an instance variable.
It doesn't make much sense for your class's method names to start with the class name.
You were creating a new instance of foo_class when calling its methods. Instead, you want to create a single instance and calls the methods on it.

thread class instance creating a thread function

I have a thread class, in it, I want to create a thread function to do its job corrurently with the thread instance. Is it possible, if yes, how ?
run function of thread class is doing a job at every, excatly, x seconds. I want to create a thread function to do a job parallel with the run function.
class Concurrent(threading.Thread):
def __init__(self,consType, consTemp):
# something
def run(self):
# make foo as a thread
def foo (self):
# something
If not, think about below case, is it possible, how ?
class Concurrent(threading.Thread):
def __init__(self,consType, consTemp):
# something
def run(self):
# make foo as a thread
def foo ():
# something
If it is unclear, please tell . I will try to reedit
Just launch another thread. You already know how to create them and start them, so simply write another sublcass of Thread and start() it along the ones you already have.
Change def foo() for a Thread subclass with run() instead of foo().
First of all, I suggest the you will reconsider using threads. In most cases in Python you should use multiprocessing instead.. That is because Python's GIL.
Unless you are using Jython or IronPython..
If I understood you correctly, just open another thread inside the thread you already opened:
import threading
class FooThread(threading.Thread):
def __init__(self, consType, consTemp):
super(FooThread, self).__init__()
self.consType = consType
self.consTemp = consTemp
def run(self):
print 'FooThread - I just started'
# here will be the implementation of the foo function
class Concurrent(threading.Thread):
def __init__(self, consType, consTemp):
super(Concurrent, self).__init__()
self.consType = consType
self.consTemp = consTemp
def run(self):
print 'Concurrent - I just started'
threadFoo = FooThread('consType', 'consTemp')
threadFoo.start()
# do something every X seconds
if __name__ == '__main__':
thread = Concurrent('consType', 'consTemp')
thread.start()
The output of the program will be:
Concurrent - I just startedFooThread - I just started

Categories

Resources