Cannot change class variables with multiprocessing.Process object in Python3 - python

If I write a class with a class variable, generate two class objects, and change the value of the class variable by using a method of one of the two objects, the class variable value is of course also changed for the other object. Here's what I mean in code:
class DemoClass:
ClassVariable = False
def __init__(self):
pass
def do(self):
print(DemoClass.ClassVariable)
DemoClass.ClassVariable = True
class1 = DemoClass()
class1.do() # False
class2 = DemoClass()
class2.do() # True
However, if I do the same with multiprocessing.Process, it does not work. The class variable value will only change for the object that changed it:
import multiprocessing
class DemoProcess(multiprocessing.Process):
ClassVariable = False
def __init__(self):
multiprocessing.Process.__init__(self)
def run(self):
print(DemoProcess.ClassVariable)
DemoProcess.ClassVariable = True
print(DemoProcess.ClassVariable)
if __name__ == '__main__':
process_list = []
p1 = DemoProcess()
process_list.append(p1)
p1.start() # False True
p2 = DemoProcess()
process_list.append(p2)
p2.start() # False True; should be: True True
for p in process_list:
p.join()
The code behaves as if each process generates a new class variable. Am I doing something wrong?

With the help of the commenters of my original question I came to the conclusion that I had not yet understood how processes work.
Every DemoProcess.start() creates a new Process which can not share its class variables with other processes.
I solved the issue by using a multprocessing.Value object like Mike McKerns proposed in the comments. The value of this object can be shared with multiple processes.
import multiprocessing
class DemoProcess(multiprocessing.Process):
def __init__(self, class_variable):
multiprocessing.Process.__init__(self)
self.class_variable = class_variable
def run(self):
print(self.class_variable.value)
with self.class_variable.get_lock():
self.class_variable.value = True
print(self.class_variable.value)
if __name__ == '__main__':
ClassVariable = multiprocessing.Value('b', False)
process_list = []
p1 = DemoProcess(ClassVariable)
process_list.append(p1)
p1.start() # Output: 0 1
p2 = DemoProcess(ClassVariable)
process_list.append(p2)
p2.start() # Output: 1 1
for p in process_list:
p.join()

Related

Changing values of list in multiprocessing

I am new to python multiprocessing, a background about the below code. I am trying to create three processes, one to add an element to the list, one to modify element in the list, and one to print the list.
The three processes are ideally using the same list that is in shared memory, initiated using manager.
The problem I face is that testprocess2 is not able to set the value to 0, basically, it is not able to alter the list.
class Trade:
def __init__(self, id):
self.exchange = None
self.order_id = id
class testprocess2(Process):
def __init__(self, trades, lock):
super().__init__(args=(trades, lock))
self.trades = trades
self.lock = lock
def run(self):
while True:
# lock.acquire()
print("Altering")
for idx in range(len(self.trades)):
self.trades[idx].order_id = 0
# lock.release()
sleep(1)
class testprocess1(Process):
def __init__(self, trades, lock):
super().__init__(args=(trades, lock))
self.trades = trades
self.lock = lock
def run(self):
while True:
print("start")
for idx in range(len(self.trades)):
print(self.trades[idx].order_id)
sleep(1)
class testprocess(Process):
def __init__(self, trades, lock):
super().__init__(args=(trades, lock))
self.trades = trades
self.lock = lock
def run(self):
while True:
# lock.acquire()
n = random.randint(0, 9)
print("adding random {}".format(n))
self.trades.append(Trade(n))
# lock.release()
# print(trades)
sleep(5)
if __name__ == "__main__":
with Manager() as manager:
records = manager.list([Trade(5)])
lock = Lock()
p1 = testprocess(records, lock)
p1.start()
p2 = testprocess1(records, lock)
p2.start()
p3 = testprocess2(records, lock)
p3.start()
p1.join()
p2.join()
p3.join()
Strictly speaking your managed list is not in shared memory and it is very important to understand what is going on. The actual list holding your Trade instances resides in a process that is created when you execute the Manager() call. When you then execute records = manager.list([Trade(5)]), records is not a direct reference to that list because, as I said, we are not dealing with shared memory. It is instead a special proxy object that implements the same methods as a list but when you, for example, invoke append on this proxy object, it takes the argument you are trying to append and serializes it and transmits it to the manager's process via either a socket or pipe where it gets de-serialized and appended to the actual list. In short, operations on the proxy object are turned into remote method calls.
Now for your problem. You are trying to reset the order_id attribute with the following statement:
self.trades[idx].order_id = 0
Since we are dealing with a remote list via a proxy object, the above statements unfortunately become the equivalent of:
trade = self.trades[idx] # fetch object from the remote list
trade.order_id = 0 # reset the order_id to 0 on the local copy
What is missing is updating the list with the newly updated trade object:
self.trades[idx] = trade
So your single update statement really needs to be replaced with the above 3-statement sequence.
I have also taken the liberty to modify your code in several ways.
The PEP8 Style Guide for Python Code recommends that class names be capitalized.
Since all of your process classes are identical in how they are constructed (i.e. have identical __init__ methods), I have created an abstract base class, TestProcess that these classes inherit from. All they have to do is provide a run method.
I have made these process classes daemon classes. That means that they will terminate automatically when the main process terminates. I did this for demo purposes so that the program does not loop endlessly. The main process will terminate after 15 seconds.
You do not need to pass the trades and lock arguments to the __init__ method of the Process class. If you were not deriving your classes from Process and you just wanted to, for example, have your newly created process be running a function foo that takes arguments trades and lock, then you would specify p1 = Process(target=foo, args=(trades, lock)). That is the real purpose of the args argument, i.e. to be used with the target argument. See documentation for threading.Thread class for details. I actually see very little value in actually deriving your classes from multiprocessing.Process (by not doing so there is better opportunity for reuse). But since you did, you are already in your __init__ method setting instance attributes self.trades and self.lock, which will be used when your run method is invoked implicitly by your calling the start method. There is nothing further you need to do. See the two additional code examples at the end.
from multiprocessing import Process, Manager, Lock
from time import sleep
import random
from abc import ABC, abstractmethod
class Trade:
def __init__(self, id):
self.exchange = None
self.order_id = id
class TestProcess(Process, ABC):
def __init__(self, trades, lock):
Process.__init__(self, daemon=True)
self.trades = trades
self.lock = lock
#abstractmethod
def run():
pass
class TestProcess2(TestProcess):
def run(self):
while True:
# lock.acquire()
print("Altering")
for idx in range(len(self.trades)):
trade = self.trades[idx]
trade.order_id = 0
# We must tell the managed list that it has been updated!!!:
self.trades[idx] = trade
# lock.release()
sleep(1)
class TestProcess1(TestProcess):
def run(self):
while True:
print("start")
for idx in range(len(self.trades)):
print(f'index = {idx}, order id = {self.trades[idx].order_id}')
sleep(1)
class TestProcess(TestProcess):
def run(self):
while True:
# lock.acquire()
n = random.randint(0, 9)
print("adding random {}".format(n))
self.trades.append(Trade(n))
# lock.release()
# print(trades)
sleep(5)
if __name__ == "__main__":
with Manager() as manager:
records = manager.list([Trade(5)])
lock = Lock()
p1 = TestProcess(records, lock)
p1.start()
p2 = TestProcess1(records, lock)
p2.start()
p3 = TestProcess2(records, lock)
p3.start()
sleep(15) # run for 15 seconds
Using classes not derived from multiprocessing.Process
from multiprocessing import Process, Manager, Lock
from time import sleep
import random
from abc import ABC, abstractmethod
class Trade:
def __init__(self, id):
self.exchange = None
self.order_id = id
class TestProcess(ABC):
def __init__(self, trades, lock):
self.trades = trades
self.lock = lock
#abstractmethod
def process():
pass
class TestProcess2(TestProcess):
def process(self):
while True:
# lock.acquire()
print("Altering")
for idx in range(len(self.trades)):
trade = self.trades[idx]
trade.order_id = 0
# We must tell the managed list that it has been updated!!!:
self.trades[idx] = trade
# lock.release()
sleep(1)
class TestProcess1(TestProcess):
def process(self):
while True:
print("start")
for idx in range(len(self.trades)):
print(f'index = {idx}, order id = {self.trades[idx].order_id}')
sleep(1)
class TestProcess(TestProcess):
def process(self):
while True:
# lock.acquire()
n = random.randint(0, 9)
print("adding random {}".format(n))
self.trades.append(Trade(n))
# lock.release()
# print(trades)
sleep(5)
if __name__ == "__main__":
with Manager() as manager:
records = manager.list([Trade(5)])
lock = Lock()
tp = TestProcess(records, lock)
p1 = Process(target=tp.process, daemon=True)
p1.start()
tp1 = TestProcess1(records, lock)
p2 = Process(target=tp1.process, daemon=True)
p2.start()
tp2 = TestProcess2(records, lock)
p3 = Process(target=tp2.process, daemon=True)
p3.start()
sleep(15) # run for 15 seconds
Using functions instead of classes derived from multiprocessing.Process
from multiprocessing import Process, Manager, Lock
from time import sleep
import random
class Trade:
def __init__(self, id):
self.exchange = None
self.order_id = id
def testprocess2(trades, lock):
while True:
# lock.acquire()
print("Altering")
for idx in range(len(trades)):
trade = trades[idx]
trade.order_id = 0
# We must tell the managed list that it has been updated!!!:
trades[idx] = trade
# lock.release()
sleep(1)
def testprocess1(trades, lock):
while True:
print("start")
for idx in range(len(trades)):
print(f'index = {idx}, order id = {trades[idx].order_id}')
sleep(1)
def testprocess(trades, lock):
while True:
# lock.acquire()
n = random.randint(0, 9)
print("adding random {}".format(n))
trades.append(Trade(n))
# lock.release()
# print(trades)
sleep(5)
if __name__ == "__main__":
with Manager() as manager:
records = manager.list([Trade(5)])
lock = Lock()
p1 = Process(target=testprocess, args=(records, lock), daemon=True)
p1.start()
p2 = Process(target=testprocess1, args=(records, lock), daemon=True)
p2.start()
p3 = Process(target=testprocess2, args=(records, lock), daemon=True)
p3.start()
sleep(15) # run for 15 seconds

Sharing a Queue created within a process - Python multiprocessing

I wish to have a list of Queue's shared between processes. The idea is from a "main" process, I can pipe whatever information I want to one of the other processes, but the number of other processes aren't determined.
I cannot create the Queue in the "main" process. I am simulating a decentralised system and creating the Queue in the main process does not fit this paradigm. As such, the Queue's must be created within the other processes.
This poses a difficulty, as I can't find how to share these Queue's with the main process. I have a managed list using multiprocessing.Manager, but if I append a multiprocess.Queue to it, I get:
RuntimeError: Queue objects should only be shared between processes
through inheritance
Appending a standard data type such as an integer works just fine.
MRE below:
import multiprocessing as mp
from time import sleep
class test:
def __init__(self, qlist):
self.qlist = qlist
self.q = mp.Queue()
qlist.append(4)
self.next = None
self.run()
def run(self):
while True:
val = self.q.get()
if val == 1:
p = mp.Process(target = test, args=(qlist, ))
p.start()
else:
print(val)
if __name__ == '__main__':
manager = mp.Manager()
qlist = manager.list()
p = mp.Process(target = test, args=(qlist, ))
p.start()
sleep(0.5)
print(qlist)
p.join()
The idea would be in the if __name__ == '__main__': code, I could look through the qlist and select one of the Queues to pipe information to, such as: qlist[2].put(1) to add a test object or qlist[3].put("Hello") to print "Hello".
The best case scenario would rather be to have a list of test objects (where the test object has its self.q attribute for accessing it's Queue) that I could access from the "main" process, but I'm even less sure of how to do that hence why I'm asking about the Queue's.
Any help with this would be greatly appreciated
You can definitely create queue instances in the main process; this occurs in your test.__init__ method with the statement self.q = mp.Queue(), which is running in the main process. The problem is that a multiprocessing queue cannot be added to a managed list. Here is your program, slightly modified where it does not attempt to add the queues to a managed list. I have also made your test class (now renamed Test) to be a subclass of Process and it will now terminate:
import multiprocessing as mp
class Test(mp.Process):
def __init__(self, value):
mp.Process.__init__(self)
self.value = value
self.q = mp.Queue()
self.q.put(value)
self.next = None
def run(self):
value = self.q.get()
print('value = ', value)
value -= 1
if value > 0:
p = Test(value).start()
if __name__ == '__main__':
p = Test(4).start()
Prints:
value = 4
value = 3
value = 2
value = 1
If you want to maintain a list of objects, then it would be better if Test is not a subclass of Process:
import multiprocessing as mp
class Test():
def __init__(self, lst, value):
lst.append(self)
self.lst = lst
self.value = value
self.q = mp.Queue()
self.q.put(value)
self.next = None
def run(self):
value = self.q.get()
print('value = ', value)
value -= 1
if value > 0:
test = Test(self.lst, value)
p = mp.Process(target=test.run).start()
if __name__ == '__main__':
manager = mp.Manager()
lst = manager.list()
test = Test(lst, 4)
p = mp.Process(target=test.run).start()
import time
time.sleep(3)
print(lst)
Prints:
value = 4
value = 3
value = 2
value = 1
[<__mp_main__.Test object at 0x0000028E6DAD5DC0>, <__mp_main__.Test object at 0x0000028E6DAD5FA0>, <__mp_main__.Test object at 0x0000028E6DAD5E50>, <__mp_main__.Test object at 0x0000028E6DAD5D90>]
But here is a big BUT:
Each of those objects "live" in a different address space and the references can only have meaning when accessed from the original address space they were created in. So this is pretty useless:
import multiprocessing as mp
class Test():
def __init__(self, lst, value):
lst.append(self)
self.lst = lst
self.value = value
self.q = mp.Queue()
self.q.put(value)
self.next = None
def run(self):
value = self.q.get()
print('value = ', value)
value -= 1
if value > 0:
test = Test(self.lst, value)
p = mp.Process(target=test.run).start()
if __name__ == '__main__':
manager = mp.Manager()
lst = manager.list()
test = Test(lst, 4)
p = mp.Process(target=test.run).start()
import time
time.sleep(3)
print(test, test.__class__, test.value)
print(lst)
for elem in lst:
print(type(elem))
print(elem.value)
Prints:
value = 4
value = 3
value = 2
value = 1
<__main__.Test object at 0x0000020E52E6A640> <class '__main__.Test'> 4
[<__mp_main__.Test object at 0x0000016827704DC0>, <__mp_main__.Test object at 0x0000016827704FA0>, <__mp_main__.Test object at 0x0000016827704250>, <__mp_main__.Test object at 0x0000016827704D90>]
<class '__main__.Test'>
Traceback (most recent call last):
File "C:\Ron\test\test.py", line 31, in <module>
print(elem.value)
AttributeError: 'Test' object has no attribute 'value'

Communicating between processes Python

I am trying to work out the solution that a process would tell the other process that some values have changed.
import multiprocessing
import time
class Consumer(multiprocessing.Process):
def __init__(self, share):
super().__init__()
self.share = share
def run(self):
print (self.share)
self.share = "xxx"
share = "ssss"
A = Consumer(share)
B = Consumer(share)
if __name__ == '__main__':
A = Consumer(share)
A.start()
time.sleep(5)
B = Consumer(share)
B.start()
expecting to have "xxx" to be printed when B runs. but got "ssss" as initial value.
after some researches, multiprocess.manager package can be used to achieve it. But due to the concerns of speed, i.e. 100 processes, with high frequency of accessing the share value, the lock would become a bottleneck.
Is there way to be able to lock the object when change the value but reading??
Use a manager to share objects across processes:
import multiprocessing
import time
class Consumer(multiprocessing.Process):
def __init__(self, manager_namespace):
super().__init__()
self.share = manager_namespace
def run(self):
print (self.share.myString)
self.share.myString = "xxx"
if __name__ == '__main__':
manager = multiprocessing.Manager()
namespace = manager.Namespace()
namespace.myString = 'sss'
B = Consumer(namespace)
A = Consumer(namespace)
A.start()
time.sleep(5)
B = Consumer(namespace)
B.start()
At least in my system it gives the required output.

In a parent process, how to see child variables that are managed by child processes?

I defined a class Node, which defines a listener service to constantly communicate and update local variables. The listener is started using multiprocessing. The class looks like:
# Pseudo-code
import multiprocessing
class Node(object):
def __init__(self, x):
self.variables = x
def listener(self):
while(True):
COMMUNICATE WITH OTHERS # Pseudo-code
UPDATE self.variable # Pseudo-code
print(self.variable) # local printer
def run(self):
p = multiprocessing.Process(target=self.listener)
p.start()
In the main process, I created two nodes a = Node(x1), b = Node(x2), and let them run
if __name__ == "__main__":
x1 = 1 # for example
x2 = 1000 # for example
a = Node(x1)
b = Node(x2)
a.run()
b.run()
while(True):
print(a.variable) # global printer
print(b.variable) # global printer
In this way, Node-a communicates with Node-b and updates its variables, and so does Node-b.
Now I come with a problem: Local printers output updated variable values correctly, but global printers do not. Actually, the global printers always output unchanged values (x1, x2, same as initial).
What's wrong with the code Or how to see the child process variables?
You won't be able to do that unless you use any mechanism to communicate with the parent. I recommend you use Manager dicts.
import random
import time
import multiprocessing as mp
class Child(mp.Process):
def __init__(self, shared_variables):
super(Child, self).__init__()
self.shared_variables = shared_variables
def run(self):
for _ in range(5): # Change shared variable value 5 times
self.shared_variables['var'] = random.randint(0, 10)
self.shared_variables['var1'] = random.randint(0, 10)
time.sleep(3)
if __name__ == "__main__":
shared_variables = mp.Manager().dict()
child = Child(shared_variables)
child.start()
while True:
print('Form parent')
for k, v in shared_variables.items():
print(f'-> {k}: {v}')
print('***********')
time.sleep(3)
And the output would look like this:
Form parent
-> var: 8
-> var1: 6
***********
Form parent
-> var: 7
-> var1: 7
***********
....

python concurrency for instances of same class using Object Oriented methods

I want two objects of the same class to operate concurrently. The class "MyClass" has a function that connects an instance to another instance of the class. I also need to keep track of the objects that have been created (oList). what I am trying is:
main.py:
from MyClass import MyClass
import time
oList = []
class oCreator1(Thread):
def __init__(self):
Thread.__init__(self)
self.o1 = MyClass()
def run(self):
while 1:
time.sleep(1)
print "Hi"
def getO1(self):
return self.o1
class oCreator2(Thread):
def __init__(self):
Thread.__init__(self)
self.o2 = MyClass()
def run(self):
while 1:
time.sleep(1)
print "Bye!"
def getO2(self):
return self.o2
main():
threadList = []
global oList
oc1 = oCreator1()
threadList.append(oc1)
o1 = oc1.getO1()
oList.append(o1)
oc2 = oCreator2()
threadList.append(oc2)
o2 = oc2.getO2()
oList.append(o2)
o1.connToAnotherO(o2)
print oList
for t in threadList:
t.start()
t.join()
if __name__ == '__main__':
main()
But the only thing that is printed is "Hi". I really want to know the things I'm doing wrong and the right way to do it. Thank you in advance.
for t in threadList:
t.start()
t.join()
The t.join() call waits for the thread t to finish. That means when you start the first thread, you wait for it to finish before starting the second, but the first thread is just going to keep printing Hi forever. It'll never finish.
Don't join, or don't start joining until all threads have started.

Categories

Resources