Python: sharing class variables across threads - python

I have a counter (training_queue) shared among many instances of a class. The class inherits threading.Thread, so it implements a run() method. When I call start(), I expect each thread to increment this counter, so when it reaches a limit no more threads are started. However, none of the threads modifies the variable. Here's the code:
class Engine(threading.Thread):
training_mutex = threading.Semaphore(MAX_TRAIN)
training_queue = 0
analysis_mutex = threading.Semaphore(MAX_ANALYSIS)
analysis_queue = 0
variable_mutex = threading.Lock()
def __init__(self, config):
threading.Thread.__init__(self)
self.config = config
self.deepnet = None
# prevents engine from doing analysis while training
self.analyze_lock = threading.Lock()
def run(self):
with self.variable_mutex:
self.training_queue += 1
print self.training_queue
with self.training_mutex:
with self.analyze_lock:
self.deepnet = self.loadLSTM3Model()
I protect the training_queue with a Lock, so it should be thread-safe. How ever, if I print its value its always 1. How does threading affect variable scope in this case?

Your understanding of how state is shared between threads is correct. However, you are using instance attribute "training_queue" instead of class attribute "training_queue".
That is, you always set training_queue to 1 for each new object.
For example:
import threading
class Engine(threading.Thread):
training_queue = 0
print_lock = threading.Lock()
def __init__(self, config):
threading.Thread.__init__(self)
def run(self):
with Engine.print_lock:
self.training_queue += 1
print self.training_queue
Engine('a').start()
Engine('b').start()
Engine('c').start()
Engine('d').start()
Engine('e').start()
Will return:
1
1
1
1
1
But:
import threading
class Engine(threading.Thread):
training_queue = 0
print_lock = threading.Lock()
def __init__(self, config):
threading.Thread.__init__(self)
def run(self):
with Engine.print_lock:
Engine.training_queue += 1 # <-here
print self.training_queue
Engine('a').start()
Engine('b').start()
Engine('c').start()
Engine('d').start()
Engine('e').start()
Returns:
1
2
3
4
5
Note self.training_queue vs Engine.training_queue
btw. I think += in python should be atomic so I wouldn't bother with the lock. However, not the usage of lock for printing to stdout in the example above.

Related

Call existing class function using Threading or Concurrent Futures - Python

Would like to know if you can call an existing class function using Threading or Concurrent Futures without any issues? It works, but I'm curious about the implication in doing this. The reason I want to do this is because I want to keep class state information.
This is a sample with the general idea.
import concurrent.futures
import time
class Test:
def __init__(self) -> None:
super().__init__()
self.counter = 0
def run(self):
print('Test')
self.counter += 1
test1 = Test()
test2 = Test()
with concurrent.futures.ThreadPoolExecutor() as executer:
while True:
t1 = executer.submit(test1.run)
t2 = executer.submit(test2.run)
time.sleep(1.0)
Edit: the threads will use shared information downstream

In a parent process, how to see child variables that are managed by child processes?

I defined a class Node, which defines a listener service to constantly communicate and update local variables. The listener is started using multiprocessing. The class looks like:
# Pseudo-code
import multiprocessing
class Node(object):
def __init__(self, x):
self.variables = x
def listener(self):
while(True):
COMMUNICATE WITH OTHERS # Pseudo-code
UPDATE self.variable # Pseudo-code
print(self.variable) # local printer
def run(self):
p = multiprocessing.Process(target=self.listener)
p.start()
In the main process, I created two nodes a = Node(x1), b = Node(x2), and let them run
if __name__ == "__main__":
x1 = 1 # for example
x2 = 1000 # for example
a = Node(x1)
b = Node(x2)
a.run()
b.run()
while(True):
print(a.variable) # global printer
print(b.variable) # global printer
In this way, Node-a communicates with Node-b and updates its variables, and so does Node-b.
Now I come with a problem: Local printers output updated variable values correctly, but global printers do not. Actually, the global printers always output unchanged values (x1, x2, same as initial).
What's wrong with the code Or how to see the child process variables?
You won't be able to do that unless you use any mechanism to communicate with the parent. I recommend you use Manager dicts.
import random
import time
import multiprocessing as mp
class Child(mp.Process):
def __init__(self, shared_variables):
super(Child, self).__init__()
self.shared_variables = shared_variables
def run(self):
for _ in range(5): # Change shared variable value 5 times
self.shared_variables['var'] = random.randint(0, 10)
self.shared_variables['var1'] = random.randint(0, 10)
time.sleep(3)
if __name__ == "__main__":
shared_variables = mp.Manager().dict()
child = Child(shared_variables)
child.start()
while True:
print('Form parent')
for k, v in shared_variables.items():
print(f'-> {k}: {v}')
print('***********')
time.sleep(3)
And the output would look like this:
Form parent
-> var: 8
-> var1: 6
***********
Form parent
-> var: 7
-> var1: 7
***********
....

cannot increment value when using threadpool in python

I am using the class that was provided by Python Thread Pool (Python recipe) in order to simulate thread pooling. I am trying to increment the value counter in function test. The problem is that it is remaining 0. I used lock that was explained in Is this simple python code thread safe but still it's not working.
Source code
#! /usr/bin/python
# -*- coding: utf-8 -*-
from Queue import Queue
from threading import Thread
import threading
lock = threading.Lock()
class Worker(Thread):
"""Thread executing tasks from a given tasks queue"""
def __init__(self, tasks):
Thread.__init__(self)
self.tasks = tasks
self.daemon = True
self.start()
def run(self):
while True:
func, args, kargs = self.tasks.get()
try: func(*args, **kargs)
except Exception, e: print e
self.tasks.task_done()
class ThreadPool:
"""Pool of threads consuming tasks from a queue"""
def __init__(self, num_threads):
self.tasks = Queue(num_threads)
for _ in range(num_threads): Worker(self.tasks)
def add_task(self, func, *args, **kargs):
"""Add a task to the queue"""
self.tasks.put((func, args, kargs))
def wait_completion(self):
"""Wait for completion of all the tasks in the queue"""
self.tasks.join()
def exp1_thread(counter):
with lock:
print counter
counter = counter + 1
def test():
# 1) Init a Thread pool with the desired number of threads
pool = ThreadPool(6)
counter = 0
for i in range(0, 10):
pool.add_task(exp1_thread,counter)
# 3) Wait for completion
pool.wait_completion()
if __name__ == "__main__":
test()
output
counter 0
counter 0
counter 0
counter 0
counter 0
counter 0
counter 0
counter 0
counter 0
counter 0
The reason why you are getting all zeros is because the integer value counter is passed by value to the threads. Each thread receives a copy of the counter and then goes to town.
You can fix this by finding a way to pass the value by reference.
Option 1. Define counter as a list you pass around:
def exp1_thread(counter):
with lock:
print counter[0]
counter[0] = counter[0] + 1
def test():
# 1) Init a Thread pool with the desired number of threads
pool = ThreadPool(6)
counter = [0]
for i in range(0, 10):
pool.add_task(exp1_thread, counter)
# 3) Wait for completion
pool.wait_completion()
Option 2. Create an object you pass around.
class Counter:
def __init__(self, initial_count):
self.count = initial_count
def exp1_thread(counter):
with lock:
print counter.count
counter.count = counter.count + 1
def test():
# 1) Init a Thread pool with the desired number of threads
pool = ThreadPool(6)
counter = Counter(0)
for i in range(0, 10):
pool.add_task(exp1_thread, counter)
# 3) Wait for completion
pool.wait_completion()
This has nothing to do with threading. The actual reason is that int is immutable in Python.
A function that just increments an int would not have the desired effect.
def inc(x):
x +=1
y = 0
inc(y)
print y # 0
If you want to increment the number you can store it in a mutable datatype (such as a list or dict) and manipulate the list.
int work differently from dictionaries, I'm sure someone well versed in Python logic can explain the difference. This is the logic I usually use and it works. As I just realized it's because you're passing the object (dict, list etc) and not the value itself.
Either declare your variables as globals (but be careful) or use say a dictionary with individual key slots for the different threads and sum them up at the end.
from threading import *
from time import sleep
myMap = {'counter' : 0}
class worker(Thread):
def __init__(self, counterMap):
Thread.__init__(self)
self.counterMap = counterMap
self.start()
def run(self):
self.counterMap['counter'] += 1
worker(myMap)
sleep(0.2)
worker(myMap)
sleep(0.2)
print(myMap)

class method process is not changing objects' attributes

It's my second day in Python, I found it's a really cool language and I want to try different things in it.
Is it possible to call an object and create a daemon of that object's method which would change the objects attributes?
from multiprocessing import Process
import time
class Foo(object):
def __init__(self):
self.number = 1
# this attribute...
def loop(self):
while 1:
print self.number
# ...is changed here
self.number += 1
time.sleep(1)
if __name__ == '__main__':
f = Foo()
p = Process(target=f.loop)
p.deamon = True # this makes it work in the background
p.start()
# proceed with the main loop...
while 1:
time.sleep(1)
print f.number * 10
The result:
1
10
2
10
3
10
4
10
...
Why doesn't f.loop() change the self.number of f? They are both part of the same class Foo().
What can I change to receive this output:
1
10
2
20
3
30
4
40
...
/edit 1:
I tried this, with the same result (why?):
class Foo(Process):
def __init__(self):
super(Foo, self).__init__()
self.daemon = True # is daemon
self.number = 1
self._target = self.loop # on start() it will run loop()
def loop(self):
while 1:
print self.number
self.number += 1
time.sleep(1)
if __name__ == '__main__':
f = Foo() # is now Process
f.start() # runs f.loop()
while 1:
time.sleep(1)
print f.number * 10
Same output as before.
You're using multiprocessing. The short (and somewhat simplified) answer is that processes to do not share memory by default. Try using threading instead.
If you're hell bent on experimenting with shared memory and processes then look at sharing state in the documentation on multiprocessing.
Also daemon doesn't do what you think it does. If a process creates children then it will attempt to kill all it's daemonic children when it exits. All Processes will work in the background, you just need to start them.

Python threading. How do I lock a thread?

I'm trying to understand the basics of threading and concurrency. I want a simple case where two threads repeatedly try to access one shared resource.
The code:
import threading
class Thread(threading.Thread):
def __init__(self, t, *args):
threading.Thread.__init__(self, target=t, args=args)
self.start()
count = 0
lock = threading.Lock()
def increment():
global count
lock.acquire()
try:
count += 1
finally:
lock.release()
def bye():
while True:
increment()
def hello_there():
while True:
increment()
def main():
hello = Thread(hello_there)
goodbye = Thread(bye)
while True:
print count
if __name__ == '__main__':
main()
So, I have two threads, both trying to increment the counter. I thought that if thread 'A' called increment(), the lock would be established, preventing 'B' from accessing until 'A' has released.
Running the makes it clear that this is not the case. You get all of the random data race-ish increments.
How exactly is the lock object used?
Additionally, I've tried putting the locks inside of the thread functions, but still no luck.
You can see that your locks are pretty much working as you are using them, if you slow down the process and make them block a bit more. You had the right idea, where you surround critical pieces of code with the lock. Here is a small adjustment to your example to show you how each waits on the other to release the lock.
import threading
import time
import inspect
class Thread(threading.Thread):
def __init__(self, t, *args):
threading.Thread.__init__(self, target=t, args=args)
self.start()
count = 0
lock = threading.Lock()
def incre():
global count
caller = inspect.getouterframes(inspect.currentframe())[1][3]
print "Inside %s()" % caller
print "Acquiring lock"
with lock:
print "Lock Acquired"
count += 1
time.sleep(2)
def bye():
while count < 5:
incre()
def hello_there():
while count < 5:
incre()
def main():
hello = Thread(hello_there)
goodbye = Thread(bye)
if __name__ == '__main__':
main()
Sample output:
...
Inside hello_there()
Acquiring lock
Lock Acquired
Inside bye()
Acquiring lock
Lock Acquired
...
import threading
# global variable x
x = 0
def increment():
"""
function to increment global variable x
"""
global x
x += 1
def thread_task():
"""
task for thread
calls increment function 100000 times.
"""
for _ in range(100000):
increment()
def main_task():
global x
# setting global variable x as 0
x = 0
# creating threads
t1 = threading.Thread(target=thread_task)
t2 = threading.Thread(target=thread_task)
# start threads
t1.start()
t2.start()
# wait until threads finish their job
t1.join()
t2.join()
if __name__ == "__main__":
for i in range(10):
main_task()
print("Iteration {0}: x = {1}".format(i,x))

Categories

Resources