Python multi-threading: Need advice to synchronize 2 threads using conditional variable

Python multi-threading: Need advice to synchronize 2 threads using conditional variable - python

I only know basic concepts for multi-threading, and I currently encounter a situation that needs some help.
I have two tasks to finish, and both should be executed continuously. The thing is that the second task should start only after the first thread did some jobs first. Right now the two thread classes look roughly like the following:
finished = False # shared flag
class first(threading.Thread):
def __init__(self, cond, finished):
threading.Thread.__init__(self)
self.cond = cond
self.finished = finished
def run(self):
self.cond.aquire()
do_something()
self.finished = True #change the flag
self.cond.notify()
self.cond.release()
do_something_else()
class second(threading.Thread):
def __init__(self, cond, finished):
threading.Thread.__init__(self)
self.cond = cond
self.finished = finished
def run(self):
self.cond.aquire()
while self.finished == False:
self.cond.wait()
self.cond.release()
do_something()
However, the fact is that the program still executes randomly regardless of the wait() and notify(). Can anybody help me with this issue? Thanks.

self.finished in class first is a copy of the value of the global finished, not a reference to it, so it has no live relationship to the self.finished of class second.
You should probably create a global Queue object (which is designed to be used with the threading module). Have both classes refer to the queue, and have the first thread write a go-ahead signal to the queue, and the second thread block until it reads the go-ahead.

You can avoid synchronization altogether. Use 3 threads instead of 2.
Thread 1a 'does some job' and terminates.
Thread 1b starts where 1a ended, and
Thread 2 starts independently.
(Also I suppose you know that you cannot effectively share CPU with Python threads; these are only good for I/O waiting in parallel. When you need CPU-bound parallelization, you use multiprocessing.)

Related

Python3: How to stop/kill thread

My code runs N number of threads. I want to stop specific threads on some condition but the remaining threads should continue running. I am doing some operation once each thread finishes its job. Is there a way to stop running thread in Python 3.
My current code is implemented in Python2 which does this by "_Thread__stop()". Is there any identical thing in Python3?

The practice is to "signal" the thread that it is time to finish and then the thread needs to exit. This is not killing like you kill a process but a regular state machine behavior of your thread function.
For example, suppose your thread is lopping. You should insert an if statement inside the loop that instructing the thread function to break or return if stop is True. The stop variable should be a shared variable with the main thread (or the thread who need to stop out thread) that will change it to True. usually after this, the stopper thread will want to wait for the thread completion by join()

It's a bad habit to kill a thread, better is to create a "flag" which will tell you when your thread made its work done.
Consider the following example:
import threading
import random
class CheckSomething(threading.Thread):
def __init__(self, variable):
super(CheckSomething, self).__init__()
self.start_flag = threading.Event()
self.variable = variable
def check_position(self, variable):
x = random.randint(100)
if variable == x:
self.stop_checking()
def run(self):
while True:
self.check_position(self.variable)
def stop_checking():
self.start_flag.set()
def stopped():
return self.start_flag.is_set()
The set() method of Event() set its status to True. More you can read in docs: https://docs.python.org/3.5/library/threading.html
So you need to call stop_checking() when you meet a condition where you want exit.

How to safe Python multithreading?

I have a thread code in python like this. But I am not sure whether I am doing in correct way or not.
Class MyThread(threading.thread):
def __init__(self, thread_id, thread_name):
self.thread_name = thread_name
self.thread_id = thread_id
def run(self):
do_something()
def do_something():
while True:
do_something_else()
time.sleep(5)
Class SomeClass:
def __init__():
pass
def run():
thread1 = MyThread(1, "thread1")
thread2 = MyThread(2, "thread2")
thread3 = MyThread(3, "thread3")
def main():
agent = Someclass()
agent.run()
Whether this is the safe way to deal with multiple thread? How does it impact other applications? Is there a chance, that execution of one thread can hinder the execution of others? What happens , if the threads got blocked in any cycle?
Also how to make sure that, thread doesn't gets blocked for forever b'coz of any reason. If it gets blocked , then after fixed timeinterval it should come out gracefully and continue in next loop.

That is why Python and some other languages introduce the lock
This page will help you, you need to read something about Lock, RLock and Condition

Your code's thread safety is really dependent on what's in do_something() and do_something_else(). It's thread safe if you're only modifying local variables. But the moment you start reading/modifying shared variables/storage, like a file or a global variable, then you need to use something like locks or semaphores to ensure thread safety.
You can read about Python's threading module here.
This Wikipedia articles on synchronization and locks may be helpful to you too.
If you need examples for writing multi-threading code, here's a good example using different synchronization mechanisms.

Python threading design

I'm trying to write a mini-game that allows me to practice my python threading skill. The game itself involves with timed bombs and citys that have them.
Here is my code:
class City(threading.Thread):
def __init__(self, name):
super().__init__()
self.name = name
self.bombs = None
self.activeBomb = None
self.bombID = 0
self.exploded = False
def addBomb(self, name, time, puzzle, answer, hidden=False):
self.bombs.append(Bomb(name, self.bombID, time, puzzle, answer, hidden))
self.activeBomb.append(self.bombID)
self.bombID += 1
def run(self):
for b in self.bombs:
b.start()
while True:
# listen to the bombs in the self.bombs # The part that I dont know how
# if one explodes
# print(self.name + ' has been destroyed')
# break
# if one is disarmed
# remove the bombID from the activeBomb
# if all bombs are disarmed (no activeBomb left)
# print('The city of ' + self.name + ' has been cleansed')
# break
class Bomb(threading.Thread):
def __init__(self, name, bombID, time, puzzle, answer, hidden=False):
super(Bomb, self).__init__()
self.name = name
self.bombID = bombID
self._timer = time
self._MAXTIME = time
self._disarmed = False
self._puzzle = puzzle
self._answer = answer
self._denoted = False
self._hidden = hidden
def run(self):
# A bomb goes off!!
if not self._hidden:
print('You have ' + str(self._MAXTIME)
+ ' seconds to solve the puzzle!')
print(self._puzzle)
while True:
if self._denoted:
print('BOOM')
// Communicate to city that bomb is denoted
break
elif not self._disarmed:
if self._timer == 0:
self._denoted = True
else:
self._timer -= 1
sleep(1)
else:
print('You have successfully disarmed bomb ' + str(self.name))
// Communicate to city that this bomb is disarmed
break
def answerPuzzle(self, ans):
print('Is answer ' + str(ans) + ' ?')
if ans == self._answer:
self._disarmed = True
else:
self._denotaed = True
def __eq__(self, bomb):
return self.bombID == bomb.bombID
def __hash__(self):
return id(self)
I currently don't know what is a good way for the City class to effectively keep track of the
bomb status.
The first thought I had was to use a for loop to have the City to check all the bombs in the
City, but I found it being too stupid and inefficient
So here is the question:
What is the most efficient way of implementing the bomb and City so that the city immediately know the state change of a bomb without having to check it every second?
PS: I do NOT mean to use this program to set off real bomb, so relax :D

A good case to use queue. Here is an example of the so-called producer - consumer pattern.
The work threads will run forever till your main program is done (that is what the daemon part and the "while True" is for). They will diligently monitor the in_queue for work packages. They will process the package until none is left. So when the in_queue is joined, your work threads' jobs are done. The out_queue here is an optional downstream processing step. So you can assemble the pieces from the work threads to a summary form. Useful when they are in a function.
If you need some outputs, like each work thread will print the results out to the screen or write to one single file, don't forget to use semaphore! Otherwise, your output will stumble onto each other.
Good luck!
from threading import Thread
import Queue
in_queue = Queue.Queue()
out_queue = Queue.Queue()
def work():
while True:
try:
sonId = in_queue.get()
###do your things here
result = sonID + 1
###you can even put your thread results again in another queue here
out_queue.put(result) ###optional
except:
pass
finally:
in_queue.task_done()
for i in range(20):
t = Thread(target=work)
t.daemon = True
t.start()
for son in range(10):
in_queue.put(son)
in_queue.join()
while not out_queue.empty():
result = out_queue.get()
###do something with your result here
out_queue.task_done()
out_queue.join()

The standard way of doing something like this is to use a queue - one thread watches the queue and waits for an object to handle (allowing it to idle happily), and the other thread pushes items onto the queue.
Python has the queue module (Queue in 2.x). Construct a queue in your listener thread and get() on it - this will block until something gets put on.
In your other thread, when a relevant event occurs, push it onto the queue and the listener thread will wake up and handle it. If you do this in a loop, you have the behaviour you want.

The easiest way would be to use a scheduler library. E.g. https://docs.python.org/2/library/sched.html. Using this you can simply schedule bombs to call a function or method at the time they go off. This is what I would recommend if you did not wanted to learn about threads.
E.g.
import sched
s = sched.scheduler(time.time, time.sleep)
class Bomb():
def explode(self):
if not self._disarmed:
print "BOOM"
def __init__(self, time):
s.enter(self._MAXTIME, 1, self.explode)
However, that way you will not learn about threads.
If you really want to use threads directly, then you can simply let the bombs call sleep until it is their time to go off. E.g.
class Bomb(threading.Thread)
def run(self):
time.sleep.(self._MAXTIME)
if not self._disarmed:
print "BOOM"
However, this is not a nice way to handle threads, since the threads will block your application. You will not be able to exit the application until you stop the threads. You can avoid this by making the thread a daemon thread. bomb.daemon = True.
In some cases, the best way to handle this is to actually "wake up" each second and check the status of the world. This may be the case when you need to perform some cleanup actions when the thread is stopped. E.g. You may need to close a file. Checking each second may seem wasteful, but it is actually the proper way to handle such problems. Modern desktop computers are mostly idle. To be interrupted for a few milliseconds each second will not cause them much sweat.
class Bomb(threading.Thread)
def run(self):
while not self._disarmed:
if time.now() > self.time_to_explode:
print "BOOM"
break
else:
time.sleep.(1)

Before you start "practising threading with python", I think it is important to understand Python threading model - it is Java threading model, but comes with a more restrictive option:
https://docs.python.org/2/library/threading.html
The design of this module is loosely based on Java’s threading model.
However, where Java makes locks and condition variables basic behavior
of every object, they are separate objects in Python. Python’s Thread
class supports a subset of the behavior of Java’s Thread class;
currently, there are no priorities, no thread groups, and threads
cannot be destroyed, stopped, suspended, resumed, or interrupted. The
static methods of Java’s Thread class, when implemented, are mapped to
module-level functions.
Locks being in separate objects, and not per-object, following the diagram below, means less independent scheduling even when different objects are accessed - because possibly even same locks are necessary.
For some python implementation - threading is not really fully concurrent:
http://uwpce-pythoncert.github.io/EMC-Python300-Spring2015/html_slides/07-threading-and-multiprocessing.html#slide-5
A thread is the entity within a process that can be scheduled for
execution
Threads are lightweight processes, run in the address space of an OS
process.
These threads share the memory and the state of the process. This
allows multiple threads access to data in the same scope.
Python threads are true OS level threads
Threads can not gain the performance advantage of multiple processors
due to the Global Interpreter Lock (GIL)
http://uwpce-pythoncert.github.io/EMC-Python300-Spring2015/html_slides/07-threading-and-multiprocessing.html#slide-6
And this (from above slide):

Simulating Cancellation Tokens in Python Threading

I just wrote a task queue in Python whose job is to limit the number of tasks that are run at one time. This is a little different than Queue.Queue because instead of limiting how many items can be in the queue, it limits how many can be taken out at one time. It still uses an unbounded Queue.Queue to do its job, but it relies on a Semaphore to limit the number of threads:
from Queue import Queue
from threading import BoundedSemaphore, Lock, Thread
class TaskQueue(object):
"""
Queues tasks to be run in separate threads and limits the number
concurrently running tasks.
"""
def __init__(self, limit):
"""Initializes a new instance of a TaskQueue."""
self.__semaphore = BoundedSemaphore(limit)
self.__queue = Queue()
self.__cancelled = False
self.__lock = Lock()
def enqueue(self, callback):
"""Indicates that the given callback should be ran."""
self.__queue.put(callback)
def start(self):
"""Tells the task queue to start running the queued tasks."""
thread = Thread(target=self.__process_items)
thread.start()
def stop(self):
self.__cancel()
# prevent blocking on a semaphore.acquire
self.__semaphore.release()
# prevent blocking on a Queue.get
self.__queue.put(lambda: None)
def __cancel(self):
print 'canceling'
with self.__lock:
self.__cancelled = True
def __process_items(self):
while True:
# see if the queue has been stopped before blocking on acquire
if self.__is_canceled():
break
self.__semaphore.acquire()
# see if the queue has been stopped before blocking on get
if self.__is_canceled():
break
callback = self.__queue.get()
# see if the queue has been stopped before running the task
if self.__is_canceled():
break
def runTask():
try:
callback()
finally:
self.__semaphore.release()
thread = Thread(target=runTask)
thread.start()
self.__queue.task_done()
def __is_canceled(self):
with self.__lock:
return self.__cancelled
The Python interpreter runs forever unless I explicitly stop the task queue. This is a lot more tricky than I thought it would be. If you look at the stop method, you'll see that I set a canceled flag, release the semaphore and put a no-op callback on the queue. The last two parts are necessary because the code could be blocking on the Semaphore or on the Queue. I basically have to force these to go through so that the loop has a chance to break out.
This code works. This class is useful when running a service that is trying to run thousands of tasks in parallel. In order to keep the machine running smoothly and to prevent the OS from screaming about too many active threads, this code will limit the number of threads living at any one time.
I have written a similar chunk of code in C# before. What made that code particular cut 'n' dry was that .NET has something called a CancellationToken that just about every threading class uses. Any time there is a blocking operation, that operation takes an optional token. If the parent task is ever canceled, any child tasks blocking with that token will be immediately canceled, as well. This seems like a much cleaner way to exit than to "fake it" by releasing semaphores or putting values in a queue.
I was wondering if there was an equivalent way of doing this in Python? I definitely want to be using threads instead of something like asynchronous events. I am wondering if there is a way to achieve the same thing using two Queue.Queues where one is has a max size and the other doesn't - but I'm still not sure how to handle cancellation.

I think your code can be simplified by using poisoning and Thread.join():
from Queue import Queue
from threading import Thread
poison = object()
class TaskQueue(object):
def __init__(self, limit):
def process_items():
while True:
callback = self._queue.get()
if callback is poison:
break
try:
callback()
except:
pass
finally:
self._queue.task_done()
self._workers = [Thread(target=process_items) for _ in range(limit)]
self._queue = Queue()
def enqueue(self, callback):
self._queue.put(callback)
def start(self):
for worker in self._workers:
worker.start()
def stop(self):
for worker in self._workers:
self._queue.put(poison)
while self._workers:
self._workers.pop().join()
Untested.
I removed the comments, for brevity.
Also, in this version process_items() is truly private.
BTW: The whole point of the Queue module is to free you from the dreaded locking and event stuff.

You seem to be creating a new thread for each task from the queue. This is wasteful in itself, and also leads you to the problem of how to limit the number of threads.
Instead, a common approach is to create a fixed number of worker threads and let them freely pull tasks from the queue. To cancel the queue, you can clear it and let the workers stay alive in anticipation of future work.

I took Janne Karila's advice and created a thread pool. This eliminated the need for a semaphore. The problem is if you ever expect the queue to go away, you have to stop the worker threads from running (just a variation of what I did before). The new code is fairly similar:
class TaskQueue(object):
"""
Queues tasks to be run in separate threads and limits the number
concurrently running tasks.
"""
def __init__(self, limit):
"""Initializes a new instance of a TaskQueue."""
self.__workers = []
for _ in range(limit):
worker = Thread(target=self.__process_items)
self.__workers.append(worker)
self.__queue = Queue()
self.__cancelled = False
self.__lock = Lock()
self.__event = Event()
def enqueue(self, callback):
"""Indicates that the given callback should be ran."""
self.__queue.put(callback)
def start(self):
"""Tells the task queue to start running the queued tasks."""
for worker in self.__workers:
worker.start()
def stop(self):
"""
Stops the queue from processing anymore tasks. Any actively running
tasks will run to completion.
"""
self.__cancel()
# prevent blocking on a Queue.get
for _ in range(len(self.__workers)):
self.__queue.put(lambda: None)
self.__event.wait()
def __cancel(self):
with self.__lock:
self.__queue.queue.clear()
self.__cancelled = True
def __process_items(self):
while True:
callback = self.__queue.get()
# see if the queue has been stopped before running the task
if self.__is_canceled():
break
try:
callback()
except:
pass
finally:
self.__queue.task_done()
self.__event.set()
def __is_canceled(self):
with self.__lock:
return self.__cancelled
If you look carefully, I had to do some accounting to kill off the workers. I basically wait on an Event for as many times as there are workers. I clear the underlying queue to prevent workers from being cancelled any other way. I also wait after pumping each bogus value into the queue, so only one worker can cancel out at a time.
I've ran some tests on this and it appears to be working. It would still be nice to eliminate the need for bogus values.

How do I terminate selected threads

I have a code where in I have two types of threads. 3 threads are spawned from the second. I wanted to know if there is a function which I can call, which will terminate the three spawned threads of the second type but still keeping the first one running.

A common solution is to have a global variable that the threads check if they should terminate or not.
Edit: An example of one way of doing it:
class MyThread(Thread):
def __init__(self):
self.keep_running = True
def run(self):
while self.keep_running:
# Do stuff
my_thread = MyThread()
my_thread.start()
# Do some other stuff
my_thread.keep_running = False
my_thread.join()

You can keep a thread pool for each type of thread and then terminate them accordingly. For instance, you can keep them in a Queue.Queue globally and then .stop() each as needed.
Edit// You can join every child thread you wish to stop to its parent with .join()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python multi-threading: Need advice to synchronize 2 threads using conditional variable - python

Related

Python3: How to stop/kill thread

How to safe Python multithreading?

Python threading design

Simulating Cancellation Tokens in Python Threading

How do I terminate selected threads

Categories

Resources