I'm currently learning to use threads in Python, and I'm playing around with this dummy bit of code for practice:
import threading
import queue
import time
my_queue = queue.Queue()
lock = threading.Lock()
for i in range(5):
my_queue.put(i)
def something_useful(CPU_number):
while not my_queue.empty():
lock.acquire()
print("\n CPU_C " + str(CPU_number) + ": " + str(my_queue.get()))
lock.release()
print("\n CPU_C " + str(CPU_number) + ": the next line is the return")
return
number_of_threads = 8
practice_threads = []
for i in range(number_of_threads):
thread = threading.Thread(target=something_useful, args=(i, ))
practice_threads.append(thread)
thread.start()
All this does is create a queue with 5 items, and pull them out and print them with different threads.
What I noticed, though, is that some of the threads aren't terminating properly. For example, if I later add something to the queue (e.g. my_queue.put(7)) then some thread will instantly print that number.
That's why I added the last print line print("\n CPU_C " + str(CPU_number) + ": the next line is the return"), and I noticed that only one thread will terminate. In other words, when I run the code above, only one thread will print "the next line is the return".
The weird thing is, this issue disappears when I remove the lock. Without the lock, it works perfectly fine.
What am I missing?
Actually it's not just 1 thread that will give the next line is the return. There can be anywhere between 1 to 8.
In my executions, sometimes i got 1,3,4,5,6,7 or 1,2,3,4,5,6,7 or 1,4,5,6,7 or only 5,6,7 etc.
You have a race-condition.
The race condition is in between the while check not my_queue.empty() and the lock.acquire()
Essentially, the .empty() could give you a "it is not empty" but before you acquired the lock, something else could have taken that value out. Hence you need to do your checks for these things within the lock.
Here is a safer implementation:
import threading
import queue
import time
my_queue = queue.Queue()
lock = threading.Lock()
for i in range(50):
my_queue.put(i)
def something_useful(CPU_number):
while True:
lock.acquire()
if not my_queue.empty():
print("CPU_C " + str(CPU_number) + ": " + str(my_queue.get()))
lock.release()
else:
lock.release()
break
print("CPU_C " + str(CPU_number) + ": the next line is the return")
return
number_of_threads = 8
practice_threads = []
for i in range(number_of_threads):
thread = threading.Thread(target=something_useful, args=(i, ))
practice_threads.append(thread)
thread.start()
Note: in you're current code as you're only getting the value - it's always a blocker i.e. only 1 thread at a time for the whole loop. Ideally you would do:
if not my_queue.empty():
val = my_queue.get()
lock.release()
print("CPU_C " + str(CPU_number) + ": " + str(val))
heavy_processing(val) # While this is going on another thread can read the next val
Related
I have a program(python 3.9.10) that has a read queue and a write queue. One thread reads and once read, sends to the write queue and another thread writes.
All works fine unless there is an error. If there is, the threads do not stop.
In the following code I am simulating an error being detected in the read thread and trying to stop the threads from reading/writing so the program exits however the program/threads stay active and the program never finishes. If I remove the error simulation code, the threads stop and the program finishes.
I wish to handle the errors WITHIN the threads and if need be, stop the threads/program without throwing an error up
What am I doing wrong? Thanks
Here is a working example of my issue:
import pandas as pd
import datetime
import traceback
from queue import Queue
from threading import Thread
import time
dlQueue = Queue()
writeQueue = Queue()
dlQDone = False
errorStop = False
def log(text):
text = datetime.datetime.now().strftime("%Y/%m/%d, %H:%M:%S ") + text
print(text)
def errorBreak():
global dlQueue
global writeQueue
global errorStop
global dlQDone
dlQueue = Queue()
writeQueue = Queue()
errorStop = True
dlQDone = True
def downloadTable(t, q):
global dlQDone
global errorStop
while True:
if errorStop:
return
nextQ = q.get()
log("READING: " + nextQ)
writeQueue.put("Writing " + nextQ)
log("DONE READING: " + nextQ)
####sumulating an error and need to exit threads###
if nextQ == "Read 7":
log("Breaking Read")
errorBreak()
return
###################################################
q.task_done()
if q.qsize() == 0:
log("Download QUEUE finished")
dlQDone = True
return
def writeTable(t, q):
global errorStop
global dlQDone
while True:
if errorStop:
log("Error Stop return")
return
nextQ = q.get()
log("WRITING: " + nextQ)
log("DONE WRITING: " + nextQ)
q.task_done()
if dlQDone:
if q.qsize() == 0:
log("Writing QUEUE finished")
return
try:
log("PROCESS STARTING!!")
for i in range(10):
dlQueue.put("Read " + str(i))
startTime = time.time()
log("Starting threaded pull....")
dlWorker = Thread(
target=downloadTable,
args=(
"DL",
dlQueue,
),
)
dlWorker.start()
writeWorker = Thread(
target=writeTable,
args=(
"Write",
writeQueue,
),
)
writeWorker.start()
dlQueue.join()
writeQueue.join()
log(f"Finished thread in {str(time.time() - startTime)} seconds") # CANNOT GET HERE
log("Threads: " + str(dlWorker.is_alive()) + str(writeWorker.is_alive()))
except Exception as error:
log(error)
log(traceback.format_exc())
If I understood you correctly, you want to stop both threads in case there's some error that warrants it; you can do that with a threading.Event, and changing your queue reads to have a timeout.
import datetime
import time
import queue
import threading
dlQueue = queue.Queue()
writeQueue = queue.Queue()
stop_event = threading.Event()
def log(text):
text = datetime.datetime.now().strftime("%Y/%m/%d, %H:%M:%S ") + text
print(text)
def downloadTable(t: str, q: queue.Queue):
while not stop_event.is_set():
try:
nextQ = q.get(timeout=1)
except queue.Empty:
continue
log("READING: " + nextQ)
writeQueue.put("Writing " + nextQ)
log("DONE READING: " + nextQ)
if nextQ == "7":
log("Breaking Read")
stop_event.set()
break
q.task_done()
log("Download thread exiting")
def writeTable(t, q):
while not stop_event.is_set():
try:
nextQ = q.get(timeout=1)
except queue.Empty:
continue
log("WRITING: " + nextQ)
log("DONE WRITING: " + nextQ)
q.task_done()
log("Write thread exiting")
def main():
log("PROCESS STARTING!!")
for i in range(10):
dlQueue.put(f"{i}")
log("Starting threaded pull....")
dlWorker = threading.Thread(
target=downloadTable,
args=(
"DL",
dlQueue,
),
)
dlWorker.start()
writeWorker = threading.Thread(
target=writeTable,
args=(
"Write",
writeQueue,
),
)
writeWorker.start()
dlWorker.join()
writeWorker.join()
if __name__ == "__main__":
main()
I'm running the following code block in my application. While running it with python3.4 I get 'python quit unexpectedly' popup on my screen. The data missing from the aOut file is for a bunch of iterations and it is in chunks. Say 0-1000 items in the list are not present and others have the data. The other items run properly on their own without intervention.
While using python2.7 the failures are for items ~3400-4400 in the list.
On logging I see that, the detect() call are not made for processes from 0-1000 (i.e) process.start() calls dont trigger the detect method.
I am doing this on MAC OS Sierra. What is happening here? Is there a better way to achieve my purpose?
def detectInBatch (aList, aOut):
#iterate through the objects
processPool = []
pthreadIndex = 0
pIndex = 0
manager = Manager()
dict = manager.dict()
outline = ""
print("Threads: ", getMaxThreads()) # max threads is 20
for key in aList:
print("Key: %s, pIndex: %d"%(key.key, pIndex))
processPool.append(Process(target=detect, args=(key.key, dict)))
pthreadIndex = pthreadIndex + 1
pIndex = pIndex + 1
#print("Added for %d" %(pIndex))
if(pthreadIndex == getMaxThreads()):
print("ProcessPool size: %d" %len(processPool))
for process in processPool:
#print("Started")
process.start()
#end for
print("20 Processes started")
for process in processPool:
#print("Joined")
process.join()
#end for
print("20 Processes joined")
for key in dict.keys():
outline = outline + dict.get(key)
#end for
dict.clear()
pthreadIndex = 0
processPool = []
#endif
#endfor
if(pthreadIndex != 0):
for process in processPool:
# print("End Start")
process.start()
#end for
for process in processPool:
# print("End done")
process.join()
#end for
for key in dict.keys():
print ("Dict: " + dict.get(key))
outline = outline + dict.get(key)
#end for
#endif
aOut.write(outline)
#end method detectInBatch
To avoid the 'unexpected quit' perhaps try to ignore the exception with
try:
your_loop()
except:
pass
Then, put in some logging to track the root cause.
Is is possible to stop a thread prematurely when it is stuck inside a while loop? Below is my sample code, which runs correctly, since each time it calls loop_thread it will check to see if the threading.Event() flag is set. When attempting to run the code for a file that processes information much longer than each second, there is no way to stop the entire function from continuing its execution until the next iteration. For example, if I run dld_img_thread, it takes about 5 minutes to complete its execution and recheck the while loop to see if should proceed. What I want to have happen is kill the dld_img_thread at a time shorter than 5 minutes (e.g. 1 minute). I don't care if the data is lost, just that the thread stops before the function finishes execution. Thank you
import threading, time, pythoncom, read_mt0
import powerfail_debugport_reader as pf_dbg_rdr
import powerfail_firmware_downloader as pf_fwdld
def loop_thread(thread_name, thread_event):
loopCnt = 0
print "\nstarting {}".format(thread_name)
print "is {0} alive? {1}\n".format(thread_name, L00P_thread.is_alive())
while not thread_event.is_set():
print("value of loopCnt = {}".format(loopCnt))
loopCnt += 1
time.sleep(1)
print('stopping {}\n'.format(thread_name))
def image_dld(thread_name, thread_event):
pythoncom.CoInitializeEx(pythoncom.COINIT_MULTITHREADED)
print "\nstarting {}".format(thread_name)
print "is {0} alive? {1}\n".format(thread_name, dld_img_thread.is_alive())
while not thread_event.is_set():
pf_fwdld.power_fail_test()
print('stopping {}'.format(thread_name))
def debug_port_thread(thread_name, thread_event):
pythoncom.CoInitializeEx(pythoncom.COINIT_MULTITHREADED)
print "\nstarting {}".format(thread_name)
print "is {0} alive? {1}\n".format(thread_name, debug_thread.is_alive())
pf_dbg_rdr.debug_port_reader()
print('\nstopping {}'.format(thread_name))
def main():
global L00P_thread, debug_thread
pf_dbg_rdr.samurai_event = threading.Event()
L00P_thread = threading.Thread(target=loop_thread, \
args=('L00P_thread', pf_dbg_rdr.samurai_event))
dld_img_thread = threading.Thread(target=image_dld, \
args=('image_download', pf_dbg_rdr.samurai_event))
debug_thread = threading.Thread(target=debug_port_thread, \
args=('debug_port_reader', pf_dbg_rdr.samurai_event))
L00P_thread.start()
dld_img_thread.start()
debug_thread.start()
debug_thread.join()
if __name__ == '__main__':
main()
print('processes stopped')
print "Exiting Main Thread"
Use a second variable in your while condition that you can change once your timeout is reached.
For example:
shouldRun = True
while not thread_event.is_set() and shouldRun:
print("value of loopCnt = {}".format(loopCnt))
loopCnt += 1
time.sleep(1)
if loopCnt > 60: shouldRun = False
would stop after 60 iterations (about 60 seconds given you sleep for 1 second on each iteration).
I've just been trying to get threading working properly and I've hit a problem. The default thread module doesn't seem to be able to return values, so I looked up a solution and found this answer - how to get the return value from a thread in python?
I've got this working for getting multiple threads running, but I can't seem to print any values from inside the thread until they've all finished. Here is the code I currently have:
import random
from multiprocessing.pool import ThreadPool
#only 2 threads for now to make sure they don't all finish at once
pool = ThreadPool(processes=2)
#should take a few seconds to process
def printNumber(number):
num = random.randint( 50000, 500000 )
for i in range( num ):
if i % 10000 == 0:
print "Thread " + str( number ) + " progress: " + str( i )
test = random.uniform( 0, 10 ) ** random.uniform( 0, 1 )
return number
thread_list = []
#Execute threads
for i in range(1,10):
m = pool.apply_async(printNumber, (i,))
thread_list.append(m)
#Wait for values and get output
totalNum = 0
for i in range( len( thread_list ) ):
totalNum += thread_list[i].get()
print "Thread finished"
# Demonstrates that the main process waited for threads to complete
print "Done"
What happens, is you get 9x "Thread finished", then "Done", then everything that was printed by the threads.
However, remove the #wait for values part, and it prints them correctly. Is there any way I can keep it waiting for completion, but print things from inside the function?
Edit: Here is the output (a bit long to add to the post), it weirdly reverses the print order - http://pastebin.com/9ZRhg52Q
I'm trying to check a list of answers like so:
def checkAns(File, answer):
answer = bytes(answer, "UTF-8")
try:
File.extractall(pwd=answer)
except:
pass
else:
print("[+] Correct Answer: " + answer.decode("UTF-8") + "\n")
def main():
File = zipfile.ZipFile("questions.zip")
ansFile = open("answers.txt")
for line in ansFile.readlines():
answer = line.strip("\n")
t = Thread(target=extractFile, args=(File, answer))
t.start()
Assume the correct answer is 4 and your list contains values 1 through 1000000.
How do I get it to stop after it gets to 4 and not run through the remaining numbers in the list?
I have tried it several different ways:
else:
print("[+] Correct Answer: " + answer.decode("UTF-8") + "\n")
exit(0)
and also
try:
File.extractall(pwd=answer)
print("[+] Correct Answer: " + answer.decode("UTF-8") + "\n")
exit(0)
except:
pass
How do I get all the threads to stop after the correct answer is found?
Strangely in Python you can't kill threads:
Python’s Thread class supports a subset of the behavior of Java’s
Thread class; currently, there are no priorities, no thread groups,
and threads cannot be destroyed, stopped, suspended, resumed, or
interrupted.
https://docs.python.org/2/library/threading.html#threading.ThreadError
This sample creates a thread that will run for 10 seconds. The parent then waits a second, then is "done", and waits (ie: join()s) the outstanding threads before exiting cleanly.
import sys, threading, time
class MyThread(threading.Thread):
def run(self):
for _ in range(10):
print 'ding'
time.sleep(1)
MyThread().start()
time.sleep(2)
print 'joining threads'
for thread in threading.enumerate():
if thread is not threading.current_thread():
thread.join()
print 'done'