Print value from inside thread during it's execution - python

I've just been trying to get threading working properly and I've hit a problem. The default thread module doesn't seem to be able to return values, so I looked up a solution and found this answer - how to get the return value from a thread in python?
I've got this working for getting multiple threads running, but I can't seem to print any values from inside the thread until they've all finished. Here is the code I currently have:
import random
from multiprocessing.pool import ThreadPool
#only 2 threads for now to make sure they don't all finish at once
pool = ThreadPool(processes=2)
#should take a few seconds to process
def printNumber(number):
num = random.randint( 50000, 500000 )
for i in range( num ):
if i % 10000 == 0:
print "Thread " + str( number ) + " progress: " + str( i )
test = random.uniform( 0, 10 ) ** random.uniform( 0, 1 )
return number
thread_list = []
#Execute threads
for i in range(1,10):
m = pool.apply_async(printNumber, (i,))
thread_list.append(m)
#Wait for values and get output
totalNum = 0
for i in range( len( thread_list ) ):
totalNum += thread_list[i].get()
print "Thread finished"
# Demonstrates that the main process waited for threads to complete
print "Done"
What happens, is you get 9x "Thread finished", then "Done", then everything that was printed by the threads.
However, remove the #wait for values part, and it prints them correctly. Is there any way I can keep it waiting for completion, but print things from inside the function?
Edit: Here is the output (a bit long to add to the post), it weirdly reverses the print order - http://pastebin.com/9ZRhg52Q

Related

Why threads are not working simultaneously?

For starters i'm new in Python.
I will be brief. I'm trying to fetch all links from the website using threads.
The problem is that threads are waiting for their turn, but I want them to work simultaneously with other threads.
For example, I set the number of threads to 2, and then get 2 chunks with links.
I want the first thread to iterate over the links in the first chunk, and the second thread to iterate over the links in the second chunk SIMULTANEOUSLY. But my program works in such a way that threads are waiting for their turn. What am I doing wrong, guys? Much obliged for your help
My code:
target()
def url_target(text, e):
global links
global chunks
number = int(sys.argv[1])
for m in text:
time.sleep(0.2)
print(m, e)
print('\n')
main()
def main():
global links
global chunks
url = sys.argv[2]
links = fetch_links(url)
number = int(sys.argv[1])
url_chunk = len(links) // number
start, stop = 0, url_chunk + len(links) % number
chunks = []
time.sleep(1)
while start < len(links):
for i in range(number):
part_links = links[start:stop]
p = Thread(name='myThread', target=url_target, args=(part_links, i+1))
p.start()
chunks.append(p)
start, stop = stop, stop + url_chunk
p.join()
time.sleep(1)
while chunks:
d = chunks.pop()
print(f'{d.ident} done')
Thanks! I'd appreciate any help you can give!
p.join() blocks until p completes. You want to start all the threads first, then wait on each in turn.
while start < len(links):
for i in range(number):
part_links = links[start:stop]
p = Thread(name='myThread', target=url_target, args=(part_links, i+1))
p.start()
chunks.append(p)
start, stop = stop, stop + url_chunk
time.sleep(1)
for p in chunks:
p.join()
If you aren't planning on doing anything while waiting for all the threads to complete, this is fine. However, you might want to block until any thread completes, rather than an arbitrarily chosen one. A thread pool can help, but
a simple way to implement a thread pool is to wait for a short period of time for a thread to complete. If it doesn't, wait on another one and come back to the first one later. For example,
from collections import deque
chunks = deque()
for start in range(0, len(links), url_chunk):
for i in range(1, number+1):
part_links = links[start:start + url_chunk]
p = Thread(name='myThread', target=url_target, args=(part_links, i))
p.start()
chunks.append(p)
while chunks:
p = chunks.popleft()
p.join(5) # Wait 5 seconds, or some other small period of time
if p.is_alive():
chunks.append(p) # put it back

Python processes fail to start

I'm running the following code block in my application. While running it with python3.4 I get 'python quit unexpectedly' popup on my screen. The data missing from the aOut file is for a bunch of iterations and it is in chunks. Say 0-1000 items in the list are not present and others have the data. The other items run properly on their own without intervention.
While using python2.7 the failures are for items ~3400-4400 in the list.
On logging I see that, the detect() call are not made for processes from 0-1000 (i.e) process.start() calls dont trigger the detect method.
I am doing this on MAC OS Sierra. What is happening here? Is there a better way to achieve my purpose?
def detectInBatch (aList, aOut):
#iterate through the objects
processPool = []
pthreadIndex = 0
pIndex = 0
manager = Manager()
dict = manager.dict()
outline = ""
print("Threads: ", getMaxThreads()) # max threads is 20
for key in aList:
print("Key: %s, pIndex: %d"%(key.key, pIndex))
processPool.append(Process(target=detect, args=(key.key, dict)))
pthreadIndex = pthreadIndex + 1
pIndex = pIndex + 1
#print("Added for %d" %(pIndex))
if(pthreadIndex == getMaxThreads()):
print("ProcessPool size: %d" %len(processPool))
for process in processPool:
#print("Started")
process.start()
#end for
print("20 Processes started")
for process in processPool:
#print("Joined")
process.join()
#end for
print("20 Processes joined")
for key in dict.keys():
outline = outline + dict.get(key)
#end for
dict.clear()
pthreadIndex = 0
processPool = []
#endif
#endfor
if(pthreadIndex != 0):
for process in processPool:
# print("End Start")
process.start()
#end for
for process in processPool:
# print("End done")
process.join()
#end for
for key in dict.keys():
print ("Dict: " + dict.get(key))
outline = outline + dict.get(key)
#end for
#endif
aOut.write(outline)
#end method detectInBatch
To avoid the 'unexpected quit' perhaps try to ignore the exception with
try:
your_loop()
except:
pass
Then, put in some logging to track the root cause.

While loop in python doesn't end when it contains a lock

I'm currently learning to use threads in Python, and I'm playing around with this dummy bit of code for practice:
import threading
import queue
import time
my_queue = queue.Queue()
lock = threading.Lock()
for i in range(5):
my_queue.put(i)
def something_useful(CPU_number):
while not my_queue.empty():
lock.acquire()
print("\n CPU_C " + str(CPU_number) + ": " + str(my_queue.get()))
lock.release()
print("\n CPU_C " + str(CPU_number) + ": the next line is the return")
return
number_of_threads = 8
practice_threads = []
for i in range(number_of_threads):
thread = threading.Thread(target=something_useful, args=(i, ))
practice_threads.append(thread)
thread.start()
All this does is create a queue with 5 items, and pull them out and print them with different threads.
What I noticed, though, is that some of the threads aren't terminating properly. For example, if I later add something to the queue (e.g. my_queue.put(7)) then some thread will instantly print that number.
That's why I added the last print line print("\n CPU_C " + str(CPU_number) + ": the next line is the return"), and I noticed that only one thread will terminate. In other words, when I run the code above, only one thread will print "the next line is the return".
The weird thing is, this issue disappears when I remove the lock. Without the lock, it works perfectly fine.
What am I missing?
Actually it's not just 1 thread that will give the next line is the return. There can be anywhere between 1 to 8.
In my executions, sometimes i got 1,3,4,5,6,7 or 1,2,3,4,5,6,7 or 1,4,5,6,7 or only 5,6,7 etc.
You have a race-condition.
The race condition is in between the while check not my_queue.empty() and the lock.acquire()
Essentially, the .empty() could give you a "it is not empty" but before you acquired the lock, something else could have taken that value out. Hence you need to do your checks for these things within the lock.
Here is a safer implementation:
import threading
import queue
import time
my_queue = queue.Queue()
lock = threading.Lock()
for i in range(50):
my_queue.put(i)
def something_useful(CPU_number):
while True:
lock.acquire()
if not my_queue.empty():
print("CPU_C " + str(CPU_number) + ": " + str(my_queue.get()))
lock.release()
else:
lock.release()
break
print("CPU_C " + str(CPU_number) + ": the next line is the return")
return
number_of_threads = 8
practice_threads = []
for i in range(number_of_threads):
thread = threading.Thread(target=something_useful, args=(i, ))
practice_threads.append(thread)
thread.start()
Note: in you're current code as you're only getting the value - it's always a blocker i.e. only 1 thread at a time for the whole loop. Ideally you would do:
if not my_queue.empty():
val = my_queue.get()
lock.release()
print("CPU_C " + str(CPU_number) + ": " + str(val))
heavy_processing(val) # While this is going on another thread can read the next val

Causing a thread to stop while stuck within a while loop?

Is is possible to stop a thread prematurely when it is stuck inside a while loop? Below is my sample code, which runs correctly, since each time it calls loop_thread it will check to see if the threading.Event() flag is set. When attempting to run the code for a file that processes information much longer than each second, there is no way to stop the entire function from continuing its execution until the next iteration. For example, if I run dld_img_thread, it takes about 5 minutes to complete its execution and recheck the while loop to see if should proceed. What I want to have happen is kill the dld_img_thread at a time shorter than 5 minutes (e.g. 1 minute). I don't care if the data is lost, just that the thread stops before the function finishes execution. Thank you
import threading, time, pythoncom, read_mt0
import powerfail_debugport_reader as pf_dbg_rdr
import powerfail_firmware_downloader as pf_fwdld
def loop_thread(thread_name, thread_event):
loopCnt = 0
print "\nstarting {}".format(thread_name)
print "is {0} alive? {1}\n".format(thread_name, L00P_thread.is_alive())
while not thread_event.is_set():
print("value of loopCnt = {}".format(loopCnt))
loopCnt += 1
time.sleep(1)
print('stopping {}\n'.format(thread_name))
def image_dld(thread_name, thread_event):
pythoncom.CoInitializeEx(pythoncom.COINIT_MULTITHREADED)
print "\nstarting {}".format(thread_name)
print "is {0} alive? {1}\n".format(thread_name, dld_img_thread.is_alive())
while not thread_event.is_set():
pf_fwdld.power_fail_test()
print('stopping {}'.format(thread_name))
def debug_port_thread(thread_name, thread_event):
pythoncom.CoInitializeEx(pythoncom.COINIT_MULTITHREADED)
print "\nstarting {}".format(thread_name)
print "is {0} alive? {1}\n".format(thread_name, debug_thread.is_alive())
pf_dbg_rdr.debug_port_reader()
print('\nstopping {}'.format(thread_name))
def main():
global L00P_thread, debug_thread
pf_dbg_rdr.samurai_event = threading.Event()
L00P_thread = threading.Thread(target=loop_thread, \
args=('L00P_thread', pf_dbg_rdr.samurai_event))
dld_img_thread = threading.Thread(target=image_dld, \
args=('image_download', pf_dbg_rdr.samurai_event))
debug_thread = threading.Thread(target=debug_port_thread, \
args=('debug_port_reader', pf_dbg_rdr.samurai_event))
L00P_thread.start()
dld_img_thread.start()
debug_thread.start()
debug_thread.join()
if __name__ == '__main__':
main()
print('processes stopped')
print "Exiting Main Thread"
Use a second variable in your while condition that you can change once your timeout is reached.
For example:
shouldRun = True
while not thread_event.is_set() and shouldRun:
print("value of loopCnt = {}".format(loopCnt))
loopCnt += 1
time.sleep(1)
if loopCnt > 60: shouldRun = False
would stop after 60 iterations (about 60 seconds given you sleep for 1 second on each iteration).

Is there a way to stop a Python thread when the correct answer is found?

I'm trying to check a list of answers like so:
def checkAns(File, answer):
answer = bytes(answer, "UTF-8")
try:
File.extractall(pwd=answer)
except:
pass
else:
print("[+] Correct Answer: " + answer.decode("UTF-8") + "\n")
def main():
File = zipfile.ZipFile("questions.zip")
ansFile = open("answers.txt")
for line in ansFile.readlines():
answer = line.strip("\n")
t = Thread(target=extractFile, args=(File, answer))
t.start()
Assume the correct answer is 4 and your list contains values 1 through 1000000.
How do I get it to stop after it gets to 4 and not run through the remaining numbers in the list?
I have tried it several different ways:
else:
print("[+] Correct Answer: " + answer.decode("UTF-8") + "\n")
exit(0)
and also
try:
File.extractall(pwd=answer)
print("[+] Correct Answer: " + answer.decode("UTF-8") + "\n")
exit(0)
except:
pass
How do I get all the threads to stop after the correct answer is found?
Strangely in Python you can't kill threads:
Python’s Thread class supports a subset of the behavior of Java’s
Thread class; currently, there are no priorities, no thread groups,
and threads cannot be destroyed, stopped, suspended, resumed, or
interrupted.
https://docs.python.org/2/library/threading.html#threading.ThreadError
This sample creates a thread that will run for 10 seconds. The parent then waits a second, then is "done", and waits (ie: join()s) the outstanding threads before exiting cleanly.
import sys, threading, time
class MyThread(threading.Thread):
def run(self):
for _ in range(10):
print 'ding'
time.sleep(1)
MyThread().start()
time.sleep(2)
print 'joining threads'
for thread in threading.enumerate():
if thread is not threading.current_thread():
thread.join()
print 'done'

Categories

Resources