Python - Why doesn't multithreading increase the speed of my code? - python

I tried improving my code by running this with and without using two threads:
from threading import Lock
from threading import Thread
import time
start_time = time.clock()
arr_lock = Lock()
arr = range(5000)
def do_print():
# Disable arr access to other threads; they will have to wait if they need to read
a = 0
while True:
arr_lock.acquire()
if len(arr) > 0:
item = arr.pop(0)
print item
arr_lock.release()
b = 0
for a in range(30000):
b = b + 1
else:
arr_lock.release()
break
thread1 = Thread(target=do_print)
thread1.start()
thread1.join()
print time.clock() - start_time, "seconds"
When running 2 threads my code's run time increased. Does anyone know why this happened, or perhaps know a different way to increase the performance of my code?

The primary reason you aren't seeing any performance improvements with multiple threads is because your program only enables one thread to do anything useful at a time. The other thread is always blocked.
Two things:
Remove the print statement that's invoked inside the lock. print statements drastically impact performance and timing. Also, the I/O channel to stdout is essentially single threaded, so you've built another implicit lock into your code. So let's just remove the print statement.
Use a proper sleep technique instead of "spin locking" and counting up from 0 to 30000. That's just going to burn a core needlessly.
Try this as your main loop
while True:
arr_lock.acquire()
if len(arr) > 0:
item = arr.pop(0)
arr_lock.release()
time.sleep(0)
else:
arr_lock.release()
break
This should run slightly better... I would even advocate getting the sleep statement out altogether so you can just let each thread have a full quantum.
However, because each thread is either doing "nothing" (sleeping or blocked on acquire) or just doing a single pop call on the array while in the lock, the majority of the time spent is going to be in the acquire/release calls instead of actually operating on the array. Hence, multiple threads aren't going to make your program run faster.

Related

Starting n number of threads from a loop

So basically, I've this function th() which counts till certain number and then prints "done".
I'd want to start n number of such threads at the same time, running simultaneously.
So I wrote:
thread_num = 3 #here n is 3, but I'd normally want something way higher
thrds = []
i = 0
while i < thread_num:
thr = Thread(target=th, args=())
thrds.append(thr)
i += 1
print("thread", str(i), "added")
for t in thrds:
t.start()
t.join()
I'd want all the threads to print "done" at the same time, but they have a noticeable lag in between of them. They print "thread i started" at seemingly the same time, but print "done" with quite a bit of time lag.
Why is this happening?
Edit: Since someone asked me to add th() function as well, here it is:
def th():
v = 0
num = 10**7
while v < num:
v += 1
print("done")
This is happening because of the t.join() method that you are calling on each thread before start the next one. t.join() blocks the execution of the current thread until the thread t has completed execution. So, each thread is starting after the previous one has finished.
You first have to start all the threads, then join all the threads in separate for loops; otherwise, each thread starts but runs to completion due to join before starting another thread.
for t in thrds: # start all the threads
t.start()
for t in thrds: # wait for all threads to finish
t.join()
If you only have a simple counting thread, you may need to add some short sleep to actually see the threads output intermingle as they may still run fast enough to complete before another thread starts.
Because you start and join each thread sequentially, one thread will run to completion before the next even starts. You'd be better off running a thread pool which is a more comprehensive implementation that handles multiple issues in multithreading.
Because of memory management and object reference count issues, python only lets a single thread execute byte code at a time. Periodically, each thread will release and reacquire the Global Interpreter Lock (GIL) to let other threads run. Exactly which thread runs at any given time is up to the operating system and you may find one gets more slices than another, causing staggered results.
To get them all to print "done" at the same time, you could use a control structure like a barrier for threads to wait until all are done. With a barrier, all threads must call wait before any can continue.
thread_num = 3 #here n is 3, but I'd normally want something way higher
wait_done = threading.Barrier(thread_num)
def th(waiter):
x = 1 # to what you want
waiter.wait()
print("done")
thrds = []
i = 0
while i < thread_num:
thr = Thread(target=th, args=(wait_done,))
thrds.append(thr)
i += 1
print("thread", str(i), "added")
for t in thrds:
t.start()
for t in thrds:
t.join()

Python multithreading producing funky results

I'm fairly new to multithreading in Python and encountered an issue (likely due to concurrency problems). When I run the code below, it produces "normal" 1,2,3,4,5,6,7,8,9 digits for the first 9 numbers. However, when it moves on to the next batch of numbers (the ones that should be printed by each thread after it "sleeps" for 2 seconds) it spits out:
different numbers each time
often very large numbers
sometimes no numbers at all
I'm guessing this is a concurrency issue where by the time each original thread got to printing the second number after "sleep" the i variable has been tampered with by the code, but can someone please explain what exactly is happening step-by-step and why the no numbers/large numbers phenomenon?
import threading
import time
def foo(text):
print(text)
time.sleep(2)
print(text)
for i in range(1,10):
allTreads = []
current_thread = threading.Thread(target = foo, args= (i,))
allTreads.append(current_thread)
current_thread.start()
Well, your problem is called race condition. Sometimes when the code is executed, one thread will print a number before the implicit '\n' of another thread, and that's why you often see those kind of behaviours.
Also, whats the purpose of the allTreads list there? It is restarted at every iteration, so it stores the current_thread and then is deleted at the end of the current iteration.
In order to avoid race conditions, you need some kind of synchronization between threads. Consider the threading.Lock(), in order to avoid that more than one thread at a time prints the given text:
import threading
import time
lock = threading.Lock()
def foo(text):
with lock:
print(text)
time.sleep(2)
with lock:
print(text)
for i in range(1,10):
allTreads = []
current_thread = threading.Thread(target = foo, args= (i,))
allTreads.append(current_thread)
current_thread.start()
The threading documentation in python is quite good. I recommend you to read these two links:
Python Threading Documentation
Real Python Threading

How to jump out of a dead loop automatically in Python?

I have a "do..., until..." structure in Python as follows:
while True:
if foo() == bar():
break
It works fine (jumps out in the end) in most of the cases. However, in some of the cases where the condition is never met, it will get stuck there.
Figuring out what are these cases is kind of difficult, since it is essentially a random process behind. So I wish to set a "timeout" thing for the while loop.
Say, if the loop has been running for 1s, but still not yet stops, I wish the loop to terminate itself.
How may I do this?
Update: Here is the actual code:
while True:
possibleJunctions = junctionReachability[junctions.index(currentJunction)]
nextJunction = random.choice(filter(lambda (jx, jy): (jx - currentJunction[0]) * (endJunction[0] - currentJunction[0]) > 0 or (jy - currentJunction[1]) * (endJunction[1] - currentJunction[1]) > 0, possibleJunctions) or possibleJunctions)
if previousJunction != nextJunction: # never go back
junctionSequence.append(nextJunction)
previousJunction = currentJunction
currentJunction = nextJunction
if currentJunction == endJunction:
break
import time
loop_start = time.time()
while time.time() - loop_start <= 1:
if foo() == bar():
break
EDIT
Dan Doe's solution is simplest and best if your code is synchronous (just runs in a single thread) and you know that the foo and bar functions always terminate within some period of time.
If you have asynchronous code (like a GUI), or if the foo and bar functions you use to test for termination conditions can themselves take too long to complete, then read on.
Run the loop inside a separate thread/process. Run a timer in another process. Once the timer expires, set a flag that would cause the loop to terminate.
Something like this (warning: untested code):
import multiprocessing
import time
SECONDS = 10
event = multiprocessing.Event()
def worker():
"""Does stuff until work is complete, or until signaled to terminate by timer."""
while not event.is_set():
if foo() == bar():
break
def timer():
"""Signals the worker to terminate immediately."""
time.sleep(SECONDS)
event.set()
def main():
"""Kicks off subprocesses and waits for both of them to terminate."""
worker_process = multiprocessing.Process(target=worker)
timer_process = multiprocessing.Process(target=timer)
timer_process.start()
worker_process.start()
timer_process.join()
worker_process.join()
if __name__ == "__main__":
main()
If you were worried about the foo and bar functions taking too long to complete, you could explicitly terminate the worker process from within the timer process.
I recommend using a counter. This is a common trick to detect non-convergence.
maxiter = 10000
while True:
if stopCondition(): break
maxiter = maxiter - 1
if maxiter <= 0:
print >>sys.stderr, "Did not converge."
break
this requires the least overhead and usually adapts best to different CPUs: even on a faster CPU, you want the same termination behavior; instead of a time-based timeout.
However, it would be even better if you would detect being stuck e.g. with some criterion function that no longer improves.

Use of threading.Thread.join()

I am new to multithreading in python and trying to learn multithreading using threading module. I have made a very simple program of multi threading and i am having trouble understanding the threading.Thread.join method.
Here is the source code of the program I have made
import threading
val = 0
def increment():
global val
print "Inside increment"
for x in range(100):
val += 1
print "val is now {} ".format(val)
thread1 = threading.Thread(target=increment, args=())
thread2 = threading.Thread(target=increment, args=())
thread1.start()
#thread1.join()
thread2.start()
#thread2.join()
What difference does it make if I use
thread1.join()
thread2.join()
which I have commented in the above code? I ran both the source codes (one with comments and the one without comments) but the output is the same.
A call to thread1.join() blocks the thread in which you're making the call, until thread1 is finished. It's like wait_until_finished(thread1).
For example:
import time
def printer():
for _ in range(3):
time.sleep(1.0)
print "hello"
thread = Thread(target=printer)
thread.start()
thread.join()
print "goodbye"
prints
hello
hello
hello
goodbye
—without the .join() call, goodbye would come first and then 3 * hello.
Also, note that threads in Python do not provide any additional performance (in terms of CPU processing power) because of a thing called the Global Interpreter Lock, so while they are useful for spawning off potentially blocking (e.g. IO, network) and time consuming tasks (e.g. number crunching) to keep the main thread free for other tasks, they do not allow you to leverage multiple cores or CPUs; for that, look at multiprocessing which uses subprocesses but exposes an API equivalent to that of threading.
PLUG: ...and it is also for the above reason that, if you're interested in concurrency, you might also want to look into a fine library called Gevent, which essentially just makes threading much easier to use, much faster (when you have many concurrent activities) and less prone to concurrency related bugs, while allowing you to keep coding the same way as with "real" threads. Also Twisted, Eventlet, Tornado and many others, are either equivalent or comparable. Furthermore, in any case, I'd strongly suggest reading these classics:
Generator Tricks for Systems Programmers
A Curious Course on Coroutines and Concurrency
I modified the code so that you will understand how exactly join works.
so run this code with comments and without comments and observe the output for both.
val = 0
def increment(msg,sleep_time):
global val
print "Inside increment"
for x in range(10):
val += 1
print "%s : %d\n" % (msg,val)
time.sleep(sleep_time)
thread1 = threading.Thread(target=increment, args=("thread_01",0.5))
thread2 = threading.Thread(target=increment, args=("thread_02",1))
thread1.start()
#thread1.join()
thread2.start()
#thread2.join()
As the relevant documentation states, join makes the caller wait until the thread terminates.
In your case, the output is the same because join doesn't change the program behaviour - it's probably being used to exit the program cleanly, only when all the threads have terminated.

Return whichever expression returns first

I have two different functions f, and g that compute the same result with different algorithms. Sometimes one or the other takes a long time while the other terminates quickly. I want to create a new function that runs each simultaneously and then returns the result from the first that finishes.
I want to create that function with a higher order function
h = firstresult(f, g)
What is the best way to accomplish this in Python?
I suspect that the solution involves threading. I'd like to avoid discussion of the GIL.
I would simply use a Queue for this. Start the threads and the first one which has a result ready writes to the queue.
Code
from threading import Thread
from time import sleep
from Queue import Queue
def firstresult(*functions):
queue = Queue()
threads = []
for f in functions:
def thread_main():
queue.put(f())
thread = Thread(target=thread_main)
threads.append(thread)
thread.start()
result = queue.get()
return result
def slow():
sleep(1)
return 42
def fast():
return 0
if __name__ == '__main__':
print firstresult(slow, fast)
Live demo
http://ideone.com/jzzZX2
Notes
Stopping the threads is an entirely different topic. For this you need to add some state variable to the threads which needs to be checked in regular intervals. As I want to keep this example short I simply assumed that part and assumed that all workers get the time to finish their work even though the result is never read.
Skipping the discussion about the Gil as requested by the questioner. ;-)
Now - unlike my suggestion on the other answer, this piece of code does exactly what you are requesting:
from multiprocessing import Process, Queue
import random
import time
def firstresult(func1, func2):
queue = Queue()
proc1 = Process(target=func1,args=(queue,))
proc2 = Process(target=func2, args=(queue,))
proc1.start();proc2.start()
result = queue.get()
proc1.terminate(); proc2.terminate()
return result
def algo1(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 1")
def algo2(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 2")
print firstresult(algo1, algo2)
Run each function in a new worker thread, the 2 worker threads send the result back to the main thread in a 1 item queue or something similar. When the main thread receives the result from the winner, it kills (do python threads support kill yet? lol.) both worker threads to avoid wasting time (one function may take hours while the other only takes a second).
Replace the word thread with process if you want.
You will need to run each function in another process (with multiprocessing) or in a different thread.
If both are CPU bound, multithread won help much - exactly due to the GIL -
so multiprocessing is the way.
If the return value is a pickleable (serializable) object, I have this decorator I created that simply runs the function in background, in another process:
https://bitbucket.org/jsbueno/lelo/src
It is not exactly what you want - as both are non-blocking and start executing right away. The tirck with this decorator is that it blocks (and waits for the function to complete) as when you try to use the return value.
But on the other hand - it is just a decorator that does all the work.

Categories

Resources