How can I use threading in Python? - python

I am trying to understand threading in Python. I've looked at the documentation and examples, but quite frankly, many examples are overly sophisticated and I'm having trouble understanding them.
How do you clearly show tasks being divided for multi-threading?

Since this question was asked in 2010, there has been real simplification in how to do simple multithreading with Python with map and pool.
The code below comes from an article/blog post that you should definitely check out (no affiliation) - Parallelism in one line: A Better Model for Day to Day Threading Tasks. I'll summarize below - it ends up being just a few lines of code:
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
results = pool.map(my_function, my_array)
Which is the multithreaded version of:
results = []
for item in my_array:
results.append(my_function(item))
Description
Map is a cool little function, and the key to easily injecting parallelism into your Python code. For those unfamiliar, map is something lifted from functional languages like Lisp. It is a function which maps another function over a sequence.
Map handles the iteration over the sequence for us, applies the function, and stores all of the results in a handy list at the end.
Implementation
Parallel versions of the map function are provided by two libraries:multiprocessing, and also its little known, but equally fantastic step child:multiprocessing.dummy.
multiprocessing.dummy is exactly the same as multiprocessing module, but uses threads instead (an important distinction - use multiple processes for CPU-intensive tasks; threads for (and during) I/O):
multiprocessing.dummy replicates the API of multiprocessing, but is no more than a wrapper around the threading module.
import urllib2
from multiprocessing.dummy import Pool as ThreadPool
urls = [
'http://www.python.org',
'http://www.python.org/about/',
'http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html',
'http://www.python.org/doc/',
'http://www.python.org/download/',
'http://www.python.org/getit/',
'http://www.python.org/community/',
'https://wiki.python.org/moin/',
]
# Make the Pool of workers
pool = ThreadPool(4)
# Open the URLs in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)
# Close the pool and wait for the work to finish
pool.close()
pool.join()
And the timing results:
Single thread: 14.4 seconds
4 Pool: 3.1 seconds
8 Pool: 1.4 seconds
13 Pool: 1.3 seconds
Passing multiple arguments (works like this only in Python 3.3 and later):
To pass multiple arrays:
results = pool.starmap(function, zip(list_a, list_b))
Or to pass a constant and an array:
results = pool.starmap(function, zip(itertools.repeat(constant), list_a))
If you are using an earlier version of Python, you can pass multiple arguments via this workaround).
(Thanks to user136036 for the helpful comment.)

Here's a simple example: you need to try a few alternative URLs and return the contents of the first one to respond.
import Queue
import threading
import urllib2
# Called by each thread
def get_url(q, url):
q.put(urllib2.urlopen(url).read())
theurls = ["http://google.com", "http://yahoo.com"]
q = Queue.Queue()
for u in theurls:
t = threading.Thread(target=get_url, args = (q,u))
t.daemon = True
t.start()
s = q.get()
print s
This is a case where threading is used as a simple optimization: each subthread is waiting for a URL to resolve and respond, to put its contents on the queue; each thread is a daemon (won't keep the process up if the main thread ends -- that's more common than not); the main thread starts all subthreads, does a get on the queue to wait until one of them has done a put, then emits the results and terminates (which takes down any subthreads that might still be running, since they're daemon threads).
Proper use of threads in Python is invariably connected to I/O operations (since CPython doesn't use multiple cores to run CPU-bound tasks anyway, the only reason for threading is not blocking the process while there's a wait for some I/O). Queues are almost invariably the best way to farm out work to threads and/or collect the work's results, by the way, and they're intrinsically threadsafe, so they save you from worrying about locks, conditions, events, semaphores, and other inter-thread coordination/communication concepts.

NOTE: For actual parallelization in Python, you should use the multiprocessing module to fork multiple processes that execute in parallel (due to the global interpreter lock, Python threads provide interleaving, but they are in fact executed serially, not in parallel, and are only useful when interleaving I/O operations).
However, if you are merely looking for interleaving (or are doing I/O operations that can be parallelized despite the global interpreter lock), then the threading module is the place to start. As a really simple example, let's consider the problem of summing a large range by summing subranges in parallel:
import threading
class SummingThread(threading.Thread):
def __init__(self,low,high):
super(SummingThread, self).__init__()
self.low=low
self.high=high
self.total=0
def run(self):
for i in range(self.low,self.high):
self.total+=i
thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join() # This waits until the thread has completed
thread2.join()
# At this point, both threads have completed
result = thread1.total + thread2.total
print result
Note that the above is a very stupid example, as it does absolutely no I/O and will be executed serially albeit interleaved (with the added overhead of context switching) in CPython due to the global interpreter lock.

Like others mentioned, CPython can use threads only for I/O waits due to GIL.
If you want to benefit from multiple cores for CPU-bound tasks, use multiprocessing:
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()

Just a note: A queue is not required for threading.
This is the simplest example I could imagine that shows 10 processes running concurrently.
import threading
from random import randint
from time import sleep
def print_number(number):
# Sleeps a random 1 to 10 seconds
rand_int_var = randint(1, 10)
sleep(rand_int_var)
print "Thread " + str(number) + " slept for " + str(rand_int_var) + " seconds"
thread_list = []
for i in range(1, 10):
# Instantiates the thread
# (i) does not make a sequence, so (i,)
t = threading.Thread(target=print_number, args=(i,))
# Sticks the thread in a list so that it remains accessible
thread_list.append(t)
# Starts threads
for thread in thread_list:
thread.start()
# This blocks the calling thread until the thread whose join() method is called is terminated.
# From http://docs.python.org/2/library/threading.html#thread-objects
for thread in thread_list:
thread.join()
# Demonstrates that the main process waited for threads to complete
print "Done"

The answer from Alex Martelli helped me. However, here is a modified version that I thought was more useful (at least to me).
Updated: works in both Python 2 and Python 3
try:
# For Python 3
import queue
from urllib.request import urlopen
except:
# For Python 2
import Queue as queue
from urllib2 import urlopen
import threading
worker_data = ['http://google.com', 'http://yahoo.com', 'http://bing.com']
# Load up a queue with your data. This will handle locking
q = queue.Queue()
for url in worker_data:
q.put(url)
# Define a worker function
def worker(url_queue):
queue_full = True
while queue_full:
try:
# Get your data off the queue, and do some work
url = url_queue.get(False)
data = urlopen(url).read()
print(len(data))
except queue.Empty:
queue_full = False
# Create as many threads as you want
thread_count = 5
for i in range(thread_count):
t = threading.Thread(target=worker, args = (q,))
t.start()

Given a function, f, thread it like this:
import threading
threading.Thread(target=f).start()
To pass arguments to f
threading.Thread(target=f, args=(a,b,c)).start()

I found this very useful: create as many threads as cores and let them execute a (large) number of tasks (in this case, calling a shell program):
import Queue
import threading
import multiprocessing
import subprocess
q = Queue.Queue()
for i in range(30): # Put 30 tasks in the queue
q.put(i)
def worker():
while True:
item = q.get()
# Execute a task: call a shell program and wait until it completes
subprocess.call("echo " + str(item), shell=True)
q.task_done()
cpus = multiprocessing.cpu_count() # Detect number of cores
print("Creating %d threads" % cpus)
for i in range(cpus):
t = threading.Thread(target=worker)
t.daemon = True
t.start()
q.join() # Block until all tasks are done

Python 3 has the facility of launching parallel tasks. This makes our work easier.
It has thread pooling and process pooling.
The following gives an insight:
ThreadPoolExecutor Example (source)
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
ProcessPoolExecutor (source)
import concurrent.futures
import math
PRIMES = [
112272535095293,
112582705942171,
112272535095293,
115280095190773,
115797848077099,
1099726899285419]
def is_prime(n):
if n % 2 == 0:
return False
sqrt_n = int(math.floor(math.sqrt(n)))
for i in range(3, sqrt_n + 1, 2):
if n % i == 0:
return False
return True
def main():
with concurrent.futures.ProcessPoolExecutor() as executor:
for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
print('%d is prime: %s' % (number, prime))
if __name__ == '__main__':
main()

I saw a lot of examples here where no real work was being performed, and they were mostly CPU-bound. Here is an example of a CPU-bound task that computes all prime numbers between 10 million and 10.05 million. I have used all four methods here:
import math
import timeit
import threading
import multiprocessing
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def time_stuff(fn):
"""
Measure time of execution of a function
"""
def wrapper(*args, **kwargs):
t0 = timeit.default_timer()
fn(*args, **kwargs)
t1 = timeit.default_timer()
print("{} seconds".format(t1 - t0))
return wrapper
def find_primes_in(nmin, nmax):
"""
Compute a list of prime numbers between the given minimum and maximum arguments
"""
primes = []
# Loop from minimum to maximum
for current in range(nmin, nmax + 1):
# Take the square root of the current number
sqrt_n = int(math.sqrt(current))
found = False
# Check if the any number from 2 to the square root + 1 divides the current numnber under consideration
for number in range(2, sqrt_n + 1):
# If divisible we have found a factor, hence this is not a prime number, lets move to the next one
if current % number == 0:
found = True
break
# If not divisible, add this number to the list of primes that we have found so far
if not found:
primes.append(current)
# I am merely printing the length of the array containing all the primes, but feel free to do what you want
print(len(primes))
#time_stuff
def sequential_prime_finder(nmin, nmax):
"""
Use the main process and main thread to compute everything in this case
"""
find_primes_in(nmin, nmax)
#time_stuff
def threading_prime_finder(nmin, nmax):
"""
If the minimum is 1000 and the maximum is 2000 and we have four workers,
1000 - 1250 to worker 1
1250 - 1500 to worker 2
1500 - 1750 to worker 3
1750 - 2000 to worker 4
so let’s split the minimum and maximum values according to the number of workers
"""
nrange = nmax - nmin
threads = []
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
# Start the thread with the minimum and maximum split up to compute
# Parallel computation will not work here due to the GIL since this is a CPU-bound task
t = threading.Thread(target = find_primes_in, args = (start, end))
threads.append(t)
t.start()
# Don’t forget to wait for the threads to finish
for t in threads:
t.join()
#time_stuff
def processing_prime_finder(nmin, nmax):
"""
Split the minimum, maximum interval similar to the threading method above, but use processes this time
"""
nrange = nmax - nmin
processes = []
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
p = multiprocessing.Process(target = find_primes_in, args = (start, end))
processes.append(p)
p.start()
for p in processes:
p.join()
#time_stuff
def thread_executor_prime_finder(nmin, nmax):
"""
Split the min max interval similar to the threading method, but use a thread pool executor this time.
This method is slightly faster than using pure threading as the pools manage threads more efficiently.
This method is still slow due to the GIL limitations since we are doing a CPU-bound task.
"""
nrange = nmax - nmin
with ThreadPoolExecutor(max_workers = 8) as e:
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
e.submit(find_primes_in, start, end)
#time_stuff
def process_executor_prime_finder(nmin, nmax):
"""
Split the min max interval similar to the threading method, but use the process pool executor.
This is the fastest method recorded so far as it manages process efficiently + overcomes GIL limitations.
RECOMMENDED METHOD FOR CPU-BOUND TASKS
"""
nrange = nmax - nmin
with ProcessPoolExecutor(max_workers = 8) as e:
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
e.submit(find_primes_in, start, end)
def main():
nmin = int(1e7)
nmax = int(1.05e7)
print("Sequential Prime Finder Starting")
sequential_prime_finder(nmin, nmax)
print("Threading Prime Finder Starting")
threading_prime_finder(nmin, nmax)
print("Processing Prime Finder Starting")
processing_prime_finder(nmin, nmax)
print("Thread Executor Prime Finder Starting")
thread_executor_prime_finder(nmin, nmax)
print("Process Executor Finder Starting")
process_executor_prime_finder(nmin, nmax)
if __name__ == "__main__":
main()
Here are the results on my Mac OS X four-core machine
Sequential Prime Finder Starting
9.708213827005238 seconds
Threading Prime Finder Starting
9.81836523200036 seconds
Processing Prime Finder Starting
3.2467174359990167 seconds
Thread Executor Prime Finder Starting
10.228896902000997 seconds
Process Executor Finder Starting
2.656402041000547 seconds

Using the blazing new concurrent.futures module
def sqr(val):
import time
time.sleep(0.1)
return val * val
def process_result(result):
print(result)
def process_these_asap(tasks):
import concurrent.futures
with concurrent.futures.ProcessPoolExecutor() as executor:
futures = []
for task in tasks:
futures.append(executor.submit(sqr, task))
for future in concurrent.futures.as_completed(futures):
process_result(future.result())
# Or instead of all this just do:
# results = executor.map(sqr, tasks)
# list(map(process_result, results))
def main():
tasks = list(range(10))
print('Processing {} tasks'.format(len(tasks)))
process_these_asap(tasks)
print('Done')
return 0
if __name__ == '__main__':
import sys
sys.exit(main())
The executor approach might seem familiar to all those who have gotten their hands dirty with Java before.
Also on a side note: To keep the universe sane, don't forget to close your pools/executors if you don't use with context (which is so awesome that it does it for you)

For me, the perfect example for threading is monitoring asynchronous events. Look at this code.
# thread_test.py
import threading
import time
class Monitor(threading.Thread):
def __init__(self, mon):
threading.Thread.__init__(self)
self.mon = mon
def run(self):
while True:
if self.mon[0] == 2:
print "Mon = 2"
self.mon[0] = 3;
You can play with this code by opening an IPython session and doing something like:
>>> from thread_test import Monitor
>>> a = [0]
>>> mon = Monitor(a)
>>> mon.start()
>>> a[0] = 2
Mon = 2
>>>a[0] = 2
Mon = 2
Wait a few minutes
>>> a[0] = 2
Mon = 2

Most documentation and tutorials use Python's Threading and Queue module, and they could seem overwhelming for beginners.
Perhaps consider the concurrent.futures.ThreadPoolExecutor module of Python 3.
Combined with with clause and list comprehension it could be a real charm.
from concurrent.futures import ThreadPoolExecutor, as_completed
def get_url(url):
# Your actual program here. Using threading.Lock() if necessary
return ""
# List of URLs to fetch
urls = ["url1", "url2"]
with ThreadPoolExecutor(max_workers = 5) as executor:
# Create threads
futures = {executor.submit(get_url, url) for url in urls}
# as_completed() gives you the threads once finished
for f in as_completed(futures):
# Get the results
rs = f.result()

With borrowing from this post we know about choosing between the multithreading, multiprocessing, and async/asyncio and their usage.
Python 3 has a new built-in library in order to make concurrency and parallelism — concurrent.futures
So I'll demonstrate through an experiment to run four tasks (i.e. .sleep() method) by Threading-Pool:
from concurrent.futures import ThreadPoolExecutor, as_completed
from time import sleep, time
def concurrent(max_worker):
futures = []
tic = time()
with ThreadPoolExecutor(max_workers=max_worker) as executor:
futures.append(executor.submit(sleep, 2)) # Two seconds sleep
futures.append(executor.submit(sleep, 1))
futures.append(executor.submit(sleep, 7))
futures.append(executor.submit(sleep, 3))
for future in as_completed(futures):
if future.result() is not None:
print(future.result())
print(f'Total elapsed time by {max_worker} workers:', time()-tic)
concurrent(5)
concurrent(4)
concurrent(3)
concurrent(2)
concurrent(1)
Output:
Total elapsed time by 5 workers: 7.007831811904907
Total elapsed time by 4 workers: 7.007944107055664
Total elapsed time by 3 workers: 7.003149509429932
Total elapsed time by 2 workers: 8.004627466201782
Total elapsed time by 1 workers: 13.013478994369507
[NOTE]:
As you can see in the above results, the best case was 3 workers for those four tasks.
If you have a process task instead of I/O bound or blocking (multiprocessing instead of threading) you can change the ThreadPoolExecutor to ProcessPoolExecutor.

I would like to contribute with a simple example and the explanations I've found useful when I had to tackle this problem myself.
In this answer you will find some information about Python's GIL (global interpreter lock) and a simple day-to-day example written using multiprocessing.dummy plus some simple benchmarks.
Global Interpreter Lock (GIL)
Python doesn't allow multi-threading in the truest sense of the word. It has a multi-threading package, but if you want to multi-thread to speed your code up, then it's usually not a good idea to use it.
Python has a construct called the global interpreter lock (GIL).
The GIL makes sure that only one of your 'threads' can execute at any one time. A thread acquires the GIL, does a little work, then passes the GIL onto the next thread.
This happens very quickly so to the human eye it may seem like your threads are executing in parallel, but they are really just taking turns using the same CPU core.
All this GIL passing adds overhead to execution. This means that if you want to make your code run faster then using the threading
package often isn't a good idea.
There are reasons to use Python's threading package. If you want to run some things simultaneously, and efficiency is not a concern,
then it's totally fine and convenient. Or if you are running code that needs to wait for something (like some I/O) then it could make a lot of sense. But the threading library won't let you use extra CPU cores.
Multi-threading can be outsourced to the operating system (by doing multi-processing), and some external application that calls your Python code (for example, Spark or Hadoop), or some code that your Python code calls (for example: you could have your Python code call a C function that does the expensive multi-threaded stuff).
Why This Matters
Because lots of people spend a lot of time trying to find bottlenecks in their fancy Python multi-threaded code before they learn what the GIL is.
Once this information is clear, here's my code:
#!/bin/python
from multiprocessing.dummy import Pool
from subprocess import PIPE,Popen
import time
import os
# In the variable pool_size we define the "parallelness".
# For CPU-bound tasks, it doesn't make sense to create more Pool processes
# than you have cores to run them on.
#
# On the other hand, if you are using I/O-bound tasks, it may make sense
# to create a quite a few more Pool processes than cores, since the processes
# will probably spend most their time blocked (waiting for I/O to complete).
pool_size = 8
def do_ping(ip):
if os.name == 'nt':
print ("Using Windows Ping to " + ip)
proc = Popen(['ping', ip], stdout=PIPE)
return proc.communicate()[0]
else:
print ("Using Linux / Unix Ping to " + ip)
proc = Popen(['ping', ip, '-c', '4'], stdout=PIPE)
return proc.communicate()[0]
os.system('cls' if os.name=='nt' else 'clear')
print ("Running using threads\n")
start_time = time.time()
pool = Pool(pool_size)
website_names = ["www.google.com","www.facebook.com","www.pinterest.com","www.microsoft.com"]
result = {}
for website_name in website_names:
result[website_name] = pool.apply_async(do_ping, args=(website_name,))
pool.close()
pool.join()
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))
# Now we do the same without threading, just to compare time
print ("\nRunning NOT using threads\n")
start_time = time.time()
for website_name in website_names:
do_ping(website_name)
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))
# Here's one way to print the final output from the threads
output = {}
for key, value in result.items():
output[key] = value.get()
print ("\nOutput aggregated in a Dictionary:")
print (output)
print ("\n")
print ("\nPretty printed output: ")
for key, value in output.items():
print (key + "\n")
print (value)

Here is the very simple example of CSV import using threading. (Library inclusion may differ for different purpose.)
Helper Functions:
from threading import Thread
from project import app
import csv
def import_handler(csv_file_name):
thr = Thread(target=dump_async_csv_data, args=[csv_file_name])
thr.start()
def dump_async_csv_data(csv_file_name):
with app.app_context():
with open(csv_file_name) as File:
reader = csv.DictReader(File)
for row in reader:
# DB operation/query
Driver Function:
import_handler(csv_file_name)

Here is multi threading with a simple example which will be helpful. You can run it and understand easily how multi threading is working in Python. I used a lock for preventing access to other threads until the previous threads finished their work. By the use of this line of code,
tLock = threading.BoundedSemaphore(value=4)
you can allow a number of processes at a time and keep hold to the rest of the threads which will run later or after finished previous processes.
import threading
import time
#tLock = threading.Lock()
tLock = threading.BoundedSemaphore(value=4)
def timer(name, delay, repeat):
print "\r\nTimer: ", name, " Started"
tLock.acquire()
print "\r\n", name, " has the acquired the lock"
while repeat > 0:
time.sleep(delay)
print "\r\n", name, ": ", str(time.ctime(time.time()))
repeat -= 1
print "\r\n", name, " is releaseing the lock"
tLock.release()
print "\r\nTimer: ", name, " Completed"
def Main():
t1 = threading.Thread(target=timer, args=("Timer1", 2, 5))
t2 = threading.Thread(target=timer, args=("Timer2", 3, 5))
t3 = threading.Thread(target=timer, args=("Timer3", 4, 5))
t4 = threading.Thread(target=timer, args=("Timer4", 5, 5))
t5 = threading.Thread(target=timer, args=("Timer5", 0.1, 5))
t1.start()
t2.start()
t3.start()
t4.start()
t5.start()
print "\r\nMain Complete"
if __name__ == "__main__":
Main()

None of the previous solutions actually used multiple cores on my GNU/Linux server (where I don't have administrator rights). They just ran on a single core.
I used the lower level os.fork interface to spawn multiple processes. This is the code that worked for me:
from os import fork
values = ['different', 'values', 'for', 'threads']
for i in range(len(values)):
p = fork()
if p == 0:
my_function(values[i])
break

As a python3 version of the second anwser:
import queue as Queue
import threading
import urllib.request
# Called by each thread
def get_url(q, url):
q.put(urllib.request.urlopen(url).read())
theurls = ["http://google.com", "http://yahoo.com", "http://www.python.org","https://wiki.python.org/moin/"]
q = Queue.Queue()
def thread_func():
for u in theurls:
t = threading.Thread(target=get_url, args = (q,u))
t.daemon = True
t.start()
s = q.get()
def non_thread_func():
for u in theurls:
get_url(q,u)
s = q.get()
And you can test it:
start = time.time()
thread_func()
end = time.time()
print(end - start)
start = time.time()
non_thread_func()
end = time.time()
print(end - start)
non_thread_func() should cost 4 times the time spent than thread_func()

import threading
import requests
def send():
r = requests.get('https://www.stackoverlow.com')
thread = []
t = threading.Thread(target=send())
thread.append(t)
t.start()

It's very easy to understand. Here are the two simple ways to do threading.
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
import threading
def a(a=1, b=2):
print(a)
time.sleep(5)
print(b)
return a+b
def b(**kwargs):
if "a" in kwargs:
print("am b")
else:
print("nothing")
to_do=[]
executor = ThreadPoolExecutor(max_workers=4)
ex1=executor.submit(a)
to_do.append(ex1)
ex2=executor.submit(b, **{"a":1})
to_do.append(ex2)
for future in as_completed(to_do):
print("Future {} and Future Return is {}\n".format(future, future.result()))
print("threading")
to_do=[]
to_do.append(threading.Thread(target=a))
to_do.append(threading.Thread(target=b, kwargs={"a":1}))
for threads in to_do:
threads.start()
for threads in to_do:
threads.join()

This code below can run 10 threads concurrently printing the numbers from 0 to 99:
from threading import Thread
def test():
for i in range(0, 100):
print(i)
thread_list = []
for _ in range(0, 10):
thread = Thread(target=test)
thread_list.append(thread)
for thread in thread_list:
thread.start()
for thread in thread_list:
thread.join()
And, this code below is the shorthand for loop version of the above code running 10 threads concurrently printing the numbers from 0 to 99:
from threading import Thread
def test():
[print(i) for i in range(0, 100)]
thread_list = [Thread(target=test) for _ in range(0, 10)]
[thread.start() for thread in thread_list]
[thread.join() for thread in thread_list]
This is the result below:
...
99
83
97
84
98
99
85
86
87
88
...

The easiest way of using threading/multiprocessing is to use more high level libraries like autothread.
import autothread
from time import sleep as heavyworkload
#autothread.multithreaded() # <-- This is all you need to add
def example(x: int, y: int):
heavyworkload(1)
return x*y
Now, you can feed your functions lists of ints. Autothread will handle everything for you and just give you the results computed in parallel.
result = example([1, 2, 3, 4, 5], 10)

Related

Python dynamic MultiThread with Queue - Class

I have been struggling to implement a proper dynamic multi-thread system until now. The idea is to spin up multiple new pools of sub-threads from the main (each pool have its own number of threads and queue size) to run functions and the user can define if the main should wait for the sub-thread to finish up or just move to the next line after starting the thread. This multi-thread logic will help to extract data in parallel and at a fast frequency.
The solution to my issue is shared below for everyone who wants it. If you have any doubts and questions, please let me know.
# -*- coding: utf-8 -*-
"""
Created on Mon Jul 5 00:00:51 2021
#author: Tahasanul Abraham
"""
#%% Initialization of Libraries
import sys, os, inspect
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(currentdir)
sys.path.insert(0,parentdir)
parentdir_1up = os.path.dirname(parentdir)
sys.path.insert(0,parentdir_1up)
from queue import Queue
from threading import Thread, Lock
class Worker(Thread):
def __init__(self, tasks):
Thread.__init__(self)
self.tasks = tasks
self.daemon = True
self.lock = Lock()
self.start()
def run(self):
while True:
func, args, kargs = self.tasks.get()
try:
if func.lower() == "terminate":
break
except:
try:
with self.lock:
func(*args, **kargs)
except Exception as exception:
print(exception)
self.tasks.task_done()
class ThreadPool:
def __init__(self, num_threads, num_queue=None):
if num_queue is None or num_queue < num_threads:
num_queue = num_threads
self.tasks = Queue(num_queue)
self.threads = num_threads
for _ in range(num_threads): Worker(self.tasks)
# This function can be called to terminate all the worker threads of the queue
def terminate(self):
self.wait_completion()
for _ in range(self.threads): self.add_task("terminate")
return None
# This function can be called to add new work to the queue
def add_task(self, func, *args, **kargs):
self.tasks.put((func, args, kargs))
# This function can be called to wait till all the workers are done processing the pending works. If this function is called, the main will not process any new lines unless all the workers are done with the pending works.
def wait_completion(self):
self.tasks.join()
# This function can be called to check if there are any pending/running works in the queue. If there are any works pending, the call will return Boolean True or else it will return Boolean False
def is_alive(self):
if self.tasks.unfinished_tasks == 0:
return False
else:
return True
#%% Standalone Run
if __name__ == "__main__":
import time
def test_return(x,d):
print (str(x) + " - pool completed")
d[str(x)] = x
time.sleep(5)
# 2 thread and 10000000000 FIFO queues
pool = ThreadPool(2,1000000000)
r ={}
for i in range(10):
pool.add_task(test_return, i, r)
print (str(i) + " - pool added")
print ("Waiting for completion")
pool.wait_completion()
print ("pool done")
# 1 thread and 2 FIFO queues
pool = ThreadPool(1,2)
r ={}
for i in range(10):
pool.add_task(test_return, i, r)
print (str(i) + " - pool added")
print ("Waiting for completion")
pool.wait_completion()
print ("pool done")
# 2 thread and 1 FIFO queues
pool = ThreadPool(2,1)
r ={}
for i in range(10):
pool.add_task(test_return, i, r)
print (str(i) + " - pool added")
print ("Waiting for completion")
pool.wait_completion()
print ("pool done")
Making a new Pool
Using the above classes, one can make a pool of their own choise with the number of parallel threads they want and the size of the queue. Example of creating a pool of 10 threads with 200 queue size.
pool = ThreadPool(10,200)
Adding work to Pool
Once a pool is created, one can use that pool.add_task to do sub-routine works. In my example version i used the pool to call a function and its arguments. Example, I called the test_return fucntion with its arguments i and r.
pool.add_task(test_return, i, r)
Waiting for the pool to complete its work
If a pool is given some work to do, the user can either move to other code lines or wait for the pool to finish its work before the next lines ar being read. To wait for the pool to finish the work and then return back, a call for wait_completion is required. Example:
pool.wait_completion()
Terminate and close down the pool threads
Once the requirement of the pool threads are done, it is possible to terminate and close down the pool threads to save up memory and release the blocked threads. This can be done by calling the following function.
pool.terminate()
Checking if there are any pending works from the pool
There is a function that can be called to check if there are any pending/running works in the queue. If there are any works pending, the call will return Boolean True, or else it will return Boolean False. To check if the pool is working or not call the folling function.
pool.is_alive()

How to can apply multithreading for a for loop in python [duplicate]

I am trying to understand threading in Python. I've looked at the documentation and examples, but quite frankly, many examples are overly sophisticated and I'm having trouble understanding them.
How do you clearly show tasks being divided for multi-threading?
Since this question was asked in 2010, there has been real simplification in how to do simple multithreading with Python with map and pool.
The code below comes from an article/blog post that you should definitely check out (no affiliation) - Parallelism in one line: A Better Model for Day to Day Threading Tasks. I'll summarize below - it ends up being just a few lines of code:
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
results = pool.map(my_function, my_array)
Which is the multithreaded version of:
results = []
for item in my_array:
results.append(my_function(item))
Description
Map is a cool little function, and the key to easily injecting parallelism into your Python code. For those unfamiliar, map is something lifted from functional languages like Lisp. It is a function which maps another function over a sequence.
Map handles the iteration over the sequence for us, applies the function, and stores all of the results in a handy list at the end.
Implementation
Parallel versions of the map function are provided by two libraries:multiprocessing, and also its little known, but equally fantastic step child:multiprocessing.dummy.
multiprocessing.dummy is exactly the same as multiprocessing module, but uses threads instead (an important distinction - use multiple processes for CPU-intensive tasks; threads for (and during) I/O):
multiprocessing.dummy replicates the API of multiprocessing, but is no more than a wrapper around the threading module.
import urllib2
from multiprocessing.dummy import Pool as ThreadPool
urls = [
'http://www.python.org',
'http://www.python.org/about/',
'http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html',
'http://www.python.org/doc/',
'http://www.python.org/download/',
'http://www.python.org/getit/',
'http://www.python.org/community/',
'https://wiki.python.org/moin/',
]
# Make the Pool of workers
pool = ThreadPool(4)
# Open the URLs in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)
# Close the pool and wait for the work to finish
pool.close()
pool.join()
And the timing results:
Single thread: 14.4 seconds
4 Pool: 3.1 seconds
8 Pool: 1.4 seconds
13 Pool: 1.3 seconds
Passing multiple arguments (works like this only in Python 3.3 and later):
To pass multiple arrays:
results = pool.starmap(function, zip(list_a, list_b))
Or to pass a constant and an array:
results = pool.starmap(function, zip(itertools.repeat(constant), list_a))
If you are using an earlier version of Python, you can pass multiple arguments via this workaround).
(Thanks to user136036 for the helpful comment.)
Here's a simple example: you need to try a few alternative URLs and return the contents of the first one to respond.
import Queue
import threading
import urllib2
# Called by each thread
def get_url(q, url):
q.put(urllib2.urlopen(url).read())
theurls = ["http://google.com", "http://yahoo.com"]
q = Queue.Queue()
for u in theurls:
t = threading.Thread(target=get_url, args = (q,u))
t.daemon = True
t.start()
s = q.get()
print s
This is a case where threading is used as a simple optimization: each subthread is waiting for a URL to resolve and respond, to put its contents on the queue; each thread is a daemon (won't keep the process up if the main thread ends -- that's more common than not); the main thread starts all subthreads, does a get on the queue to wait until one of them has done a put, then emits the results and terminates (which takes down any subthreads that might still be running, since they're daemon threads).
Proper use of threads in Python is invariably connected to I/O operations (since CPython doesn't use multiple cores to run CPU-bound tasks anyway, the only reason for threading is not blocking the process while there's a wait for some I/O). Queues are almost invariably the best way to farm out work to threads and/or collect the work's results, by the way, and they're intrinsically threadsafe, so they save you from worrying about locks, conditions, events, semaphores, and other inter-thread coordination/communication concepts.
NOTE: For actual parallelization in Python, you should use the multiprocessing module to fork multiple processes that execute in parallel (due to the global interpreter lock, Python threads provide interleaving, but they are in fact executed serially, not in parallel, and are only useful when interleaving I/O operations).
However, if you are merely looking for interleaving (or are doing I/O operations that can be parallelized despite the global interpreter lock), then the threading module is the place to start. As a really simple example, let's consider the problem of summing a large range by summing subranges in parallel:
import threading
class SummingThread(threading.Thread):
def __init__(self,low,high):
super(SummingThread, self).__init__()
self.low=low
self.high=high
self.total=0
def run(self):
for i in range(self.low,self.high):
self.total+=i
thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join() # This waits until the thread has completed
thread2.join()
# At this point, both threads have completed
result = thread1.total + thread2.total
print result
Note that the above is a very stupid example, as it does absolutely no I/O and will be executed serially albeit interleaved (with the added overhead of context switching) in CPython due to the global interpreter lock.
Like others mentioned, CPython can use threads only for I/O waits due to GIL.
If you want to benefit from multiple cores for CPU-bound tasks, use multiprocessing:
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
Just a note: A queue is not required for threading.
This is the simplest example I could imagine that shows 10 processes running concurrently.
import threading
from random import randint
from time import sleep
def print_number(number):
# Sleeps a random 1 to 10 seconds
rand_int_var = randint(1, 10)
sleep(rand_int_var)
print "Thread " + str(number) + " slept for " + str(rand_int_var) + " seconds"
thread_list = []
for i in range(1, 10):
# Instantiates the thread
# (i) does not make a sequence, so (i,)
t = threading.Thread(target=print_number, args=(i,))
# Sticks the thread in a list so that it remains accessible
thread_list.append(t)
# Starts threads
for thread in thread_list:
thread.start()
# This blocks the calling thread until the thread whose join() method is called is terminated.
# From http://docs.python.org/2/library/threading.html#thread-objects
for thread in thread_list:
thread.join()
# Demonstrates that the main process waited for threads to complete
print "Done"
The answer from Alex Martelli helped me. However, here is a modified version that I thought was more useful (at least to me).
Updated: works in both Python 2 and Python 3
try:
# For Python 3
import queue
from urllib.request import urlopen
except:
# For Python 2
import Queue as queue
from urllib2 import urlopen
import threading
worker_data = ['http://google.com', 'http://yahoo.com', 'http://bing.com']
# Load up a queue with your data. This will handle locking
q = queue.Queue()
for url in worker_data:
q.put(url)
# Define a worker function
def worker(url_queue):
queue_full = True
while queue_full:
try:
# Get your data off the queue, and do some work
url = url_queue.get(False)
data = urlopen(url).read()
print(len(data))
except queue.Empty:
queue_full = False
# Create as many threads as you want
thread_count = 5
for i in range(thread_count):
t = threading.Thread(target=worker, args = (q,))
t.start()
Given a function, f, thread it like this:
import threading
threading.Thread(target=f).start()
To pass arguments to f
threading.Thread(target=f, args=(a,b,c)).start()
I found this very useful: create as many threads as cores and let them execute a (large) number of tasks (in this case, calling a shell program):
import Queue
import threading
import multiprocessing
import subprocess
q = Queue.Queue()
for i in range(30): # Put 30 tasks in the queue
q.put(i)
def worker():
while True:
item = q.get()
# Execute a task: call a shell program and wait until it completes
subprocess.call("echo " + str(item), shell=True)
q.task_done()
cpus = multiprocessing.cpu_count() # Detect number of cores
print("Creating %d threads" % cpus)
for i in range(cpus):
t = threading.Thread(target=worker)
t.daemon = True
t.start()
q.join() # Block until all tasks are done
Python 3 has the facility of launching parallel tasks. This makes our work easier.
It has thread pooling and process pooling.
The following gives an insight:
ThreadPoolExecutor Example (source)
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
ProcessPoolExecutor (source)
import concurrent.futures
import math
PRIMES = [
112272535095293,
112582705942171,
112272535095293,
115280095190773,
115797848077099,
1099726899285419]
def is_prime(n):
if n % 2 == 0:
return False
sqrt_n = int(math.floor(math.sqrt(n)))
for i in range(3, sqrt_n + 1, 2):
if n % i == 0:
return False
return True
def main():
with concurrent.futures.ProcessPoolExecutor() as executor:
for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
print('%d is prime: %s' % (number, prime))
if __name__ == '__main__':
main()
I saw a lot of examples here where no real work was being performed, and they were mostly CPU-bound. Here is an example of a CPU-bound task that computes all prime numbers between 10 million and 10.05 million. I have used all four methods here:
import math
import timeit
import threading
import multiprocessing
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def time_stuff(fn):
"""
Measure time of execution of a function
"""
def wrapper(*args, **kwargs):
t0 = timeit.default_timer()
fn(*args, **kwargs)
t1 = timeit.default_timer()
print("{} seconds".format(t1 - t0))
return wrapper
def find_primes_in(nmin, nmax):
"""
Compute a list of prime numbers between the given minimum and maximum arguments
"""
primes = []
# Loop from minimum to maximum
for current in range(nmin, nmax + 1):
# Take the square root of the current number
sqrt_n = int(math.sqrt(current))
found = False
# Check if the any number from 2 to the square root + 1 divides the current numnber under consideration
for number in range(2, sqrt_n + 1):
# If divisible we have found a factor, hence this is not a prime number, lets move to the next one
if current % number == 0:
found = True
break
# If not divisible, add this number to the list of primes that we have found so far
if not found:
primes.append(current)
# I am merely printing the length of the array containing all the primes, but feel free to do what you want
print(len(primes))
#time_stuff
def sequential_prime_finder(nmin, nmax):
"""
Use the main process and main thread to compute everything in this case
"""
find_primes_in(nmin, nmax)
#time_stuff
def threading_prime_finder(nmin, nmax):
"""
If the minimum is 1000 and the maximum is 2000 and we have four workers,
1000 - 1250 to worker 1
1250 - 1500 to worker 2
1500 - 1750 to worker 3
1750 - 2000 to worker 4
so let’s split the minimum and maximum values according to the number of workers
"""
nrange = nmax - nmin
threads = []
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
# Start the thread with the minimum and maximum split up to compute
# Parallel computation will not work here due to the GIL since this is a CPU-bound task
t = threading.Thread(target = find_primes_in, args = (start, end))
threads.append(t)
t.start()
# Don’t forget to wait for the threads to finish
for t in threads:
t.join()
#time_stuff
def processing_prime_finder(nmin, nmax):
"""
Split the minimum, maximum interval similar to the threading method above, but use processes this time
"""
nrange = nmax - nmin
processes = []
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
p = multiprocessing.Process(target = find_primes_in, args = (start, end))
processes.append(p)
p.start()
for p in processes:
p.join()
#time_stuff
def thread_executor_prime_finder(nmin, nmax):
"""
Split the min max interval similar to the threading method, but use a thread pool executor this time.
This method is slightly faster than using pure threading as the pools manage threads more efficiently.
This method is still slow due to the GIL limitations since we are doing a CPU-bound task.
"""
nrange = nmax - nmin
with ThreadPoolExecutor(max_workers = 8) as e:
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
e.submit(find_primes_in, start, end)
#time_stuff
def process_executor_prime_finder(nmin, nmax):
"""
Split the min max interval similar to the threading method, but use the process pool executor.
This is the fastest method recorded so far as it manages process efficiently + overcomes GIL limitations.
RECOMMENDED METHOD FOR CPU-BOUND TASKS
"""
nrange = nmax - nmin
with ProcessPoolExecutor(max_workers = 8) as e:
for i in range(8):
start = int(nmin + i * nrange/8)
end = int(nmin + (i + 1) * nrange/8)
e.submit(find_primes_in, start, end)
def main():
nmin = int(1e7)
nmax = int(1.05e7)
print("Sequential Prime Finder Starting")
sequential_prime_finder(nmin, nmax)
print("Threading Prime Finder Starting")
threading_prime_finder(nmin, nmax)
print("Processing Prime Finder Starting")
processing_prime_finder(nmin, nmax)
print("Thread Executor Prime Finder Starting")
thread_executor_prime_finder(nmin, nmax)
print("Process Executor Finder Starting")
process_executor_prime_finder(nmin, nmax)
if __name__ == "__main__":
main()
Here are the results on my Mac OS X four-core machine
Sequential Prime Finder Starting
9.708213827005238 seconds
Threading Prime Finder Starting
9.81836523200036 seconds
Processing Prime Finder Starting
3.2467174359990167 seconds
Thread Executor Prime Finder Starting
10.228896902000997 seconds
Process Executor Finder Starting
2.656402041000547 seconds
Using the blazing new concurrent.futures module
def sqr(val):
import time
time.sleep(0.1)
return val * val
def process_result(result):
print(result)
def process_these_asap(tasks):
import concurrent.futures
with concurrent.futures.ProcessPoolExecutor() as executor:
futures = []
for task in tasks:
futures.append(executor.submit(sqr, task))
for future in concurrent.futures.as_completed(futures):
process_result(future.result())
# Or instead of all this just do:
# results = executor.map(sqr, tasks)
# list(map(process_result, results))
def main():
tasks = list(range(10))
print('Processing {} tasks'.format(len(tasks)))
process_these_asap(tasks)
print('Done')
return 0
if __name__ == '__main__':
import sys
sys.exit(main())
The executor approach might seem familiar to all those who have gotten their hands dirty with Java before.
Also on a side note: To keep the universe sane, don't forget to close your pools/executors if you don't use with context (which is so awesome that it does it for you)
For me, the perfect example for threading is monitoring asynchronous events. Look at this code.
# thread_test.py
import threading
import time
class Monitor(threading.Thread):
def __init__(self, mon):
threading.Thread.__init__(self)
self.mon = mon
def run(self):
while True:
if self.mon[0] == 2:
print "Mon = 2"
self.mon[0] = 3;
You can play with this code by opening an IPython session and doing something like:
>>> from thread_test import Monitor
>>> a = [0]
>>> mon = Monitor(a)
>>> mon.start()
>>> a[0] = 2
Mon = 2
>>>a[0] = 2
Mon = 2
Wait a few minutes
>>> a[0] = 2
Mon = 2
Most documentation and tutorials use Python's Threading and Queue module, and they could seem overwhelming for beginners.
Perhaps consider the concurrent.futures.ThreadPoolExecutor module of Python 3.
Combined with with clause and list comprehension it could be a real charm.
from concurrent.futures import ThreadPoolExecutor, as_completed
def get_url(url):
# Your actual program here. Using threading.Lock() if necessary
return ""
# List of URLs to fetch
urls = ["url1", "url2"]
with ThreadPoolExecutor(max_workers = 5) as executor:
# Create threads
futures = {executor.submit(get_url, url) for url in urls}
# as_completed() gives you the threads once finished
for f in as_completed(futures):
# Get the results
rs = f.result()
With borrowing from this post we know about choosing between the multithreading, multiprocessing, and async/asyncio and their usage.
Python 3 has a new built-in library in order to make concurrency and parallelism — concurrent.futures
So I'll demonstrate through an experiment to run four tasks (i.e. .sleep() method) by Threading-Pool:
from concurrent.futures import ThreadPoolExecutor, as_completed
from time import sleep, time
def concurrent(max_worker):
futures = []
tic = time()
with ThreadPoolExecutor(max_workers=max_worker) as executor:
futures.append(executor.submit(sleep, 2)) # Two seconds sleep
futures.append(executor.submit(sleep, 1))
futures.append(executor.submit(sleep, 7))
futures.append(executor.submit(sleep, 3))
for future in as_completed(futures):
if future.result() is not None:
print(future.result())
print(f'Total elapsed time by {max_worker} workers:', time()-tic)
concurrent(5)
concurrent(4)
concurrent(3)
concurrent(2)
concurrent(1)
Output:
Total elapsed time by 5 workers: 7.007831811904907
Total elapsed time by 4 workers: 7.007944107055664
Total elapsed time by 3 workers: 7.003149509429932
Total elapsed time by 2 workers: 8.004627466201782
Total elapsed time by 1 workers: 13.013478994369507
[NOTE]:
As you can see in the above results, the best case was 3 workers for those four tasks.
If you have a process task instead of I/O bound or blocking (multiprocessing instead of threading) you can change the ThreadPoolExecutor to ProcessPoolExecutor.
I would like to contribute with a simple example and the explanations I've found useful when I had to tackle this problem myself.
In this answer you will find some information about Python's GIL (global interpreter lock) and a simple day-to-day example written using multiprocessing.dummy plus some simple benchmarks.
Global Interpreter Lock (GIL)
Python doesn't allow multi-threading in the truest sense of the word. It has a multi-threading package, but if you want to multi-thread to speed your code up, then it's usually not a good idea to use it.
Python has a construct called the global interpreter lock (GIL).
The GIL makes sure that only one of your 'threads' can execute at any one time. A thread acquires the GIL, does a little work, then passes the GIL onto the next thread.
This happens very quickly so to the human eye it may seem like your threads are executing in parallel, but they are really just taking turns using the same CPU core.
All this GIL passing adds overhead to execution. This means that if you want to make your code run faster then using the threading
package often isn't a good idea.
There are reasons to use Python's threading package. If you want to run some things simultaneously, and efficiency is not a concern,
then it's totally fine and convenient. Or if you are running code that needs to wait for something (like some I/O) then it could make a lot of sense. But the threading library won't let you use extra CPU cores.
Multi-threading can be outsourced to the operating system (by doing multi-processing), and some external application that calls your Python code (for example, Spark or Hadoop), or some code that your Python code calls (for example: you could have your Python code call a C function that does the expensive multi-threaded stuff).
Why This Matters
Because lots of people spend a lot of time trying to find bottlenecks in their fancy Python multi-threaded code before they learn what the GIL is.
Once this information is clear, here's my code:
#!/bin/python
from multiprocessing.dummy import Pool
from subprocess import PIPE,Popen
import time
import os
# In the variable pool_size we define the "parallelness".
# For CPU-bound tasks, it doesn't make sense to create more Pool processes
# than you have cores to run them on.
#
# On the other hand, if you are using I/O-bound tasks, it may make sense
# to create a quite a few more Pool processes than cores, since the processes
# will probably spend most their time blocked (waiting for I/O to complete).
pool_size = 8
def do_ping(ip):
if os.name == 'nt':
print ("Using Windows Ping to " + ip)
proc = Popen(['ping', ip], stdout=PIPE)
return proc.communicate()[0]
else:
print ("Using Linux / Unix Ping to " + ip)
proc = Popen(['ping', ip, '-c', '4'], stdout=PIPE)
return proc.communicate()[0]
os.system('cls' if os.name=='nt' else 'clear')
print ("Running using threads\n")
start_time = time.time()
pool = Pool(pool_size)
website_names = ["www.google.com","www.facebook.com","www.pinterest.com","www.microsoft.com"]
result = {}
for website_name in website_names:
result[website_name] = pool.apply_async(do_ping, args=(website_name,))
pool.close()
pool.join()
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))
# Now we do the same without threading, just to compare time
print ("\nRunning NOT using threads\n")
start_time = time.time()
for website_name in website_names:
do_ping(website_name)
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))
# Here's one way to print the final output from the threads
output = {}
for key, value in result.items():
output[key] = value.get()
print ("\nOutput aggregated in a Dictionary:")
print (output)
print ("\n")
print ("\nPretty printed output: ")
for key, value in output.items():
print (key + "\n")
print (value)
Here is the very simple example of CSV import using threading. (Library inclusion may differ for different purpose.)
Helper Functions:
from threading import Thread
from project import app
import csv
def import_handler(csv_file_name):
thr = Thread(target=dump_async_csv_data, args=[csv_file_name])
thr.start()
def dump_async_csv_data(csv_file_name):
with app.app_context():
with open(csv_file_name) as File:
reader = csv.DictReader(File)
for row in reader:
# DB operation/query
Driver Function:
import_handler(csv_file_name)
Here is multi threading with a simple example which will be helpful. You can run it and understand easily how multi threading is working in Python. I used a lock for preventing access to other threads until the previous threads finished their work. By the use of this line of code,
tLock = threading.BoundedSemaphore(value=4)
you can allow a number of processes at a time and keep hold to the rest of the threads which will run later or after finished previous processes.
import threading
import time
#tLock = threading.Lock()
tLock = threading.BoundedSemaphore(value=4)
def timer(name, delay, repeat):
print "\r\nTimer: ", name, " Started"
tLock.acquire()
print "\r\n", name, " has the acquired the lock"
while repeat > 0:
time.sleep(delay)
print "\r\n", name, ": ", str(time.ctime(time.time()))
repeat -= 1
print "\r\n", name, " is releaseing the lock"
tLock.release()
print "\r\nTimer: ", name, " Completed"
def Main():
t1 = threading.Thread(target=timer, args=("Timer1", 2, 5))
t2 = threading.Thread(target=timer, args=("Timer2", 3, 5))
t3 = threading.Thread(target=timer, args=("Timer3", 4, 5))
t4 = threading.Thread(target=timer, args=("Timer4", 5, 5))
t5 = threading.Thread(target=timer, args=("Timer5", 0.1, 5))
t1.start()
t2.start()
t3.start()
t4.start()
t5.start()
print "\r\nMain Complete"
if __name__ == "__main__":
Main()
None of the previous solutions actually used multiple cores on my GNU/Linux server (where I don't have administrator rights). They just ran on a single core.
I used the lower level os.fork interface to spawn multiple processes. This is the code that worked for me:
from os import fork
values = ['different', 'values', 'for', 'threads']
for i in range(len(values)):
p = fork()
if p == 0:
my_function(values[i])
break
As a python3 version of the second anwser:
import queue as Queue
import threading
import urllib.request
# Called by each thread
def get_url(q, url):
q.put(urllib.request.urlopen(url).read())
theurls = ["http://google.com", "http://yahoo.com", "http://www.python.org","https://wiki.python.org/moin/"]
q = Queue.Queue()
def thread_func():
for u in theurls:
t = threading.Thread(target=get_url, args = (q,u))
t.daemon = True
t.start()
s = q.get()
def non_thread_func():
for u in theurls:
get_url(q,u)
s = q.get()
And you can test it:
start = time.time()
thread_func()
end = time.time()
print(end - start)
start = time.time()
non_thread_func()
end = time.time()
print(end - start)
non_thread_func() should cost 4 times the time spent than thread_func()
import threading
import requests
def send():
r = requests.get('https://www.stackoverlow.com')
thread = []
t = threading.Thread(target=send())
thread.append(t)
t.start()
It's very easy to understand. Here are the two simple ways to do threading.
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
import threading
def a(a=1, b=2):
print(a)
time.sleep(5)
print(b)
return a+b
def b(**kwargs):
if "a" in kwargs:
print("am b")
else:
print("nothing")
to_do=[]
executor = ThreadPoolExecutor(max_workers=4)
ex1=executor.submit(a)
to_do.append(ex1)
ex2=executor.submit(b, **{"a":1})
to_do.append(ex2)
for future in as_completed(to_do):
print("Future {} and Future Return is {}\n".format(future, future.result()))
print("threading")
to_do=[]
to_do.append(threading.Thread(target=a))
to_do.append(threading.Thread(target=b, kwargs={"a":1}))
for threads in to_do:
threads.start()
for threads in to_do:
threads.join()
This code below can run 10 threads concurrently printing the numbers from 0 to 99:
from threading import Thread
def test():
for i in range(0, 100):
print(i)
thread_list = []
for _ in range(0, 10):
thread = Thread(target=test)
thread_list.append(thread)
for thread in thread_list:
thread.start()
for thread in thread_list:
thread.join()
And, this code below is the shorthand for loop version of the above code running 10 threads concurrently printing the numbers from 0 to 99:
from threading import Thread
def test():
[print(i) for i in range(0, 100)]
thread_list = [Thread(target=test) for _ in range(0, 10)]
[thread.start() for thread in thread_list]
[thread.join() for thread in thread_list]
This is the result below:
...
99
83
97
84
98
99
85
86
87
88
...
The easiest way of using threading/multiprocessing is to use more high level libraries like autothread.
import autothread
from time import sleep as heavyworkload
#autothread.multithreaded() # <-- This is all you need to add
def example(x: int, y: int):
heavyworkload(1)
return x*y
Now, you can feed your functions lists of ints. Autothread will handle everything for you and just give you the results computed in parallel.
result = example([1, 2, 3, 4, 5], 10)

Force Multiprocessing Pool to iterate over argument

I'm using multiprocessing Pool to run a function for multiple arguments over and over. I use a list for jobs that filled by another thread and a job_handler function to handles each job. My problem is that when the list becomes empty the Pool will end the function. I want to keep the pool alive and wait until the list to fill. Actually, there are two scenarios to solve this.
1.Use one pool but would end after list become empty:
from multiprocessing import Pool
from threading import Thread
from time import sleep
def job_handler(i):
print("Doing job:", i)
sleep(0.5)
def job_adder():
i = 0
while True:
jobs.append(i)
i += 1
sleep(0.1)
if __name__ == "__main__":
pool = Pool(4)
jobs = []
thr = Thread(target=job_adder)
thr.start()
# wait for job_adder to add to list
sleep(1)
pool.map_async(job_handler, jobs)
while True:
pass
2.Multiple map_async:
from multiprocessing import Pool
from threading import Thread
from time import sleep
def job_handler(i):
print("Doing job:", i)
sleep(0.5)
def job_adder():
i = 0
while True:
jobs.append(i)
i += 1
sleep(0.1)
if __name__ == "__main__":
pool = Pool(4)
jobs = []
thr = Thread(target=job_adder)
thr.start()
while True:
for job in jobs:
pool1 = pool.map_async(job_handler, (job,))
jobs.remove(job)
What is the difference between the two? I think the first option would be nicer because the map itself would handle the iteration. My aim is to get better performance to handle each job separately.
The need to “slow down” a Pool comes up in a number of situations. This case is easier than some:
q=queue.Queue()
m=pool.imap(iter(q.get,None))
You can also use imap_unordered; None is a sentinel to terminate the Pool. The Pool has to use a thread to collect the tasks (since those functions are “lazier [than] map()”), and it will block on q as needed.

Python 3 Limit count of active threads (finished threads do not quit)

I want to limit the number of active threads. What i have seen is, that a finished thread stays alive and does not exit itself, so the number of active threads keep growing until an error occours.
The following code starts only 8 threads at a time but they stay alive even when they finished. So the number keeps growing:
class ThreadEx(threading.Thread):
__thread_limiter = None
__max_threads = 2
#classmethod
def max_threads(cls, thread_max):
ThreadEx.__max_threads = thread_max
ThreadEx.__thread_limiter = threading.BoundedSemaphore(value=ThreadEx.__max_threads)
def __init__(self, target=None, args:tuple=()):
super().__init__(target=target, args=args)
if not ThreadEx.__thread_limiter:
ThreadEx.__thread_limiter = threading.BoundedSemaphore(value=ThreadEx.__max_threads)
def run(self):
ThreadEx.__thread_limiter.acquire()
try:
#success = self._target(*self._args)
#if success: return True
super().run()
except:
pass
finally:
ThreadEx.__thread_limiter.release()
def call_me(test1, test2):
print(test1 + test2)
time.sleep(1)
ThreadEx.max_threads(8)
for i in range(0, 99):
t = ThreadEx(target=call_me, args=("Thread count: ", str(threading.active_count())))
t.start()
Due to the for loop, the number of threads keep growing to 99.
I know that a thread has done its work because call_me has been executed and threading.active_count() was printed.
Does somebody know how i make sure, a finished thread does not stay alive?
This may be a silly answer but to me it looks you are trying to reinvent ThreadPool.
from multiprocessing.pool import ThreadPool
from time import sleep
p = ThreadPool(8)
def call_me(test1):
print(test1)
sleep(1)
for i in range(0, 99):
p.apply_async(call_me, args=(i,))
p.close()
p.join()
This will ensure only 8 concurrent threads are running your function at any point of time. And if you want a bit more performance, you can import Pool from multiprocessing and use that. The interface is exactly the same but your pool will now be subprocesses instead of threads, which usually gives a performance boost as GIL does not come in the way.
I have changed the class according to the help of Hannu.
I post it for reference, maybe it's useful for others that come across this post:
import threading
from multiprocessing.pool import ThreadPool
import time
class MultiThread():
__thread_pool = None
#classmethod
def begin(cls, max_threads):
MultiThread.__thread_pool = ThreadPool(max_threads)
#classmethod
def end(cls):
MultiThread.__thread_pool.close()
MultiThread.__thread_pool.join()
def __init__(self, target=None, args:tuple=()):
self.__target = target
self.__args = args
def run(self):
try:
result = MultiThread.__thread_pool.apply_async(self.__target, args=self.__args)
return result.get()
except:
pass
def call_me(test1, test2):
print(test1 + test2)
time.sleep(1)
return 0
MultiThread.begin(8)
for i in range(0, 99):
t = MultiThread(target=call_me, args=("Thread count: ", str(threading.active_count())))
t.run()
MultiThread.end()
The maximum of threads is 8 at any given time determined by the method begin.
And also the method run returns the result of your passed function if it returns something.
Hope that helps.

python multithreading wait till all threads finished

This may have been asked in a similar context but I was unable to find an answer after about 20 minutes of searching, so I will ask.
I have written a Python script (lets say: scriptA.py) and a script (lets say scriptB.py)
In scriptB I want to call scriptA multiple times with different arguments, each time takes about an hour to run, (its a huge script, does lots of stuff.. don't worry about it) and I want to be able to run the scriptA with all the different arguments simultaneously, but I need to wait till ALL of them are done before continuing; my code:
import subprocess
#setup
do_setup()
#run scriptA
subprocess.call(scriptA + argumentsA)
subprocess.call(scriptA + argumentsB)
subprocess.call(scriptA + argumentsC)
#finish
do_finish()
I want to do run all the subprocess.call() at the same time, and then wait till they are all done, how should I do this?
I tried to use threading like the example here:
from threading import Thread
import subprocess
def call_script(args)
subprocess.call(args)
#run scriptA
t1 = Thread(target=call_script, args=(scriptA + argumentsA))
t2 = Thread(target=call_script, args=(scriptA + argumentsB))
t3 = Thread(target=call_script, args=(scriptA + argumentsC))
t1.start()
t2.start()
t3.start()
But I do not think this is right.
How do I know they have all finished running before going to my do_finish()?
Put the threads in a list and then use the Join method
threads = []
t = Thread(...)
threads.append(t)
...repeat as often as necessary...
# Start all threads
for x in threads:
x.start()
# Wait for all of them to finish
for x in threads:
x.join()
You need to use join method of Thread object in the end of the script.
t1 = Thread(target=call_script, args=(scriptA + argumentsA))
t2 = Thread(target=call_script, args=(scriptA + argumentsB))
t3 = Thread(target=call_script, args=(scriptA + argumentsC))
t1.start()
t2.start()
t3.start()
t1.join()
t2.join()
t3.join()
Thus the main thread will wait till t1, t2 and t3 finish execution.
In Python3, since Python 3.2 there is a new approach to reach the same result, that I personally prefer to the traditional thread creation/start/join, package concurrent.futures: https://docs.python.org/3/library/concurrent.futures.html
Using a ThreadPoolExecutor the code would be:
from concurrent.futures.thread import ThreadPoolExecutor
import time
def call_script(ordinal, arg):
print('Thread', ordinal, 'argument:', arg)
time.sleep(2)
print('Thread', ordinal, 'Finished')
args = ['argumentsA', 'argumentsB', 'argumentsC']
with ThreadPoolExecutor(max_workers=2) as executor:
ordinal = 1
for arg in args:
executor.submit(call_script, ordinal, arg)
ordinal += 1
print('All tasks has been finished')
The output of the previous code is something like:
Thread 1 argument: argumentsA
Thread 2 argument: argumentsB
Thread 1 Finished
Thread 2 Finished
Thread 3 argument: argumentsC
Thread 3 Finished
All tasks has been finished
One of the advantages is that you can control the throughput setting the max concurrent workers.
To use multiprocessing instead, you can use ProcessPoolExecutor.
I prefer using list comprehension based on an input list:
inputs = [scriptA + argumentsA, scriptA + argumentsB, ...]
threads = [Thread(target=call_script, args=(i)) for i in inputs]
[t.start() for t in threads]
[t.join() for t in threads]
You can have class something like below from which you can add 'n' number of functions or console_scripts you want to execute in parallel passion and start the execution and wait for all jobs to complete..
from multiprocessing import Process
class ProcessParallel(object):
"""
To Process the functions parallely
"""
def __init__(self, *jobs):
"""
"""
self.jobs = jobs
self.processes = []
def fork_processes(self):
"""
Creates the process objects for given function deligates
"""
for job in self.jobs:
proc = Process(target=job)
self.processes.append(proc)
def start_all(self):
"""
Starts the functions process all together.
"""
for proc in self.processes:
proc.start()
def join_all(self):
"""
Waits untill all the functions executed.
"""
for proc in self.processes:
proc.join()
def two_sum(a=2, b=2):
return a + b
def multiply(a=2, b=2):
return a * b
#How to run:
if __name__ == '__main__':
#note: two_sum, multiply can be replace with any python console scripts which
#you wanted to run parallel..
procs = ProcessParallel(two_sum, multiply)
#Add all the process in list
procs.fork_processes()
#starts process execution
procs.start_all()
#wait until all the process got executed
procs.join_all()
I just came across the same problem where I needed to wait for all the threads which were created using the for loop.I just tried out the following piece of code.It may not be the perfect solution but I thought it would be a simple solution to test:
for t in threading.enumerate():
try:
t.join()
except RuntimeError as err:
if 'cannot join current thread' in err:
continue
else:
raise
From the threading module documentation
There is a “main thread” object; this corresponds to the initial
thread of control in the Python program. It is not a daemon thread.
There is the possibility that “dummy thread objects” are created.
These are thread objects corresponding to “alien threads”, which are
threads of control started outside the threading module, such as
directly from C code. Dummy thread objects have limited functionality;
they are always considered alive and daemonic, and cannot be join()ed.
They are never deleted, since it is impossible to detect the
termination of alien threads.
So, to catch those two cases when you are not interested in keeping a list of the threads you create:
import threading as thrd
def alter_data(data, index):
data[index] *= 2
data = [0, 2, 6, 20]
for i, value in enumerate(data):
thrd.Thread(target=alter_data, args=[data, i]).start()
for thread in thrd.enumerate():
if thread.daemon:
continue
try:
thread.join()
except RuntimeError as err:
if 'cannot join current thread' in err.args[0]:
# catchs main thread
continue
else:
raise
Whereupon:
>>> print(data)
[0, 4, 12, 40]
Maybe, something like
for t in threading.enumerate():
if t.daemon:
t.join()
using only join can result in false-possitive interaction with thread. Like said in docs :
When the timeout argument is present and not None, it should be a
floating point number specifying a timeout for the operation in
seconds (or fractions thereof). As join() always returns None, you
must call isAlive() after join() to decide whether a timeout happened
– if the thread is still alive, the join() call timed out.
and illustrative piece of code:
threads = []
for name in some_data:
new = threading.Thread(
target=self.some_func,
args=(name,)
)
threads.append(new)
new.start()
over_threads = iter(threads)
curr_th = next(over_threads)
while True:
curr_th.join()
if curr_th.is_alive():
continue
try:
curr_th = next(over_threads)
except StopIteration:
break

Categories

Resources