Implementation of Multi-Threading and Multi-Processing using Python

Implementation of Multi-Threading and Multi-Processing using Python - python

I've defined a (test) Function in Python, which I am using to understand the different computation time that might be required to execute the code - using normal code (without using multi-processing or multi-threading), and then implementing each of them one by one.
Function (for Basic Usage):
from random import randint as rInt
def highComputationFunction(rangeNumber):
count_ = 0
for i in range(rangeNumber):
count_ = count_*2 + rInt(rangeNumber**2, rangeNumber**3)
count_ = 10**100//count_
return count_
Also, for Multi-Processing & Multi-Threading, I wanted to return the result of the thread to my Parent Function, so modified it like this:
from random import randint as rInt
def highComputationFunction(rangeNumber, result):
count_ = 0
for i in range(rangeNumber):
count_ = count_*2 + rInt(rangeNumber**2, rangeNumber**3)
count_ = 10**100//count_
return count_
Looking into the CPU Usage for each of the main function as below:
import time
if __name__ == '__main__':
startTime = time.time()
rangeNumber = 10000
coumputedNum = float(round(highComputationFunction(rangeNumber)//100**5000, 3))
print('\tFunction of {} Executed in: {} seconds. Result = {}'.format(rangeNumber, round(time.time() - startTime, 2), coumputedNum))
inTime = time.time()
rangeNumber = 100000
coumputedNum = float(round(highComputationFunction(rangeNumber)//100**5000, 3))
print('\tFunction of {} Executed in: {} seconds. Result = {}'.format(rangeNumber, round(time.time() - inTime, 2), coumputedNum))
inTime = time.time()
rangeNumber = 1000000
coumputedNum = float(round(highComputationFunction(rangeNumber)//100**5000, 3))
print('\tFunction of {} Executed in: {} seconds. Result = {}'.format(rangeNumber, round(time.time() - inTime, 2), coumputedNum))
print('Total Execution Time: {}'.format(round(time.time() - startTime, 2)))
This was executed in approximately 46 Seconds in Total. One output is as Below:
# python understandComputation.py
# Function of 10000 Executed in: 0.03 seconds. Result = 0.0
# Function of 100000 Executed in: 0.91 seconds. Result = 0.0
# Function of 1000000 Executed in: 45.49 seconds. Result = 0.0
# Total Execution Time: 46.44
Executed the same thing with Multi-Threading:
import time
import threading
if __name__ == '__main__':
startTime = time.time()
result_ = 0
threadList = []
for i in [10000, 100000, 1000000]:
curThread = threading.Thread(target = highComputationFunction, args = (i, result_))
curThread.start()
print('\tThread for {} Started.'.format(i))
threadList.append(curThread)
result_ += result_
for i in threadList:
i.join()
print('Total Function Executed in: {} seconds. Result = {}'.format(round(time.time() - startTime, 2), result_))
For Multi-Processing:
import time
import multiprocessing
if __name__ == '__main__':
startTime = time.time()
result_ = 0
procList = []
for i in [10000, 100000, 1000000]:
curProc = multiprocessing.Process(target = highComputationFunction, args = (i, result_))
curProc.start()
print('\tProcess for {} Started.'.format(i))
procList.append(curProc)
result_ += result_
for i in procList:
i.join()
print('Total Function Executed in: {} seconds. Result = {}'.format(round(time.time() - startTime, 2), result_))
Implementing this, got the output in much more time than usual.
# python understandComputation.py
# Thread for 10000 Started.
# Thread for 100000 Started.
# Thread for 1000000 Started.
# Total Function Executed in: 47.04 seconds. Result = 0
# python understandComputation.py
# Process for 10000 Started.
# Process for 100000 Started.
# Process for 1000000 Started.
# Total Function Executed in: 47.21 seconds. Result = 0
Please tell me, if it is wrong with the implementation of the code or not. Expected result for multi-threading and multi-processing should be less than 45.5 Seconds, which is the maximum time taken for the execution of the 1000000 number in the actual code, but I'm not getting the desired output.

Related

Looking to Maximize CPU Utilization with a python benchmarking program using concurrent.futures

I have been working on developing a synthetic benchmarking program in python in order to stress-test the CPU in various systems for a class project. I have based my approach on Mersenne primality tests (inspired by prime95). The program is intended to test the Mersenne primality of numbers over a working set, defined by the user. I have so far implemented this in python, however once implementing the concurrent.futures module in order to run the task in parallel to maximize CPU utilization, I hit a snag. When testing my program I ran into 2 issues.
CPU utilization is still only ~35%
When testing larger working sets, the program stalls for several minutes before it starts iterating through each prime - I am assuming this has something to do with concurrent.futures's setup.
I was hoping someone could provide some insight into how to maximize system resource usage with this program and iron out the issues with larger sets.
Code below:
sys = platform.uname()
c = wmi.WMI()
winSys = c.Win32_ComputerSystem()[0]
mode1 = "Integer Mode"
mode2 = "Floating-Point Mode"
def lehmer(p: int) -> bool:
s = 4
M = (1 << p) - 1
for i in range(p - 2):
s = ((s * s) - 2) % M
return s == 0
#Initial printout of system information and menu screen schowing benchamrking options
print("_________________________________________________________________________________")
print("------------------------------System Information---------------------------------")
print(f"\tOS: {sys.system} {sys.release} ") #
print(f"\tMachine Name: {sys.node}")
print(f"\tVersion: {sys.version}")
print(f"\tCPU: {sys.processor}")
print("\tNumber of Cores: " + str(psutil.cpu_count()))
print(f"\tRAM: {psutil.virtual_memory()}")
print("---------------------------------------------------------------------------------")
modeSelect = 0;
print("Welcome to ParaBench! Please select what benchmarking mode you would like to use." + '\n')
modeSelect = int(input("[1] -> " + mode1 + '\n' + "[2] -> " + mode2
+ "\n[9] -> Exit\n_________________________________________________________________________________\n"))
#User selects Integer benchmarking mode
if modeSelect == 1:
print("User Selected " + mode1)
#Printout of selection for order of magnitude
print("[1] -> First 1x10^2 Primes\n" + "[2] -> First 1x10^3 Primes\n"
+"[3] -> First 1x10^4 Primes\n" + "[4] -> First 1x10^5 Primes\n" + "[5] -> First 1x10^6 Primes\n")
mersenneOrder = int(input("Please Select an option\n"))
if mersenneOrder == 1:
print("Starting Benchmark...")
with ThreadPoolExecutor(15) as executor:
timeStart = perf_counter()
for result in executor.map(lehmer,range(2,100)):
print(result)
timeStop = perf_counter()
print("1E2 Benchmark Complete in ",timeStop-timeStart)
if mersenneOrder == 2:
print("Starting Benchmark...")
with ThreadPoolExecutor(15) as executor:
timeStart = perf_counter()
for result in executor.map(lehmer,range(2,1000)):
print(result)
timeStop = perf_counter()
print("1E3 Benchmark Complete!!", timeStop-timeStart)
if mersenneOrder == 3:
print("Starting Benchmark...")
with ThreadPoolExecutor() as executor:
timeStart = perf_counter()
for result in executor.map(lehmer,range(2,10000)):
print(result)
timeStop = perf_counter()
print("1E4 Benchmark Complete!!", timeStop-timeStart)
if mersenneOrder == 4:
print("Starting Benchmark...")
with ThreadPoolExecutor(15) as executor:
timeStart = perf_counter()
for result in executor.map(lehmer,range(2,100000)):
print(result)
timeStop = perf_counter()
print("1E5 Benchmark Complete!!", timeStop-timeStart)
if mersenneOrder == 5:
print("Starting Benchmark...")
with ThreadPoolExecutor(15) as executor:
timeStart = perf_counter()
for result in executor.map(lehmer,range(2,1000000)):
print(result)
timeStop = perf_counter()
print("1E6 Benchmark Complete!!", timeStop-timeStart)
#Single-threaded test (DEPRECATED)
#for x in range(2,1000000):
# if lehmer(x):
# print(x)

Calculating the amount of time left until completion

I am wondering how to calculate the amount of time it would take to example:
Complete a brute force word list.
I know how to use the time function and measure in time,
but the problem is i need to find out how long it would take in the program itself...
Here is the code i made this yesterday
import itertools, math
import os
Alphabet = ("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890") # Add or remove whatevs you think will be in the password you're cracking (example, [symbols])
counter = 1
CharLength = 1
range_num = int(raw_input("Enter range: "))
stopper = range_num + 1
filename = "bruteforce_%r.txt" % (range_num)
f = open(filename, 'a')
#n_1 = len(Alphabet)
#n_2 = n_1 - 1 # <-- total useless peice of garbage that could of been great in vurtual life
#n_3 = '0' * n_2
#n = '1' + n_3
x = range_num
y = len(Alphabet)
amount = math.pow(y, x)
total_items = math.pow(y, x)
for CharLength in range(range_num, stopper):
passwords = (itertools.product(Alphabet, repeat = CharLength))
for i in passwords:
counter += 1
percentage = (counter / total_items) * 100
amount -= 1
i = str(i)
i = i.replace("[", "")
i = i.replace("]", "")
i = i.replace("'", "")
i = i.replace(" ", "")
i = i.replace(",", "")
i = i.replace("(", "")
i = i.replace(")", "")
f.write(i)
f.write('\n')
print "Password: %r\tPercentage: %r/100\tAmount left: %r" % (i, int(percentage), amount)
if i == '0'* range_num:
print "*Done"
f.close()
exit(0)
else:
pass
This is my timer function i managed to make
#import winsound # Comment this out if your using linux
import os
import time
from sys import exit
print "This is the timer\nHit CTRL-C to stop the timer\nOtherwise just let it rip untill the time's up"
hours = int(raw_input('Enter the hours.\n>>> '))
os.system('clear') # Linux
#os.system('cls') # Windows
minutes = int(raw_input('Enter the minutes.\n>>> '))
os.system('clear') # linux
#os.system('cls') # Windows
seconds = int(raw_input('Enter the seconds.\n>>> '))
os.system('clear') # Linux
#os.system('cls') # Windows
stop_time = '%r:%r:%r' % (hours, minutes, seconds)
t_hours = 00
t_minutes = 00
t_seconds = 00
while t_seconds <= 60:
try:
os.system('clear') # Linux
#os.system('cls') # Windows
current_time = '%r:%r:%r' % (t_hours, t_minutes, t_seconds)
print current_time
time.sleep(1)
t_seconds+=1
if current_time == stop_time:
print "// Done"
#winsound.Beep(500,1000)
#winsound.Beep(400,1000)
break
elif t_seconds == 60:
t_minutes+=1
t_seconds=0
elif t_minutes == 60:
t_hours+=1
t_minutes = 00
except KeyboardInterrupt:
print "Stopped at: %r:%r:%r" % (t_hours, t_minutes, t_seconds)
raw_input("Hit enter to continue\nHit CTRL-C to end")
try:
pass
except KeyboardInterrupt:
exit(0)
Now i just cant figure out how to make this again but to calculate how long it will take rather than how long it is taking...

You cannot predict the time a script is going to take.
Firstly because two machines wouldn't run the script in the same time, and secondly, because the execution time on one machine can vary from on take to another.
What you can do, however, is compute the percentage of execution.
You need to figure out, for example, how many iterations your main loop will do, and calculate at each iteration the ratio current iteration count / total number of iterations.
Here is a minimalist example of what you can do:
n = 10000
for i in range(n):
print("Processing file {} ({}%)".format(i, 100*i//n))
process_file(i)
You can take it further and add the time as an additional info:
n = 10000
t0 = time.time()
for i in range(n):
t1 = time.time()
print("Processing file {} ({}%)".format(i, 100*i//n), end="")
process_file(i)
t2 = time.time()
print(" {}s (total: {}s)".format(t2-t1, t2-t0))
The output will look like this:
...
Processing file 2597 (25%) 0.2s (total: 519.4s)
Processing file 2598 (25%) 0.3s (total: 519.7s)
Processing file 2599 (25%) 0.1s (total: 519.8s)
Processing file 2600 (25%)

This is my implementation, which returns time elapsed, time left, and finish time in H:M:S format.
def calcProcessTime(starttime, cur_iter, max_iter):
telapsed = time.time() - starttime
testimated = (telapsed/cur_iter)*(max_iter)
finishtime = starttime + testimated
finishtime = dt.datetime.fromtimestamp(finishtime).strftime("%H:%M:%S") # in time
lefttime = testimated-telapsed # in seconds
return (int(telapsed), int(lefttime), finishtime)
Example:
import time
import datetime as dt
start = time.time()
cur_iter = 0
max_iter = 10
for i in range(max_iter):
time.sleep(5)
cur_iter += 1
prstime = calcProcessTime(start,cur_iter ,max_iter)
print("time elapsed: %s(s), time left: %s(s), estimated finish time: %s"%prstime)
Output:
time elapsed: 5(s), time left: 45(s), estimated finish time: 14:28:18
time elapsed: 10(s), time left: 40(s), estimated finish time: 14:28:18
time elapsed: 15(s), time left: 35(s), estimated finish time: 14:28:18
....

You will never ever be able to know exactly how long it is going to take to finish. The best you can do is calculate was percentage of the work you have finished and how long that has taken you and then project that out.
For example if you are doing some work on the range of numbers from 1 to 100 you could do something such as
start_time = get the current time
for i in range(1, 101):
# Do some work
current_time = get the current time
elapsed_time = current_time - start_time
time_left = 100 * elapsed_time / i - elapsed_time
print(time_left)
Please understand that the above is largely pseudo-code

The following function will calculate the remaining time:
last_times = []
def get_remaining_time(i, total, time):
last_times.append(time)
len_last_t = len(last_times)
if len_last_t > 5:
last_times.pop(0)
mean_t = sum(last_times) // len_last_t
remain_s_tot = mean_t * (total - i + 1)
remain_m = remain_s_tot // 60
remain_s = remain_s_tot % 60
return f"{remain_m}m{remain_s}s"
The parameters are:
i : The current iteration
total : the total number of iterations
time : the duration of the last iteration
It uses the average time taken by the last 5 iterations to calculate the remaining time. You can the use it in your code as follows:
last_t = 0
iterations = range(1,1000)
for i in iterations:
t = time.time()
# Do your task here
last_t = time.time() - t
get_remaining_time(i, len(iterations), last_t)

fps - how to divide count by time function to determine fps

I have a counter working that counts every frame. what I want to do is divide this by time to determine the FPS of my program. But I'm not sure how to perform operations on timing functions within python.
I've tried initializing time as
fps_time = time.time
fps_time = float(time.time)
fps_time = np.float(time.time)
fps_time = time()
Then for calculating the fps,
FPS = (counter / fps_time)
FPS = float(counter / fps_time)
FPS = float(counter (fps_time))
But errors I'm getting are object is not callable or unsupported operand for /: 'int' and 'buildin functions'
thanks in advance for the help!

Here is a very simple way to print your program's frame rate at each frame (no counter needed) :
import time
while True:
start_time = time.time() # start time of the loop
########################
# your fancy code here #
########################
print("FPS: ", 1.0 / (time.time() - start_time)) # FPS = 1 / time to process loop
If you want the average frame rate over x seconds, you can do like so (counter needed) :
import time
start_time = time.time()
x = 1 # displays the frame rate every 1 second
counter = 0
while True:
########################
# your fancy code here #
########################
counter+=1
if (time.time() - start_time) > x :
print("FPS: ", counter / (time.time() - start_time))
counter = 0
start_time = time.time()
Hope it helps!

Works like a charm
import time
import collections
class FPS:
def __init__(self,avarageof=50):
self.frametimestamps = collections.deque(maxlen=avarageof)
def __call__(self):
self.frametimestamps.append(time.time())
if(len(self.frametimestamps) > 1):
return len(self.frametimestamps)/(self.frametimestamps[-1]-self.frametimestamps[0])
else:
return 0.0
fps = FPS()
for i in range(100):
time.sleep(0.1)
print(fps())
Make sure fps is called once per frame

You might want to do something in this taste:
def program():
start_time = time.time() #record start time of program
frame_counter = 0
# random logic
for i in range(0, 100):
for j in range(0, 100):
# do stuff that renders a new frame
frame_counter += 1 # count frame
end_time = time.time() #record end time of program
fps = frame_counter / float(end_time - start_time)
Of course you don't have to wait the end of the program to compute end_time and fps, you can do it every now and then to report the FPS as the program runs. Re-initing start_time after reporting the current FPS estimation could also help with reporting a more precise FPS estimation.

This sample code of finding FPS. I have used it for pre, inference, and postprocessing. Hope it helps!
import time
...
dt, tt, num_im = [0.0, 0.0, 0.0], 0.0, 0
for image in images:
num_im += 1
t1 = time.time()
# task1....
t2 = time.time()
dt[0] += t2 - t1
# task2...
t3 = time.time()
dt[1] += t3 - t2
# task3...
dt[2] += time.time() - t3
tt += time.time() - t1
t = tuple(x / num_im * 1E3 for x in dt)
print(f'task1 {t[0]:.2f}ms, task2 {t[1]:.2f}ms, task3 {t[2]:.2f}ms, FPS {num_im / tt:.2f}')

from time import sleep,time
fps = 0
fps_count = 0
start_time = time()
while True:
if (time()-start_time) > 1:
fps = fps_count
fps_count = 1
start_time = time()
else:
fps_count += 1
print("FPS:",fps)
FPS = the number of cycles running per second

Why is a single process pool faster than serialized implementation in this python code?

I'm experiencing with multiprocessing in python. I know that it can be slower than serialized computation, this is not the point of my post.
I'm just wandering why a single process pool is faster than the serialized computation of my basic problem. Shouldn't these times be the same?
Here is the code:
import time
import multiprocessing as mp
import matplotlib.pyplot as plt
def func(x):
return x*x*x
def multi_proc(nb_procs):
tic = time.time()
pool = mp.Pool(processes=nb_procs)
pool.map_async(func, range(1, 10000000))
toc = time.time()
return toc-tic
def single_core():
tic = time.time()
[func(x) for x in range(1, 10000000)]
toc = time.time()
return toc-tic
if __name__ == '__main__':
sc_times = [0]
mc_times = [0]
print('single core computation')
sc_constant_time = single_core()
print('{} secs'.format(sc_constant_time))
for nb_procs in range(1, 12):
print('computing for {} processes...'.format(nb_procs))
time_elapsed = (multi_proc(nb_procs))
print('{} secs'.format(time_elapsed))
mc_times.append(time_elapsed)
sc_times = [sc_constant_time for _ in mc_times]
plt.plot(sc_times, 'r--')
plt.plot(mc_times, 'b--')
plt.xlabel('nb procs')
plt.ylabel('time (s)')
plt.show()
And the plot of times per number of processes (red = serial computation, blue = multiprocessing):
EDIT 1:
I modified my code as Sidhnarth Gupta indicated, and here is the new code I have. I changed my func for no reason.
import time
import multiprocessing as mp
import matplotlib.pyplot as plt
import random
def func(x):
return random.choice(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
def multi_proc(nb_procs, nb_iter):
tic = time.time()
pool = mp.Pool(processes=nb_procs)
pool.map_async(func, range(1, nb_iter)).get()
toc = time.time()
return toc-tic
def single_core(nb_iter):
tic = time.time()
[func(x) for x in range(1, nb_iter)]
toc = time.time()
return toc-tic
if __name__ == '__main__':
# configure
nb_iter = 100000
max_procs = 16
sc_times = [0]
mc_times = [0]
# multi proc calls
for nb_procs in range(1, max_procs):
print('computing for {} processes...'.format(nb_procs))
time_elapsed = (multi_proc(nb_procs, nb_iter))
print('{} secs'.format(time_elapsed))
mc_times.append(time_elapsed)
# single proc call
print('single core computation')
for nb in range(1, len(mc_times)):
print('{}...'.format(nb))
sc_times.append(single_core(nb_iter))
# average time
average_time = sum(sc_times)/len(sc_times)
print('average time on single core: {} secs'.format(average_time))
# plot
plt.plot(sc_times, 'r--')
plt.plot(mc_times, 'b--')
plt.xlabel('nb procs')
plt.ylabel('time (s)')
plt.show()
Here is the new plot I have:
I think I can now say that I have increased my program's speed by using multiprocessing.

Your current code to calculate the time taken by multiprocessing is actually telling the time taken by the process to submit the task to the pool. The processing is actually happening in asynchronous mode without blocking the thread.
I tried your program with following changes:
def multi_proc(nb_procs):
tic = time.time()
pool = mp.Pool(processes=nb_procs)
pool.map_async(func, range(1, 10000000)).get()
toc = time.time()
return toc-tic
and
def multi_proc(nb_procs):
tic = time.time()
pool = mp.Pool(processes=nb_procs)
pool.map(func, range(1, 10000000))
toc = time.time()
return toc-tic
Both of them take significantly more time than then serialised computation.
Also while creating such graphs, you should also consider calling the single_core() function everytime you want to map the value instead of mapping the same value multiple time. You will see a significant variance in time taken by the same.

How to let a multi-processing python application quit cleanly

When I run a python script that uses multiprocessing I find it hard to get it to stop cleanly when it receives Ctrl-C. Ctrl-C has to be pressed multiple times and all sorts of error messages appear on the screen.
How can you make a python script that uses multiprocessing and quits
cleanly when it receives a Ctrl-C ?
Take this script for example
import numpy as np, time
from multiprocessing import Pool
def countconvolve(N):
np.random.seed() # ensure seed is random
count = 0
iters = 1000000 # 1million
l=12
k=12
l0=l+k-1
for n in range(N):
t = np.random.choice(np.array([-1,1], dtype=np.int8), size=l0 * iters)
v = np.random.choice(np.array([-1,1], dtype=np.int8), size = l * iters)
for i in xrange(iters):
if (not np.convolve(v[(l*i):(l*(i+1))],
t[(l0*i):(l0*(i+1))], 'valid').any()):
count += 1
return count
if __name__ == '__main__':
start = time.clock()
num_processes = 8
N = 13
pool = Pool(processes=num_processes)
res = pool.map(countconvolve, [N] * num_processes)
print res, sum(res)
print (time.clock() - start)

Jon's solution is probably better, but here it is using a signal handler. I tried it in a VBox VM which was extremely slow, but worked. I hope it will help.
import numpy as np, time
from multiprocessing import Pool
import signal
# define pool as global
pool = None
def term_signal_handler(signum, frame):
global pool
print 'CTRL-C pressed'
try:
pool.close()
pool.join()
except AttributeError:
print 'Pool has been already closed'
def countconvolve(N):
np.random.seed() # ensure seed is random
count = 0
iters = 1000000 # 1million
l=12
k=12
l0=l+k-1
for n in range(N):
t = np.random.choice(np.array([-1,1], dtype=np.int8), size=l0 * iters)
v = np.random.choice(np.array([-1,1], dtype=np.int8), size = l * iters)
for i in xrange(iters):
if (not np.convolve(v[(l*i):(l*(i+1))],t[(l0*i):(l0*(i+1))], 'valid').any()):
count += 1
return count
if __name__ == '__main__':
# Register the signal handler
signal.signal(signal.SIGINT, term_signal_handler)
start = time.clock()
num_processes = 8
N = 13
pool = Pool(processes=num_processes)
res = pool.map(countconvolve, [N] * num_processes)
print res, sum(res)
print (time.clock() - start)

I believe the try-catch mentioned in a similar post here on SO could be adapted to cover it.
If you wrap the pool.map call in the try-catch and then call terminate and join I think that would do it.
[Edit]
Some experimentation suggests something along these lines works well:
from multiprocessing import Pool
import random
import time
def countconvolve(N):
try:
sleepTime = random.randint(0,5)
time.sleep(sleepTime)
count = sleepTime
except KeyboardInterrupt as e:
pass
return count
if __name__ == '__main__':
random.seed(0)
start = time.clock()
num_processes = 8
N = 13
pool = Pool(processes=num_processes)
try:
res = pool.map(countconvolve, [N] * num_processes)
print res, sum(res)
print (time.clock() - start)
except KeyboardInterrupt as e:
print 'Stopping..'
I simplified your example somewhat to avoid having to load numpy on my machine to test but the critical part is the two try-except calls which handle the CTRL+C key presses.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Implementation of Multi-Threading and Multi-Processing using Python - python

Related

Looking to Maximize CPU Utilization with a python benchmarking program using concurrent.futures

Calculating the amount of time left until completion

fps - how to divide count by time function to determine fps

Why is a single process pool faster than serialized implementation in this python code?

How to let a multi-processing python application quit cleanly

Categories

Resources