Why is multiprocessing module not producing the desired result? - python

import multiprocessing as mp
import os
def cube(num):
print(os.getpid())
print("Cube is {}".format(num*num*num))
def square(num):
print(os.getpid())
print("Square is {}".format(num*num))
if __name__ == "__main__":
p1 = mp.Process(target = cube, args = (3,))
p2 = mp.Process(target = square, args = (4,))
p1.start()
p2.start()
p1.join()
p2.join()
print("Done")
I was using the multiprocessing module, but I am not able to print any output from a function using that.
I even tried flushing the stdout using the sys module.

Q : "Why is multiprocessing module not producing the desired result?"
Why?
Because it crashes.
The MWE/MCVE-representation of the problem has a wrong code. It crashes & it has nothing to do with the sys.stdout.flush() :
>>> cube( 4 )
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in cube
NameError: global name 'os' is not defined
Solution :
>>> import os # be it in the __main__ or in the def()-ed functions...
>>> cube( 4 )
14165
Cube is 64
and your mp.Process()-based replicas of the python-process instances will stop crashing too.
MCVE that works :
(base) Fri May 29 14:29:33 $ conda activate py3
(py3) Fri May 29 14:34:55 $ python StackOverflow_mp.py
This is ____6745::__main__
This is ____6746::PID
This is ____6747::PID
Cube(__3) is _______27.
Square(__4) is _______16.
Done.
Works.
Q.E.D.
import multiprocessing as mp
import os
import sys
import time
def cube( num ):
print( "This is {0:_>8d}::PID".format( os.getpid() ) )
print( "Cube({0:_>3d}) is {1:_>9d}.".format( num, num*num*num ) )
sys.stdout.flush()
def square( num ):
print( "This is {0:_>8d}::PID".format( os.getpid() ) )
print( "Square({0:_>3d}) is {1:_>9d}.".format( num, num*num ) )
sys.stdout.flush()
if __name__ == "__main__":
print( "This is {0:_>8d}::__main__".format( os.getpid() ) )
p1 = mp.Process( target = cube, args = (3, ) )
p2 = mp.Process( target = square, args = (4, ) )
p1.start()
p2.start()
p1.join()
p2.join()
time.sleep( 1 )
print( "Done.\nWorks.\nQ.E.D." )
I copied and pasted your exact code. But I still didn't get the output from the called functions using the multiprocessing libraries– Kartikeya Agarwal 47 mins ago
So,
- I opened a new Terminal process,
- I copied the conda activate py3 command and
- I hit Enter to let it run, so as to make python3 ecosystem go live.
- I re-launched the proof-of-solution again python StackOverflow_mp.py and
- I hit Enter to let it run
- I saw it working the very same way as it worked last time.
- I doubt the problem is on the provided twice (re)-validated proof-of-solution side, is it?
Q.E.D.
(py3) Fri May 29 19:53:58 $ python StackOverflow_mp.py
This is ___27202::__main__
This is ___27203::PID
Cube(__3) is _______27.
This is ___27204::PID
Square(__4) is _______16.
Done

Related

Multiprocessing a for loop - got errors

I have some code in Python and I wanna do it with multiprocessing
import multiprocessing as mp
from multiprocessing.sharedctypes import Value
import time
import math
resault_a = []
resault_b = []
resault_c = []
def make_calculation_one(numbers):
for number in numbers:
resault_a.append(math.sqrt(number**3))
def make_calculation_two(numbers):
for number in numbers:
resault_a.append(math.sqrt(number**4))
def make_calculation_three(numbers):
for number in numbers:
resault_c.append(math.sqrt(number**5))
number_list = list(range(1000000))
if __name__ == "__main__":
mp.set_start_method("fork")
p1 = mp.Process(target=make_calculation_one, args=(number_list))
p2 = mp.Process(target=make_calculation_two, args=(number_list))
p3 = mp.Process(target=make_calculation_three, args=(number_list))
start = time.time()
p1.start()
p2.start()
p3.start()
end = time.time()
print(end - start)
I got an empty array, where is the problem?
I got some errors:
"Process Process-1:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()"
How can I fix it?
TNX
There are several issues with your code:
The major problem is that the args argument to the Process initializer requires a tuple or list. You are specifying args=(number_list). The parentheses around number_list does not make this a tuple. Without the comma you just have a parenthesized expression, i.e. a list. So instead of passing a single argument that is a list, you are passing 10,000 arguments, while your "worker" functions only take 1 argument. You need: args=(number_list,).
Your worker functions are doing calculations but neither printing nor returning the results of these calculations. Assuming you want to return the results, you need a mechanism for doing so. If you are using multiprocessing.Process then the usual solution is to pass to the worker function a multiprocessing.Queue instance to which the worker function can put the results (see below). You can also use a multiprocessing pool (also see below).
Your timing is not quite right. You have started the child processes and immediately set end without waiting for the tasks to complete. To get the actual time, end should only be set when the child processes have finished creating their results.
Using Process with queues
import multiprocessing as mp
import time
import math
def make_calculation_one(numbers, out_q):
out_q.put([math.sqrt(number**3) for number in numbers])
def make_calculation_two(numbers, out_q):
out_q.put([math.sqrt(number**4) for number in numbers])
def make_calculation_three(numbers, out_q):
out_q.put([math.sqrt(number**5) for number in numbers])
if __name__ == "__main__":
# We only want one copy of `number_list`, i.e. in our main process.
# But there is actually no need to convert to a list:
number_list = range(1000000)
mp.set_start_method("fork")
out_q_1 = mp.Queue()
out_q_2 = mp.Queue()
out_q_3 = mp.Queue()
# Create pool of size 3:
p1 = mp.Process(target=make_calculation_one, args=(number_list, out_q_1))
p2 = mp.Process(target=make_calculation_two, args=(number_list, out_q_2))
p3 = mp.Process(target=make_calculation_three, args=(number_list, out_q_3))
start = time.time()
p1.start()
p2.start()
p3.start()
results = []
# Get return values:
results.append(out_q_1.get())
results.append(out_q_2.get())
results.append(out_q_3.get())
end = time.time()
p1.join()
p2.join()
p3.join()
print(end - start)
Using a shared memory array to pass the number list and to return the results
import multiprocessing as mp
import time
import math
def make_calculation_one(numbers, results):
for idx, number in enumerate(numbers):
results[idx] = math.sqrt(number**3)
def make_calculation_two(numbers, results):
for idx, number in enumerate(numbers):
results[idx] = math.sqrt(number**4)
def make_calculation_three(numbers, results):
for idx, number in enumerate(numbers):
results[idx] = math.sqrt(number**5)
if __name__ == "__main__":
# We only want one copy of `number_list`, i.e. in our main process
number_list = mp.RawArray('d', range(1000000))
mp.set_start_method("fork")
results_1 = mp.RawArray('d', len(number_list))
results_2 = mp.RawArray('d', len(number_list))
results_3 = mp.RawArray('d', len(number_list))
# Create pool of size 3:
p1 = mp.Process(target=make_calculation_one, args=(number_list, results_1))
p2 = mp.Process(target=make_calculation_two, args=(number_list, results_2))
p3 = mp.Process(target=make_calculation_three, args=(number_list, results_3))
start = time.time()
p1.start()
p2.start()
p3.start()
p1.join()
p2.join()
p3.join()
end = time.time()
print(end - start)
Using a multiprocessing pool
import multiprocessing as mp
import time
import math
def make_calculation_one(numbers):
return [math.sqrt(number**3) for number in numbers]
def make_calculation_two(numbers):
return [math.sqrt(number**4) for number in numbers]
def make_calculation_three(numbers):
return [math.sqrt(number**5) for number in numbers]
if __name__ == "__main__":
# We only want one copy of `number_list`, i.e. in our main process
number_list = range(1000000)
mp.set_start_method("fork")
# Create pool of size 3:
pool = mp.Pool(3)
start = time.time()
async_results = []
async_results.append(pool.apply_async(make_calculation_one, args=(number_list,)))
async_results.append(pool.apply_async(make_calculation_two, args=(number_list,)))
async_results.append(pool.apply_async(make_calculation_three, args=(number_list,)))
# Now wait for results:
results = [async_result.get() for async_result in async_results]
end = time.time()
pool.close()
pool.join()
print(end - start)
Conclusion
Since your calculations yield a type readily supported by shared memory, the second code example above should result in the best performance. You could also adapt the multiprocessing pool example to use shared memory.
I'm getting some other error:
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
TypeError: make_calculation_one() takes 1 positional argument but 1000000 were given
but if I change these line accordingly then it works:
p1 = mp.Process(target=make_calculation_one, args=([number_list]))
p2 = mp.Process(target=make_calculation_two, args=([number_list]))
p3 = mp.Process(target=make_calculation_three, args=([number_list]))
The function that is run in a worker Process cannot access data in the parent process.
If the "fork" start method is used, it would have access to the copy of that data in the forked process.
But modifying that would not alter the value in the parent process.
In this case, the easiest thing to do it to create a multiprocessing.Array, and pass that to the process to use.
import math
import multiprocessing as mp
def make_calculation_one(numbers, res):
for idx, number in enumerate(numbers):
res[idx] = math.sqrt(number**3)
number_list = list(range(10000))
if __name__ == "__main__":
result_a = mp.Array("d", len(number_list))
p1 = mp.Process(target=make_calculation_one, args=(number_list, result_a))
p1.start()
p1.join()
print(sum(result_a))
This code prints the value 3999500012.4745193.

How to get every second's GPU usage in Python

I have a model which runs by tensorflow-gpu and my device is nvidia. And I want to list every second's GPU usage so that I can measure average/max GPU usage. I can do this mannually by open two terminals, one is to run model and another is to measure by nvidia-smi -l 1. Of course, this is not a good way. I also tried to use a Thread to do that, here it is.
import subprocess as sp
import os
from threading import Thread
class MyThread(Thread):
def __init__(self, func, args):
super(MyThread, self).__init__()
self.func = func
self.args = args
def run(self):
self.result = self.func(*self.args)
def get_result(self):
return self.result
def get_gpu_memory():
output_to_list = lambda x: x.decode('ascii').split('\n')[:-1]
ACCEPTABLE_AVAILABLE_MEMORY = 1024
COMMAND = "nvidia-smi -l 1 --query-gpu=memory.used --format=csv"
memory_use_info = output_to_list(sp.check_output(COMMAND.split()))[1:]
memory_use_values = [int(x.split()[0]) for i, x in enumerate(memory_use_info)]
return memory_use_values
def run():
pass
t1 = MyThread(run, args=())
t2 = MyThread(get_gpu_memory, args=())
t1.start()
t2.start()
t1.join()
t2.join()
res1 = t2.get_result()
However, this does not return every second's usage as well. Is there a good solution?
In the command nvidia-smi -l 1 --query-gpu=memory.used --format=csv
the -l stands for:
-l, --loop= Probe until Ctrl+C at specified second interval.
So the command:
COMMAND = 'nvidia-smi -l 1 --query-gpu=memory.used --format=csv'
sp.check_output(COMMAND.split())
will never terminate and return.
It works if you remove the event loop from the command(nvidia-smi) to python.
Here is the code:
import subprocess as sp
import os
from threading import Thread , Timer
import sched, time
def get_gpu_memory():
output_to_list = lambda x: x.decode('ascii').split('\n')[:-1]
ACCEPTABLE_AVAILABLE_MEMORY = 1024
COMMAND = "nvidia-smi --query-gpu=memory.used --format=csv"
try:
memory_use_info = output_to_list(sp.check_output(COMMAND.split(),stderr=sp.STDOUT))[1:]
except sp.CalledProcessError as e:
raise RuntimeError("command '{}' return with error (code {}): {}".format(e.cmd, e.returncode, e.output))
memory_use_values = [int(x.split()[0]) for i, x in enumerate(memory_use_info)]
# print(memory_use_values)
return memory_use_values
def print_gpu_memory_every_5secs():
"""
This function calls itself every 5 secs and print the gpu_memory.
"""
Timer(5.0, print_gpu_memory_every_5secs).start()
print(get_gpu_memory())
print_gpu_memory_every_5secs()
"""
Do stuff.
"""
Here is a more rudimentary way of getting this output, however just as effective - and I think easier to understand. I added a small 10-value cache to get a good recent average and upped the check time to every second. It outputs average of the last 10 seconds and the current each second, so operations that cause usage can be identified (what I think the original question was).
import subprocess as sp
import time
memory_total=8192 #found with this command: nvidia-smi --query-gpu=memory.total --format=csv
memory_used_command = "nvidia-smi --query-gpu=memory.used --format=csv"
isolate_memory_value = lambda x: "".join(y for y in x.decode('ascii') if y in "0123456789")
def main():
percentage_cache = []
while True:
memory_used = isolate_memory_value(sp.check_output(memory_used_command.split(), stderr=sp.STDOUT))
percentage = float(memory_used)/float(memory_total)*100
percentage_cache.append(percentage)
percentage_cache = percentage_cache[max(0, len(percentage_cache) - 10):]
print("curr: " + str(percentage) + " %", "\navg: " + str(sum(percentage_cache)/len(percentage_cache))[:4] + " %\n")
time.sleep(1)
main()

Python - For loop finishing before it is supposed to

I am currently executing tasks via a thread pool based on a for loop length, and it is ending its execution when it is not supposed to (before end of loop). Any ideas why? Here is the relavent code:
from classes.scraper import size
from multiprocessing import Pool
import threading
if __name__ == '__main__':
print("Do something")
size = size()
pool = Pool(processes=50)
with open('size.txt','r') as file:
asf = file.read()
for x in range(0,1000000):
if '{num:06d}'.format(num=x) in asf:
continue
else:
res = pool.apply_async(size.scrape, ('{num:06d}'.format(num=x),))
Here is the console output (I am printing out the values inside size.scrape().
...
...
...
013439
013440
013441
013442
013443
Process finished with exit code 0

Why import instruction is not executed into python Process

I'm looking for a solution to do multiprocessing for running script.
I have a function which launches 4 process, and each process executes a script through runpy.run_path() and I get return back.
Example :
def valorise(product, dico_valo):
res = runpy.run_path(product +"/PyScript.py", run_name="__main__")
dico_valo[product] = res["ret"]
def f(mutex,l,dico):
while len(l)!= 0:
mutex.acquire()
product = l.pop(0)
mutex.release()
p = Process(target=valorise, args=(product,dico))
p.start()
p.join()
def run_parallel_computations(valuationDate, list_scripts):
if len(product_list)>0:
print '\n\nPARALLEL COMPUTATIONS BEGIN..........\n\n'
manager = Manager()
l = manager.list(list_scripts)
dico = manager.dict()
mutex = Lock()
p1 = Process(target=f, args=(mutex,l,dico), name="script1")
p2 = Process(target=f, args=(mutex,l,dico), name="script2")
p3 = Process(target=f, args=(mutex,l,dico), name="script3")
p4 = Process(target=f, args=(mutex,l,dico), name="script4")
p1.start()
p2.start()
p3.start()
p4.start()
p1.join()
p2.join()
p3.join()
p4.join()
dico_isin = {}
for i in iter(dico.keys()):
dico_isin[i] = dico[i]
return dico
print '\n\nPARALLEL COMPUTATIONS END..........'
else:
print '\n\nNOTHING TO PRICE !'
In every PyScript.py, I import a library and each script has to import again it. However, in this case, it doesn't work as I want and I don't understand why. My library is imported once during the first process and the same "import" is used in the other processes.
Could you help me ?
Thank you !
It might not be the case in multiprocessing (but looks like it is).
When you will try to import something more than once (ie. import re in most of your modules), Python will not 'reimport' it. As it will see it in modules already imported and will skip it.
To force reloading you can try reload(module_name) (it can not reload import of single class/method from module, you can reload whole module or nothing)

Ctypes callback functions not working with embedded python 3.1 (AMD64)

I try to communicate with a PLC through a DLL (C API interface distributed by the manufacturer of the PLC). I'm using Python 3.1.4 who is embedded as a scripting environment in an other software (x64 - Windows 7).
The callback function bellow doesn't work in this embedded scripting environment (nothing happen). I the trigger for the callback function is generated after that the script has been started and stopped, then the software with the embedded python crashes completely.
The code works fine in a standalone python (3.1.4 MSC v1500 64bit AMD as well)
I have successfully implemented other function of the DLL that doesn't use callbacks in the embedded Python.
Does anyone have I idea what it could be ?
def callback_func(amsaddr,notifHeader,huser):
print('do something')
pass
CMPFUNC = WINFUNCTYPE(None,POINTER(AmsAddr),POINTER(AdsNotificationHeader),c_ulong)
cmp_func = CMPFUNC(callback_func)
netId = AmsNetId((c_ubyte*6)(5,18,18,27,1,1))
plcAddress = AmsAddr(netId,801)
nIndexGroup = c_ulong(0xF021)
nIndexOffset = c_ulong(0x0)
adsNotif = AdsNotificationAttrib(1,4,1000000,1000000)
handle = c_ulong()
huser = c_ulong(10)
ADS_DLL = WinDLL("C:/Program Files/TwinCAT/Ads Api/TcAdsDll/x64/TcAdsDll.dll")
ADS_DLL.AdsSyncAddDeviceNotificationReq.argtypes=[POINTER(AmsAddr),c_ulong,c_ulong,POINTER(AdsNotificationAttrib),CMPFUNC,c_ulong,POINTER(c_ulong)]
ADS_DLL.AdsSyncAddDeviceNotificationReq.restype=c_long
#Function in the DLL with the callback
errCode = ADS_DLL.AdsSyncAddDeviceNotificationReq(pointer(plcAddress),nIndexGroup,nIndexOffset,pointer(adsNotif),cmp_func,huser,pointer(handle))
print('Device Notification error Code : %s' %errCode)
EDIT ^2
I tried a simple ctypes callback and it failed miserably in the embedded python version...The software just hangs and I have to kill it in the taskmanager.
I tried the following code (from the docs):
from ctypes import *
IntArray5 = c_int * 5
ia = IntArray5(5, 1, 7, 33, 99)
libc = cdll.msvcrt #or libc = cdll.msvcr90 --> same problem
qsort = libc.qsort
qsort.restype = None
CMPFUNC = CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int))
def py_cmp_func(a, b):
print("py_cmp_func", a[0], b[0])
return 0
cmp_func = CMPFUNC(py_cmp_func)
qsort(ia, len(ia), sizeof(c_int), cmp_func)
EDIT ^3
Managed to get some improvements with the use of threading. Only the print() function doesn't print anything in the callback... An other function as os.system('c:/windows/notepad.exe') doesn't work either for example.
from ctypes import *
import threading, queue
import os
IntArray5 = c_int * 5
ia = IntArray5(5, 1, 7, 33, 99)
libc = cdll.msvcrt
qsort = libc.qsort
qsort.restype = None
q = queue.Queue()
CMPFUNC = CFUNCTYPE(None,POINTER(c_int), POINTER(c_int))
def py_cmp_func(a, b):
print("py_cmp_func", a[0], b[0]) #--> doesn't print anything
print('Callback, in thread %s' % threading.current_thread().name) #--> doesn't print anything
q.put('something')
cmp_func = CMPFUNC(py_cmp_func)
t = threading.Thread(target=qsort, args=(ia, len(ia), sizeof(c_int), cmp_func))
t.start()
print(threading.enumerate()) #--> prints [<Thread(Thread-1, started 2068)>, <_MainThread(MainThread, started 2956)>]
t.join()
print(threading.enumerate()) # --> prints [<_MainThread(MainThread, started 2956)>]
print(q.get()) #--> prints 'something'

Categories

Resources