Multithreading is printing the output but not multiprocessing. Searched stack overflow and answered questions didnt solve the problem.
Multiprocessing is not working.
from threading import Thread
import datetime
from multiprocessing import Process
import sys
import time
def func1():
print('Working')
time.sleep(5)
global a
a=10
print(datetime.datetime.now())
def func2():
print("Working")
time.sleep(10)
print(datetime.datetime.now())
p1 = Process(target=func1)
p1.start()
p2 = Process(target=func2)
p2.start()
p1.join()
p2.join()
print(a)
Even the print(a) is not printing the value. It says
NameError: name 'a' is not defined
As I commented, plain variables, be they global or not, won't magically travel between multiprocessing Processes. (Well, actually, that's a bit of a simplification and depends on the OS and multiprocessing spawner you're using, but I digress.)
The simplest communication channel is a multiprocessing.Queue (that actually "magically" works between processes).
As discussed in further comments,
you can't use multiprocessing in an IDE that doesn't save your script before executing it, since it requires being able to spawn a copy of the script, and if there's no script on disk, there's nothing to spawn.
on a similar note, you can't use multiprocessing very well from Jupyter notebooks, since they're not run as regular Python scripts, but via the Python kernel process Jupyter starts.
Here's a simple adaptation of your code to actually pass data between the processes.
Remember to guard your multiprocessing main() with if __name__ == "__main__".
import datetime
import time
import multiprocessing
def func1(q: multiprocessing.Queue):
print("func1 thinking...")
time.sleep(2)
q.put(("func1", 10))
print("func1 quit at", datetime.datetime.now())
def func2(q: multiprocessing.Queue):
for x in range(10):
print("func2 working", x)
q.put(("func2", x))
time.sleep(0.3)
def main():
queue = multiprocessing.Queue()
p1 = multiprocessing.Process(target=func1, args=(queue,))
p2 = multiprocessing.Process(target=func2, args=(queue,))
p1.start()
p2.start()
p1.join()
p2.join()
print("Subprocesses ended, reading their results...")
while not queue.empty():
print(queue.get())
if __name__ == "__main__":
main()
The output is:
func1 thinking...
func2 working 0
func2 working 1
func2 working 2
func2 working 3
func2 working 4
func2 working 5
func2 working 6
func1 quit at 2021-06-16 17:58:46.542275
func2 working 7
func2 working 8
func2 working 9
2021-06-16 17:58:47.577008
Subprocesses ended, reading their results...
('func2', 0)
('func2', 1)
('func2', 2)
('func2', 3)
('func2', 4)
('func2', 5)
('func2', 6)
('func1', 10)
('func2', 7)
('func2', 8)
('func2', 9)
Related
I am testing multiprocessing on jupyter notebook and spyder:
import multiprocessing
import time
start = time.perf_counter()
def do_something():
print(f'Sleeping 5 second(s)...')
time.sleep(5)
print(f'Done Sleeping...')
p2 = multiprocessing.Process(target = do_something)
p3 = multiprocessing.Process(target = do_something)
p2.start()
p3.start()
p2.join()
p3.join()
finish = time.perf_counter()
print(f'Finished in {round(finish-start, 2)} secounds')
And I got:
Finished in 0.12 secounds
This is much shorter than 5 seconds.
I did test the do_something function and it seems fine. I feel like in above code, the do_someting function was not even executed...
start = time.perf_counter()
def do_something(seconds):
print(f'Sleeping {seconds} second(s)...')
time.sleep(seconds)
print(f'Done Sleeping...{seconds}')
do_something(5)
finish = time.perf_counter()
print(f'Finished in {round(finish-start, 2)} secounds')
Sleeping 5 second(s)...
Done Sleeping...5
Finished in 5.0 secounds
Your code should throw an error (I won't write the traceback to keep the answer short):
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Long story short: the multiprocessing package is unable to correctly understand and execute your code. You should keep the definitions at the beginning of the file, and put the code you want to execute inside the
if __name__ == '__main__':
Otherwise, each new process will try to execute the same file (and spawn other processes, as well). The corrected code takes about 5.22 seconds to complete on my pc.
The need for the "if" is explained in the programming guidelines (section "Safe importing of main module") of the multiprocessing package. Be sure to read them to avoid an unwanted behaviour: multithreading and multiprocessing are prone to elusive bugs when not used correctly.
Here is the corrected code:
import multiprocessing
import time
def do_something():
print('Sleeping 5 seconds...')
time.sleep(5)
print('Done Sleeping.')
if __name__ == '__main__':
start = time.perf_counter()
p2 = multiprocessing.Process(target=do_something, args=())
p3 = multiprocessing.Process(target=do_something, args=())
p2.start()
p3.start()
p2.join()
p3.join()
finish = time.perf_counter()
print(f'Finished in {round(finish-start, 2)} seconds')
Why do you see the output after 0.12 seconds? This happens because each child process throws its error and crashes (you should get two identical runtime errors), then the parent process is able to complete.
Hey everyone i have script that works parallel, i was using APScheduler for scheduling the tasks but it works synchron (BlockingScheduler,BackgroundScheduler) doesnt work on parallel processes. What would be your advices , how can i run the parallel processes for every second ? also im using multiprocesses for parallel
EDİT:I have just solved it, if anyone gets trouble like this issue, here the example
from multiprocessing import Process
from apscheduler.schedulers.background import BlockingScheduler
def work_log_cpu1():
print(" Proces work_log_cpu1")
list11=[]
for i in range(10000000):
list11.append(i*2)
print("Proces work_log_cpu1 finished")
def work_log_cpu2():
print("Proces work_log_cpu2")
list12=[]
for i in range(10000000):
list12.append(i*2)
print("Proces work_log_cpu2 finished")
def work_log_cpu3():
print(" Proces work_log_cpu3")
list13=[]
for i in range(10000000):
list13.append(i*2)
print("Proces work_log_cpu3 finished")
def main():
# sleeps=[3,5,2,7]
process=Process(target=work_log_cpu1)
process2=Process(target=work_log_cpu2)
process3=Process(target=work_log_cpu3)
process.start()
process2.start()
process3.start()
process.join()
process2.join()
process3.join()
if __name__ == '__main__':
# main()
sched.add_job(main, 'interval', seconds=1,id='first_job',max_instances=1)
sched.start()
What's wrong with multiprocessing?
import multiprocessing
p1 = multiprocessing.Process(target=func1, args=("var1", "var2",))
p2 = multiprocessing.Process(target=func2, args=("var3", "var4",))
p1.start()
p2.start()
p2.join()
I am learning about Python multiprocessing and trying to understand how I can make my code wait for all processes to finish and then continue with the rest of the code. I thought join() method should do the job, but the output of my code is not what I expected from the using it.
Here is the code:
from multiprocessing import Process
import time
def fun():
print('starting fun')
time.sleep(2)
print('finishing fun')
def fun2():
print('starting fun2')
time.sleep(5)
print('finishing fun2')
def fun3():
print('starting fun3')
print('finishing fun3')
if __name__ == '__main__':
processes = []
print('starting main')
for i in [fun, fun2, fun3]:
p = Process(target=i)
p.start()
processes.append(p)
for p in processes:
p.join()
print('finishing main')
g=0
print("g",g)
I expected all processes under if __name__ == '__main__': to finish before the lines g=0 and print(g) are called, so something like this was expected:
starting main
starting fun2
starting fun
starting fun3
finishing fun3
finishing fun
finishing fun2
finishing main
g 0
But the actual output indicates that there's something I don't understand about join() (or multiprocessing in general):
starting main
g 0
g 0
starting fun2
g 0
starting fun
starting fun3
finishing fun3
finishing fun
finishing fun2
finishing main
g 0
The question is: How do I write the code that finishes all processes first and then continues with the code without multiprocessing, so that I get the former output? I run the code from command prompt on Windows, in case it matters.
On waiting the Process to finish:
You can just Process.join your list, something like
import multiprocessing
import time
def func1():
time.sleep(1)
print('func1')
def func2():
time.sleep(2)
print('func2')
def func3():
time.sleep(3)
print('func3')
def main():
processes = [
multiprocessing.Process(target=func1),
multiprocessing.Process(target=func2),
multiprocessing.Process(target=func3),
]
for p in processes:
p.start()
for p in processes:
p.join()
if __name__ == '__main__':
main()
But if you're thinking about giving your process more complexity, try using a Pool:
import multiprocessing
import time
def func1():
time.sleep(1)
print('func1')
def func2():
time.sleep(2)
print('func2')
def func3():
time.sleep(3)
print('func3')
def main():
result = []
with multiprocessing.Pool() as pool:
result.append(pool.apply_async(func1))
result.append(pool.apply_async(func2))
result.append(pool.apply_async(func3))
for r in result:
r.wait()
if __name__ == '__main__':
main()
More info on Pool
On why g0 prints multiple times:
This is happening because you're using spawn or forkserver to set your Process and the g0 and print declarations are outside a function or the __main__ if block.
From the docs:
Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).
(...)
This allows the newly spawned Python interpreter to safely import the module and then run the module’s foo() function.
Similar restrictions apply if a pool or manager is created in the main module.
It's basically interpreting again because it's importing your .py file as a module.
I am learning the multiprocessing module of Python. I am on Python 3.8. This is my sample code:
# import stuff
def add(x, y):
time.sleep(10)
print(f'{x + y} \n')
def main():
start = time.perf_counter()
if __name__ == '__main__':
p1 = mp.Process(target=add, args=(100, 200))
p2 = mp.Process(target=add, args=(200, 300))
p1.start(); p2.start()
p1.join(); p2.join()
end = time.perf_counter()
print(f'{end - start} seconds \n')
main()
I am expecting outputs such as:
300
500
10.something seconds
But when I run it I am getting:
5.999999999062311e-07 seconds
5.00000000069889e-07 seconds
500
300
10.704853300000002 seconds
For some reason the end = time.perf_counter(); print(f'{end - start} seconds \n') part is getting executed once after each process is started and one more time after they both end. But here I am specifically writing p1.join(); p2.join() to tell the computer to wait until these processes are finished and then move on to the following line of code.
Why is it behaving like this? And what can I do to fix it?
This is happening because you are running on Windows, which does not support fork. On Linux, I see the output you expect. Because Windows can't fork, it has to re-import your entire module in each child process in order to run your worker function. Because you're not protecting the code that calculates/prints the runtime in the if __name__ == "__main__": guard, they are executed in both of your worker processes when they are launched, in addition to running in your main process once the workers finish. Move them (and any other code you only want to run in the parent process) into the guard to get the output you want:
# import stuff
def add(x, y):
time.sleep(10)
print(f'{x + y} \n')
def main():
p1 = mp.Process(target=add, args=(100, 200))
p2 = mp.Process(target=add, args=(200, 300))
p1.start(); p2.start()
p1.join(); p2.join()
if __name__ == '__main__':
main()
I'm getting "EOFError: EOF when reading a line", when I try to take input.
def one():
xyz = input("enter : ")
print(xyz)
time.sleep(1)
if __name__=='__main__':
from multiprocessing import Process
import time
p1 = Process(target = one)
p1.start()
the main process owns standard input, the forked process doesn't.
What would work would be to use multiprocessing.dummy which doesn't create subprocesses but threads.
def one(stdin):
xyz = input("enter: ")
print(xyz)
time.sleep(1)
if __name__=='__main__':
from multiprocessing.dummy import Process
import time
p1 = Process(target = one)
p1.start()
since threads share the process, they also share standard input.
for real multiprocessing, I suggest that you collect interactive input from main process and pass it as argument.