Multi-processing question in Python. windows vs linux - python

import random
import os
from multiprocessing import Process
num = random.randint(0, 100)
def show_num():
print("pid:{}, num is {}".format(os.getpid(), num))
if __name__ == '__main__':
print("pid:{}, num is {}".format(os.getpid(), num))
p = Process(target=show_num)
p.start()
p.join()
print('Parent Process Stop')
The above code shows the basic usage of creating a process. If I run this script in the windows environment, the variable num is different in the parent process and child process. However, the variable num is the same when the script run between the Linux environment.
I understand their mechanism of creating process is different. For example, the windows system doesn't have fork method.
But, Can someone give me a more detailed explanation of their difference?
Thank you very much.

The difference explaining the behavior described in your post is exactly what you mentioned: the start method used for creating the process. On Unix-style OSs, the default is fork. On Windows, the only available option is spawn.
fork
As described in the Overview section of this Wiki page (in a slightly different order):
The fork operation creates a separate address space for the child. The
child process has an exact copy of all the memory segments of the
parent process.
The child process calls the exec system call to overlay itself with the
other program: it ceases execution of its former program in favor of
the other.
This means that, when using fork, the child process already has the variable num in its address space and uses it. random.randint(0, 100) is not called again.
spawn
As the multiprocessing docs describe:
The parent process starts a fresh python interpreter process.
In this fresh interpreter process, the module from which the child is spawned is executed. Oversimplified, this does python.exe your_script.py a second time. Hence, a new variable num is created in the child process by assigning the return value of another call to random.randint(0, 100) to it. Therefore it is very likely, that the content of num differs between the processes.This is, by the way, also the reason why you absolutely need to safeguard the instantiation and start of a process with the if __name__ == '__main__' idiom when using spawn as start method, otherwise you end up with:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
You can use spawn in POSIX OSs as well, to mimic the behavior you have seen on Windows:
import random
import os
from multiprocessing import Process, set_start_method
import platform
num = random.randint(0, 100)
def show_num():
print("pid:{}, num is {}".format(os.getpid(), num))
if __name__ == '__main__':
print(platform.system())
# change the start method for new processes to spawn
set_start_method("spawn")
print("pid:{}, num is {}".format(os.getpid(), num))
p = Process(target=show_num)
p.start()
p.join()
print('Parent Process Stop')
Output:
Linux
pid:26835, num is 41
pid:26839, num is 13
Parent Process Stop

Related

Cannot kill a loading animation when using multiprocessing

I'm trying to use multiprocessing to run multiple scripts. At the start, I launch a loading animation, however I am unable to ever kill it. Below is an example...
Animation: foo.py
import sys
import time
import itertools
# Simple loading animation that runs infinitely.
for c in itertools.cycle(['|', '/', '-', '\\']):
sys.stdout.write('\r' + c)
sys.stdout.flush()
time.sleep(0.1)
Useful script: bar.py
from time import sleep
# Stand-in for a script that does something useful.
sleep(5)
Attempt to run them both:
import multiprocessing
from multiprocessing import Process
import subprocess
pjt_dir = "/home/solebay/path/to/project" # Setup paths..
foo_path = pjt_dir + "/foo.py" # ..
bar_path = pjt_dir + "/bar.py" # ..
def run_script(path): # Simple function that..
"""Launches python scripts.""" # ..allows me to set a..
subprocess.run(["python", path]) # ..script as a process.
foo_p = Process(target=run_script, args=(foo_path,)) # Define the processes..
bar_p = Process(target=run_script, args=(bar_path,)) # ..
foo_p.start() # start loading animation
bar_p.start() # start 'useful' script
bar_p.join() # Wait for useful script to finish executing
foo_p.kill() # Kill loading animation
I get no error messages, and (my_venv) solebay#computer:~$ comes up in my terminal, but the loading animation persists (clipping over my name and environement). How can I kill it?
I've run into a similar situation before where I couldn't terminate the program using ctrl + c. The issue is (more or less) solved by using daemonic processes/threads (see multiprocessing doc). To do this, you simply change
foo_p = Process(target=run_script, args=(foo_path,))
to
foo_p = Process(target=run_script, args=(foo_path,), daemon=True)
and similarly for other children processes that you would like to create.
With that being said, I myself am not exactly sure if this is the correct way to remedy the issue with not being able to terminate the multiprocessing program, or is it just some artifact that happens to help with this. I would suggest this thread that went into the discussion about daemon threads more. But essentially, from my understanding, daemon threads would be terminated automatically whenever their parent process is terminated, regardless of whether they are finished or not. Meanwhile, if a thread is not daemonic, then somehow you need to wait until the children processes to finish before you're able to fully terminate the program.
You are creating too many processes. These two lines:
foo_p = Process(target=run_script, args=(foo_path,)) # Define the processes..
bar_p = Process(target=run_script, args=(bar_path,)) # ..
create two new processes. Let's all them "A" and "B". Each process consists of this function:
def run_script(path): # Simple function that..
"""Launches python scripts.""" # ..allows me to set a..
subprocess.run(["python", path]) # ..script as a process.
which then creates another subprocess. Let's call those two processes "C" and "D". In all you have created 4 extra processes, instead of just the 2 that you need. It is actually process "C" that's producing the output on the terminal. This line:
bar_p.join()
waits for "B" to terminate, which implies that "D" has terminated. But this line:
foo_p.kill()
kills process "A" but orphans process "C". So the output to the terminal continues forever.
This is well documented - see the description of multiprocessing.terminate, which says:
"Note that descendant processes of the process will not be terminated – they will simply become orphaned."
The following program works as you intended, exiting gracefully from the second process after the first one has finished. (I renamed "foo.py" to useless.py and "bar.py" to useful.py, and made small changes so I could run it on my computer.)
import subprocess
import os
def run_script(name):
s = os.path.join(r"c:\pyproj310\so", name)
return subprocess.Popen(["py", s])
if __name__ == "__main__":
useless_p = run_script("useless.py")
useful_p = run_script("useful.py")
useful_p.wait() # Wait for useful script to finish executing
useless_p.kill() # Kill loading animation
You can't use subprocess.run() to launch the new processes since that function will block the main script until the process completes. So I used Popen instead. Also I placed the running code under an if __name__ == "__main__" which is good practice (and maybe necessary on Windows).

Why is my child process creating a new process, and what does it inherit from its parent process?

from multiprocessing import Process, cpu_count
import time
def count(element):
count_value = 0
while count_value < element:
count_value += 1
x = Process(target=count, args=(1000000000,))
x.start()
print(cpu_count())
x.join()
print(cpu_count())
print(time.perf_counter())
When I execute the code above I get a RunTime Error:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
Now my question is what does my child process inherit from its parent that such an error occurs?
This is because the main segment of your program should be under an if __name__ == '__main__': block. This is because the function you want the new process to use is in the main file so the new interpreter tries to import the main file. This causes it to run all of the code including trying to start a new process, which is what causes the error. All of this behaviour is documented here.

Running Python on multiple cores

I have created a (rather large) program that takes quite a long time to finish, and I started looking into ways to speed up the program.
I found that if I open task manager while the program is running only one core is being used.
After some research, I found this website:
Why does multiprocessing use only a single core after I import numpy? which gives a solution of os.system("taskset -p 0xff %d" % os.getpid()),
however this doesn't work for me, and my program continues to run on a single core.
I then found this:
is python capable of running on multiple cores?,
which pointed towards using multiprocessing.
So after looking into multiprocessing, I came across this documentary on how to use it https://docs.python.org/3/library/multiprocessing.html#examples
I tried the code:
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
a = input("Finished")
After running the code (not in IDLE) It said this:
Finished
hello bob
Finished
Note: after it said Finished the first time I pressed enter
So after this I am now even more confused and I have two questions
First: It still doesn't run with multiple cores (I have an 8 core Intel i7)
Second: Why does it input "Finished" before its even run the if statement code (and it's not even finished yet!)
To answer your second question first, "Finished" is printed to the terminal because a = input("Finished") is outside of your if __name__ == '__main__': code block. It is thus a module level constant which gets assigned when the module is first loaded and will execute before any code in the module runs.
To answer the first question, you only created one process which you run and then wait to complete before continuing. This gives you zero benefits of multiprocessing and incurs overhead of creating the new process.
Because you want to create several processes, you need to create a pool via a collection of some sort (e.g. a python list) and then start all of the processes.
In practice, you need to be concerned with more than the number of processors (such as the amount of available memory, the ability to restart workers that crash, etc.). However, here is a simple example that completes your task above.
import datetime as dt
from multiprocessing import Process, current_process
import sys
def f(name):
print('{}: hello {} from {}'.format(
dt.datetime.now(), name, current_process().name))
sys.stdout.flush()
if __name__ == '__main__':
worker_count = 8
worker_pool = []
for _ in range(worker_count):
p = Process(target=f, args=('bob',))
p.start()
worker_pool.append(p)
for p in worker_pool:
p.join() # Wait for all of the workers to finish.
# Allow time to view results before program terminates.
a = input("Finished") # raw_input(...) in Python 2.
Also note that if you join workers immediately after starting them, you are waiting for each worker to complete its task before starting the next worker. This is generally undesirable unless the ordering of the tasks must be sequential.
Typically Wrong
worker_1.start()
worker_1.join()
worker_2.start() # Must wait for worker_1 to complete before starting worker_2.
worker_2.join()
Usually Desired
worker_1.start()
worker_2.start() # Start all workers.
worker_1.join()
worker_2.join() # Wait for all workers to finish.
For more information, please refer to the following links:
https://docs.python.org/3/library/multiprocessing.html
Dead simple example of using Multiprocessing Queue, Pool and Locking
https://pymotw.com/2/multiprocessing/basics.html
https://pymotw.com/2/multiprocessing/communication.html
https://pymotw.com/2/multiprocessing/mapreduce.html

How can I restrict the scope of a multiprocessing process?

Using python's multiprocessing module, the following contrived example runs with minimal memory requirements:
import multiprocessing
# completely_unrelated_array = range(2**25)
def foo(x):
for x in xrange(2**28):pass
print x**2
P = multiprocessing.Pool()
for x in range(8):
multiprocessing.Process(target=foo, args=(x,)).start()
Uncomment the creation of the completely_unrelated_array and you'll find that each spawned process allocates the memory for a copy of the completely_unrelated_array! This is a minimal example of a much larger project that I can't figure out how to workaround; multiprocessing seems to make a copy of everything that is global. I don't need a shared memory object, I simply need to pass in x, and process it without the memory overhead of the entire program.
Side observation: What's interesting is that print id(completely_unrelated_array) inside foo gives the same value, suggesting that somehow that might not be copies...
Because of the nature of os.fork(), any variables in the global namespace of your __main__ module will be inherited by the child processes (assuming you're on a Posix platform), so you'll see the memory usage in the children reflect that as soon as they're created. I'm not sure if all that memory is really being allocated though, as far as I know that memory is shared until you actually try to change it in the child, at which point a new copy is made. Windows, on the other hand, doesn't use os.fork() - it re-imports the main module in each child, and pickles any local variables you want sent to the children. So, using Windows you can actually avoid the large global ending up copied in the child by only defining it inside an if __name__ == "__main__": guard, because everything inside that guard will only run in the parent process:
import time
import multiprocessing
def foo(x):
for x in range(2**28):pass
print(x**2)
if __name__ == "__main__":
completely_unrelated_array = list(range(2**25)) # This will only be defined in the parent on Windows
P = multiprocessing.Pool()
for x in range(8):
multiprocessing.Process(target=foo, args=(x,)).start()
Now, in Python 2.x, you can only create new multiprocessing.Process objects by forking if you're using a Posix platform. But on Python 3.4, you can specify how the new processes are created, by using contexts. So, we can specify the "spawn" context, which is the one Windows uses, to create our new processes, and use the same trick:
# Note that this is Python 3.4+ only
import time
import multiprocessing
def foo(x):
for x in range(2**28):pass
print(x**2)
if __name__ == "__main__":
completely_unrelated_array = list(range(2**23)) # Again, this only exists in the parent
ctx = multiprocessing.get_context("spawn") # Use process spawning instead of fork
P = ctx.Pool()
for x in range(8):
ctx.Process(target=foo, args=(x,)).start()
If you need 2.x support, or want to stick with using os.fork() to create new Process objects, I think the best you can do to get the reported memory usage down is immediately delete the offending object in the child:
import time
import multiprocessing
import gc
def foo(x):
init()
for x in range(2**28):pass
print(x**2)
def init():
global completely_unrelated_array
completely_unrelated_array = None
del completely_unrelated_array
gc.collect()
if __name__ == "__main__":
completely_unrelated_array = list(range(2**23))
P = multiprocessing.Pool(initializer=init)
for x in range(8):
multiprocessing.Process(target=foo, args=(x,)).start()
time.sleep(100)
What is important here is which platform you are targeting.
Unix systems processes are created by using Copy-On-Write (cow) memory. So even though each process gets a copy of the full memory of the parent process, that memory is only actually allocated on a per page bases (4KiB)when it is modified.
So if you are only targeting these platforms you don't have to change anything.
If you are targeting platforms without cow forks you may want to use python 3.4 and its new forking contexts spawn and forkserver, see the documentation
These methods will create new processes which share nothing or limited state with the parent and all memory passing is explicit.
But not that that the spawned process will import your module so all global data will be explicitly copied and no copy-on-write is possible. To prevent this you have to reduce the scope of the data.
import multiprocessing as mp
import numpy as np
def foo(x):
import time
time.sleep(60)
if __name__ == "__main__":
mp.set_start_method('spawn')
# not global so forks will not have this allocated due to the spawn method
# if the method would be fork the children would still have this memory allocated
# but it could be copy-on-write
completely_unrelated_array = np.ones((5000, 10000))
P = mp.Pool()
for x in range(3):
mp.Process(target=foo, args=(x,)).start()
e.g top output with spawn:
%MEM TIME+ COMMAND
29.2 0:00.52 python3
0.5 0:00.00 python3
0.5 0:00.00 python3
0.5 0:00.00 python3
and with fork:
%MEM TIME+ COMMAND
29.2 0:00.52 python3
29.1 0:00.00 python3
29.1 0:00.00 python3
29.1 0:00.00 python3
note how its more than 100%, due to copy-on-write

Python's semaphore hangs for ever

Im trying to do things concurrently in my program and to throttle the number of processes opened at the same time (10).
from multiprocessing import Process
from threading import BoundedSemaphore
semaphore = BoundedSemaphore(10)
for x in xrange(100000):
semaphore.acquire(blocking=True)
print 'new'
p = Process(target=f, args=(x,))
p.start()
def f(x):
... # do some work
semaphore.release()
print 'done'
The first 10 processes are launched and they end correctly (I see 10 "new" and "done" on the console), and then nothing. I don't see another "new", the program just hangs there (and Ctrl-C doesn't work either). What's wrong ?
Your problem is the use of threading.BoundedSemaphore across process boundaries:
import threading
import multiprocessing
import time
semaphore = threading.BoundedSemaphore(10)
def f(x):
semaphore.release()
print('done')
semaphore.acquire(blocking=True)
print('new')
print(semaphore._value)
p = multiprocessing.Process(target=f, args=(100,))
p.start()
time.sleep(3)
print(semaphore._value)
When you create a new process, the child gets a copy of the parent process's memory. Thus the child is decrementing it's semaphore, and the semaphore in the parent is untouched. (Typically, processes are isolated from each other: it takes some extra work to communicate across processes; this is what multiprocessing is for.)
This is opposed to threads, where the two threads share the memory space, and are considered the same process.
multiprocessing.BoundedSemaphore is probably what you want. (If you replace threading.BoundedSemaphore with it, and replace semaphore._value with semaphore.get_value()`, you'll see the above's output change.)
Your bounded semaphore is not shared properly between the various processes which are being spawned; you might want to switch to using multiprocessing.BoundedSemaphore. See the answers to this question for some more details.

Categories

Resources