When spawn processes, Does Lock have different id? - python

I'm trying to figure out how Lock works under the hood. I run this code on MacOS which using "spawn" as default method to start new process.
from multiprocessing import Process, Lock, set_start_method
from time import sleep
def f(lock, i):
lock.acquire()
print(id(lock))
try:
print('hello world', i)
sleep(3)
finally:
lock.release()
if __name__ == '__main__':
# set_start_method("fork")
lock = Lock()
for num in range(3):
p = Process(target=f, args=(lock, num))
p.start()
p.join()
Output:
140580736370432
hello world 0
140251759281920
hello world 1
140398066042624
hello world 2
The Lock works in my code. However, the ids of lock make me confused. Since idare different, are they still same one lock or there are multiple locks and they somehow communicate secretly? Is id() still hold the position in multiprocessing, I quote "CPython implementation detail: id is the address of the object in memory."?
If I use "fork" method, set_start_method("fork"), it prints out identical id which totally make sense for me.

id is implemented as but not required to be the memory location of the given object. when using fork, the separate process does not get it's own memory space until it modifies something (copy on write), so the memory location does not change because it "is" the same object. When using spawn, an entire new process is created and the __main__ file is imported as a library into the local namesapce, so all your same functions, classes, and module level variables are accessable (sans any modifications from anything that results from if __name__ == "__main__":). Then python creates a connection between the processes (pipe) in which it can send which function to call, and the arguments to call it with. everything passing through this pipe must be pickle'd then unpickle'd. Locks specifically are re-created when un-pickling by asking the operating system for a lock with a specific name (which was created in the parent process when the lock was created, then this name is sent across using pickle). This is how the two locks are synchronized, because it is backed by an object the operating system controls. Python then stores this lock along with some other data (the PyObject as it were) in the memory of the new process. calling id now will get the location of this struct which is different because it was created by a different process in a different chunk of memory.
here's a quick example to convince you that a "spawn'ed" lock is still synchronized:
from multiprocessing import Process, Lock, set_start_method
def foo(lock):
with lock:
print(f'child process lock id: {id(lock)}')
if __name__ == "__main__":
set_start_method("spawn")
lock = Lock()
print(f'parent process lock id: {id(lock)}')
lock.acquire() #lock the lock so child has to wait
p = Process(target=foo, args=(lock,))
p.start()
input('press enter to unlock the lock')
lock.release()
p.join()
The different "id's" are the different PyObject locations, but have little to do with the underlying mutex. I am not aware that there's a direct way to inspect the underlying lock which the operating system manages.

Related

why does multiprocessing create a clone for base variables for each thread

So I'm using multiprocessing pool with 3 threads, to run a function that does a certain job, and I have a variable defined outside this function which equals 0, and every time the function do it job it should add 1 to that variable and print it, but every thread uses a separated variable
here is the code:
from multiprocessing import Pool
number_of_doe_jobs = 0
def thefunction():
global number_of_doe_jobs
# JOB CODE GOES HERE
number_of_doe_jobs+=1
if __name__ =="__main__":
p = Pool(3)
p.map(checker, datalist)
the desired output is that it adds 1 to number_of_doe_jobs ,
but every thread add 1 to it own number_of_doe_jobs , so there are 3 number_of_doe_jobs variables now.
You are not spawning 3 threads. You are spawning 3 processes. Each process has its own memory space, with its own copy of the interpreter and its own independent object space. Global variables are not shared across processes. There are ways to create shared variables (which communicate over sockets), but you might be better served by using a multiprocessing.Queue. Create it in the mainline code, and pass it as a parameter to the subprocesses. Have the jobs push a "complete" flag on the queue, and have the mainline code read the results.
FOLLOWUP
The NUMBER of jobs will always be equal to len(datalist), so it's not clear why you would track that. Here, I create a multiprocessing queue and pass that to the function. Python implements that by creating a socket. The checker function sends a signal when it finishes, and the mainline code fetches each one and prints it. q.get will block until something is in the queue.
import multiprocessing
def checker(q):
# JOB CODE GOES HERE
q.put( "done" )
if __name__ =="__main__":
q = multiprocessing.Queue()
p = Pool(3)
p.map(lambda: checker(q), datalist)
for _ in datalist:
print( q.get() )

multiprocessing global variable memory copying

I am running a program which loads 20 GB data to the memory at first. Then I will do N (> 1000) independent tasks where each of them may use (read only) part of the 20 GB data. I am now trying to do those tasks via multiprocessing. However, as this answer says, the entire global variables are copied for each process. In my case, I do not have enough memory to perform more than 4 tasks as my memory is only 96 GB. I wonder if there is any solution to this kind of problem so that I can fully use all my cores without consuming too much memory.
In linux, forked processes have a copy-on-write view of the parent address space. forking is light-weight and the same program runs in both the parent and the child, except that the child takes a different execution path. As a small exmample,
import os
var = "unchanged"
pid = os.fork()
if pid:
print('parent:', os.getpid(), var)
os.waitpid(pid, 0)
else:
print('child:', os.getpid(), var)
var = "changed"
# show parent and child views
print(os.getpid(), var)
Results in
parent: 22642 unchanged
child: 22643 unchanged
22643 changed
22642 unchanged
Applying this to multiprocessing, in this example I load data into a global variable. Since python pickles the data sent to the process pool, I make sure it pickles something small like an index and have the worker get the global data itself.
import multiprocessing as mp
import os
my_big_data = "well, bigger than this"
def worker(index):
"""get char in big data"""
return my_big_data[index]
if __name__ == "__main__":
pool = mp.Pool(os.cpu_count())
for c in pool.imap_unordered(worker, range(len(my_big_data)), chunksize=1):
print(c)
Windows does not have a fork-and-exec model for running programs. It has to start a new instance of the python interpreter and clone all relevant data to the child. This is a heavy lift!

Multiprocessing with pool in python: About several instances with same name at the same time

I'm kind of new to multiprocessing. However, assume that we have a program as below. The program seems to work fine. Now to the question. In my opinion we will have 4 instances of SomeKindOfClass with the same name (a) at the same time. How is that possible? Moreover, is there a potential risk with this kind of programming?
from multiprocessing.dummy import Pool
import numpy
from theFile import someKindOfClass
n = 8
allOutputs = numpy.zeros(n)
def work(index):
a = SomeKindOfClass()
a.theSlowFunction()
allOutputs[index] = a.output
pool = Pool(processes=4)
pool.map(work,range(0,n))
The name a is only local in scope within your work function, so there is no conflict of names here. Internally python will keep track of each class instance with a unique identifier. If you wanted to check this you could check the object id using the id function:
print(id(a))
I don't see any issues with your code.
Actually, you will have 8 instances of SomeKindOfClass (one for each worker), but only 4 will ever be active at the same time.
multiprocessing vs multiprocessing.dummy
Your program will only work if you continue to use the multiprocessing.dummy module, which is just a wrapper around the threading module. You are still using "python threads" (not separate processes). "Python threads" share the same global state; "Processes" don't. Python threads also share the same GIL, so they're still limited to running one python bytecode statement at a time, unlike processes, which can all run python code simultaneously.
If you were to change your import to from multiprocessing import Pool, you would notice that the allOutputs array remains unchanged after all the workers finish executing (also, you would likely get an error because you're creating the pool in the global scope, you should probably put that inside a main() function). This is because multiprocessing makes a new copy of the entire global state when it makes a new process. When the worker modifies the global allOutputs, it will be modifying a copy of that initial global state. When the process ends, nothing will be returned to the main process and the global state of the main process will remain unchanged.
Sharing State Between Processes
Unlike threads, processes aren't sharing the same memory
If you want to share state between processes, you have to explicitly declare shared variables and pass them to each process, or use pipes or some other method to allow the worker processes to communicate with each other or with the main process.
There are several ways to do this, but perhaps the simplest is using the Manager class
import multiprocessing
def worker(args):
index, array = args
a = SomeKindOfClass()
a.some_expensive_function()
array[index] = a.output
def main():
n = 8
manager = multiprocessing.Manager()
array = manager.list([0] * n)
pool = multiprocessing.Pool(4)
pool.map(worker, [(i, array) for i in range(n)])
print array
You can declare class instances inside the pool workers, because each instance has a separate place in memory so they don't conflict. The problem is if you declare a class instance first, then try to pass that one instance into multiple pool workers. Then each worker has a pointer to the same place in memory, and it will fail (this can be handled, just not this way).
Basically pool workers must not have overlapping memory anywhere. As long as the workers don't try to share memory somewhere, or perform operations that may result in collisions (like printing to the same file), there shouldn't be any problem.
Make sure whatever they're supposed to do (like something you want printed to a file, or added to a broader namespace somewhere) is returned as a result at the end, which you then iterate through.
If you are using multiprocessing you shouldn't worry - process doesn't share memory (by-default). So, there is no any risk to have several independent objects of class SomeKindOfClass - each of them will live in own process. How it works? Python runs your program and after that it runs 4 child processes. That's why it's very important to have if __init__ == '__main__' construction before pool.map(work,range(0,n)). Otherwise you will receive a infinity loop of process creation.
Problems could be if SomeKindOfClass keeps state on disk - for example, write something to file or read it.

Identify current process is child process in Python

In Python 2.7, is there's a way to identify if the current forked/spawned process is a child process instance (as opposed to being starting as a regular process). My goal is to set a global variable differently if it's a child process (e.g. create a pool with size 0 for child else pool with some number greater than 0).
I can't pass a parameter into the function (being called to execute in the child process), as even before the function is invoked the process would have been initialized and hence the global variable (especially for spawned process).
Also I am not in a position to use freeze_support (unless of course I am miss understood how to use it) as my application is running in a web service container (flask). Hence there's no main method.
Any help will be much appreciated.
Sample code that goes into infinite loop if you run it on windows:
from multiprocessing import Pool, freeze_support
p = Pool(5) # This should be created only in parent process and not the child process
def f(x):
return x*x
if __name__ == '__main__':
freeze_support()
print(p.map(f, [1, 2, 3]))
I would suggest restructuring your program to something more like my example code below. You mentioned that you don't have a main function, but you can create a wrapper that handles your pool:
from multiprocessing import Pool, freeze_support
def f(x):
return x*x
def handle_request():
p = Pool(5) # pool will only be in the parent process
print(p.map(f, [1, 2, 3]))
p.close() # remember to clean up the resources you use
p.join()
return
if __name__ == '__main__':
freeze_support() # do you really need this?
# start your web service here and make it use `handle_request` as the callback
# when a request needs to be serviced
It sounds like you are having a bit of an XY problem. You shouldn't be making a pool of processes global. It's just bad. You're giving your subprocesses access to their own process objects, which allows you to accidentally do bad things, like make a child process join itself. If you create your pool within a wrapper that is called for each request, then you don't need to worry about a global variable.
In the comments, you mentioned that you want a persistent pool. There is indeed some overhead to creating a pool on each request, but it's far safer than having a global pool. Also, you now have the capability to handle multiple requests simultaneously, assuming your web service handles each request in their own thread/process, without multiple requests trampling on each other by trying to use the same pool. I would strongly suggest you try to use this approach, and if it doesn't meet your performance specifications, you can look at optimizing it in other ways (ie, no global pool) to meet your spec.
One other note: multiprocessing.freeze_support() only needs to be called if you intend to bundle your scripts into a Windows executable. Don't use it if you are not doing that.
Move the pool creation into the main section to only create a multiprocessing pool once, any only in the main process:
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [1, 2, 3]))
This works because the only process that is executing in the __main__ name is the original process. Spawned processes run with the __mp_main__ module name.
create a pool with size 0 for child
The child processes should never start a new multiprocessing pool. Only handle your processes from a single entry point.

Python's semaphore hangs for ever

Im trying to do things concurrently in my program and to throttle the number of processes opened at the same time (10).
from multiprocessing import Process
from threading import BoundedSemaphore
semaphore = BoundedSemaphore(10)
for x in xrange(100000):
semaphore.acquire(blocking=True)
print 'new'
p = Process(target=f, args=(x,))
p.start()
def f(x):
... # do some work
semaphore.release()
print 'done'
The first 10 processes are launched and they end correctly (I see 10 "new" and "done" on the console), and then nothing. I don't see another "new", the program just hangs there (and Ctrl-C doesn't work either). What's wrong ?
Your problem is the use of threading.BoundedSemaphore across process boundaries:
import threading
import multiprocessing
import time
semaphore = threading.BoundedSemaphore(10)
def f(x):
semaphore.release()
print('done')
semaphore.acquire(blocking=True)
print('new')
print(semaphore._value)
p = multiprocessing.Process(target=f, args=(100,))
p.start()
time.sleep(3)
print(semaphore._value)
When you create a new process, the child gets a copy of the parent process's memory. Thus the child is decrementing it's semaphore, and the semaphore in the parent is untouched. (Typically, processes are isolated from each other: it takes some extra work to communicate across processes; this is what multiprocessing is for.)
This is opposed to threads, where the two threads share the memory space, and are considered the same process.
multiprocessing.BoundedSemaphore is probably what you want. (If you replace threading.BoundedSemaphore with it, and replace semaphore._value with semaphore.get_value()`, you'll see the above's output change.)
Your bounded semaphore is not shared properly between the various processes which are being spawned; you might want to switch to using multiprocessing.BoundedSemaphore. See the answers to this question for some more details.

Categories

Resources