Python multiprocessing.Pool does not start right away

Python multiprocessing.Pool does not start right away - python

I want to input text to python and process it in parallel. For that purpose I use multiprocessing.Pool. The problem is that sometime, not always, I have to input text multiple times before anything is processed.
This is a minimal version of my code to reproduce the problem:
import multiprocessing as mp
import time
def do_something(text):
print('Out: ' + text, flush=True)
# do some awesome stuff here
if __name__ == '__main__':
p = None
while True:
message = input('In: ')
if not p:
p = mp.Pool()
p.apply_async(do_something, (message,))
What happens is that I have to input text multiple times before I get a result, no matter how long I wait after I have inputted something the first time. (As stated above, that does not happen every time.)
python3 test.py
In: a
In: a
In: a
In: Out: a
Out: a
Out: a
If I create the pool before the while loop or if I add time.sleep(1) after creating the pool, it seems to work every time. Note: I do not want to create the pool before I get an input.
Has someone an explanation for this behavior?
I'm running Windows 10 with Python 3.4.2
EDIT: Same behavior with Python 3.5.1
EDIT:
An even simpler example with Pool and also ProcessPoolExecutor. I think the problem is the call to input() right after appyling/submitting, which only seems to be a problem the first time appyling/submitting something.
import concurrent.futures
import multiprocessing as mp
import time
def do_something(text):
print('Out: ' + text, flush=True)
# do some awesome stuff here
# ProcessPoolExecutor
# if __name__ == '__main__':
# with concurrent.futures.ProcessPoolExecutor() as executor:
# executor.submit(do_something, 'a')
# input('In:')
# print('done')
# Pool
if __name__ == '__main__':
p = mp.Pool()
p.apply_async(do_something, ('a',))
input('In:')
p.close()
p.join()
print('done')

Your code works when I tried it on my Mac.
In Python 3, it might help to explicitly declare how many processors will be in your pool (ie the number of simultaneous processes).
try using p = mp.Pool(1)
import multiprocessing as mp
import time
def do_something(text):
print('Out: ' + text, flush=True)
# do some awesome stuff here
if __name__ == '__main__':
p = None
while True:
message = input('In: ')
if not p:
p = mp.Pool(1)
p.apply_async(do_something, (message,))

I could not reproduce it on Windows 7 but there are few long shots worth to mention for your issue.
your AV might be interfering with the newly spawned processes, try temporarily disabling it and see if the issue is still present.
Win 10 might have different IO caching algorithm, try inputting larger strings. If it works, it means that the OS tries to be smart and sends data when a certain amount has piled up.
As Windows has no fork() primitive, you might see the delay caused by the spawn starting method.
Python 3 added a new pool of workers called ProcessPoolExecutor, I'd recommend you to use this no matter the issue you suffer from.

Related

Queue timeout when I execute unrelated code

I'm trying to use "exec" to check some external code snippets for correctness and I wanted to trap infinite loops by spawning a process, waiting for a short period of time, then checking the local variables. I managed to shrink the code to this example:
import multiprocessing
def fHelper(queue, codeIn, globalsParamIn, localsParamIn):
exec(codeIn, globalsParamIn, localsParamIn) # Execute code string with limited builtins
queue.put(localsParamIn['spam'])
def f(codeIn):
globalsParam = {"float" : float, "int" : int, "len" : len}
spam = False
localsParam = {'spam': spam}
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=fHelper, args=(queue, codeIn, globalsParam, localsParam))
p.start()
p.join(3) # Wait for 3 seconds or until process finishes
if p.is_alive(): # Just in case p hangs
p.terminate()
p.join()
return queue.get(timeout=3)
fOut = f("spam=True")
print(fOut)
# assert fOut
Now the code as-is executes fine, but if you uncomment the last line (or use almost anything else - print(fOut.copy()) will do it) the queue times out. I'm using Python 3.8.2 on Windows.
I would welcome any suggestions on how to fix the bug, or better yet understand what on earth is going on.
Thanks!

Multiprocessing spawns idle processes and doesn't compute anything

There seems to be a litany of questions and answers on overflow about the multiprocessing library. I have looked through all the relevant ones I can find all and have not found one that directly speaks to my problem.
I am trying to apply the same function to multiple files in parallel. Whenever I start the processing though, the computer just spins up several instances of python and then does nothing. No computations happen at all and the processes just sit idle
I have looked at all of the similar questions on overflow, and none seem to have my problem of idle processes.
what am i doing wrong?
define the function (abbreviated for example. checked to make sure it works)
import pandas as pd
import numpy as np
import glob
import os
#from timeit import default_timer as timer
import talib
from multiprocessing import Process
def example_function(file):
df=pd.read_csv(file, header = 1)
stock_name = os.path.basename(file)[:-4]
macd, macdsignal, macdhist = talib.MACD(df.Close, fastperiod=12, slowperiod=26, signalperiod=9)
df['macd'] = macdhist*1000
print(f'stock{stock_name} processed')
final_macd_report.append(df)
getting a list of all the files in the directory i want to run the function on
import glob
path = r'C:\Users\josiahh\Desktop\big_test3/*'
files = [f for f in glob.glob(path, recursive=True)]
attempting multiprocessing
import multiprocessing as mp
if __name__ == '__main__':
p = mp.Pool(processes = 5)
async_result = p.map_async(example_function, files)
p.close()
p.join()
print("Complete")
any help would be greatly appreciated.

There's nothing wrong with the structure of the code, so something is going wrong that can't be guessed from what you posted. Start with something very much simpler, then move it in stages to what you're actually trying to do. You're importing mountains of extension (3rd party) code, and the problem could be anywhere. Here's a start:
def example_function(arg):
from time import sleep
msg = "crunching " + str(arg)
print(msg)
sleep(arg)
print("done " + msg)
if __name__ == '__main__':
import multiprocessing as mp
p = mp.Pool(processes = 5)
async_result = p.map_async(example_function, reversed(range(15)))
print("result", async_result.get())
p.close()
p.join()
print("Complete")
That works fine on Win10 under 64-bit Python 3.7.4 for me. Does it for you?
Note especially the async_result.get() at the end. That displays a list with 15 None values. You never do anything with your async_result. Because of that, if any exception was raised in a worker process, it will most likely silently vanish. In such cases .get()'ing the result will (re)raise the exception in your main program.
Also please verify that your files list isn't in fact empty. We can't guess at that from here either ;-)
EDIT
I moved the async_result.get() into its own line, right after the map_async(), to maximize the chance of revealing otherwise silent exception in the worker processes. At least add that much to your code too.

While I don't see anything wrong per se, I would like to suggest some changes.
In general, worker functions in a Pool are expected to return something. This return value is transferred back to the parent process. I like to use that as a status report. It is also a good idea to catch exceptions in the worker process, just in case.
For example:
def example_function(file):
status = 'OK'
try:
df=pd.read_csv(file, header = 1)
stock_name = os.path.basename(file)[:-4]
macd, macdsignal, macdhist = talib.MACD(df.Close, fastperiod=12, slowperiod=26, signalperiod=9)
df['macd'] = macdhist*1000
final_macd_report.append(df)
except:
status = 'exception caught!'
return {'filename': file, 'result': status}
(This is just a quick example. You might want to e.g. report the full exception traceback to help with debugging.)
If workers run for a long time, I like to get feedback ASAP.
So I prefer to use imap_unordered, especially if some tasks can take much longer than others. This returns an iterator that yields results in the order that jobs finish.
if __name__ == '__main__':
with mp.Pool() as p:
for res in p.imap_unordered(example_function, files):
print(res)
This way you get unambiguous proof that a worker finished, and what the result was and if any problems occurred.
This is preferable over just calling print from the workers. With stdout buffering and multiple workers inheriting the same output stream there is no saying when you actually see something.
Edit: As you can see here, multiprocessing.Pool does not work well with interactive interpreters, especially on ms-windows. Basically, ms-windows lacks the fork system call that lets UNIX-like systems duplicate a process. So on ms-windows, multiprocessing has to do a try and mimic fork which means importing the original program file in the child processes. That doesn't work well with interactive interpreters like IPython. One would probably have to dig deep into the internals of Jupyter and multiprocessing to find out the exact cause of the problem.
It seems that a workaround for this problem is to define the worker function in a separate module and import that in your code in IPython.
It is actually mentioned in the documentation that multiprocessing.Pool doesn't work well with interactive interpreters. See the note at the end of this section.

Multiprocessing with Python and Windows

I have a code that works with Thread in python, but I wanna switch to Process as if I have understood well that will give me a speed-up.
Here there is the code with Thread:
threads.append(Thread(target=getId, args=(my_queue, read)))
threads.append(Thread(target=getLatitude, args=(my_queue, read)))
The code works putting the return in the Queue and after a join on the threads list, I can retrieve the results.
Changing the code and the import statement my code now is like that:
threads.append(Process(target=getId, args=(my_queue, read)))
threads.append(Process(target=getLatitude, args=(my_queue, read)))
However it does not execute anything and the Queue is empty, with the Thread the Queue is not empty so I think it is related to Process.
I have read answers in which the Process class does not work on Windows is it true, or there is a way to make it work (adding freeze_support() does not help)?
In the negative case, multithreading on windows is actually executed in parallel on different cores?
ref:
Python multiprocessing example not working
Python code with multiprocessing does not work on Windows
Multiprocessing process does not join when putting complex dictionary in return queue
(in which is described that fork does not exist on Windows)
EDIT:
To add some details:
the code with Process is actually working on centOS.
EDIT2:
add a simplified version of my code with processes, code tested on centOS
import pandas as pd
from multiprocessing import Process, freeze_support
from multiprocessing import Queue
#%% Global variables
datasets = []
latitude = []
def fun(key, job):
global latitude
if(key == 'LAT'):
latitude.append(job)
def getLatitude(out_queue, skip = None):
latDict = {'LAT' : latitude}
out_queue.put(latDict)
n = pd.read_csv("my.csv", sep =',', header = None).shape[0]
print("Number of baboon:" + str(n))
read = []
for i in range(0,n):
threads = []
my_queue = Queue()
threads.append(Process(target=getLatitude, args=(my_queue, read)))
for t in threads:
freeze_support() # try both with and without this line
t.start()
for t in threads:
t.join()
while not my_queue.empty():
try:
job = my_queue.get()
key = list(job.keys())
fun(key[0],job[key[0]])
except:
print("END")
read.append(i)

Per the documentation, you need the following after the function definitions. When Python creates the subprocesses, they import your script so the code that runs at the global level will be run multiple times. For the code you only want to run in the main thread:
if __name__ == '__main__':
n = pd.read_csv("my.csv", sep =',', header = None).shape[0]
# etc.
Indent the rest of code under this if.

Running Python on multiple cores

I have created a (rather large) program that takes quite a long time to finish, and I started looking into ways to speed up the program.
I found that if I open task manager while the program is running only one core is being used.
After some research, I found this website:
Why does multiprocessing use only a single core after I import numpy? which gives a solution of os.system("taskset -p 0xff %d" % os.getpid()),
however this doesn't work for me, and my program continues to run on a single core.
I then found this:
is python capable of running on multiple cores?,
which pointed towards using multiprocessing.
So after looking into multiprocessing, I came across this documentary on how to use it https://docs.python.org/3/library/multiprocessing.html#examples
I tried the code:
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
a = input("Finished")
After running the code (not in IDLE) It said this:
Finished
hello bob
Finished
Note: after it said Finished the first time I pressed enter
So after this I am now even more confused and I have two questions
First: It still doesn't run with multiple cores (I have an 8 core Intel i7)
Second: Why does it input "Finished" before its even run the if statement code (and it's not even finished yet!)

To answer your second question first, "Finished" is printed to the terminal because a = input("Finished") is outside of your if __name__ == '__main__': code block. It is thus a module level constant which gets assigned when the module is first loaded and will execute before any code in the module runs.
To answer the first question, you only created one process which you run and then wait to complete before continuing. This gives you zero benefits of multiprocessing and incurs overhead of creating the new process.
Because you want to create several processes, you need to create a pool via a collection of some sort (e.g. a python list) and then start all of the processes.
In practice, you need to be concerned with more than the number of processors (such as the amount of available memory, the ability to restart workers that crash, etc.). However, here is a simple example that completes your task above.
import datetime as dt
from multiprocessing import Process, current_process
import sys
def f(name):
print('{}: hello {} from {}'.format(
dt.datetime.now(), name, current_process().name))
sys.stdout.flush()
if __name__ == '__main__':
worker_count = 8
worker_pool = []
for _ in range(worker_count):
p = Process(target=f, args=('bob',))
p.start()
worker_pool.append(p)
for p in worker_pool:
p.join() # Wait for all of the workers to finish.
# Allow time to view results before program terminates.
a = input("Finished") # raw_input(...) in Python 2.
Also note that if you join workers immediately after starting them, you are waiting for each worker to complete its task before starting the next worker. This is generally undesirable unless the ordering of the tasks must be sequential.
Typically Wrong
worker_1.start()
worker_1.join()
worker_2.start() # Must wait for worker_1 to complete before starting worker_2.
worker_2.join()
Usually Desired
worker_1.start()
worker_2.start() # Start all workers.
worker_1.join()
worker_2.join() # Wait for all workers to finish.
For more information, please refer to the following links:
https://docs.python.org/3/library/multiprocessing.html
Dead simple example of using Multiprocessing Queue, Pool and Locking
https://pymotw.com/2/multiprocessing/basics.html
https://pymotw.com/2/multiprocessing/communication.html
https://pymotw.com/2/multiprocessing/mapreduce.html

Python multiprocessing - Is it possible to introduce a fixed time delay between individual processes?

I have searched and cannot find an answer to this question elsewhere. Hopefully I haven't missed something.
I am trying to use Python multiprocessing to essentially batch run some proprietary models in parallel. I have, say, 200 simulations, and I want to batch run them ~10-20 at a time. My problem is that the proprietary software crashes if two models happen to start at the same / similar time. I need to introduce a delay between processes spawned by multiprocessing so that each new model run waits a little bit before starting.
So far, my solution has been to introduced a random time delay at the start of the child process before it fires off the model run. However, this only reduces the probability of any two runs starting at the same time, and therefore I still run into problems when trying to process a large number of models. I therefore think that the time delay needs to be built into the multiprocessing part of the code but I haven't been able to find any documentation or examples of this.
Edit: I am using Python 2.7
This is my code so far:
from time import sleep
import numpy as np
import subprocess
import multiprocessing
def runmodels(arg):
sleep(np.random.rand(1,1)*120) # this is my interim solution to reduce the probability that any two runs start at the same time, but it isn't a guaranteed solution
subprocess.call(arg) # this line actually fires off the model run
if __name__ == '__main__':
arguments = [big list of runs in here
]
count = 12
pool = multiprocessing.Pool(processes = count)
r = pool.imap_unordered(runmodels, arguments)
pool.close()
pool.join()

multiprocessing.Pool() already limits number of processes running concurrently.
You could use a lock, to separate the starting time of the processes (not tested):
import threading
import multiprocessing
def init(lock):
global starting
starting = lock
def run_model(arg):
starting.acquire() # no other process can get it until it is released
threading.Timer(1, starting.release).start() # release in a second
# ... start your simulation here
if __name__=="__main__":
arguments = ...
pool = Pool(processes=12,
initializer=init, initargs=[multiprocessing.Lock()])
for _ in pool.imap_unordered(run_model, arguments):
pass

One way to do this with thread and semaphore :
from time import sleep
import subprocess
import threading
def runmodels(arg):
subprocess.call(arg)
sGlobal.release() # release for next launch
if __name__ == '__main__':
threads = []
global sGlobal
sGlobal = threading.Semaphore(12) #Semaphore for max 12 Thread
arguments = [big list of runs in here
]
for arg in arguments :
sGlobal.acquire() # Block if more than 12 thread
t = threading.Thread(target=runmodels, args=(arg,))
threads.append(t)
t.start()
sleep(1)
for t in threads :
t.join()

The answer suggested by jfs caused problems for me as a result of starting a new thread with threading.Timer. If the worker just so happens to finish before the timer does, the timer is killed and the lock is never released.
I propose an alternative route, in which each successive worker will wait until enough time has passed since the start of the previous one. This seems to have the same desired effect, but without having to rely on another child process.
import multiprocessing as mp
import time
def init(shared_val):
global start_time
start_time = shared_val
def run_model(arg):
with start_time.get_lock():
wait_time = max(0, start_time.value - time.time())
time.sleep(wait_time)
start_time.value = time.time() + 1.0 # Specify interval here
# ... start your simulation here
if __name__=="__main__":
arguments = ...
pool = mp.Pool(processes=12,
initializer=init, initargs=[mp.Value('d')])
for _ in pool.imap_unordered(run_model, arguments):
pass

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python multiprocessing.Pool does not start right away - python

Related

Queue timeout when I execute unrelated code

Multiprocessing spawns idle processes and doesn't compute anything

Multiprocessing with Python and Windows

Running Python on multiple cores

Python multiprocessing - Is it possible to introduce a fixed time delay between individual processes?

Categories

Resources