I have a queue of 500 processes that I want to run through a python script, I want to run every N processes in parallel.
What my python script does so far:
It runs N processes in parallel, waits for all of them to terminate, then runs the next N files.
What I need to do:
When one of the N processes is finished, another process from the queue is automatically started, without waiting for the rest of the processes to terminate.
Note: I do not know how much time each process will take, so I can't schedule a process to run at a particular time.
Following is the code that I have.
I am currently using subprocess.Popen, but I'm not limited to its use.
for i in range(0, len(queue), N):
batch = []
for _ in range(int(jobs)):
batch.append(queue.pop(0))
for process in batch:
p = subprocess.Popen([process])
ps.append(p)
for p in ps:
p.communicate()
I believe this should work:
import subprocess
import time
def check_for_done(l):
for i, p in enumerate(l):
if p.poll() is not None:
return True, i
return False, False
processes = list()
N = 5
queue = list()
for process in queue:
p = subprocess.Popen(process)
processes.append(p)
if len(processes) == N:
wait = True
while wait:
done, num = check_for_done(processes)
if done:
processes.pop(num)
wait = False
else:
time.sleep(0.5) # set this so the CPU does not go crazy
So you have an active process list, and the check_for_done function loops through it, the subprocess returns None if it is not finished and it returns a return code if it is. So when something is returned it should be done (without knowing if it was successful or not). Then you remove that process from the list allowing for the loop to add another one.
Assuming python3, you could make use of ThreadPoolExecutor from concurrent.futures like,
$ cat run.py
from subprocess import Popen, PIPE
from concurrent.futures import ThreadPoolExecutor
def exec_(cmd):
proc = Popen(cmd, stdout=PIPE, stderr=PIPE)
stdout, stderr = proc.communicate()
#print(stdout, stderr)
def main():
with ThreadPoolExecutor(max_workers=4) as executor:
# to demonstrate it will take a batch of 4 jobs at the same time
cmds = [['sleep', '4'] for i in range(10)]
start = time.time()
futures = executor.map(exec_, cmds)
for future in futures:
pass
end = time.time()
print(f'Took {end-start} seconds')
if __name__ == '__main__':
main()
This will process 4 tasks at a time, and since the number of tasks are 10, it should only take around 4 + 4 + 4 = 12 seconds
First 4 seconds for the first 4 tasks
Seconds 4 seconds for the second 4 tasks
And the final 4 seconds for the last 2 tasks remaining
Output:
$ python run.py
Took 12.005989074707031 seconds
Related
I am a beginner in Python, so I would very appreciate it if you can help me with clear and easy explanations.
In my Python script, I have a function that makes several threads to do an I/O bound task (What it really does is making several Azure requests concurrently using Azure Python SDK), and I also have a list of time differences like [1 second, 3 seconds, 10 seconds, 5 seconds, ..., 7 seconds] so that I execute the function again after each time difference.
Let's say I want to execute the function and execute it again after 5 seconds. The first execution can take much more than 5 seconds to finish as it has to wait for the requests it makes to be done. So, I want to execute each function in a different process so that different executions of the function do not block each other (Even if they don't block each other without using different processes, I just didn't want threads in different executions to be mixed).
My code is like:
import multiprocessing as mp
from time import sleep
def function(num_threads):
"""
This functions makes num_threads number of threads to make num_threads number of requests
"""
# Time to wait in seconds between each execution of the function
times = [1, 10, 7, 3, 13, 19]
# List of number of requests to make for each execution of the function
num_threads_list = [1, 2, 3, 4, 5, 6]
processes = []
for i in range(len(times)):
p = mp.Process(target=function, args=[num_threads_list[i]])
p.start()
processes.append(p)
sleep(times[i])
for process in processes:
process.join()
Question I have due to mare:
the length of the list "times" is very big in my real script (, which is 1000). Considering the time differences in the list "times", I guess there are at most 5 executions of the function running concurrently using processes. I wonder if each process terminates when it is done executing the function, so that there are actually at most 5 processes running. Or, Does it remain so that there will be 1000 processes, which sounds very weird given the number of CPU cores of my computer?
Please tell me if you think there is a better way to do what I explained above.
Thank you!
The main problem I destilate from your question is having a large amount of processes running simultaniously.
You can prevent that by maintaining a list of processes with a maximum length. Something like this.
import multiprocessing as mp
from time import sleep
from random import randint
def function(num_threads):
"""
This functions makes num_threads number of threads to make num_threads number of requests
"""
sleep(randint(3, 7))
# Time to wait in seconds between each execution of the function
times = [1, 10, 7, 3, 13, 19]
# List of number of requests to make for each execution of the function
num_threads_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
process_data_list = []
max_processes = 4
# =======================================================================================
def main():
times_index = 0
while times_index < len(times):
# cleanup stopped processes -------------------------------
cleanup_done = False
while not cleanup_done:
cleanup_done = True
# search stopped processes
for i, process_data in enumerate(process_data_list):
if not process_data[1].is_alive():
print(f'process {process_data[0]} finished')
# remove from processes
p = process_data_list.pop(i)
del p
# start new search
cleanup_done = False
break
# try start new process ---------------------------------
if len(process_data_list) < max_processes:
process = mp.Process(target=function, args=[num_threads_list[times_index]])
process.start()
process_data_list.append([times_index, process])
print(f'process {times_index} started')
times_index += 1
else:
sleep(0.1)
# wait for all processes to finish --------------------------------
while process_data_list:
for i, process_data in enumerate(process_data_list):
if not process_data[1].is_alive():
print(f'process {process_data[0]} finished')
# remove from processes
p = process_data_list.pop(i)
del p
# start new search
break
print('ALL DONE !!!!!!')
# =======================================================================================
if __name__ == '__main__':
main()
It runs max_processes at once as you can see in the result.
process 0 started
process 1 started
process 2 started
process 3 started
process 3 finished
process 4 started
process 1 finished
process 5 started
process 0 finished
process 2 finished
process 5 finished
process 4 finished
ALL DONE !!!!!!
You would also use a timer to do the job like in the following code.
I voluntarily put 15 second to thread 2 in order that one could see it’s effectively ended in last position once time elapsed.
This code sample has two main functions.
The first one your_process_here() like it’s name says is waiting for your own code
The second one is a manager which organizes the threads slicing in order to not overload the system.
Parameters
max_process : total number of processes being executed by the script
simultp : maximum number of simultaneous processes
timegl : time guideline which defines the waiting time for each thread since time parent starts. So waiting time is at least the time defined in the guideline (which refers to parent's start time).
Say in other words, since its guideline time is elapsed, thread starts as soon as possible when taking into account the maximum number of simultaneous threads allowed.
In this example
max_process = 6
simultp = 3
timegl = [1, 15, 1, 0.22, 6, 0.5] (just for explanations because the more logical is to have an increase series there)
Result in the shell
simultaneously launched processes : 3
process n°2 is active and will wait 14.99 seconds more before treatment function starts
process n°1 is active and will wait 0.98 seconds more before treatment function starts
process n°3 is active and will wait 0.98 seconds more before treatment function starts
---- process n°1 ended ----
---- process n°3 ended ----
simultaneously launched processes : 3
process n°5 is active and will wait 2.88 seconds more before treatment function starts
process n°4 is active and will start now
---- process n°4 ended ----
---- process n°5 ended ----
simultaneously launched processes : 2
process n°6 is active and will start now
---- process n°6 ended ----
---- process n°2 ended ----
Code
import multiprocessing as mp
from threading import Timer
import time
def your_process_here(starttime, pnum, timegl):
# Delay since the parent thread starts
delay_since_pstart = time.time() - starttime
# Time to sleep in order to follow the most possible the time guideline
diff = timegl[pnum-1]- delay_since_pstart
if diff > 0: # if time ellapsed since Parent starts < guideline time
print('process n°{0} is active and will wait {1} seconds more before treatment function starts'\
.format(pnum, round(diff, 2)))
time.sleep(diff) # wait for X more seconds
else:
print('process n°{0} is active and will start now'.format(pnum))
########################################################
## PUT THE CODE AFTER SLEEP() TO START CODE WITH A DELAY
## if pnum == 1:
## function1()
## elif pnum == 2:
## function2()
## ...
print('---- process n°{0} ended ----'.format(pnum))
def process_manager(max_process, simultp, timegl, starttime=0, pnum=1, launchp=[]):
# While your number of simultaneous current processes is less than simultp and
# the historical number of processes is less than max_process
while len(mp.active_children()) < simultp and len(launchp) < max_process:
# Incrementation of the process number
pnum = len(launchp) + 1
# Start a new process
mp.Process(target=your_process_here, args=(starttime, pnum, timegl)).start()
# Historical of all launched unique processes
launchp = list(set(launchp + mp.active_children()))
# ...
####### THESE 2 FOLLOWING LINES ARE TO DELETE IN OPERATIONAL CODE ############
print('simultaneously launched processes : ', len(mp.active_children()))
time.sleep(3) # optionnal : This a break of 3 seconds before the next slice of process to be treated
##############################################################################
if pnum < max_process:
delay_repeat = 0.1 # 100 ms
# If all the processes have not been launched renew the operation
Timer(delay_repeat, process_manager, (max_process, simultp, timegl, starttime, pnum, launchp)).start()
if __name__ == '__main__':
max_process = 6 # maximum of processes
simultp = 3 # maximum of simultaneous processes to save resources
timegl = [1, 15, 1, 0.22, 6, 0.5] # Time guideline
starttime = time.time()
process_manager(max_process, simultp, timegl, starttime)
I have 50 different methods that I want to run. I have 10 cpus available, so I can run only 10 processes at the same time. So I run them 5 times. However, the problem is first 10 processes should finish in order to start the second 10 processes, and this increases the time needed to finish. What I want is as soon as 9 processes are running a new process should start and always run 10 processes.
I put my 50 classes to 5 different groups and run.
group1 = [class1, class2...]
group2 = [class11, class12..]
groups = [group1, group2, ..., group5]
for group in groups:
threads = []
for x in group:
threads.append(mp.Process(target= x().method(), args= (b,)))
for thread in threads:
thread.start()
for thread in threads:
thread.join()
You should create a Pool of processes and use the apply_async method:
from multiprocessing import Pool
pool = Pool(processes=10) # start 10 worker processes
for arg in args:
pool.apply_async(yourFunc, args = (arg, ))
pool.close()
pool.join()
https://docs.python.org/2/library/multiprocessing.html
My problem is, whenever I use thr.results() the program acts like its running on one thread. But when i don't you use thr.results() it will use x threads
so if I remove my if statement, it will run on 10 threads, if I have it in there it will act like its on one 1 thread
def search(query):
r = requests.get("https://www.google.com/search?q=" + query)
return r.status_code
pool = ThreadPoolExecutor(max_workers=10)
for i in range(50):
thr = pool.submit(search, "stocks")
print(i)
if thr.result() != 404:
print("Ran")
pool.shutdown(wait=True)
That's because result will wait for the future to complete:
Return the value returned by the call. If the call hasn’t yet completed then this method will wait up to timeout seconds. If the call hasn’t completed in timeout seconds, then a concurrent.futures.TimeoutError will be raised. timeout can be an int or float. If timeout is not specified or None, there is no limit to the wait time.
When you have result within a loop you submit a task, then wait it to complete and then submit another one so there can be only one task running at a time.
Update You can either store the returned futures to a list and iterate over them once you have submitted all the task. Other option is to use map:
from concurrent.futures import ThreadPoolExecutor
import time
def square(x):
time.sleep(0.3)
return x * x
print(time.time())
with ThreadPoolExecutor(max_workers=3) as pool:
for res in pool.map(square, range(10)):
print(res)
print(time.time())
Output:
1485845609.983702
0
1
4
9
16
25
36
49
64
81
1485845611.1942203
The following code starts three processes, they are in a pool to handle 20 worker calls:
import multiprocessing
def worker(nr):
print(nr)
numbers = [i for i in range(20)]
if __name__ == '__main__':
multiprocessing.freeze_support()
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, numbers)
pool.close()
pool.join()
Is there a way to start the processes in a sequence (as opposed to having them starting all at the same time), with a delay inserted between each process start?
If not using a Pool I would have used multiprocessing.Process(target=worker, args=(nr,)).start() in a loop, starting them one after the other and inserting the delay as needed. I find Pool to be extremely useful, though (together with the map call) so I would be glad to keep it if possible.
According to the documentation, no such control over pooled processes exists. You could however, simulate it with a lock:
import multiprocessing
import time
lock = multiprocessing.Lock()
def worker(nr):
lock.acquire()
time.sleep(0.100)
lock.release()
print(nr)
numbers = [i for i in range(20)]
if __name__ == '__main__':
multiprocessing.freeze_support()
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, numbers)
pool.close()
pool.join()
Your 3 processes will still start simultaneously. Well, what I mean is you don't have control over which process starts executing the callback first. But at least you get your delay. This effectively has each worker "starting" (but really, continuing) at designated intervals.
Ammendment resulting from discussion below:
Note that on Windows it's not possible to inherit a lock from a parent process. Instead, you can use multiprocessing.Manager().Lock() to communicate a global lock object between processes (with additional IPC overhead, of course). The global lock object needs to be initialized in each process, as well. This would look like:
from multiprocessing import Process, freeze_support
import multiprocessing
import time
from datetime import datetime as dt
def worker(nr):
glock.acquire()
print('started job: {} at {}'.format(nr, dt.now()))
time.sleep(1)
glock.release()
print('ended job: {} at {}'.format(nr, dt.now()))
numbers = [i for i in range(6)]
def init(lock):
global glock
glock = lock
if __name__ == '__main__':
multiprocessing.freeze_support()
lock = multiprocessing.Manager().Lock()
pool = multiprocessing.Pool(processes=3, initializer=init, initargs=(lock,))
results = pool.map(worker, numbers)
pool.close()
pool.join()
Couldn't you do something simple like this:
from multiprocessing import Process
from time import sleep
def f(n):
print 'started job: '+str(n)
sleep(3)
print 'ended job: '+str(n)
if __name__ == '__main__':
for i in range(0,100):
p = Process(target=f, args=(i,))
p.start()
sleep(1)
Result
started job: 0
started job: 1
started job: 2
ended job: 0
started job: 3
ended job: 1
started job: 4
ended job: 2
started job: 5
could you try defining a function that yields your values slowly?
def get_numbers_on_delay(numbers, delay):
for i in numbers:
yield i
time.sleep(delay)
and then:
results = pool.map(worker, get_numbers_on_delay(numbers, 5))
i haven't tested it, so i'm not sure, but give it a shot.
I couldn't get the locking answer to work for some reason so i implemented it this way.
I realize the question is old, but maybe someone else has the same problem.
It spawns all the processes similar to the locking solution, but sleeps before work based on their process name number.
from multiprocessing import current_process
from re import search
from time import sleep
def worker():
process_number = search('\d+', current_process().name).group()
time_between_workers = 5
sleep(time_between_workers * int(process_number))
#do your work here
Since the names given to the processes seem to be unique and incremental, this grabs the number of the process and sleeps based on that.
SpawnPoolWorker-1 sleeps 1 * 5 seconds, SpawnPoolWorker-2 sleeps 2 * 5 seconds etc.
The following code works, but it is very slow due to passing the large data sets. In the actual implementation, the speed it takes to create the process and send the data is almost the same as calculation time, so by the time the second process is created, the first process is almost finished with the calculation, making parallezation? pointless.
The code is the same as in this question Multiprocessing has cutoff at 992 integers being joined as result with the suggested change working and implemented below. However, I ran into the common problem as others with I assume, pickling large data taking a long time.
I see answers using the multiprocessing.array to pass a shared memory array. I have an array of ~4000 indexes, but each index has a dictionary with 200 key/value pairs. The data is just read by each process, some calculation is done, and then an matrix (4000x3) (with no dicts) is returned.
Answers like this Is shared readonly data copied to different processes for Python multiprocessing? use map. Is it possible to maintain the below system and implement shared memory? Is there an efficient way to send the data to each process with an array of dicts, such as wrapping the dict in some manager and then putting that inside of the multiprocessing.array ?
import multiprocessing
def main():
data = {}
total = []
for j in range(0,3000):
total.append(data)
for i in range(0,200):
data[str(i)] = i
CalcManager(total,start=0,end=3000)
def CalcManager(myData,start,end):
print 'in calc manager'
#Multi processing
#Set the number of processes to use.
nprocs = 3
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
#Setup an empty array to store our processes
procs = []
#Divide up the data for the set number of processes
interval = (end-start)/nprocs
new_start = start
#Create all the processes while dividing the work appropriately
for i in range(nprocs):
print 'starting processes'
new_end = new_start + interval
#Make sure we dont go past the size of the data
if new_end > end:
new_end = end
#Generate a new process and pass it the arguments
data = myData[new_start:new_end]
#Create the processes and pass the data and the result queue
p = multiprocessing.Process(target=multiProcess,args=(data,new_start,new_end,result_q,i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
#Print out the results
for i in range(nprocs):
result = result_q.get()
print result
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
#MultiProcess Handling
def multiProcess(data,start,end,result_q,proc_num):
print 'started process'
results = []
temp = []
for i in range(0,22):
results.append(temp)
for j in range(0,3):
temp.append(j)
result_q.put(results)
return
if __name__== '__main__':
main()
Solved
by just putting the list of dictionaries into a manager, the problem was solved.
manager=Manager()
d=manager.list(myData)
It seems that the manager holding the list also manages the dict contained by that list. The startup time is a bit slow, so it seems data is still being copied, but its done once at the beginning and then inside of the process the data is sliced.
import multiprocessing
import multiprocessing.sharedctypes as mt
from multiprocessing import Process, Lock, Manager
from ctypes import Structure, c_double
def main():
data = {}
total = []
for j in range(0,3000):
total.append(data)
for i in range(0,100):
data[str(i)] = i
CalcManager(total,start=0,end=500)
def CalcManager(myData,start,end):
print 'in calc manager'
print type(myData[0])
manager = Manager()
d = manager.list(myData)
#Multi processing
#Set the number of processes to use.
nprocs = 3
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
#Setup an empty array to store our processes
procs = []
#Divide up the data for the set number of processes
interval = (end-start)/nprocs
new_start = start
#Create all the processes while dividing the work appropriately
for i in range(nprocs):
new_end = new_start + interval
#Make sure we dont go past the size of the data
if new_end > end:
new_end = end
#Generate a new process and pass it the arguments
data = myData[new_start:new_end]
#Create the processes and pass the data and the result queue
p = multiprocessing.Process(target=multiProcess,args=(d,new_start,new_end,result_q,i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
#Print out the results
for i in range(nprocs):
result = result_q.get()
print len(result)
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
#MultiProcess Handling
def multiProcess(data,start,end,result_q,proc_num):
#print 'started process'
results = []
temp = []
data = data[start:end]
for i in range(0,22):
results.append(temp)
for j in range(0,3):
temp.append(j)
print len(data)
result_q.put(results)
return
if __name__ == '__main__':
main()
You may see some improvement by using a multiprocessing.Manager to store your list in a manager server, and having each child process access items from the dict by pulling them from that one shared list, rather than copying slices to each child process:
def CalcManager(myData,start,end):
print 'in calc manager'
print type(myData[0])
manager = Manager()
d = manager.list(myData)
nprocs = 3
result_q = multiprocessing.Queue()
procs = []
interval = (end-start)/nprocs
new_start = start
for i in range(nprocs):
new_end = new_start + interval
if new_end > end:
new_end = end
p = multiprocessing.Process(target=multiProcess,
args=(d, new_start, new_end, result_q, i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
for i in range(nprocs):
result = result_q.get()
print len(result)
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
This copies your entire data list to a Manager process prior to creating any of your workers. The Manager returns a Proxy object that allows shared access to the list. You then just pass the Proxy to the workers, which means their startup time will be greatly reduced, since there's no longer any need to copy slices of the data list. The downside here is that accessing the list will be slower in the children, since the access needs to go to the manager process via IPC. Whether or not this will really help performance is very dependent on exactly what work you're doing on the list in your work processes, but its worth a try, since it requires very few code changes.
Looking at your question, I assume the following:
For each item in myData, you want to return an output (a matrix of some sort)
You created a JoinableQueue (tasks) probably for holding the input, but not sure how to use it
The Code
import logging
import multiprocessing
def create_logger(logger_name):
''' Create a logger that log to the console '''
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)
# create console handler and set appropriate level
ch = logging.StreamHandler()
formatter = logging.Formatter("%(processName)s %(funcName)s() %(levelname)s: %(message)s")
ch.setFormatter(formatter)
logger.addHandler(ch)
return logger
def main():
global logger
logger = create_logger(__name__)
logger.info('Main started')
data = []
for i in range(0,100):
data.append({str(i):i})
CalcManager(data,start=0,end=50)
logger.info('Main ended')
def CalcManager(myData,start,end):
logger.info('CalcManager started')
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
results = multiprocessing.Queue()
# Add tasks
for i in range(start, end):
tasks.put(myData[i])
# Create processes to do work
nprocs = 3
for i in range(nprocs):
logger.info('starting processes')
p = multiprocessing.Process(target=worker,args=(tasks,results))
p.daemon = True
p.start()
# Wait for tasks completion, i.e. tasks queue is empty
try:
tasks.join()
except KeyboardInterrupt:
logger.info('Cancel tasks')
# Print out the results
print 'RESULTS'
while not results.empty():
result = results.get()
print result
logger.info('CalManager ended')
def worker(tasks, results):
while True:
try:
task = tasks.get() # one row of input
task['done'] = True # simular work being done
results.put(task) # Save the result to the output queue
finally:
# JoinableQueue: for every get(), we need a task_done()
tasks.task_done()
if __name__== '__main__':
main()
Discussion
For multiple process situation, I recommend using the logging module as it offer a few advantages:
It is thread- and process- safe; meaning you won't have situation where the output of one processes mingle together
You can configure logging to show the process name, function name--very handy for debugging
CalcManager is essentially a task manager which does the following
Creates three processes
Populate the input queue, tasks
Waits for the task completion
Prints out the result
Note that when creating processes, I mark them as daemon, meaning they will killed when the main program exits. You don't have to worry about killing them
worker is where the work is done
Each of them runs forever (while True loop)
Each time through the loop, they will get one unit of input, do some processing, then put the result in the output
After a task is done, it calls task_done() so that the main process knows when all jobs are done. I put task_done in the finally clause to ensure it will run even if an error occurred during processing