The following code starts three processes, they are in a pool to handle 20 worker calls:
import multiprocessing
def worker(nr):
print(nr)
numbers = [i for i in range(20)]
if __name__ == '__main__':
multiprocessing.freeze_support()
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, numbers)
pool.close()
pool.join()
Is there a way to start the processes in a sequence (as opposed to having them starting all at the same time), with a delay inserted between each process start?
If not using a Pool I would have used multiprocessing.Process(target=worker, args=(nr,)).start() in a loop, starting them one after the other and inserting the delay as needed. I find Pool to be extremely useful, though (together with the map call) so I would be glad to keep it if possible.
According to the documentation, no such control over pooled processes exists. You could however, simulate it with a lock:
import multiprocessing
import time
lock = multiprocessing.Lock()
def worker(nr):
lock.acquire()
time.sleep(0.100)
lock.release()
print(nr)
numbers = [i for i in range(20)]
if __name__ == '__main__':
multiprocessing.freeze_support()
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, numbers)
pool.close()
pool.join()
Your 3 processes will still start simultaneously. Well, what I mean is you don't have control over which process starts executing the callback first. But at least you get your delay. This effectively has each worker "starting" (but really, continuing) at designated intervals.
Ammendment resulting from discussion below:
Note that on Windows it's not possible to inherit a lock from a parent process. Instead, you can use multiprocessing.Manager().Lock() to communicate a global lock object between processes (with additional IPC overhead, of course). The global lock object needs to be initialized in each process, as well. This would look like:
from multiprocessing import Process, freeze_support
import multiprocessing
import time
from datetime import datetime as dt
def worker(nr):
glock.acquire()
print('started job: {} at {}'.format(nr, dt.now()))
time.sleep(1)
glock.release()
print('ended job: {} at {}'.format(nr, dt.now()))
numbers = [i for i in range(6)]
def init(lock):
global glock
glock = lock
if __name__ == '__main__':
multiprocessing.freeze_support()
lock = multiprocessing.Manager().Lock()
pool = multiprocessing.Pool(processes=3, initializer=init, initargs=(lock,))
results = pool.map(worker, numbers)
pool.close()
pool.join()
Couldn't you do something simple like this:
from multiprocessing import Process
from time import sleep
def f(n):
print 'started job: '+str(n)
sleep(3)
print 'ended job: '+str(n)
if __name__ == '__main__':
for i in range(0,100):
p = Process(target=f, args=(i,))
p.start()
sleep(1)
Result
started job: 0
started job: 1
started job: 2
ended job: 0
started job: 3
ended job: 1
started job: 4
ended job: 2
started job: 5
could you try defining a function that yields your values slowly?
def get_numbers_on_delay(numbers, delay):
for i in numbers:
yield i
time.sleep(delay)
and then:
results = pool.map(worker, get_numbers_on_delay(numbers, 5))
i haven't tested it, so i'm not sure, but give it a shot.
I couldn't get the locking answer to work for some reason so i implemented it this way.
I realize the question is old, but maybe someone else has the same problem.
It spawns all the processes similar to the locking solution, but sleeps before work based on their process name number.
from multiprocessing import current_process
from re import search
from time import sleep
def worker():
process_number = search('\d+', current_process().name).group()
time_between_workers = 5
sleep(time_between_workers * int(process_number))
#do your work here
Since the names given to the processes seem to be unique and incremental, this grabs the number of the process and sleeps based on that.
SpawnPoolWorker-1 sleeps 1 * 5 seconds, SpawnPoolWorker-2 sleeps 2 * 5 seconds etc.
Related
It is possible to retrieve the outputs of workers with Pool.map, but when one worker fails, an exception is raised and it's not possible to retrieve the outputs anymore. So, my idea was to log the outputs in a process-synchronized queue so as to retrieve the outputs of all successful workers.
The following snippet seems to work:
from multiprocessing import Pool, Manager
from functools import partial
def f(x, queue):
if x == 4:
raise Exception("Error")
queue.put_nowait(x)
if __name__ == '__main__':
queue = Manager().Queue()
pool = Pool(2)
try:
pool.map(partial(f, queue=queue), range(6))
pool.close()
pool.join()
except:
print("An error occurred")
while not queue.empty():
print("Output => " + str(queue.get()))
But I was wondering whether a race condition could occur during the queue polling phase. I'm not sure whether the queue process will necessarily be alive when all workers have completed. Do you think my code is correct from that point of view?
As far as "how to correctly handle exceptions", which is your main question:
First, in your case, you will never get to execute pool.close and pool.join. But pool.map will not return until all the submitted tasks have returned their results or generated an exception, so you really don't need to call these to be sure that all of your submitted tasks have been completed. If it weren't for worker function f writing the results to a queue, you would never be able to get any results back using map as long as long as any of your tasks resulted in an exception. You would instead have to apply_async individual tasks and get AsyncResult instances for each one.
So I would say that a better way of handling exceptions in you worker functions without having to resort to using a queue would be as follows. But note that when you use apply_async, tasks are being submitted one task at a time, which can result in many shared memory accesses. This becomes a performance issue really only when the number of tasks being submitted is very large. In this case, it would be better for worker functions to handle the exceptions themselves and somehow pass back the error indication to allow the use of map or imap, where you could specify a chunksize.
When using a queue, be aware that writing to a managed queue has fair bit of overhead. The second piece of code shows how you can reduce that overhead a bit by using a multiprocessing.Queue instance, which does not use a proxy unlike the managed queue. Note the output order, which is not the order in which the tasks were submitted but rather the order in which tasks completed -- another potential downside or upside to using a queue (you can use a callback function with apply_async if you want the results in the order completed). Even with your original code you should not depend on the order of results in the queue.
from multiprocessing import Pool
def f(x):
if x == 4:
raise Exception("Error")
return x
if __name__ == '__main__':
pool = Pool(2)
results = [pool.apply_async(f, args=(x,)) for x in range(6)]
for x, result in enumerate(results): # result is AsyncResult instance:
try:
return_value = result.get()
except:
print(f'An error occurred for x = {x}')
else:
print(f'For x = {x} the return value is {return_value}')
Prints:
For x = 0 the return value is 0
For x = 1 the return value is 1
For x = 2 the return value is 2
For x = 3 the return value is 3
An error occurred for x = 4
For x = 5 the return value is 5
OP's Original Code Modified to Use multiprocessing.Queue
from multiprocessing import Pool, Queue
def init_pool(q):
global queue
queue = q
def f(x):
if x == 4:
raise Exception("Error")
queue.put_nowait(x)
if __name__ == '__main__':
queue = Queue()
pool = Pool(2, initializer=init_pool, initargs=(queue,))
try:
pool.map(f, range(6))
except:
print("An error occurred")
while not queue.empty():
print("Output => " + str(queue.get()))
Prints:
An error occurred
Output => 0
Output => 2
Output => 3
Output => 1
Output => 5
I have a queue of 500 processes that I want to run through a python script, I want to run every N processes in parallel.
What my python script does so far:
It runs N processes in parallel, waits for all of them to terminate, then runs the next N files.
What I need to do:
When one of the N processes is finished, another process from the queue is automatically started, without waiting for the rest of the processes to terminate.
Note: I do not know how much time each process will take, so I can't schedule a process to run at a particular time.
Following is the code that I have.
I am currently using subprocess.Popen, but I'm not limited to its use.
for i in range(0, len(queue), N):
batch = []
for _ in range(int(jobs)):
batch.append(queue.pop(0))
for process in batch:
p = subprocess.Popen([process])
ps.append(p)
for p in ps:
p.communicate()
I believe this should work:
import subprocess
import time
def check_for_done(l):
for i, p in enumerate(l):
if p.poll() is not None:
return True, i
return False, False
processes = list()
N = 5
queue = list()
for process in queue:
p = subprocess.Popen(process)
processes.append(p)
if len(processes) == N:
wait = True
while wait:
done, num = check_for_done(processes)
if done:
processes.pop(num)
wait = False
else:
time.sleep(0.5) # set this so the CPU does not go crazy
So you have an active process list, and the check_for_done function loops through it, the subprocess returns None if it is not finished and it returns a return code if it is. So when something is returned it should be done (without knowing if it was successful or not). Then you remove that process from the list allowing for the loop to add another one.
Assuming python3, you could make use of ThreadPoolExecutor from concurrent.futures like,
$ cat run.py
from subprocess import Popen, PIPE
from concurrent.futures import ThreadPoolExecutor
def exec_(cmd):
proc = Popen(cmd, stdout=PIPE, stderr=PIPE)
stdout, stderr = proc.communicate()
#print(stdout, stderr)
def main():
with ThreadPoolExecutor(max_workers=4) as executor:
# to demonstrate it will take a batch of 4 jobs at the same time
cmds = [['sleep', '4'] for i in range(10)]
start = time.time()
futures = executor.map(exec_, cmds)
for future in futures:
pass
end = time.time()
print(f'Took {end-start} seconds')
if __name__ == '__main__':
main()
This will process 4 tasks at a time, and since the number of tasks are 10, it should only take around 4 + 4 + 4 = 12 seconds
First 4 seconds for the first 4 tasks
Seconds 4 seconds for the second 4 tasks
And the final 4 seconds for the last 2 tasks remaining
Output:
$ python run.py
Took 12.005989074707031 seconds
For a web-scraping analysis I need two loops that run permanently, one returning a list with websites updated every x minutes, while the other one analyses the sites (old an new ones) every y seconds. This is the code construction that exemplifies, what I am trying to do, but it doesn't work: Code has been edited to incorporate answers and my research
from multiprocessing import Process
import time, random
from threading import Lock
from collections import deque
class MyQueue(object):
def __init__(self):
self.items = deque()
self.lock = Lock()
def put(self, item):
with self.lock:
self.items.append(item)
# Example pointed at in [this][1] answer
def get(self):
with self.lock:
return self.items.popleft()
def a(queue):
while True:
x=[random.randint(0,10), random.randint(0,10), random.randint(0,10)]
print 'send', x
queue.put(x)
time.sleep(10)
def b(queue):
try:
while queue:
x = queue.get()
print 'recieve', x
for i in x:
print i
time.sleep(2)
except IndexError:
print queue.get()
if __name__ == '__main__':
q = MyQueue()
p1 = Process(target=a, args=(q,))
p2 = Process(target=b, args=(q,))
p1.start()
p2.start()
p1.join()
p2.join()
So, this is my first Python project after an online introduction course and I am struggling here big time. I understand now, that the functions don't truly run in parallel, as b does not start until a is finished ( I used this answer an tinkered with the timer and while True). EDIT: Even after using the approach given in the answer, I think this is still the case, as the queue.get() throws an IndexError saying, the deque is empty. I can only explain that with process a not finishing, because when I print queue.get()
immediately after .put(x) it is not empty.
I eventually want an output like this:
send [3,4,6]
3
4
6
3
4
send [3,8,6,5] #the code above gives always 3 entries, but in my project
3 #the length varies
8
6
5
3
8
6
.
.
What do I need for having two truly parallel loops where one is returning an updated list every x minutes which the other loop needs as basis for analysis? Is Process really the right tool here?
And where can I get good info about designing my program.
I did something a little like this a while ago. I think using the Process is the correct approach, but if you want to pass data between processes then you should probably use a Queue.
https://docs.python.org/2/library/multiprocessing.html#exchanging-objects-between-processes
Create the queue first and pass it into both processes. One can write to it, the other can read from it.
One issue I remember is that the reading process will block on the queue until something is pushed to it, so you may need to push a special 'terminate' message of some kind to the queue when process 1 is done so process 2 knows to stop.
EDIT: Simple example. This doesn't include a clean way to stop the processes. But it shows how you can start 2 new processes and pass data from one to the other. Since the queue blocks on get() function b will automatically wait for data from a before continuing.
from multiprocessing import Process, Queue
import time, random
def a(queue):
while True:
x=[random.randint(0,10), random.randint(0,10), random.randint(0,10)]
print 'send', x
queue.put(x)
time.sleep(5)
def b(queue):
x = []
while True:
time.sleep(1)
try:
x = queue.get(False)
print 'receive', x
except:
pass
for i in x:
print i
if __name__ == '__main__':
q = Queue()
p1 = Process(target=a, args=(q,))
p2 = Process(target=b, args=(q,))
p1.start()
p2.start()
p1.join()
p2.join()
The following code works, but it is very slow due to passing the large data sets. In the actual implementation, the speed it takes to create the process and send the data is almost the same as calculation time, so by the time the second process is created, the first process is almost finished with the calculation, making parallezation? pointless.
The code is the same as in this question Multiprocessing has cutoff at 992 integers being joined as result with the suggested change working and implemented below. However, I ran into the common problem as others with I assume, pickling large data taking a long time.
I see answers using the multiprocessing.array to pass a shared memory array. I have an array of ~4000 indexes, but each index has a dictionary with 200 key/value pairs. The data is just read by each process, some calculation is done, and then an matrix (4000x3) (with no dicts) is returned.
Answers like this Is shared readonly data copied to different processes for Python multiprocessing? use map. Is it possible to maintain the below system and implement shared memory? Is there an efficient way to send the data to each process with an array of dicts, such as wrapping the dict in some manager and then putting that inside of the multiprocessing.array ?
import multiprocessing
def main():
data = {}
total = []
for j in range(0,3000):
total.append(data)
for i in range(0,200):
data[str(i)] = i
CalcManager(total,start=0,end=3000)
def CalcManager(myData,start,end):
print 'in calc manager'
#Multi processing
#Set the number of processes to use.
nprocs = 3
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
#Setup an empty array to store our processes
procs = []
#Divide up the data for the set number of processes
interval = (end-start)/nprocs
new_start = start
#Create all the processes while dividing the work appropriately
for i in range(nprocs):
print 'starting processes'
new_end = new_start + interval
#Make sure we dont go past the size of the data
if new_end > end:
new_end = end
#Generate a new process and pass it the arguments
data = myData[new_start:new_end]
#Create the processes and pass the data and the result queue
p = multiprocessing.Process(target=multiProcess,args=(data,new_start,new_end,result_q,i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
#Print out the results
for i in range(nprocs):
result = result_q.get()
print result
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
#MultiProcess Handling
def multiProcess(data,start,end,result_q,proc_num):
print 'started process'
results = []
temp = []
for i in range(0,22):
results.append(temp)
for j in range(0,3):
temp.append(j)
result_q.put(results)
return
if __name__== '__main__':
main()
Solved
by just putting the list of dictionaries into a manager, the problem was solved.
manager=Manager()
d=manager.list(myData)
It seems that the manager holding the list also manages the dict contained by that list. The startup time is a bit slow, so it seems data is still being copied, but its done once at the beginning and then inside of the process the data is sliced.
import multiprocessing
import multiprocessing.sharedctypes as mt
from multiprocessing import Process, Lock, Manager
from ctypes import Structure, c_double
def main():
data = {}
total = []
for j in range(0,3000):
total.append(data)
for i in range(0,100):
data[str(i)] = i
CalcManager(total,start=0,end=500)
def CalcManager(myData,start,end):
print 'in calc manager'
print type(myData[0])
manager = Manager()
d = manager.list(myData)
#Multi processing
#Set the number of processes to use.
nprocs = 3
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
#Setup an empty array to store our processes
procs = []
#Divide up the data for the set number of processes
interval = (end-start)/nprocs
new_start = start
#Create all the processes while dividing the work appropriately
for i in range(nprocs):
new_end = new_start + interval
#Make sure we dont go past the size of the data
if new_end > end:
new_end = end
#Generate a new process and pass it the arguments
data = myData[new_start:new_end]
#Create the processes and pass the data and the result queue
p = multiprocessing.Process(target=multiProcess,args=(d,new_start,new_end,result_q,i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
#Print out the results
for i in range(nprocs):
result = result_q.get()
print len(result)
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
#MultiProcess Handling
def multiProcess(data,start,end,result_q,proc_num):
#print 'started process'
results = []
temp = []
data = data[start:end]
for i in range(0,22):
results.append(temp)
for j in range(0,3):
temp.append(j)
print len(data)
result_q.put(results)
return
if __name__ == '__main__':
main()
You may see some improvement by using a multiprocessing.Manager to store your list in a manager server, and having each child process access items from the dict by pulling them from that one shared list, rather than copying slices to each child process:
def CalcManager(myData,start,end):
print 'in calc manager'
print type(myData[0])
manager = Manager()
d = manager.list(myData)
nprocs = 3
result_q = multiprocessing.Queue()
procs = []
interval = (end-start)/nprocs
new_start = start
for i in range(nprocs):
new_end = new_start + interval
if new_end > end:
new_end = end
p = multiprocessing.Process(target=multiProcess,
args=(d, new_start, new_end, result_q, i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
for i in range(nprocs):
result = result_q.get()
print len(result)
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
This copies your entire data list to a Manager process prior to creating any of your workers. The Manager returns a Proxy object that allows shared access to the list. You then just pass the Proxy to the workers, which means their startup time will be greatly reduced, since there's no longer any need to copy slices of the data list. The downside here is that accessing the list will be slower in the children, since the access needs to go to the manager process via IPC. Whether or not this will really help performance is very dependent on exactly what work you're doing on the list in your work processes, but its worth a try, since it requires very few code changes.
Looking at your question, I assume the following:
For each item in myData, you want to return an output (a matrix of some sort)
You created a JoinableQueue (tasks) probably for holding the input, but not sure how to use it
The Code
import logging
import multiprocessing
def create_logger(logger_name):
''' Create a logger that log to the console '''
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)
# create console handler and set appropriate level
ch = logging.StreamHandler()
formatter = logging.Formatter("%(processName)s %(funcName)s() %(levelname)s: %(message)s")
ch.setFormatter(formatter)
logger.addHandler(ch)
return logger
def main():
global logger
logger = create_logger(__name__)
logger.info('Main started')
data = []
for i in range(0,100):
data.append({str(i):i})
CalcManager(data,start=0,end=50)
logger.info('Main ended')
def CalcManager(myData,start,end):
logger.info('CalcManager started')
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
results = multiprocessing.Queue()
# Add tasks
for i in range(start, end):
tasks.put(myData[i])
# Create processes to do work
nprocs = 3
for i in range(nprocs):
logger.info('starting processes')
p = multiprocessing.Process(target=worker,args=(tasks,results))
p.daemon = True
p.start()
# Wait for tasks completion, i.e. tasks queue is empty
try:
tasks.join()
except KeyboardInterrupt:
logger.info('Cancel tasks')
# Print out the results
print 'RESULTS'
while not results.empty():
result = results.get()
print result
logger.info('CalManager ended')
def worker(tasks, results):
while True:
try:
task = tasks.get() # one row of input
task['done'] = True # simular work being done
results.put(task) # Save the result to the output queue
finally:
# JoinableQueue: for every get(), we need a task_done()
tasks.task_done()
if __name__== '__main__':
main()
Discussion
For multiple process situation, I recommend using the logging module as it offer a few advantages:
It is thread- and process- safe; meaning you won't have situation where the output of one processes mingle together
You can configure logging to show the process name, function name--very handy for debugging
CalcManager is essentially a task manager which does the following
Creates three processes
Populate the input queue, tasks
Waits for the task completion
Prints out the result
Note that when creating processes, I mark them as daemon, meaning they will killed when the main program exits. You don't have to worry about killing them
worker is where the work is done
Each of them runs forever (while True loop)
Each time through the loop, they will get one unit of input, do some processing, then put the result in the output
After a task is done, it calls task_done() so that the main process knows when all jobs are done. I put task_done in the finally clause to ensure it will run even if an error occurred during processing
After looking over Thread/Queue model, I would like to ask how to create something in a similar fashion:
I have, for example, 100 items that need processing, but I want to process them 20 at a time - so there are 20 slots, and once each slot is emptied ( item finished processing ) next one would be seated and started processing.
The limit doesn't need to be 20, it could be adjusted ( 30, 50 etc) ?
Thank you for your suggestions and answers!
Use multiprocessing:
import multiprocessing
import time
def process(x):
time.sleep(1)
return x
if __name__ == '__main__':
jobs = range(100)
pool = multiprocessing.Pool(20)
for result in pool.imap_unordered(process, jobs):
print(result)
pool.close()
pool.join()