The following code works, but it is very slow due to passing the large data sets. In the actual implementation, the speed it takes to create the process and send the data is almost the same as calculation time, so by the time the second process is created, the first process is almost finished with the calculation, making parallezation? pointless.
The code is the same as in this question Multiprocessing has cutoff at 992 integers being joined as result with the suggested change working and implemented below. However, I ran into the common problem as others with I assume, pickling large data taking a long time.
I see answers using the multiprocessing.array to pass a shared memory array. I have an array of ~4000 indexes, but each index has a dictionary with 200 key/value pairs. The data is just read by each process, some calculation is done, and then an matrix (4000x3) (with no dicts) is returned.
Answers like this Is shared readonly data copied to different processes for Python multiprocessing? use map. Is it possible to maintain the below system and implement shared memory? Is there an efficient way to send the data to each process with an array of dicts, such as wrapping the dict in some manager and then putting that inside of the multiprocessing.array ?
import multiprocessing
def main():
data = {}
total = []
for j in range(0,3000):
total.append(data)
for i in range(0,200):
data[str(i)] = i
CalcManager(total,start=0,end=3000)
def CalcManager(myData,start,end):
print 'in calc manager'
#Multi processing
#Set the number of processes to use.
nprocs = 3
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
#Setup an empty array to store our processes
procs = []
#Divide up the data for the set number of processes
interval = (end-start)/nprocs
new_start = start
#Create all the processes while dividing the work appropriately
for i in range(nprocs):
print 'starting processes'
new_end = new_start + interval
#Make sure we dont go past the size of the data
if new_end > end:
new_end = end
#Generate a new process and pass it the arguments
data = myData[new_start:new_end]
#Create the processes and pass the data and the result queue
p = multiprocessing.Process(target=multiProcess,args=(data,new_start,new_end,result_q,i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
#Print out the results
for i in range(nprocs):
result = result_q.get()
print result
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
#MultiProcess Handling
def multiProcess(data,start,end,result_q,proc_num):
print 'started process'
results = []
temp = []
for i in range(0,22):
results.append(temp)
for j in range(0,3):
temp.append(j)
result_q.put(results)
return
if __name__== '__main__':
main()
Solved
by just putting the list of dictionaries into a manager, the problem was solved.
manager=Manager()
d=manager.list(myData)
It seems that the manager holding the list also manages the dict contained by that list. The startup time is a bit slow, so it seems data is still being copied, but its done once at the beginning and then inside of the process the data is sliced.
import multiprocessing
import multiprocessing.sharedctypes as mt
from multiprocessing import Process, Lock, Manager
from ctypes import Structure, c_double
def main():
data = {}
total = []
for j in range(0,3000):
total.append(data)
for i in range(0,100):
data[str(i)] = i
CalcManager(total,start=0,end=500)
def CalcManager(myData,start,end):
print 'in calc manager'
print type(myData[0])
manager = Manager()
d = manager.list(myData)
#Multi processing
#Set the number of processes to use.
nprocs = 3
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
#Setup an empty array to store our processes
procs = []
#Divide up the data for the set number of processes
interval = (end-start)/nprocs
new_start = start
#Create all the processes while dividing the work appropriately
for i in range(nprocs):
new_end = new_start + interval
#Make sure we dont go past the size of the data
if new_end > end:
new_end = end
#Generate a new process and pass it the arguments
data = myData[new_start:new_end]
#Create the processes and pass the data and the result queue
p = multiprocessing.Process(target=multiProcess,args=(d,new_start,new_end,result_q,i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
#Print out the results
for i in range(nprocs):
result = result_q.get()
print len(result)
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
#MultiProcess Handling
def multiProcess(data,start,end,result_q,proc_num):
#print 'started process'
results = []
temp = []
data = data[start:end]
for i in range(0,22):
results.append(temp)
for j in range(0,3):
temp.append(j)
print len(data)
result_q.put(results)
return
if __name__ == '__main__':
main()
You may see some improvement by using a multiprocessing.Manager to store your list in a manager server, and having each child process access items from the dict by pulling them from that one shared list, rather than copying slices to each child process:
def CalcManager(myData,start,end):
print 'in calc manager'
print type(myData[0])
manager = Manager()
d = manager.list(myData)
nprocs = 3
result_q = multiprocessing.Queue()
procs = []
interval = (end-start)/nprocs
new_start = start
for i in range(nprocs):
new_end = new_start + interval
if new_end > end:
new_end = end
p = multiprocessing.Process(target=multiProcess,
args=(d, new_start, new_end, result_q, i))
procs.append(p)
p.start()
#Increment our next start to the current end
new_start = new_end+1
print 'finished starting'
for i in range(nprocs):
result = result_q.get()
print len(result)
#Joint the process to wait for all data/process to be finished
for p in procs:
p.join()
This copies your entire data list to a Manager process prior to creating any of your workers. The Manager returns a Proxy object that allows shared access to the list. You then just pass the Proxy to the workers, which means their startup time will be greatly reduced, since there's no longer any need to copy slices of the data list. The downside here is that accessing the list will be slower in the children, since the access needs to go to the manager process via IPC. Whether or not this will really help performance is very dependent on exactly what work you're doing on the list in your work processes, but its worth a try, since it requires very few code changes.
Looking at your question, I assume the following:
For each item in myData, you want to return an output (a matrix of some sort)
You created a JoinableQueue (tasks) probably for holding the input, but not sure how to use it
The Code
import logging
import multiprocessing
def create_logger(logger_name):
''' Create a logger that log to the console '''
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)
# create console handler and set appropriate level
ch = logging.StreamHandler()
formatter = logging.Formatter("%(processName)s %(funcName)s() %(levelname)s: %(message)s")
ch.setFormatter(formatter)
logger.addHandler(ch)
return logger
def main():
global logger
logger = create_logger(__name__)
logger.info('Main started')
data = []
for i in range(0,100):
data.append({str(i):i})
CalcManager(data,start=0,end=50)
logger.info('Main ended')
def CalcManager(myData,start,end):
logger.info('CalcManager started')
#Initialize the multiprocessing queue so we can get the values returned to us
tasks = multiprocessing.JoinableQueue()
results = multiprocessing.Queue()
# Add tasks
for i in range(start, end):
tasks.put(myData[i])
# Create processes to do work
nprocs = 3
for i in range(nprocs):
logger.info('starting processes')
p = multiprocessing.Process(target=worker,args=(tasks,results))
p.daemon = True
p.start()
# Wait for tasks completion, i.e. tasks queue is empty
try:
tasks.join()
except KeyboardInterrupt:
logger.info('Cancel tasks')
# Print out the results
print 'RESULTS'
while not results.empty():
result = results.get()
print result
logger.info('CalManager ended')
def worker(tasks, results):
while True:
try:
task = tasks.get() # one row of input
task['done'] = True # simular work being done
results.put(task) # Save the result to the output queue
finally:
# JoinableQueue: for every get(), we need a task_done()
tasks.task_done()
if __name__== '__main__':
main()
Discussion
For multiple process situation, I recommend using the logging module as it offer a few advantages:
It is thread- and process- safe; meaning you won't have situation where the output of one processes mingle together
You can configure logging to show the process name, function name--very handy for debugging
CalcManager is essentially a task manager which does the following
Creates three processes
Populate the input queue, tasks
Waits for the task completion
Prints out the result
Note that when creating processes, I mark them as daemon, meaning they will killed when the main program exits. You don't have to worry about killing them
worker is where the work is done
Each of them runs forever (while True loop)
Each time through the loop, they will get one unit of input, do some processing, then put the result in the output
After a task is done, it calls task_done() so that the main process knows when all jobs are done. I put task_done in the finally clause to ensure it will run even if an error occurred during processing
Related
I'm working on an optimization problem, and you can see a simplified version of my code posted below (the origin code is too complicated for asking such a question, and I hope my simplified code has simulated the original one as much as possible).
My purpose:
use the function foo in the function optimization, but foo can take very long time due to some hard situations. So I use multiprocessing to set a time limit for execution of the function (proc.join(iter_time), the method is from an anwser from this question; How to limit execution time of a function call?).
My problem:
In the while loop, every time the generated values for extra are the same.
The list lst's length is always 1, which means in every iteration in the while loop it starts from an empty list.
My guess: possible reason can be each time I create a process the random seed is counting from the beginning, and each time the process is terminated, there could be some garbage collection mechanism to clean the memory the processused, so the list is cleared.
My question
Anyone know the real reason of such problems?
if not using multiprocessing, is there anyway else that I can realize my purpose while generate different random numbers? btw I have tried func_timeout but it has other problems that I cannot handle...
random.seed(123)
lst = [] # a global list for logging data
def foo(epoch):
...
extra = random.random()
lst.append(epoch + extra)
...
def optimization(loop_time, iter_time):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = multiprocessing.Process(target=foo, args=(epoch,))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
if __name__ == '__main__':
optimization(300, 2)
You need to use shared memory if you want to share variables across processes. This is because child processes do not share their memory space with the parent. Simplest way to do this here would be to use managed lists and delete the line where you set a number seed. This is what is causing same number to be generated because all child processes will take the same seed to generate the random numbers. To get different random numbers either don't set a seed, or pass a different seed to each process:
import time, random
from multiprocessing import Manager, Process
def foo(epoch, lst):
extra = random.random()
lst.append(epoch + extra)
def optimization(loop_time, iter_time, lst):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = Process(target=foo, args=(epoch, lst))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
print(lst)
if __name__ == '__main__':
manager = Manager()
lst = manager.list()
optimization(10, 2, lst)
Output
[0.2035898948744943, 0.07617925389396074, 0.6416754412198231, 0.6712193790613651, 0.419777147554235, 0.732982735576982, 0.7137712131028766, 0.22875414425414997, 0.3181113880578589, 0.5613367673646847, 0.8699685474084119, 0.9005359611195111, 0.23695341111251134, 0.05994288664062197, 0.2306562314450149, 0.15575356275408125, 0.07435292814989103, 0.8542361251850187, 0.13139055891993145, 0.5015152768477814, 0.19864873743952582, 0.2313646288041601, 0.28992667535697736, 0.6265055915510219, 0.7265797043535446, 0.9202923318284002, 0.6321511834038631, 0.6728367262605407, 0.6586979597202935, 0.1309226720786667, 0.563889613032526, 0.389358766191921, 0.37260564565714316, 0.24684684162272597, 0.5982042933298861, 0.896663326233504, 0.7884030244369596, 0.6202229004466849, 0.4417549843477827, 0.37304274232635715, 0.5442716244427301, 0.9915536257041505, 0.46278512685707873, 0.4868394190894778, 0.2133187095154937]
Keep in mind that using managers will affect performance of your code. Alternate to this, you could also use multiprocessing.Array, which is faster than managers but is less flexible in what data it can store, or Queues as well.
import multiprocessing
import time
def WORK(x,q,it):
for i in range(it):
t = x + '---'+str(i)
q.put(t)
def cons(q,cp):
while not q.empty():
cp.append(q.get())
return q.put(cp)
if __name__ == '__main__':
cp = []
it = 600 #iteratons
start = time.perf_counter()
q = multiprocessing.Queue()
p1 = multiprocessing.Process(target = WORK, args = ('n',q,it))
p2 = multiprocessing.Process(target=WORK, args=('x',q,it))
p3 = multiprocessing.Process(target=cons, args=(q,cp,))
p1.start()
p2.start()
p3.start()
p1.join()
p2.join()
p3.join()
print(q.get())
end = time.perf_counter()
print(end - start)
I encountered a problem running this code in Pycharm and Colab, if i run this in colab it works fine only with 1000 iterations and less in WORK() process, if more - it freezes.
In Pycharm it works fine only with 500 iterations or less
What is a problem??? Any limitations?
So i find not very good solution is to remove join or put it after dict call from queue, it help to get mor limits, with this code it started to work with 1000 iterations in pycharm but 10000 iteration is deadlock again
p1.join()
p2.join()
print(q.get())
p3.join()
end = time.perf_counter()
print(end - start)
Further change helped me to increase iterations limit to 10000 by adding queuq maxsize:
q = multiprocessing.Queue(maxsize = 1000)
So what is limitations and laws with this queues???
How to manage endless queue, from websockets for example, they sends data continiously
You have several issues with your code. First, according to the documentation on multiprocessing.Queue, method empty is not reliable. So in function cons the statement while not q.empty(): is problematic. But even if method Queue.empty were reliable, you have here a race condition. You have started processes WORK and cons in parallel where the former is writing elements to a queue and the latter is reading until it finds the queue is empty. But if cons runs before WORK gets to write its first element, it will find the queue immediately empty and that is not your expected result. And as I mentioned in my comment above, you must not try to join a process that is writing to a queue before you have retrieved all of the records that process has written.
Another problem you have is you are passing to cons an empty list cp to which you keep on appending. But cons is a function belonging to a process running in a different address space and consequently the cp list it is appending to is not the same cp list as in the main process. Just be aware of this.
Finally, cons is writing its result to the same queue that it is reading from and consequently the main process is reading this result from that same queue. So we have another race condition: Once the main process has been modified not to read from this queue until after it has joined all the processes, the main process and cons are now both reading from the same queue in parallel. We now need a separate input and output queue so that there is no conflict. That solves this race condition.
To solve the the first race condition, the WORK process should write a special sentinel record that serves as an end of records indicator. It could be the value None if None is not a valid normal record or it could be any special object that cannot be mistaken for an actual record. Since we have two processes writing records to the same input queue for cons to read, we will end up with two sentinel records, which cons will have to be looking for to know that there are truly no more records left.
import multiprocessing
import time
SENTINEL = 'SENTINEL' # or None
def WORK(x, q, it):
for i in range(it):
t = x + '---' + str(i)
q.put(t)
q.put(SENTINEL) # show end of records
def cons(q_in, q_out, cp):
# We now are looking for two end of record indicators:
for record in iter(q_in.get, SENTINEL):
cp.append(record)
for record in iter(q_in.get, SENTINEL):
cp.append(record)
q_out.put(cp)
if __name__ == '__main__':
it = 600 #iteratons
start = time.perf_counter()
q_in = multiprocessing.Queue()
q_out = multiprocessing.Queue()
p1 = multiprocessing.Process(target=WORK, args = ('n', q_in, it))
p2 = multiprocessing.Process(target=WORK, args=('x', q_in, it))
cp = []
p3 = multiprocessing.Process(target=cons, args=(q_in, q_out, cp))
p1.start()
p2.start()
p3.start()
cp = q_out.get()
print(len(cp))
p1.join()
p2.join()
p3.join()
end = time.perf_counter()
print(end - start)
Prints:
1200
0.1717168
I'm trying Python multiprocessing, and I want to use Lock to avoid overlapping variable 'es_id' values.
According to theory and examples, when a process calls lock, 'es_id' can't overlap because another process can't access it, but, the results show that es_id often overlaps.
How can the id values not overlap?
Part of my code is:
def saveDB(imgName, imgType, imgStar, imgPull, imgTag, lock): #lock=Lock() in main
imgName=NameFormat(imgName) #name/subname > name:subname
i=0
while i < len(imgName):
lock.acquire() #since global es_id
global es_id
print "getIMG.pt:save information about %s"%(imgName[i])
cmd="curl -XPUT http://localhost:9200/kimhk/imgName/"+str(es_id)+" -d '{" +\
'"image_name":"'+imgName[i]+'", '+\
'"image_type":"'+imgType[i]+'", '+\
'"image_star":"'+imgStar[i]+'", '+\
'"image_pull":"'+imgPull[i]+'", '+\
'"image_Tag":"'+",".join(imgTag[i])+'"'+\
"}'"
try:
subprocess.call(cmd,shell=True)
except subprocess.CalledProcessError as e:
print e.output
i+=1
es_id+=1
lock.release()
...
#main
if __name__ == "__main__":
lock = Lock()
exPg, proc_num=option()
procs=[]
pages=[ [] for i in range(proc_num)]
i=1
#Use Multiprocessing to get HTML data quickly
if proc_num >= exPg: #if page is less than proc_num, don't need to distribute the page to the process.
while i<=exPg:
page=i
proc=Process(target=getExplore, args=(page,lock,))
procs.append(proc)
proc.start()
i+=1
else:
while i<=exPg: #distribute the page to the process
page=i
index=(i-1)%proc_num #if proc_num=4 -> 0 1 2 3
pages[index].append(page)
i+=1
i=0
while i<proc_num:
proc=Process(target=getExplore, args=(pages[i],lock,))#
procs.append(proc)
proc.start()
i+=1
for proc in procs:
proc.join()
execution result screen:
result is the output of subprocess.call (cmd, shell = True). I use XPUT to add data to ElasticSearch, and es_id is the id of the data. I want these id to increase sequentially without overlap. (Because they will be overwritten by the previous data if they overlap)
I know XPOST doesn't need to use a lock code because it automatically generates an ID, but I need to access all the data sequentially in the future (like reading one line of files).
If you know how to access all the data sequentially after using XPOST, can you tell me?
It looks like you are trying to access a global variable with a lock, but global variables are different instances between processes. What you need to use is a shared memory value. Here's a working example. It has been tested on Python 2.7 and 3.6:
from __future__ import print_function
import multiprocessing as mp
def process(counter):
# Increment the counter 3 times.
# Hold the counter's lock for read/modify/write operations.
# Keep holding it so the value doesn't change before printing,
# and keep prints from multiple processes from trying to write
# to a line at the same time.
for _ in range(3):
with counter.get_lock():
counter.value += 1
print(mp.current_process().name,counter.value)
def main():
counter = mp.Value('i') # shared integer
processes = [mp.Process(target=process,args=(counter,)) for i in range(3)]
for p in processes:
p.start()
for p in processes:
p.join()
if __name__ == '__main__':
main()
Output:
Process-2 1
Process-2 2
Process-1 3
Process-3 4
Process-2 5
Process-1 6
Process-3 7
Process-1 8
Process-3 9
You've only given part of your code, so I can only see a potential problem. It doesn't do any good to lock-protect one access to es_id. You must lock-protect them all, anywhere they occur in the program. Perhaps it is best to create an access function for this purpose, like:
def increment_es_id():
global es_id
lock.acquire()
es_id += 1
lock.release()
This can be called safely from any thread.
In your code, it's a good practice to move the acquire/release calls as close together as you can make them. Here you only need to protect one variable, so you can move the acquire/release pair to just before and after the es_id += 1 statement..
Even better is to use the lock in a context manager (although in this simple case it won't make any difference):
def increment_es_id2():
global es_id
with lock:
es_id += 1
I am about to start on an endevour with python. The goal is to multithread different tasks and use queues to communicate between tasks. For the sake of clarity I would like to be able to pass a queue to a sub-function, thus sending information to the queue from there. So something similar like so:
from queue import Queue
from threading import Thread
import copy
# Object that signals shutdown
_sentinel = object()
# increment function
def increment(i, out_q):
i += 1
print(i)
out_q.put(i)
return
# A thread that produces data
def producer(out_q):
i = 0
while True:
# Produce some data
increment( i , out_q)
if i > 5:
out_q.put(_sentinel)
break
# A thread that consumes data
def consumer(in_q):
while True:
# Get some data
data = in_q.get()
# Process the data
# Check for termination
if data is _sentinel:
in_q.put(_sentinel)
break
# Create the shared queue and launch both threads
q = Queue()
t1 = Thread(target=consumer, args=(q,))
t2 = Thread(target=producer, args=(q,))
t1.start()
t2.start()
# Wait for all produced items to be consumed
q.join()
Currently the output is a row of 0's, where I would like it to be the numbers 1 to 6. I have read the difficulty of passing references in python, but would like to clarify if this is just not possible in python or am I looking at this issue wrongly?
The problem has nothing to do with the way the queues are passed; you're doing that right. The issue is actually related to how you're trying to increment i. Because variable in python are passed by assignment, you have to actually return the incremented value of i back to the caller for the change you made inside increment to have any effect. Otherwise, you just rebind the local variable i inside of increment, and then i gets thrown away when increment completes.
You can also simplify your consume method a bit by using the iter built-in function, along with a for loop, to consume from the queue until _sentinel is reached, rather than a while True loop:
from queue import Queue
from threading import Thread
import copy
# Object that signals shutdown
_sentinel = object()
# increment function
def increment(i):
i += 1
return i
# A thread that produces data
def producer(out_q):
i = 0
while True:
# Produce some data
i = increment( i )
print(i)
out_q.put(i)
if i > 5:
out_q.put(_sentinel)
break
# A thread that consumes data
def consumer(in_q):
for data in iter(in_q.get, _sentinel):
# Process the data
pass
# Create the shared queue and launch both threads
q = Queue()
t1 = Thread(target=consumer, args=(q,))
t2 = Thread(target=producer, args=(q,))
t1.start()
t2.start()
Output:
1
2
3
4
5
6
This is a followup question to this. User Will suggested using a queue, I tried to implement that solution below. The solution works just fine with j=1000, however, it hangs as I try to scale to larger numbers. I am stuck here and cannot determine why it hangs. Any suggestions would be appreciated. Also, the code is starting to get ugly as I keep messing with it, I apologize for all the nested functions.
def run4(j):
"""
a multicore approach using queues
"""
from multiprocessing import Process, Queue, cpu_count
import os
def bazinga(uncrunched_queue, crunched_queue):
"""
Pulls the next item off queue, generates its collatz
length and
"""
num = uncrunched_queue.get()
while num != 'STOP': #Signal that there are no more numbers
length = len(generateChain(num, []) )
crunched_queue.put([num , length])
num = uncrunched_queue.get()
def consumer(crunched_queue):
"""
A process to pull data off the queue and evaluate it
"""
maxChain = 0
biggest = 0
while not crunched_queue.empty():
a, b = crunched_queue.get()
if b > maxChain:
biggest = a
maxChain = b
print('%d has a chain of length %d' % (biggest, maxChain))
uncrunched_queue = Queue()
crunched_queue = Queue()
numProcs = cpu_count()
for i in range(1, j): #Load up the queue with our numbers
uncrunched_queue.put(i)
for i in range(numProcs): #put sufficient stops at the end of the queue
uncrunched_queue.put('STOP')
ps = []
for i in range(numProcs):
p = Process(target=bazinga, args=(uncrunched_queue, crunched_queue))
p.start()
ps.append(p)
p = Process(target=consumer, args=(crunched_queue, ))
p.start()
ps.append(p)
for p in ps: p.join()
You're putting 'STOP' poison pills into your uncrunched_queue (as you should), and having your producers shut down accordingly; on the other hand your consumer only checks for emptiness of the crunched queue:
while not crunched_queue.empty():
(this working at all depends on a race condition, btw, which is not good)
When you start throwing non-trivial work units at your bazinga producers, they take longer. If all of them take long enough, your crunched_queue dries up, and your consumer dies. I think you may be misidentifying what's happening - your program doesn't "hang", it just stops outputting stuff because your consumer is dead.
You need to implement a smarter methodology for shutting down your consumer. Either look for n poison pills, where n is the number of producers (who accordingly each toss one in the crunched_queue when they shut down), or use something like a Semaphore that counts up for each live producer and down when one shuts down.