I wrote a script in Python 3.6 initially using a for loop which called an API, then putting all results into a pandas dataframe and writing them to a SQL database. (approximately 9,000 calls are made to that API every time the script runs).
Realising the calls inside the for loop were processed one-by-one, I decided to use the multiprocessing module to speed things up.
Therefore, I created a module level function called parallel_requests and now I call that instead of having the for loop:
list_of_lists = multiprocessing.Pool(processes=4).starmap(parallel_requests, zip(....))
Side note: I use starmap instead of map only because my parallel_requests function takes multiple arguments which I need to zip.
The good: this approach works and is much faster.
The bad: this approach works but is too fast. By using 4 processes (I tried that because I have 4 cores), parallel_requests is getting executed too fast. More than 15 calls per second are made to the API, and I'm getting blocked by the API itself.
In fact, it only works if I use 1 or 2 processes, otherwise it's too damn fast.
Essentially what I want is to keep using 4 processes, but also to limit the execution of my parallel_requests function to only 15 times per second overall.
Is there any parameter of multiprocessing.Pool that would help with this, or it's more complicated than that?
For this case I'd use a leaky bucket. You can have one process that fills a queue at the proscribed rate, with a maximum size that indicates how many requests you can "bank" if you don't make them at the maximum rate; the worker processes then just need to get from the queue before doing its work.
import time
def make_api_request(this, that, rate_queue):
rate_queue.get()
print("DEBUG: doing some work at {}".format(time.time()))
return this * that
def throttler(rate_queue, interval):
try:
while True:
if not rate_queue.full(): # avoid blocking
rate_queue.put(0)
time.sleep(interval)
except BrokenPipeError:
# main process is done
return
if __name__ == '__main__':
from multiprocessing import Pool, Manager, Process
from itertools import repeat
rq = Manager().Queue(maxsize=15) # conservative; no banking
pool = Pool(4)
Process(target=throttler, args=(rq, 1/15.)).start()
pool.starmap(make_api_request, zip(range(100), range(100, 200), repeat(rq)))
I'll look at the ideas posted here, but in the meantime I've just used a simple approach of opening and closing a Pool of 4 processes for every 15 requests and appending all the results in a list_of_lists.
Admittedly, not the best approach, since it takes time/resources to open/close a Pool, but it was the most handy solution for now.
# define a generator for use below
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in range(0, len(l), n):
yield l[i:i + n]
list_of_lists = []
for current_chunk in chunks(all_data, 15): # 15 is the API's limit of requests per second
pool = multiprocessing.Pool(processes=4)
res = pool.starmap(parallel_requests, zip(current_chunk, [to_symbol]*len(current_chunk), [query]*len(current_chunk), [start]*len(current_chunk), [stop]*len(current_chunk)) )
sleep(1) # Sleep for 1 second after every 15 API requests
list_of_lists.extend(res)
pool.close()
flatten_list = [item for sublist in list_of_lists for item in sublist] # use this to construct a `pandas` dataframe
PS: This solution is really not at all that fast due to the multiple opening/closing of pools. Thanks Nathan VÄ“rzemnieks for suggesting to open just one pool, it's much faster, plus your processor won't look like it's running a stress test.
One way to do is to use Queue, which can share details about api-call timestamps with other processes.
Below is an example how this could work. It takes the oldest entry in queue, and if it is younger than one second, sleep functions is called for the duration of the difference.
from multiprocessing import Pool, Manager, queues
from random import randint
import time
MAX_CONNECTIONS = 10
PROCESS_COUNT = 4
def api_request(a, b):
time.sleep(randint(1, 9) * 0.03) # simulate request
return a, b, time.time()
def parallel_requests(a, b, the_queue):
try:
oldest = the_queue.get()
time_difference = time.time() - oldest
except queues.Empty:
time_difference = float("-inf")
if 0 < time_difference < 1:
time.sleep(1-time_difference)
else:
time_difference = 0
print("Current time: ", time.time(), "...after sleeping:", time_difference)
the_queue.put(time.time())
return api_request(a, b)
if __name__ == "__main__":
m = Manager()
q = m.Queue(maxsize=MAX_CONNECTIONS)
for _ in range(0, MAX_CONNECTIONS): # Fill the queue with zeroes
q.put(0)
p = Pool(PROCESS_COUNT)
# Create example data
data_length = 100
data1 = range(0, data_length) # Just some dummy-data
data2 = range(100, data_length+100) # Just some dummy-data
queue_iterable = [q] * (data_length+1) # required for starmap -function
list_of_lists = p.starmap(parallel_requests, zip(data1, data2, queue_iterable))
print(list_of_lists)
Related
I'm working on an optimization problem, and you can see a simplified version of my code posted below (the origin code is too complicated for asking such a question, and I hope my simplified code has simulated the original one as much as possible).
My purpose:
use the function foo in the function optimization, but foo can take very long time due to some hard situations. So I use multiprocessing to set a time limit for execution of the function (proc.join(iter_time), the method is from an anwser from this question; How to limit execution time of a function call?).
My problem:
In the while loop, every time the generated values for extra are the same.
The list lst's length is always 1, which means in every iteration in the while loop it starts from an empty list.
My guess: possible reason can be each time I create a process the random seed is counting from the beginning, and each time the process is terminated, there could be some garbage collection mechanism to clean the memory the processused, so the list is cleared.
My question
Anyone know the real reason of such problems?
if not using multiprocessing, is there anyway else that I can realize my purpose while generate different random numbers? btw I have tried func_timeout but it has other problems that I cannot handle...
random.seed(123)
lst = [] # a global list for logging data
def foo(epoch):
...
extra = random.random()
lst.append(epoch + extra)
...
def optimization(loop_time, iter_time):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = multiprocessing.Process(target=foo, args=(epoch,))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
if __name__ == '__main__':
optimization(300, 2)
You need to use shared memory if you want to share variables across processes. This is because child processes do not share their memory space with the parent. Simplest way to do this here would be to use managed lists and delete the line where you set a number seed. This is what is causing same number to be generated because all child processes will take the same seed to generate the random numbers. To get different random numbers either don't set a seed, or pass a different seed to each process:
import time, random
from multiprocessing import Manager, Process
def foo(epoch, lst):
extra = random.random()
lst.append(epoch + extra)
def optimization(loop_time, iter_time, lst):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = Process(target=foo, args=(epoch, lst))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
print(lst)
if __name__ == '__main__':
manager = Manager()
lst = manager.list()
optimization(10, 2, lst)
Output
[0.2035898948744943, 0.07617925389396074, 0.6416754412198231, 0.6712193790613651, 0.419777147554235, 0.732982735576982, 0.7137712131028766, 0.22875414425414997, 0.3181113880578589, 0.5613367673646847, 0.8699685474084119, 0.9005359611195111, 0.23695341111251134, 0.05994288664062197, 0.2306562314450149, 0.15575356275408125, 0.07435292814989103, 0.8542361251850187, 0.13139055891993145, 0.5015152768477814, 0.19864873743952582, 0.2313646288041601, 0.28992667535697736, 0.6265055915510219, 0.7265797043535446, 0.9202923318284002, 0.6321511834038631, 0.6728367262605407, 0.6586979597202935, 0.1309226720786667, 0.563889613032526, 0.389358766191921, 0.37260564565714316, 0.24684684162272597, 0.5982042933298861, 0.896663326233504, 0.7884030244369596, 0.6202229004466849, 0.4417549843477827, 0.37304274232635715, 0.5442716244427301, 0.9915536257041505, 0.46278512685707873, 0.4868394190894778, 0.2133187095154937]
Keep in mind that using managers will affect performance of your code. Alternate to this, you could also use multiprocessing.Array, which is faster than managers but is less flexible in what data it can store, or Queues as well.
The use case I have in mind is as follows: I would like to start a range of jobs with ThreadPoolExecutor and then when a job completes, I would like to add a new job to the queue. I would also like to know when the next job finishes and repeat the procedure above. After I have had the opportunity to observe a predefined number of results, I would like to terminate everything properly. For instance, consider the following code, where the commented out bits in the run_con method show what I would like to achieve.
import numpy as np
import time
from concurrent import futures
MAX_WORKERS = 20
seed = 1234
np.random.seed(seed)
MSG = "Wall time: {:.2f}s"
def expensive_function(x):
time.sleep(x)
return x
def run_con(func, ts):
t0 = time.time()
workers = min(MAX_WORKERS, len(ts))
with futures.ThreadPoolExecutor(workers) as executor:
jobs = []
for t in ts:
jobs.append(executor.submit(func, t))
done = futures.as_completed(jobs)
for future in done:
print("Job complete: ", future.result())
# depending on some condition, add new job to jobs, e.g.
# jobs.append(func, np.random.random())
# update done generator
# if a threshold on total jobs is reached, close every thing down sensibly.
print(MSG.format(time.time()-t0))
ts = np.random.random(10)*5
print("Sleep times: ", ts)
run_con(expensive_function, ts)
Is this achievable with concurrent.futures? If not, what are the alternatives?
I have been looking around for some time, but haven't had luck finding an example that could solve my problem. I have added an example from my code. As one can notice this is slow and the 2 functions could be done separately.
My aim is to print every second the latest parameter values. At the same time the slow processes can be calculated in the background. The latest value is shown and when any process is ready the value is updated.
Can anybody recommend a better way to do it? An example would be really helpful.
Thanks a lot.
import time
def ProcessA(parA):
# imitate slow process
time.sleep(5)
parA += 2
return parA
def ProcessB(parB):
# imitate slow process
time.sleep(10)
parB += 5
return parB
# start from here
i, parA, parB = 1, 0, 0
while True: # endless loop
print(i)
print(parA)
print(parB)
time.sleep(1)
i += 1
# update parameter A
parA = ProcessA(parA)
# update parameter B
parB = ProcessB(parB)
I imagine this should do it for you. This has the benefit of you being able to add extra parallel funcitons up to a total equal to the number of cores you have. Edits are welcome.
#import time module
import time
#import the appropriate multiprocessing functions
from multiprocessing import Pool
#define your functions
#whatever your slow function is
def slowFunction(x):
return someFunction(x)
#printingFunction
def printingFunction(new,current,timeDelay):
while new == current:
print current
time.sleep(timeDelay)
#set the initial value that will be printed.
#Depending on your function this may take some time.
CurrentValue = slowFunction(someTemporallyDynamicVairable)
#establish your pool
pool = Pool()
while True: #endless loop
#an asynchronous function, this will continue
# to run in the background while your printing operates.
NewValue = pool.apply_async(slowFunction(someTemporallyDynamicVairable))
pool.apply(printingFunction(NewValue,CurrentValue,1))
CurrentValue = NewValue
#close your pool
pool.close()
I have a dataframe which I perform some operation on and print out. To do this, I have to iterate through each row.
for count, row in final_df.iterrows():
x = row['param_a']
y = row['param_b']
# Perform operation
# Write to output file
I decided to parallelize this using the python multiprocessing module
def write_site_files(row):
x = row['param_a']
y = row['param_b']
# Perform operation
# Write to output file
pkg_num = 0
total_runs = final_df.shape[0] # Total number of rows in final_df
threads = []
import multiprocessing
while pkg_num < total_runs or len(threads):
if(len(threads) < num_proc and pkg_num < total_runs):
print pkg_num, total_runs
t = multiprocessing.Process(target=write_site_files,args=[final_df.iloc[pkg_num],pkg_num])
pkg_num = pkg_num + 1
t.start()
threads.append(t)
else:
for thread in threads:
if not thread.is_alive():
threads.remove(thread)
However, the latter (parallelized) method is way slower than the simple iteration based approach. Is there anything I am missing?
thanks!
This will be way less efficient that doing this in a single process unless the actual operation take a lot of time, like seconds per row.
Normally parallelization is the last tool in the box. After profiling, after local vectorization, after local optimization, then you parallelize.
You are spending time just doing the slicing, then spinning up new processes (which is generally a constant overhead), then pickling a single row (not clear how big it is from your example).
At the very least, you should chunk the rows, e.g. df.iloc[i:(i+1)*chunksize].
There hopefully will be some support for parallel apply in 0.14, see here: https://github.com/pydata/pandas/issues/5751
This is a followup question to this. User Will suggested using a queue, I tried to implement that solution below. The solution works just fine with j=1000, however, it hangs as I try to scale to larger numbers. I am stuck here and cannot determine why it hangs. Any suggestions would be appreciated. Also, the code is starting to get ugly as I keep messing with it, I apologize for all the nested functions.
def run4(j):
"""
a multicore approach using queues
"""
from multiprocessing import Process, Queue, cpu_count
import os
def bazinga(uncrunched_queue, crunched_queue):
"""
Pulls the next item off queue, generates its collatz
length and
"""
num = uncrunched_queue.get()
while num != 'STOP': #Signal that there are no more numbers
length = len(generateChain(num, []) )
crunched_queue.put([num , length])
num = uncrunched_queue.get()
def consumer(crunched_queue):
"""
A process to pull data off the queue and evaluate it
"""
maxChain = 0
biggest = 0
while not crunched_queue.empty():
a, b = crunched_queue.get()
if b > maxChain:
biggest = a
maxChain = b
print('%d has a chain of length %d' % (biggest, maxChain))
uncrunched_queue = Queue()
crunched_queue = Queue()
numProcs = cpu_count()
for i in range(1, j): #Load up the queue with our numbers
uncrunched_queue.put(i)
for i in range(numProcs): #put sufficient stops at the end of the queue
uncrunched_queue.put('STOP')
ps = []
for i in range(numProcs):
p = Process(target=bazinga, args=(uncrunched_queue, crunched_queue))
p.start()
ps.append(p)
p = Process(target=consumer, args=(crunched_queue, ))
p.start()
ps.append(p)
for p in ps: p.join()
You're putting 'STOP' poison pills into your uncrunched_queue (as you should), and having your producers shut down accordingly; on the other hand your consumer only checks for emptiness of the crunched queue:
while not crunched_queue.empty():
(this working at all depends on a race condition, btw, which is not good)
When you start throwing non-trivial work units at your bazinga producers, they take longer. If all of them take long enough, your crunched_queue dries up, and your consumer dies. I think you may be misidentifying what's happening - your program doesn't "hang", it just stops outputting stuff because your consumer is dead.
You need to implement a smarter methodology for shutting down your consumer. Either look for n poison pills, where n is the number of producers (who accordingly each toss one in the crunched_queue when they shut down), or use something like a Semaphore that counts up for each live producer and down when one shuts down.