how can i call a def to be executed simultaneously? - python

I want to call a function that takes a long time to execute(variable, from 0 to 300 sec.), it has to be executed as many times as there are lists in a json. How can I do it so that it executes three in parallel, for example?
To make it simpler the code in the explanation would be to call a function, this...
def f(x, j):
print(str(x*x)+" "+" "+str(j))
time.sleep(2)
j = 0
for i in range(10):
j += 1
f(i,j)
result
thread 1 : 0, 1
thread 2 : 1, 2
thread 3 : 4, 3
thread 1 : 9, 4
thread 2 : 16, 5
thread 3 : 25, 6
thread 1 : 36, 7
thread 2: 49, 8
thread 3: 64, 9
thread 1: 81, 10
Thread three could end before thread one, but I always want three runs.

Take a look at ProcessPoolExecutor and ThreadPoolExecutor, which are API's that allow the execution of concurrent tasks:
from concurrent.futures import ProcessPoolExecutor
def f(x, j):
print(str(x*x)+" "+" "+str(j))
time.sleep(2)
j = 0
with ProcessPoolExecutor() as e:
for i in range(10):
e.submit(f, i, j)
You can also use the .map() method in this case. Read the documentation for more information.

Related

Python overwriting lines in console

I am attempting to learn about concurrent programming with python. I have a simple script made as an example to show you what I am attempting to do here. Basically what is happening is some of the output is overwriting lines in the console so I am missing part of the output.
Here is my code:
import multiprocessing
lock = multiprocessing.Lock()
def dostuff(th):
for x in range(8):
lock.acquire()
print(th, ": loop", x)
lock.release()
def run():
pool = multiprocessing.Pool()
inputs = [1, 2, 3, 4]
pool.map(dostuff, inputs)
print('end')
Any help would be greatly appreciated.
Here is some of the output:
>>> run()
2 : loop 0
2 : loop 1
2 : loop 2
2 : loop 3
>>> run()
3 : loop 0
3 : loop 1
3 : loop 2
3 : loop 3
>>> run()
end
Here is some of the expected output:
>>> run()
2 : loop 0
3 : loop 1
1 : loop 1
4 : loop 2
>>> run()
end
Basically I want to show concurrency. Thanks
Ok problem solved. Executed the script through Geany and it seems to be working perfectly. Thanks for all your support.

How to properly parallelize and append a for loop?

I have a for loop below that runs a few functions and appends the result to a list. The list 'results' will end up with 12 elements (3 functions x 4 loop iterations = 12).
How can I parallelize this? Since my loop will iterate 4 times, does that mean I can only use 4 threads/cores?
results = []
for points in [250, 500, 1000, 4000]:
df_dx, db_dy = simulate_cointegration(num_points=points, stax=0.0100, stay=0.0050, coex=0.0200,
coey=0.0200)
johansen_results = johansen_test(df_dx, db_dy)
cadf_results = cadf_test(df_dx, db_dy)
po_results = po_test(df_dx, db_dy)
results.append(johansen_results)
results.append(cadf_results)
results.append(po_results)
This is a basic answer to how you can parallelize your computations. Note Thymen's comment explaining that this answer is limited to the usage of 1 core. So if you can run 4 threads on your core, you'll potentially go 4 times faster (surely less in reality).
import threading
results = []
# 1. create & start the first 4 threads (one per point value)
test_threads = []
def point_task(points):
# you have to do this computation once before you can run your 3 test threads
df_dx, db_dy = simulate_cointegration(num_points=points, stax=0.0100, stay=0.0050, coex=0.0200, coey=0.0200)
# each thread will do some computation & appends the results
thread1 = threading.Thread(target=lambda dx, dy: results.append(johansen_test(dx, dy)), args=(df_dx, db_dy))
thread2 = threading.Thread(target=lambda dx, dy: results.append(cadf_test(dx, dy)), args=(df_dx, db_dy))
thread3 = threading.Thread(target=lambda dx, dy: results.append(po_test(dx, dy)), args=(df_dx, db_dy))
# remember the three tasks to call thread.join() at the end
test_threads.append(thread1)
test_threads.append(thread2)
test_threads.append(thread3)
# then start all the tasks while you prepare the others
thread1.start()
thread2.start()
thread3.start()
# 2. create one thread per point value, each thread will create three new threads
points_threads = []
for points in [250, 500, 1000, 4000]:
thread = threading.Thread(target=point_task, args=(points,))
points_threads.append(thread)
# 3. start all the threads
for thread in points_threads:
thread.start()
# 4. make sure the first four threads are finished before making sure the test_threads are over as well
# this is because if one of the first threads is no over yet, then you might have a missing thread in 'tests_threads' on which you won't be able to call '.join()'
for thread in points_threads:
thread.join()
for thread in tests_threads:
thread.join()
print(results)
So if use a simple implementation of the other methods :
#!/usr/bin/env python
# -*- coding: utf-8 -*-
def simulate_cointegration(num_points=100, stax=0.0100, stay=0.0050, coex=0.0200, coey=0.0200):
return 1, 1
def johansen_test(dx, dy):
return 1
def cadf_test(dx, dy):
return 2
def po_test(dx, dy):
return 3
Then you get the following results:
print(results)
>>> [1, 2, 1, 3, 1, 2, 1, 3, 2, 3, 2, 3]
If you wish to use more cores, see multiprocessing or MPI.
Also, I am not sure how Python manages the critical resources, but you might want to use locks on your results list or replace it with queues for good practice.
Edit : just found this answer Are lists thread-safe?. So you should be able to keep your list as a simple python list, as long as you keep to the threading module and do not use multiprocessing or something else.

Multiprocessing and Queues

`This code is an attempt to use a queue to feed tasks to a number worker processes.
I wanted to time the difference in speed between different number of process and different methods for handling data.
But the output is not doing what I thought it would.
from multiprocessing import Process, Queue
import time
result = []
base = 2
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 23, 45, 76, 4567, 65423, 45, 4, 3, 21]
# create queue for new tasks
new_tasks = Queue(maxsize=0)
# put tasks in queue
print('Putting tasks in Queue')
for i in data:
new_tasks.put(i)
# worker function definition
def f(q, p_num):
print('Starting process: {}'.format(p_num))
while not q.empty():
# mimic some process being done
time.sleep(0.05)
print(q.get(), p_num)
print('Finished', p_num)
print('initiating processes')
processes = []
for i in range(0, 2):
if __name__ == '__main__':
print('Creating process {}'.format(i))
p = Process(target=f, args=(new_tasks, i))
processes.append(p)
#record start time
start = time.time()
# start process
for p in processes:
p.start()
# wait for processes to finish processes
for p in processes:
p.join()
#record end time
end = time.time()
# print time result
print('Time taken: {}'.format(end-start))
I expect this:
Putting tasks in Queue
initiating processes
Creating process 0
Creating process 1
Starting process: 1
Starting process: 0
1 1
2 0
3 1
4 0
5 1
6 0
7 1
8 0
9 1
10 0
11 1
23 0
45 1
76 0
4567 1
65423 0
45 1
4 0
3 1
21 0
Finished 1
Finished 0
Time taken: <some-time>
But instead I actually get this:
Putting tasks in Queue
initiating processes
Creating process 0
Creating process 1
Time taken: 0.01000523567199707
Putting tasks in Queue
Putting tasks in Queue
initiating processes
Time taken: 0.0
Starting process: 1
initiating processes
Time taken: 0.0
Starting process: 0
1 1
2 0
3 1
4 0
5 1
6 0
7 1
8 0
9 1
10 0
11 1
23 0
45 1
76 0
4567 1
65423 0
45 1
4 0
3 1
21 0
Finished 0
There seem to be two major problems, I am not sure how related they are:
The print statements such as:
Putting tasks in Queue
initiating processes
Time taken: 0.0
are repeated systematically though out the code - I say systematically becasue they repeat exactly every time.
The second process never finishes, it never recognizes the queue is empty and therefore fails to exit
1) I cannot reproduce this.
2) Look at the following code:
while not q.empty():
time.sleep(0.05)
print(q.get(), p_num)
Each line can be run in any order by any proces. Now consider q having a single item and two processes A and B. Now consider the following order of execution:
# A runs
while not q.empty():
time.sleep(0.05)
# B runs
while not q.empty():
time.sleep(0.05)
# A runs
print(q.get(), p_num) # Removes and prints the last element of q
# B runs
print(q.get(), p_num) # q is now empty so q.get() blocks forever
Swapping the order of time.sleep and q.get removes the blocking in all of my runs, but it's still possible to have more than one processes enter the loop with a single item left.
The way to fix this is using a non-blocking get call and catching the queue.Empty exception:
import queue
while True:
time.sleep(0.05)
try:
print(q.get(False), p_num)
except queue.Empty:
break
Your worker threads should be like this:
def f(q, p_num):
print('Starting process: {}'.format(p_num))
while True:
value = q.get()
if value is None:
break
# mimic some process being done
time.sleep(0.05)
print(value, p_num)
print('Finished', p_num)
And the queue should be filled with markers after the real data:
for i in data:
new_tasks.put(i)
for _ in range(num_of_threads):
new_tasks.put(None)

Complete a multithreading parallelize process with k threads

3sum Problem is defined as
Given: A positive integer k≤20, a postive integer n≤104, and k arrays of size n containing integers from −105 to 105.
Return: For each array A[1..n], output three different indices 1≤p<q<r≤n such that A[p]+A[q]+A[r]=0 if exist, and "-1" otherwise.
Sample Dataset
4 5
2 -3 4 10 5
8 -6 4 -2 -8
-5 2 3 2 -4
2 4 -5 6 8
Sample Output
-1
1 2 4
1 2 3
-1
However I want to speed up the code using threads, To do so I am applying python code
def TS(arr):
original = arr[:]
arr.sort()
n = len(arr)
for i in xrange(n-2):
a = arr[i]
j = i+1
k = n-1
while j < k:
b = arr[j]
c = arr[k]
if a + b + c == 0:
return sorted([original.index(a)+1,original.index(b)+1,original.index(c)+1])
elif a + b + c > 0:
k = k - 1
else:
j = j +1
return [-1]
with open("dataset.txt") as dataset:
k = int(dataset.readline().split()[0])
for i in xrange(k):
aux = map(int, dataset.readline().split())
results = TS(aux)
print ' ' . join(map(str, results))
I was thinking on creating k threads, and a global array output, however do not know how to continue developing the idea
from threading import Thread
class thread_it(Thread):
def __init__ (self,param):
Thread.__init__(self)
self.param = param
def run(self):
mutex.acquire()
output.append(TS(aux))
mutex.release()
threads = [] #k threads
output = [] #global answer
mutex = thread.allocate_lock()
with open("dataset.txt") as dataset:
k = int(dataset.readline().split()[0])
for i in xrange(k):
aux = map(int, dataset.readline().split())
current = thread_it(aux)
threads.append(current)
current.start()
for t in threads:
t.join()
What would be the correct way to get the results = TS(aux) inside a thread and then wait until all threads have finish and then print ' ' . join(map(str,results)) for all of them?
Update
Got this issue when running script from console
First, like #Cyphase said, because of GIL, you cannot speed things up with threading. Every thread will run on the same core. Consider using multiprocessing to utilize multiple cores, multiprocessing has a very similar API as threading.
Second, even if we pretend GIL doesn't exist. Putting everything in a critical section protected by mutex, you are actually serializing all the threads. What you need to protect is access to output, so put the processing code out of critical section, to make them run concurrently:
def run(self):
result = TS(aux)
mutex.acquire()
output.append(result)
mutex.release()
But don't re-invent the wheel, python standard library provides a thread-safe Queue, use that:
try:
import Queue as queue # python2
except:
import queue
output = queue.Queue()
def run(self):
result = TS(self.param)
output.append(result)
With multiprocessing, the final code looks something like this:
from multiprocessing import Process, Queue
output = Queue()
class TSProcess(Process):
def __init__ (self, param):
Process.__init__(self)
self.param = param
def run(self):
result = TS(self.param)
output.put(result)
processes = []
with open("dataset.txt") as dataset:
k = int(dataset.readline().split()[0])
for i in xrange(k):
aux = map(int, dataset.readline().split())
current = TSProcess(aux)
processes.append(current)
current.start()
for p in processes:
p.join()
# process result with output.get()

Multiprocessing and Queue with Dataframe

I have some troubles with exchange of the object (dataframe) between 2 processes through the Queue.
First process get the data from a queue, second put data into a queue.
The put-process is faster, so the get-process should clear the queue with reading all object.
I've got strange behaviour, because my code works perfectly and as expected but only for 100 rows in dataframe, for 1000row the get-process takes always only 1 object.
import multiprocessing, time, sys
import pandas as pd
NR_ROWS = 1000
i = 0
def getDf():
global i, NR_ROWS
myheader = ["name", "test2", "test3"]
myrow1 = [ i, i+400, i+250]
df = pd.DataFrame([myrow1]*NR_ROWS, columns = myheader)
i = i+1
return df
def f_put(q):
print "f_put start"
while(1):
data = getDf()
q.put(data)
print "P:", data["name"].iloc[0]
sys.stdout.flush()
time.sleep(1.55)
def f_get(q):
print "f_get start"
while(1):
data = pd.DataFrame()
while not q.empty():
data = q.get()
print "get"
if not data.empty:
print "G:", data["name"].iloc[0]
else:
print "nothing new"
time.sleep(5.9)
if __name__ == "__main__":
q = multiprocessing.Queue()
p = multiprocessing.Process(target=f_put, args=(q,))
p.start()
while(1):
f_get(q)
p.join()
Output for 100rows dataframe, get-process takes all objects
f_get start
nothing new
f_put start
P: 0 # put 1.object into the queue
P: 1 # put 2.object into the queue
P: 2 # put 3.object into the queue
P: 3 # put 4.object into the queue
get # get-process takes all 4 objects from the queue
get
get
get
G: 3
P: 4
P: 5
P: 6
get
get
get
G: 6
P: 7
P: 8
Output for 1000rows dataframe, get-process takes only one object.
f_get start
nothing new
f_put start
P: 0 # put 1.object into the queue
P: 1 # put 2.object into the queue
P: 2 # put 3.object into the queue
P: 3 # put 4.object into the queue
get <-- #!!! get-process takes ONLY 1 object from the queue!!!
G: 1
P: 4
P: 5
P: 6
get
G: 2
P: 7
P: 8
P: 9
P: 10
get
G: 3
P: 11
Any idea what I am doing wrong and how to pass also the bigger dataframe through?
At the risk of not being completely able to provide a fully functional example, here is what goes wrong.
First of all, its a timing issue.
I tried your code again with larger DataFrames (10000 or even 100000) and I start to see the same things as you do. This means you see this behaviour as soon as the size of the arrays crosses a certain threshold that will be system(CPU?) dependent.
I modified your code a bit to make it easier to see what happens. First, 5 DataFrames are put into the queue without any custom time.sleep. In the f_get function I added a counter (and a time.sleep(0), see below) to the loop (while not q.empty()).
The new code:
import multiprocessing, time, sys
import pandas as pd
NR_ROWS = 10000
i = 0
def getDf():
global i, NR_ROWS
myheader = ["name", "test2", "test3"]
myrow1 = [ i, i+400, i+250]
df = pd.DataFrame([myrow1]*NR_ROWS, columns = myheader)
i = i+1
return df
def f_put(q):
print "f_put start"
j = 0
while(j < 5):
data = getDf()
q.put(data)
print "P:", data["name"].iloc[0]
sys.stdout.flush()
j += 1
def f_get(q):
print "f_get start"
while(1):
data = pd.DataFrame()
loop = 0
while not q.empty():
data = q.get()
print "get (loop: %s)" %loop
time.sleep(0)
loop += 1
time.sleep(1.)
if __name__ == "__main__":
q = multiprocessing.Queue()
p = multiprocessing.Process(target=f_put, args=(q,))
p.start()
while(1):
f_get(q)
p.join()
Now, if you run this for different number of rows, you will see something like this:
N=100:
f_get start
f_put start
P: 0
P: 1
P: 2
P: 3
P: 4
get (loop: 0)
get (loop: 1)
get (loop: 2)
get (loop: 3)
get (loop: 4)
N=10000:
f_get start
f_put start
P: 0
P: 1
P: 2
P: 3
P: 4
get (loop: 0)
get (loop: 1)
get (loop: 0)
get (loop: 0)
get (loop: 0)
What does this tell us?
As long as the DataFrame is small, your assumption that the put process is faster than the get seems true, we can fetch all 5 items within one loop of while not q.empty().
But, as the number of rows increases, something changes. The while-condition q.empty() evaluates to True (the queue is empty) and the outer while(1) cycles.
This could mean that put is now slower than get and we have to wait. But if we set the sleep time for the whole f_get to something like 15, we still get the same behaviour.
On the other hand, if we change the time.sleep(0) in the inner q.get() loop to 1,
while not q.empty():
data = q.get()
time.sleep(1)
print "get (loop: %s)" %loop
loop += 1
we get this:
f_get start
f_put start
P: 0
P: 1
P: 2
P: 3
P: 4
get (loop: 0)
get (loop: 1)
get (loop: 2)
get (loop: 3)
get (loop: 4)
This looks right! And it means that actually get does something strange. It seems that while it is still processing a get, the queue state is empty, and after the get is done the next item is available.
I'm sure there is a reason for that, but I'm not familiar enough with multiprocessing to see that.
Depending on your application, you could just add the appropriate time.sleep to your inner loop and see if thats enough.
Or, if you want to solve it (instead of using a workaround as the time.sleep method), you could look into multiprocessing and look for information on blocking, non-blocking or asynchronous communication - I think the solution will be found there.

Categories

Resources