Multiprocessing in Python slower than without

Multiprocessing in Python slower than without - python

I'm using python 3.3 on a computer with 2 cores but 4 threads. I am trying to learn to use multiprocessing to speed up code, but when using it my code slows down.
To start my learning, I have made a small program:
from multiprocessing import Process
import time
def f():
s = 0
for i in range(2*10**7):
s += i
return s
if __name__ == '__main__':
t = time.time()
p1 = Process(target = f)
p2 = Process(target = f)
p3 = Process(target = f)
p4 = Process(target = f)
p1.start()
p2.start()
p3.start()
p4.start()
p1.join()
p2.join()
p3.join()
p4.join()
print (time.time()-t)
t2 = time.time()
for a in range(4):
f()
print(time.time()-t2)
Average of 3 runs, the first part with multiprocessing takes 17.15 sec, while the second part without multiprocessing takes 6.24 sec. Using the Windows Task Manager, I see that my computer is indeed using 100% CPU for the first part and only 25% for the second part, and also that I am not running out of memory.
Why is this program so much slower with multiprocessing?

Windows has no fork(), so multiprocessing has to work around that by importing the __main__ module each time a new process is started.
This means when each of your subprocesses runs, it doesn't only run the target function, but also the part at the end of the file. Move it into the block and it should be way faster!

Related

Run time is roughly 12 seconds, with and without multiprocessing. Shouldnt multiprocessing be faster in this example?

I have this web scraper, that scrapes the price of 4 different metals.
The run time as mentioned about 12 seconds multiprocessing or not. Shouldnt this code run the function 4 times roughly at the same time, and just about take off 75% of the run time off?
My processor has 4 cores and 4 threads if that has something to do with it.
def scraper(url,metal):
global aluPriser
global zinkPriser
global messingPriser
global kobberPriser
global tal
url.status_code
url.headers
c = url.content
soup = BeautifulSoup(c, "html.parser")
samples = soup.find_all("td",style="text-align:center;white-space:nowrap;border-left:solid black 1px")
for a in samples:
for b in a:
if b.startswith("$"):
b = b.replace(" ","")
b = b.replace("$","")
b = int(b)
tal.append(b)
I run this code with the following multiprocessing code:
if __name__ == '__main__':
url = "https://www.alumeco.dk/viden-og-teknik/metalpriser/aluminiumpriser?s=0"
url = requests.get(url)
whatDate(url)
p1 = Process(target=scraper(url,"alu"))
p1.start()
url = "https://www.alumeco.dk/viden-og-teknik/metalpriser/kobber?s=0"
url = requests.get(url)
p2 = Process(target=scraper(url,"kobber"))
p2.start()
url = "https://www.alumeco.dk/viden-og-teknik/metalpriser/metal-priser-mp58?s=0"
url = requests.get(url)
p3 = Process(target=scraper(url,"messing"))
p3.start()
url = "https://www.alumeco.dk/viden-og-teknik/metalpriser/zink?s=0"
url = requests.get(url)
p4 = Process(target=scraper(url,"zink"))
p4.start()
p1.join()
p2.join()
p3.join()
p4.join()

To get any real benefit from parallelization here, you need to move the requests.get() into the scraper function. Almost all your time is spent doing network requests; parallelizing CPU-bound bits doesn't matter if almost no time is spent in it.
That said, multiprocessing is also the wrong tool for this particular job: You pay more in serialization/deserialization costs than you gain from not having GIL contention. Use threading instead.

Python Multiprocessing queue limitations in different conditions

import multiprocessing
import time
def WORK(x,q,it):
for i in range(it):
t = x + '---'+str(i)
q.put(t)
def cons(q,cp):
while not q.empty():
cp.append(q.get())
return q.put(cp)
if __name__ == '__main__':
cp = []
it = 600 #iteratons
start = time.perf_counter()
q = multiprocessing.Queue()
p1 = multiprocessing.Process(target = WORK, args = ('n',q,it))
p2 = multiprocessing.Process(target=WORK, args=('x',q,it))
p3 = multiprocessing.Process(target=cons, args=(q,cp,))
p1.start()
p2.start()
p3.start()
p1.join()
p2.join()
p3.join()
print(q.get())
end = time.perf_counter()
print(end - start)
I encountered a problem running this code in Pycharm and Colab, if i run this in colab it works fine only with 1000 iterations and less in WORK() process, if more - it freezes.
In Pycharm it works fine only with 500 iterations or less
What is a problem??? Any limitations?
So i find not very good solution is to remove join or put it after dict call from queue, it help to get mor limits, with this code it started to work with 1000 iterations in pycharm but 10000 iteration is deadlock again
p1.join()
p2.join()
print(q.get())
p3.join()
end = time.perf_counter()
print(end - start)
Further change helped me to increase iterations limit to 10000 by adding queuq maxsize:
q = multiprocessing.Queue(maxsize = 1000)
So what is limitations and laws with this queues???
How to manage endless queue, from websockets for example, they sends data continiously

You have several issues with your code. First, according to the documentation on multiprocessing.Queue, method empty is not reliable. So in function cons the statement while not q.empty(): is problematic. But even if method Queue.empty were reliable, you have here a race condition. You have started processes WORK and cons in parallel where the former is writing elements to a queue and the latter is reading until it finds the queue is empty. But if cons runs before WORK gets to write its first element, it will find the queue immediately empty and that is not your expected result. And as I mentioned in my comment above, you must not try to join a process that is writing to a queue before you have retrieved all of the records that process has written.
Another problem you have is you are passing to cons an empty list cp to which you keep on appending. But cons is a function belonging to a process running in a different address space and consequently the cp list it is appending to is not the same cp list as in the main process. Just be aware of this.
Finally, cons is writing its result to the same queue that it is reading from and consequently the main process is reading this result from that same queue. So we have another race condition: Once the main process has been modified not to read from this queue until after it has joined all the processes, the main process and cons are now both reading from the same queue in parallel. We now need a separate input and output queue so that there is no conflict. That solves this race condition.
To solve the the first race condition, the WORK process should write a special sentinel record that serves as an end of records indicator. It could be the value None if None is not a valid normal record or it could be any special object that cannot be mistaken for an actual record. Since we have two processes writing records to the same input queue for cons to read, we will end up with two sentinel records, which cons will have to be looking for to know that there are truly no more records left.
import multiprocessing
import time
SENTINEL = 'SENTINEL' # or None
def WORK(x, q, it):
for i in range(it):
t = x + '---' + str(i)
q.put(t)
q.put(SENTINEL) # show end of records
def cons(q_in, q_out, cp):
# We now are looking for two end of record indicators:
for record in iter(q_in.get, SENTINEL):
cp.append(record)
for record in iter(q_in.get, SENTINEL):
cp.append(record)
q_out.put(cp)
if __name__ == '__main__':
it = 600 #iteratons
start = time.perf_counter()
q_in = multiprocessing.Queue()
q_out = multiprocessing.Queue()
p1 = multiprocessing.Process(target=WORK, args = ('n', q_in, it))
p2 = multiprocessing.Process(target=WORK, args=('x', q_in, it))
cp = []
p3 = multiprocessing.Process(target=cons, args=(q_in, q_out, cp))
p1.start()
p2.start()
p3.start()
cp = q_out.get()
print(len(cp))
p1.join()
p2.join()
p3.join()
end = time.perf_counter()
print(end - start)
Prints:
1200
0.1717168

load dict every x seconds then pass to df and clear dict

I am trying to write values to a dictionary every 5 seconds for 1 minute. I then want to take those values and put into a dataframe to write to csv and clear the original dictionary and keep going.
import time
import random
from multiprocessing import Process
a = {'value':[], 'timeStamp': []}
def func1():
global a
print "starting First Function"
a['value'].append(random.randint(1,101))
a['timeStamp'].append(time.time()*1000.0)
time.sleep(5)
return a
def func2():
print "starting Second Function"
time.sleep(60)
d = pd.DataFrame(a)
print d
# here i would write out the df to csv and del d
a.update({}.fromkeys(a,0))
print "cleared"
if __name__=='__main__':
while True:
p1 = Process(target = func1)
p1.start()
p2 = Process(target = func2)
p2.start()
p1.join()
p2.join()
print "test"
print a
This is where i'm at now, which may or may not be the correct way to do this. Regardless, this code is not giving me the correct results. I am trying to figure out the best way to get the dict into the df and clear it. Hopefully, someone has done something similar?

Processes do not share memory - each function modifies a separate a. Therefore, changes are not seen across functions and the main process.
To share memory between your functions, use the threading module instead. You can test this in your example by replacing Process with Thread:
from threading import Thread as Process
This allows you to run your example unchanged otherwise.
Note that threading in Python is limited by the Global Interpreter Lock. Threads run concurrently, but not in parallel - Python code only ever runs on one core. Extensions and system calls such as time.sleep and the underlying data structures of pandas can sidestep this, however.

Your code has so many problems that it is hardly suitable for any use. You may start your research with something like this (python 3, threads instead of processes):
import time
import random
import threading
def func1(a):
print("starting First Function")
for dummy in range(10):
a['value'].append(random.randint(1, 101))
a['timeStamp'].append(time.time() * 1000.0)
time.sleep(1)
print("stopping First Function")
def func2(a):
print("starting Second Function")
for dummy in range(2):
time.sleep(5)
print(a)
a['value'] = list()
a['timeStamp'] = list()
print("cleared")
print('stopping Second Function')
if __name__ == '__main__':
a = {'value': list(), 'timeStamp': list()}
t1 = threading.Thread(target=func1, args=(a,))
t1.start()
t2 = threading.Thread(target=func2, args=(a,))
t2.start()
The output is:
starting First Function
starting Second Function
{'value': [32, 95, 2, 71, 65], 'timeStamp': [1536244351577.3914, 1536244352584.13, 1536244353586.6367, 1536244354589.3767, 1536244355591.9202]}
cleared
{'value': [43, 44, 28, 69, 25], 'timeStamp': [1536244356594.6294, 1536244357597.2498, 1536244358599.9812, 1536244359602.9592, 1536244360605.9316]}
cleared
stopping Second Function
stopping First Function

Run an exe after another in parallel

I want to run four .exes in parallel. After the first iteration of the first .exe the second .exe must start while the firts keeps on its second iteration, and so on with the others. The goal is the four in parallel feedbacking each other with data. The exes are written in Fortran 90, but the code is in linux python.
import os, threading
e = range(10)
for a in e:
def exe1():
os.system("./exe1")
t1 = threading.Thread(target=exe1, args=())
t1.start()
t1.join()
if a > 0:
for b in e:
def exe2():
os.system("./exe2")
t2 = threading.Thread(target=exe2, args=())
t2.start()
t2.join()
if b > 0:
for c in e
def exe3():
os.system("./exe3")
t3 = threading.Thread(target=exe3, args=())
t3.start()
t3.join()
if c > 0:
for d in e
def exe4():
os.system("./exe4")
t4 = threading.Thread(target=exe4, args=())
t4.start()
t4.join()
This is my idea but i don't have the capacity to run them in parallel. They must do 10 iterations each.

I won't comment further on the loops defining functions (very weird) probably because the indentation is really off, so there may be more that 4 threads in parallel (I figured out that much).
But to answer your question, your x executables don't run in parallel just because you're using join() on the thread as soon as you start it.
So main program waits for current thread termination before it tries to start another.
I would do this:
thread_list = []
at the start of your program.
Each time you create a thread, store its reference in thread_list:
t1 = Threading.thread(...)
thread_list.append(t1)
Then, remove all the join calls inside your program. Now you're really starting x processes within x threads, in parallel.
And at the end of your program wait for all threads to finish:
for t in thread_list:
t.join()

running two interdependent while loops in python?

For a web-scraping analysis I need two loops that run permanently, one returning a list with websites updated every x minutes, while the other one analyses the sites (old an new ones) every y seconds. This is the code construction that exemplifies, what I am trying to do, but it doesn't work: Code has been edited to incorporate answers and my research
from multiprocessing import Process
import time, random
from threading import Lock
from collections import deque
class MyQueue(object):
def __init__(self):
self.items = deque()
self.lock = Lock()
def put(self, item):
with self.lock:
self.items.append(item)
# Example pointed at in [this][1] answer
def get(self):
with self.lock:
return self.items.popleft()
def a(queue):
while True:
x=[random.randint(0,10), random.randint(0,10), random.randint(0,10)]
print 'send', x
queue.put(x)
time.sleep(10)
def b(queue):
try:
while queue:
x = queue.get()
print 'recieve', x
for i in x:
print i
time.sleep(2)
except IndexError:
print queue.get()
if __name__ == '__main__':
q = MyQueue()
p1 = Process(target=a, args=(q,))
p2 = Process(target=b, args=(q,))
p1.start()
p2.start()
p1.join()
p2.join()
So, this is my first Python project after an online introduction course and I am struggling here big time. I understand now, that the functions don't truly run in parallel, as b does not start until a is finished ( I used this answer an tinkered with the timer and while True). EDIT: Even after using the approach given in the answer, I think this is still the case, as the queue.get() throws an IndexError saying, the deque is empty. I can only explain that with process a not finishing, because when I print queue.get()
immediately after .put(x) it is not empty.
I eventually want an output like this:
send [3,4,6]
3
4
6
3
4
send [3,8,6,5] #the code above gives always 3 entries, but in my project
3 #the length varies
8
6
5
3
8
6
.
.
What do I need for having two truly parallel loops where one is returning an updated list every x minutes which the other loop needs as basis for analysis? Is Process really the right tool here?
And where can I get good info about designing my program.

I did something a little like this a while ago. I think using the Process is the correct approach, but if you want to pass data between processes then you should probably use a Queue.
https://docs.python.org/2/library/multiprocessing.html#exchanging-objects-between-processes
Create the queue first and pass it into both processes. One can write to it, the other can read from it.
One issue I remember is that the reading process will block on the queue until something is pushed to it, so you may need to push a special 'terminate' message of some kind to the queue when process 1 is done so process 2 knows to stop.
EDIT: Simple example. This doesn't include a clean way to stop the processes. But it shows how you can start 2 new processes and pass data from one to the other. Since the queue blocks on get() function b will automatically wait for data from a before continuing.
from multiprocessing import Process, Queue
import time, random
def a(queue):
while True:
x=[random.randint(0,10), random.randint(0,10), random.randint(0,10)]
print 'send', x
queue.put(x)
time.sleep(5)
def b(queue):
x = []
while True:
time.sleep(1)
try:
x = queue.get(False)
print 'receive', x
except:
pass
for i in x:
print i
if __name__ == '__main__':
q = Queue()
p1 = Process(target=a, args=(q,))
p2 = Process(target=b, args=(q,))
p1.start()
p2.start()
p1.join()
p2.join()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Multiprocessing in Python slower than without - python

Related

Run time is roughly 12 seconds, with and without multiprocessing. Shouldnt multiprocessing be faster in this example?

Python Multiprocessing queue limitations in different conditions

load dict every x seconds then pass to df and clear dict

Run an exe after another in parallel

running two interdependent while loops in python?

Categories

Resources