Why is `gevent.spawn` different than a monkeypatched `threading.Thread()`?

Why is `gevent.spawn` different than a monkeypatched `threading.Thread()`? - python

While double checking that threading.Condition is correctly monkey patched, I noticed that a monkeypatched threading.Thread(…).start() behaves differently from gevent.spawn(…).
Consider:
from gevent import monkey; monkey.patch_all()
from threading import Thread, Condition
import gevent
cv = Condition()
def wait_on_cv(x):
cv.acquire()
cv.wait()
print "Here:", x
cv.release()
# XXX: This code yields "This operation would block forever" when joining the first thread
threads = [ gevent.spawn(wait_on_cv, x) for x in range(10) ]
"""
# XXX: This code, which seems semantically similar, works correctly
threads = [ Thread(target=wait_on_cv, args=(x, )) for x in range(10) ]
for t in threads:
t.start()
"""
cv.acquire()
cv.notify_all()
print "Notified!"
cv.release()
for x, thread in enumerate(threads):
print "Joining", x
thread.join()
Note, specifically, the two comments starting with XXX.
When using the first line (with gevent.spawn), the first thread.join() raises an exception:
Notified!
Joining 0
Traceback (most recent call last):
File "foo.py", line 30, in
thread.join()
File "…/gevent/greenlet.py", line 291, in join
result = self.parent.switch()
File "…/gevent/hub.py", line 381, in switch
return greenlet.switch(self)
gevent.hub.LoopExit: This operation would block forever
However, Thread(…).start() (the second block), everything works as expected.
Why would this be? What's the difference between gevent.spawn() and Thread(…).start()?

What happen in your code is that the greenlets that you have created in you threads list didn't have yet the chance to be executed because gevent will not trigger a context switch until you do so explicitly in your code using gevent.sleep() and such or implicitly by calling a function that block e.g. semaphore.wait() or by yielding and so on ..., to see that you can insert a print before cv.wait() and see that it's called only after cv.notify_all() is called:
def wait_on_cv(x):
cv.acquire()
print 'acquired ', x
cv.wait()
....
So an easy fix to your code will be to insert something that will trigger a context switch after you create your list of greenlets, example:
...
threads = [ gevent.spawn(wait_on_cv, x) for x in range(10) ]
gevent.sleep() # Trigger a context switch
...
Note: I am still new to gevent so i don't know if this is the right way to do it :)
This way all the greenlets will have the chance to be executed and each one of them will trigger a context switch when they call cv.wait() and in the mean time they will
register them self to the condition waiters so that when cv.notify_all() is called it
will notify all the greenlets.
HTH,

Related

Python: Threads are not running in parrallel

I'm trying to create a networking project using UDP connections. The server that I'm creating has to multithread in order to be able to receive multiple commands from multiple clients. However when trying to multithread the server, only one thread is running. Here is the code:
def action_assigner():
print('Hello Assign')
while True:
if work_queue.qsize() != 0:
data, client_address, request_number = work_queue.get()
do_actions(data, client_address, request_number)
def task_putter():
request_number = 0
print('Hello Task')
while True:
data_received = server_socket.recvfrom(1024)
request_number += 1
taskRunner(data_received, request_number)
try:
thread_task = threading.Thread(target=task_putter())
action_thread = threading.Thread(target=action_assigner())
action_thread.start()
thread_task.start()
action_thread.join()
thread_task.join()
except Exception as e:
server_socket.close()
When running the code, I only get Hello Task as the result meaning that the action_thread never started. Can someone explain how to fix this?

The problem here is that you are calling the functions that should be the "body" of each thread when creating the Threads themselves.
Upon executing the line thread_task = threading.Thread(target=task_putter()) Python will resolve first the expession inside the parentheses - it calls the function task_putter, which never returns. None of the subsequent lines on your program is ever run.
What we do when creating threads, and other calls that takes callable objects as arguments, is to pass the function itself, not calling it (which will run the function and evaluate to its return value).
Just change both lines creating the threads to not put the calling parentheses on the target= argument and you will get past this point:
...
try:
thread_task = threading.Thread(target=task_putter)
action_thread = threading.Thread(target=action_assigner)
...

Python Multiprocessing Doesnt Terminate On Base Exception

When running using multiprocessing pool, I find that the worker process keeps running past a point where an exception is thrown.
Consider the following code:
import multiprocessing
def worker(x):
print("input: " + x)
y = x + "_output"
raise Exception("foobar")
print("output: " + y)
return(y)
def main():
data = [str(x) for x in range(4)]
pool = multiprocessing.Pool(1)
chunksize = 1
results = pool.map(worker, data, chunksize)
pool.close()
pool.join()
print("Printing results:")
print(results)
if __name__ == "__main__":
main()
The output is:
$ python multiprocessing_fail.py
input: 0
input: 1
input: 2
Traceback (most recent call last):
input: 3
File "multiprocessing_fail.py", line 25, in <module>
main()
File "multiprocessing_fail.py", line 16, in main
results = pool.map(worker, data, 1)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
raise self._value
Exception: foobar
As you can see, the worker process never proceeds beyond raise Exception("foobar") to the second print statement. However, it resumes work at the beginning of function worker() again and again.
I looked for an explanation in the documentation, but couldn't find any. Here is a potentially related SO question:
Keyboard Interrupts with python's multiprocessing Pool
But that is different (about keyboard interrupts not being picked by the master process).
Another SO question:
How to catch exceptions in workers in Multiprocessing
This question is also different, since in it the master process doesnt catch any exception, whereas here the master did catch the exception (line 16). More importantly, in that question the worker did not run past an exception (there is only one executable line for the worker).
Am running python 2.7

Comment: Pool should start one worker since the code has pool = multiprocessing.Pool(1).
From the Documnentation:
A process pool object which controls a pool of worker processes to which jobs can be submitted
Comment: That one worker is running the worker() function multiple times
From the Documentation:
map(func, iterable[, chunksize])
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks.
Your worker() is the separate task. Renaming your worker() to task() could help to clarify what is what.
Comment: What I expect is that the worker process crashes at the Exception
It does, the separate task, your worker() dies and Pool starts the next task.
What you want is Pool.terminate()
From the Documentation:
terminate()
Stops the worker processes immediately without completing outstanding work.
Question: ... I find that the worker process keeps running past a point where an exception is thrown.
You give iteration data to Pool, therfore Pool does what it have to do:
Starting len(data) worker.
data = [str(x) for x in range(4)]
The main Question is: What do you want to expect with
raise Exception("foobar")

What is wrong with this Python Multiprocessing Code?

I am trying to create some multiprocessing code for my project. I have created a snippet of the things that I want to do. However its not working as per my expectations. Can you please let me know what is wrong with this.
from multiprocessing import Process, Pipe
import time
class A:
def __init__(self,rpipe,spipe):
print "In the function fun()"
def run(self):
print"in run method"
time.sleep(5)
message = rpipe.recv()
message = str(message).swapcase()
spipe.send(message)
workers = []
my_pipe_1 = Pipe(False)
my_pipe_2 = Pipe(False)
proc_handle = Process(target = A, args=(my_pipe_1[0], my_pipe_2[1],))
workers.append(proc_handle)
proc_handle.run()
my_pipe_1[1].send("hello")
message = my_pipe_2[0].recv()
print message
print "Back in the main function now"
The trace back displayed when i press ctrl-c:
^CTraceback (most recent call last):
File "sim.py", line 22, in <module>
message = my_pipe_2[0].recv()
KeyboardInterrupt
When I run this above code, the main process does not continue after calling "proc_handle.run". Why is this?

You've misunderstood how to use Process. You're creating a Process object, and passing it a class as target, but target is meant to be passed a callable (usually a function) that Process.run then executes. So in your case it's just instantiating A inside Process.run, and that's it.
You should instead make your A class a Process subclass, and just instantiate it directly:
#!/usr/bin/python
from multiprocessing import Process, Pipe
import time
class A(Process):
def __init__(self,rpipe,spipe):
print "In the function fun()"
super(A, self).__init__()
self.rpipe = rpipe
self.spipe = spipe
def run(self):
print"in run method"
time.sleep(5)
message = self.rpipe.recv()
message = str(message).swapcase()
self.spipe.send(message)
if __name__ == "__main__":
workers = []
my_pipe_1 = Pipe(False)
my_pipe_2 = Pipe(False)
proc_handle = A(my_pipe_1[0], my_pipe_2[1])
workers.append(proc_handle)
proc_handle.start()
my_pipe_1[1].send("hello")
message = my_pipe_2[0].recv()
print message
print "Back in the main function now"
mgilson was right, though. You should call start(), not run(), to make A.run execute in a child process.
With these changes, the program works fine for me:
dan#dantop:~> ./mult.py
In the function fun()
in run method
HELLO
Back in the main function now

Taking a stab at this one, I think it's because you're calling proc_handle.run() instead of proc_handle.start().
The former is the activity that the process is going to do -- the latter actually arranges for run to be called on a separate process. In other words, you're never forking the process, so there's no other process for my_pipe_1[1] to communicate with so it hangs.

Python: How to make program wait till function's or method's completion

Often there is a need for the program to wait for a function to complete its work. Sometimes it is opposite: there is no need for a main program to wait.
I've put a simple example. There are four buttons. Clicking each will call the same calculate() function. The only difference is the way the function is called.
"Call Directly" button calls calculate() function directly. Since there is a 'Function End' print out it is evident that the program is waiting for the calculate function to complete its job.
"Call via Threading" calls the same function this time using threading mechanism. Since the program prints out ': Function End' message immidiately after the button is presses I can conclude the program doesn't wait for calculate() function to complete. How to override this behavior? How to make program wait till calculate() function is finished?
"Call via Multiprocessing" buttons utilizes multiprocessing to call calculate() function.
Just like with threading multiprocessing doesn't wait for function completion. What statement we have to put in order to make it wait?
"Call via Subprocess" buttons doesn't do anything since I didn't figure out the way to hook subprocess to run internal script function or method. It would be interesting to see how to do it...
Example:
from PyQt4 import QtCore, QtGui
app = QtGui.QApplication(sys.argv)
def calculate(listArg=None):
print '\n\t Starting calculation...'
m=0
for i in range(50000000):
m+=i
print '\t ...calculation completed\n'
class Dialog_01(QtGui.QMainWindow):
def __init__(self):
super(Dialog_01, self).__init__()
myQWidget = QtGui.QWidget()
myBoxLayout = QtGui.QVBoxLayout()
directCall_button = QtGui.QPushButton("Call Directly")
directCall_button.clicked.connect(self.callDirectly)
myBoxLayout.addWidget(directCall_button)
Button_01 = QtGui.QPushButton("Call via Threading")
Button_01.clicked.connect(self.callUsingThreads)
myBoxLayout.addWidget(Button_01)
Button_02 = QtGui.QPushButton("Call via Multiprocessing")
Button_02.clicked.connect(self.callUsingMultiprocessing)
myBoxLayout.addWidget(Button_02)
Button_03 = QtGui.QPushButton("Call via Subprocess")
Button_03.clicked.connect(self.callUsingSubprocess)
myBoxLayout.addWidget(Button_03)
myQWidget.setLayout(myBoxLayout)
self.setCentralWidget(myQWidget)
self.setWindowTitle('Dialog 01')
def callUsingThreads(self):
print '------------------------------- callUsingThreads() ----------------------------------'
import threading
self.myEvent=threading.Event()
self.c_thread=threading.Thread(target=calculate)
self.c_thread.start()
print "\n\t\t : Function End"
def callUsingMultiprocessing(self):
print '------------------------------- callUsingMultiprocessing() ----------------------------------'
from multiprocessing import Pool
pool = Pool(processes=3)
try: pool.map_async( calculate, ['some'])
except Exception, e: print e
print "\n\t\t : Function End"
def callDirectly(self):
print '------------------------------- callDirectly() ----------------------------------'
calculate()
print "\n\t\t : Function End"
def callUsingSubprocess(self):
print '------------------------------- callUsingSubprocess() ----------------------------------'
import subprocess
print '-missing code solution'
print "\n\t\t : Function End"
if __name__ == '__main__':
dialog_1 = Dialog_01()
dialog_1.show()
dialog_1.resize(480,320)
sys.exit(app.exec_())

Use a queue: each thread when completed puts the result on the queue and then you just need to read the appropriate number of results and ignore the remainder:
#!python3.3
import queue # For Python 2.x use 'import Queue as queue'
import threading, time, random
def func(id, result_queue):
print("Thread", id)
time.sleep(random.random() * 5)
result_queue.put((id, 'done'))
def main():
q = queue.Queue()
threads = [ threading.Thread(target=func, args=(i, q)) for i in range(5) ]
for th in threads:
th.daemon = True
th.start()
result1 = q.get()
result2 = q.get()
print("Second result: {}".format(result2))
if __name__=='__main__':
main()
Documentation for Queue.get() (with no arguments it is equivalent to Queue.get(True, None):
Queue.get([block[, timeout]])
Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).
How to wait until only the first thread is finished in Python
You can to use .join() method too.
what is the use of join() in python threading

I find that using the "pool" submodule within "multiprocessing" works amazingly for executing multiple processes at once within a Python Script.
See Section: Using a pool of workers
Look carefully at "# launching multiple evaluations asynchronously may use more processes" in the example. Once you understand what those lines are doing, the following example I constructed will make a lot of sense.
import numpy as np
from multiprocessing import Pool
def desired_function(option, processes, data, etc...):
# your code will go here. option allows you to make choices within your script
# to execute desired sections of code for each pool or subprocess.
return result_array # "for example"
result_array = np.zeros("some shape") # This is normally populated by 1 loop, lets try 4.
processes = 4
pool = Pool(processes=processes)
args = (processes, data, etc...) # Arguments to be passed into desired function.
multiple_results = []
for i in range(processes): # Executes each pool w/ option (1-4 in this case).
multiple_results.append(pool.apply_async(param_process, (i+1,)+args)) # Syncs each.
results = np.array(res.get() for res in multiple_results) # Retrieves results after
# every pool is finished!
for i in range(processes):
result_array = result_array + results[i] # Combines all datasets!
The code will basically run the desired function for a set number of processes. You will have to carefully make you're function can distinguish between each process (hence why I added the variable "option".) Additionally, it doesn't have to be an array that is being populated in the end, but for my example thats how I used it. Hope this simplifies or helps you better understand the power of multiprocessing in Python!

Propagate system call interruptions in threads

I'm running two python threads (import threading). Both of them are blocked on a open() call; in fact they try to open named pipes in order to write in them, so it's a normal behaviour to block until somebody try to read from the named pipe.
In short, it looks like:
import threading
def f():
open('pipe2', 'r')
if __name__ == '__main__':
t = threading.Thread(target=f)
t.start()
open('pipe1', 'r')
When I type a ^C, the open() in the main thread is interrupted (raises IOError with errno == 4).
My problem is: the t threads still waits, and I'd like to propagate the interruption behaviour, in order to make it raise IOError too.

I found this in python docs:
"
... only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can’t be used as a means of inter-thread communication. Use locks instead.
"
Maybe you should also check these docs:
exceptions.KeyboardInterrupt
library/signal.html
One other idea is to use select to read the pipe asynchronously in the threads. This works in Linux, not sure about Windows (it's not the cleanest, nor the best implementation):
#!/usr/bin/python
import threading
import os
import select
def f():
f = os.fdopen(os.open('pipe2', os.O_RDONLY|os.O_NONBLOCK))
finput = [ f ]
foutput = []
# here the pipe is scanned and whatever gets in will be printed out
# ...as long as 'getout' is False
while finput and not getout:
fread, fwrite, fexcep = select.select(finput, foutput, finput)
for q in fread:
if q in finput:
s = q.read()
if len(s) > 0:
print s
if __name__ == '__main__':
getout = False
t = threading.Thread(target=f)
t.start()
try:
open('pipe1', 'r')
except:
getout = True

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is `gevent.spawn` different than a monkeypatched `threading.Thread()`? - python

Related

Python: Threads are not running in parrallel

Python Multiprocessing Doesnt Terminate On Base Exception

What is wrong with this Python Multiprocessing Code?

Python: How to make program wait till function's or method's completion

Propagate system call interruptions in threads

Categories

Resources