Python, Multiprocessing and GUI - python

I have a program with a GUI that needs to do some multiprocessing. The point of it is to avoid freezing the GUI, and to allow the user using some other buttons while the process is running.
I'd like then to define a method like the following:
def start_waiting(self, parent, func, args):
self.window_waiting.show()
task=Process(func(*args))
task.start()
task.join()
# Some code to execute at the end of the process
#
#...
The problem is that join() is not working, and I need it because the code executed after join() indicates when the Process has ended. I'll use that code to update a cancel button of the window_waiting.
I then came in mind with another solution to avoid usingjoin(), I replaced it with:
while task.is_alive():
time.sleep(0.5)
But it didn't worked neither, so I tried a plan C, which was to create a queue:
def worker(input, output):
for func, args in iter(input.get, 'STOP'):
result=func(*args)
output.put(result)
task_queue = Queue()
done_queue = Queue()
task_queue.put(task)
Process(target=worker, args=(task_queue, done_queue)).start()
done_queue.get()
The last code gave me the error: 'Pickling an AuthenticationString object is '
TypeError: Pickling an AuthenticationString object is disallowed for security reasons
Which lead me to multiprocessing.Process subclass using shared queue but I haven't managed to solve the problem :/

Your first example should look like this:
def start_waiting(self,parent,func,args):
...not relevant code..
self.window_waiting.show()
task=Process(target=func, args=args) # This line is different.
task.start()
task.join()
The way you had it, it wasn't actually executing func in a child process; it was executing it in the parent, and then passing the return value to Process. When you called task.start(), it would probably instantly fail, because you were passing it whatever func returned, rather than a function object.
Note that because you're calling task.join() inside start_waiting, it's probably going to block your GUI, because start_waiting won't return until func completes, even though its running in a child process. The only way it wouldn't block is if you're running start_waiting in a separate thread from the GUI's event loop. You probably want something more like this:
def start_waiting(self,parent,func,args):
...not relevant code..
self.window_waiting.show() # You should interact with the GUI in the main thread only.
self.task = Process(target=func, args=args) # This line is different.
self.thrd = threading.Thread(target=self.start_and_wait_for_task)
self.thrd.start()
def start_and_wait_for_task(self):
""" This runs in its own thread, so it won't block the GUI. """
self.task.start()
self.task.join()
# If you need to update the GUI at all here, use GLib.idle_add (see https://wiki.gnome.org/Projects/PyGObject/Threading)
# This is required to safely update the GUI from outside the main thread.

Related

Stop a long-running action in web2py with multiprocessing

I have a web2py application that basically serves as a browser interface for a Python script. This script usually returns pretty quickly, but can occasionally take a long time. I want to provide a way for the user to stop the script's execution if it takes too long.
I am currently calling the function like this:
def myView(): # this function is called from ajax
session.model = myFunc() # myFunc is from a module which i have complete control over
return dict(model=session.model)
myFunc, when called with certain options, uses multiprocessing but still ends up taking a long time. I need some way to terminate the function, or at the very least the thread's children.
The first thing i tried was to run myFunc in a new process, and roll my own simple event system to kill it:
# in the controller
def myView():
p_conn, c_conn = multiprocessing.Pipe()
events = multiprocessing.Manager().dict()
proc = multiprocessing.Process(target=_fit, args=(options, events c_conn))
proc.start()
sleep(0.01)
session.events = events
proc.join()
session.model = p_conn.recv()
return dict(model=session.model)
def _fit(options, events pipe):
pipe.send(fitting.logistic_fit(options=options, events=events))
pipe.close()
def stop():
try:
session.events['kill']()
except SystemExit:
pass # because it raises that error intentionally
return dict()
# in the module
def kill():
print multiprocessing.active_children()
for p in multiprocessing.active_children():
p.terminate()
raise SystemExit
def myFunc(options, events):
events['kill'] = kill
I ran into a few major problems with this.
The session in stop() wasn't always the same as the session in myView(), so session.events was None.
Even when the session was the same, kill() wasn't properly killing the children.
The long-running function would hang the web2py thread, so stop() wasn't even processed until the function finished.
I considered not calling join() and using AJAX to pick up the result of the function at a later time, but I wasn't able to save the process object in session for later use. The pipe seemed to be able to be pickled, but then I had the problem with not being able to access the same session from another view.
How can I implement this feature?
For long running tasks, you are better off queuing them via the built-in scheduler. If you want to allow the user to manually stop a task that is taking too long, you can use the scheduler.stop_task(ref) method (where ref is the task id or uuid). Alternatively, when you queue a task, you can specify a timeout, so it will automatically stop if not completed within the timeout period.
You can do simple Ajax polling to notify the client when the task has completed (or implement something more sophisticated with websockets or SSE).

Process invoked from python thread keeps on running on closing main process

I have a Tkinter wrapper written over robocopy.exe, the code is organized as shown below :-
Tkinter Wrapper :
Spawns a new thread and pass the arguments, which includes source/destination and other parameters
(Note : Queue object is also passed to thread, since the thread will read the output from robocopy and will put in queue, main tkinter thread will keep on polling queue and will update Tkinter text widget with the output)
Code Snippet
... Code to poll queue and update tk widget ...
q = Queue.Queue()
t1 = threading.Thread(target=CopyFiles,args=(q,src,dst,), kwargs={"ignore":ignore_list})
t1.daemon = True
t1.start()
Thread : (In a separate file)
Below is the code snippet from thread
def CopyFiles(q,src,dst,ignore=None):
extra_args = ['/MT:15', '/E', '/LOG:./log.txt', '/tee', '/r:2', '/w:2']
if len(ignore) > 0:
extra_args.append('/xf')
extra_args.extend(ignore)
extra_args.append('/xd')
extra_args.extend(ignore)
command_to_pass = ["robocopy",src, dst]
command_to_pass.extend(extra_args)
proc = subprocess.Popen(command_to_pass,stdout=subprocess.PIPE)
while True:
line = proc.stdout.readline()
if line == '':
break
q.put(line.strip())
Code which is called, when tkinter application is closed :-
def onQuit(self):
global t1
if t1.isAlive():
pass
if tkMessageBox.askyesno("Title", "Do you really want to exit?"):
self.destroy()
self.master.destroy()
Problem
Whenever, I close the tkinter application when robocopy is running, python application closes but the robocopy.exe keeps on running.
I have tried setting the thread as daemon, but it has no effect. How can I stop robocopy.exe when onQuit method is called?
To simplify things let's ignore Tkinter and the fact that a separate thread is used.
The situation is that your app spawns a subprocess to execute an external program (robocopy.exe in this question), and you'd need to stop the spawned program from you're application on a certain event (when the Tkinter app is closing in this question).
This requires an inter process communication mechanism, so the spawned process would be notified of the event, and reacts accordingly. A common mechanism is to use signals provided by the OS.
You could send a signal (SIGTERM) to the external process and ask from it to quit. Assuming that the program reacts to signal as expected (most well written applications do) you'll get the desired behavior (the process will terminate).
Using the terminate method on the subprocess will send the proper signal of current platform to the subprocess.
You'd need a reference to the subprocess object proc in the onQuit function (from the provided code I see onQuit is a function and not an object method, so it could use a global variable to access proc), so you can call the terminate method of the process:
def onQuit(self):
global t1, proc
if t1.isAlive():
pass
# by the way I'm not sure about the logic, but I guess this
# below statement should be an elif instead of if
if tkMessageBox.askyesno("Title", "Do you really want to exit?"):
proc.terminate()
self.destroy()
self.master.destroy()
This code assumes you're storing the reference to the subprocess in the global scope, so you'd have to modify the CopyFiles as well.
I'm not sure how robocopy handles terminate signals and I'm guessing that's not something that we have any control over.
If you had more control on the external program (could modify the source), there might have been more options, for example sending messages using stdio, or using a shared memory, etc.

python simple thread not working

I'm trying to use a new thread or multiprocessing to run a function.
The function is called like this:
Actualize_files_and_folders(self)
i've read alot about multiprocessing and threads, and looking the questions at StackOverflow but i can't make it work... It would be great to have some help >.<
im calling the function with a button.
def on_button_act_clicked(self, menuitem, data=None):
self.window_waiting.show()
Actualize_files_and_folders(self)
self.window_waiting.hide()
In the waiting_window i have a button called 'cancel', it would be great if i can have a command/function that kills the thread.
i've tryed a lot of stuff, for exemple:
self.window_waiting.show()
from multiprocessing import Process
a=Process(Target=Actualize_files_and_folders(self))
a.start()
a.join()
self.window_waiting.hide()
But the window still freezing, and window_waiting is displayed at the end of Actualize_files_and_folders(self), like if i had called a normal function.
Thanks so much for any help!!
It looks like the worker function is being called rather than used as a callback for the process target:
process = Process(target=actualize_files_and_folders(self))
This is essentially equivalent to:
tmp = actualize_files_and_folders(self)
process = Process(target=tmp)
So the worker function is called in the main thread blocking it. The result of this function is passed into the Process as the target which will do nothing if it is None. You need to pass the function itself as a callback, not its result:
process = Process(target=actualize_files_and_folders, args=[self])
process.start()
See: https://docs.python.org/2/library/multiprocessing.html

Function within Worker/Child instance does not return, freezes program

I am using the multiprocessing module in python. Here is a sample of the code I am using:
import multiprocessing as mp
def function(fun_var1, fun_var2):
b = fun_var1 + fun_var2
# and more computationally intensive stuff happens here
return b
# my program freezes after the return command
class Worker(mp.Process):
def __init__(self, queue_obj, func_var1, func_var2):
mp.Process.__init__(self)
self.queue_obj = queue_obj
self.func_var1 = func_var1
self.func_var2 = func_var2
def run(self):
self.var = function( self.func_var1, self.func_var2 )
self.queue_obj.put(self.var)
if __name__ == '__main__':
mp.freeze_support()
queue_list = []
processes = []
result = []
for i in range(2):
queue_list.append(mp.Queue())
processes.append( Worker(queue_list[i], i, var1, var2 )
processes[i].start()
for i in range(2):
processes[i].join()
result.append(queue_list[i].get())
During runtime of the program two instances of the worker class are generated which work simultaneously. One instance finishes after about 2 minutes and the other would take about 7 minutes. The first instance returns its results fine. However, the second instance freezes the program when the function() that is called within the run() method returns its value. No error is being thrown, the program just does not continue to execute. The console also indicates that it is busy but not displaying the >>> prompt. I am completely clueless why this behavior occurs. The same code works fine for slightly different inputs in the two Worker instances. The only difference I can make out is that the work loads are more equal when it executes correctly. Could the time difference cause trouble? Does anyone have experience with this kind of behavior? Also note that if I run a serial setup of the program in which function() is just called twice by the main program, the code executes flawlessly. Could there be some timeout involved in the worker instance that makes it impossible for function() to return its value to the Worker instance? The return value of function() is actually a list that is fairly small. It contains about 100 float values.
Any suggestions are welcomed!
This is a bit of an educated guess without actually seeing what's going on in worker, but is it possible that your child has put items into the Queue that haven't been consumed? The documentation has a warning about this:
Warning
As mentioned above, if a child process has put items on a queue (and
it has not used JoinableQueue.cancel_join_thread), then that process
will not terminate until all buffered items have been flushed to the
pipe.
This means that if you try joining that process you may get a deadlock
unless you are sure that all items which have been put on the queue
have been consumed. Similarly, if the child process is non-daemonic
then the parent process may hang on exit when it tries to join all its
non-daemonic children.
Note that a queue created using a manager does not have this issue.
See Programming guidelines.
It might be worth trying to create your Queue object using mp.Manager.Queue and see if the issue goes away.

Python, non-blocking threads

There are a lot of tutorials etc. on Python and asynchronous coding techniques, but I am having difficulty filtering the through results to find what I need. I am new to Python, so that doesn't help.
Setup
I currently have two objects that look sort of like this (please excuse my python formatting):
class Alphabet(parent):
def init(self, item):
self.item = item
def style_alphabet(callback):
# this method presumably takes a very long time, and fills out some properties
# of the Alphabet object
callback()
class myobj(another_parent):
def init(self):
self.alphabets = []
refresh()
def foo(self):
for item in ['a', 'b', 'c']:
letters = new Alphabet(item)
self.alphabets.append(letters)
self.screen_refresh()
for item in self.alphabets
# this is the code that I want to run asynchronously. Typically, my efforts
# all involve passing item.style_alphabet to the async object / method
# and either calling start() here or in Alphabet
item.style_alphabet(self.screen_refresh)
def refresh(self):
foo()
# redraw screen, using the refreshed alphabets
redraw_screen()
def screen_refresh(self):
# a lighter version of refresh()
redraw_screen()
The idea is that the main thread initially draws the screen with incomplete Alphabet objects, fills out the Alphabet objects, updating the screen as they complete.
I've tried a number of implementations of threading.Tread, Queue.Queue, and even futures, and for some reason they either haven't worked, or they have blocked the main thread. so that the initial draw doesn't take place.
A few of the async methods I've attempted:
class Async (threading.Thread):
def __init__(self, f, cb):
threading.Thread.__init__(self)
self.f = f
self.cb = cb
def run(self):
self.f()
self.cb()
def run_as_thread(f):
# When I tried this method, I assigned the callback to a property of "Alphabet"
thr = threading.Thread(target=f)
thr.start()
def run_async(f, cb):
pool = Pool(processes=1)
result = pool.apply_async(func=f, args=args, callback=cb)
I ended up writing a thread pool to deal with this use pattern. Try creating a queue and handing a reference off to all the worker threads. Add task objects to the queue from the main thread. Worker threads pull objects from the queue and invoke the functions. Add an event to each task to be signaled on the worker thread at task completion. Keep a list of task objects on the main thread and use polling to see if the UI needs an update. One can get fancy and add a pointer to a callback function on the task objects if needed.
My solution was inspired by what I found on Google: http://code.activestate.com/recipes/577187-python-thread-pool/
I kept improving on that design to add features and give the threading, multiprocessing, and parallel python modules a consistent interface. My implementation is at:
https://github.com/nornir/nornir-pools
Docs:
http://nornir.github.io/packages/nornir_pools.html
If you are new to Python and not familiar with the GIL I suggest doing a search for Python threading and the global interpreter lock (GIL). It isn’t a happy story. Generally I find I need to use the multiprocessing module to get decent performance.
Hope some of this helps.

Categories

Resources