How do I run a Python method as a subprocess? - python

i need a help with a python project:
Example:
class MyFrame(wx.Frame):
def __init__(self, parent, title):
super(MyFrame, self).__init__(parent, title=title, size=(330, 300))
self.InitUI()
self.Centre()
self.Show()
def InitUI(self):
"""
Subprocess
"""
subprocess.execMethodFromClass( self , 'Connection' , args1 , args2 , ... )
def Connection( self ):
self.connection = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.connection.connect(( '192.0.1.135' , 3345 ))
while True:
data = self.connection.recv(1024)
if not data:
break
else:
print data
Show:
subprocess.execMethodFromClass( self , 'Connection' , args1 , args2 , ... )
Thanks!

As the friendly dogcow says, to run a function in a child process, all you have to do is use a multiprocessing.Process:
p = multiprocessing.Process(target=f, args=('bob',))
p.start()
p.join()
Of course you probably want to hang onto p and join it later in most* real-life use cases. You're obviously not getting any parallelism by spawning a new process just to make your main process sit around and wait.
So, in your case, that's just:
p = multiprocessing.Process(target=self.Connection, args=(args1, args2))
But this probably won't work in your case, because you're trying to call a method on the current self object.
First, depending on your platform and Python version, multiprocessing may have to pass the bound method self.Connection to the child by pickling it and sending it over a pipe. This involves pickling the self instance as well as the method. So it will only work if MyFrame objects are pickleable. And I'm pretty sure that a wx.Frame can't be pickled.
And even if you do get the self object to the child, it will obviously be a copy, not a shared instance. So, when the child process's Connection method sets self.connection = …, that won't affect the original parent process's self.
Even worse if you try to call any wx.Frame methods. Even if all the Python stuff worked, on most platforms, trying to modify GUI resources like windows from the wrong process will not work.
The only kinds of objects you can actually share are the kinds you can put in multiprocessing.Value or multiprocessing.sharedctypes.
The way around this is to factor out the code you want to childify into a separate, isolated function, that shares as little as possible (ideally nothing, or nothing but a Queue or Pipe) with the parent.
For your example, this is easy:
class Client(object):
def connect_and_fetch(self):
self.connection = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.connection.connect(( '192.0.1.135' , 3345 ))
while True:
data = self.connection.recv(1024)
if not data:
break
else:
print data
def do_client():
client = Client()
connect_and_fetch()
class MyFrame(wx.Frame):
# ...
def Connection(self):
self.child = multiprocessing.Process(target=do_client)
self.child.start()
# and now put a self.child.join() somewhere
In fact, you don't even need a class at all here, because the only use you have for self is to store a variable that could just as easily be a local. But I'm guessing in your real-life program, there's a bit more state than that.
There's an interesting (if a bit outdated) example on the wxpython wiki, called MultiProcessing, which looks like it does most of what you want and more. (It's using a classmethod for the child process instead of a standalone function for some reason, and using old-style syntax for it because it's old, but hopefully it's still helpful.)
If you're using wx for your GUI, you may want to consider using its inter-process mechanisms instead of the native Python ones. While it's more complicated and less pythonic in the general case, when you're trying to integrate a child process and its communications pipe into your main event loop, why not let wx take care of it?
The alternative is to create a thread to wait on the child process and/or whatever Pipe or Queue you give it, and then create and post wx.Events to the main thread.
* Most, not all. For example, if f temporarily uses up a whole lot of memory, running it in a child process means you release that memory to the OS as quickly as possible. Or, if it calls badly-written third-party/legacy/whatever code that has nasty and poorly-documented global side-effects, you're isolated from those side-effects. And so on.

From http://docs.python.org/dev/library/multiprocessing.html:
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()

You can't. You use subprocess to call another application or script to run in a separate process.
subprocess.Popen(cmds)
If you need to run some long running process, look into threads or the multiprocessing module. Here are some links:
http://docs.python.org/2/library/multiprocessing.html
http://wiki.wxpython.org/LongRunningTasks
http://www.blog.pythonlibrary.org/2012/08/03/python-concurrency-porting-from-a-queue-to-multiprocessing/
http://www.blog.pythonlibrary.org/2010/05/22/wxpython-and-threads/

Related

Shared pool map between processes with object-oriented python

(python2.7)
I'm trying to do a kind of scanner, that has to walk through CFG nodes, and split in different processes on branching for parallelism purpose.
The scanner is represented by an object of class Scanner. This class has one method traverse that walks through the said graph and splits if necessary.
Here how it looks:
class Scanner(object):
def __init__(self, atrb1, ...):
self.attribute1 = atrb1
self.process_pool = Pool(processes=4)
def traverse(self, ...):
[...]
if branch:
self.process_pool.map(my_func, todo_list).
My problem is the following:
How do I create a instance of multiprocessing.Pool, that is shared between all of my processes ? I want it to be shared, because since a path can be splitted again, I do not want to end with a kind of fork bomb, and having the same Pool will help me to limit the number of processes running at the same time.
The above code does not work, since Pool can not be pickled. In consequence, I have tried that:
class Scanner(object):
def __getstate__(self):
self_dict = self.__dict__.copy()
def self_dict['process_pool']
return self_dict
[...]
But obviously, it results in having self.process_pool not defined in the created processes.
Then, I tried to create a Pool as a module attribute:
process_pool = Pool(processes=4)
def my_func(x):
[...]
class Scanner(object):
def __init__(self, atrb1, ...):
self.attribute1 = atrb1
def traverse(self, ...):
[...]
if branch:
process_pool.map(my_func, todo_list)
It does not work, and this answer explains why.
But here comes the thing, wherever I create my Pool, something is missing. If I create this Pool at the end of my file, it does not see self.attribute1, the same way it did not see answer and fails with an AttributeError.
I'm not even trying to share it yet, and I'm already stuck with Multiprocessing way of doing thing.
I don't know if I have not been thinking correctly the whole thing, but I can not believe it's so complicated to handle something as simple as "having a worker pool and giving them tasks".
Thank you,
EDIT:
I resolved my first problem (AttributeError), my class had a callback as its attribute, and this callback was defined in the main script file, after the import of the scanner module... But the concurrency and "do not fork bomb" thing is still a problem.
What you want to do can't be done safely. Think about if you somehow had a single shared Pool shared across parent and worker processes, with, say, two worker processes. The parent runs a map that tries to perform two tasks, and each task needs to map two more tasks. The two parent dispatched tasks go to each worker, and the parent blocks. Each worker sends two more tasks to the shared pool and blocks for them to complete. But now all workers are now occupied, waiting for a worker to become free; you've deadlocked.
A safer approach would be to have the workers return enough information to dispatch additional tasks in the parent. Then you could do something like:
class MoreWork(object):
def __init__(self, func, *args):
self.func = func
self.args = args
pool = multiprocessing.Pool()
try:
base_task = somefunc, someargs
outstanding = collections.deque([pool.apply_async(*base_task)])
while outstanding:
result = outstanding.popleft().get()
if isinstance(result, MoreWork):
outstanding.append(pool.apply_async(result.func, result.args))
else:
... do something with a "final" result, maybe breaking the loop ...
finally:
pool.terminate()
What the functions are is up to you, they'd just return information in a MoreWork when there was more to do, not launch a task directly. The point is to ensure that by having the parent be solely responsible for task dispatch, and the workers solely responsible for task completion, you can't deadlock due to all workers being blocked waiting for tasks that are in the queue, but not being processed.
This is also not at all optimized; ideally, you wouldn't block waiting on the first item in the queue if other items in the queue were complete; it's a lot easier to do this with the concurrent.futures module, specifically with concurrent.futures.wait to wait on the first available result from an arbitrary number of outstanding tasks, but you'd need a third party PyPI package to get concurrent.futures on Python 2.7.

Multiprocesses python with shared memory

I have an object that connects to a websocket remote server. I need to do a parallel process at the same time. However, I don't want to create a new connection to the server. Since threads are the easier way to do this, this is what I have been using so far. However, I have been getting a huge latency because of GIL. Can I achieve the same thing as threads but with multiprocesses in parallel?
This is the code that I have:
class WebSocketApp(object):
def on_open(self):
# Create another thread to make sure the commands are always been read
print "Creating thread..."
try:
thread.start_new_thread( self.read_commands,() )
except:
print "Error: Unable to start thread"
Is there an equivalent way to do this with multiprocesses?
Thanks!
The direct equivalent is
import multiprocessing
class WebSocketApp(object):
def on_open(self):
# Create another process to make sure the commands are always been read
print "Creating process..."
try:
multiprocessing.Process(target=self.read_commands,).start()
except:
print "Error: Unable to start process"
However, this doesn't address the "shared memory" aspect, which has to be handled a little differently than it is with threads, where you can just use global variables. You haven't really specified what objects you need to share between processes, so it's hard to say exactly what approach you should take. The multiprocessing documentation does cover ways to deal with shared state, however. Do note that in general it's better to avoid shared state if possible, and just explicitly pass state between the processes, either as an argument to the Process constructor or via a something like a Queue.
You sure can, use something along the lines of:
from multiprocessing import Process
class WebSocketApp(object):
def on_open(self):
# Create another thread to make sure the commands are always been read
print "Creating thread..."
try:
p = Process(target = WebSocketApp.read_commands, args = (self, )) # Add other arguments to this tuple
p.start()
except:
print "Error: Unable to start thread"
It is important to note, however, that as soon as the object is sent to the other process the two objects self and self in the different threads diverge and represent different objects. If you wish to communicate you will need to use something like the included Queue or Pipe in the multiprocessing module.
You may need to keep a reference of all the processes (p in this case) in your main thread in order to be able to communicate that your program is terminating (As a still-running child process will appear to hang the parent when it dies), but that depends on the nature of your program.
If you wish to keep the object the same, you can do one of a few things:
Make all of your object properties either single values or arrays and then do something similar to this:
from multiprocessing import Process, Value, Array
class WebSocketApp(object):
def __init__(self):
self.my_value = Value('d', 0.3)
self.my_array = Array('i', [4 10 4])
# -- Snip --
And then these values should work as shared memory. The types are very restrictive though (You must specify their types)
A different answer is to use a manager:
from multiprocessing import Process, Manager
class WebSocketApp(object):
def __init__(self):
self.my_manager = Manager()
self.my_list = self.my_manager.list()
self.my_dict = self.my_manager.dict()
# -- Snip --
And then self.my_list and self.my_dict act as a shared-memory list and dictionary respectively.
However, the types for both of these approaches can be restrictive so you may have to roll your own technique with a Queue and a Semaphore. But it depends what you're doing.
Check out the multiprocessing documentation for more information.

Timout on a function

Let us say we have a python function magical_attack(energy) which may or may not last more than a second. It could even be an infinite loop? How would I run, but if it goes over a second, terminate it, and tell the rest of the program. I am looking for a sleek module to do this. Example:
import timeout
try: timout.run(magical_attack(5), 1)
except timeout.timeouterror:
blow_up_in_face(wizard)
Note: It is impossible to modify the function. It comes from the outside during runtime.
The simplest way to do this is to run the background code in a thread:
t = threading.Thread(target=magical_attack, args=(5,))
t.start()
t.join(1)
if not t.isAlive():
blow_up_in_face(wizard)
However, note that this will not cancel the magical_attack function; it could still keep spinning along in the background for as long as it wants even though you no longer care about the results.
Canceling threads safely is inherently hard to do, and different on each platform, so Python doesn't attempt to provide a way to do it. If you need that, there are three alternatives:
If you can edit the code of magical_attack to check a flag every so often, you can cancel it cooperatively by just setting that flag.
You can use a child process instead of a thread, which you can then kill safely.
You can use ctypes, pywin32, PyObjC, etc. to access platform-specific routines to kill the thread. But you have to really know what you're doing to make sure you do it safely, and don't confuse Python in doing it.
As Chris Pak pointed out, the futures module in Python 3.2+ makes this even easier. For example, you can throw off thousands of jobs without having thousands of threads; you can apply timeouts to a whole group of jobs as if they were a single job; etc. Plus, you can switch from threads to processes with a trivial one-liner change. Unfortunately, Python 2.7 does not have this module—but there is a quasi-official backport that you can install and use just as easily.
Abamert beat me there on the answer I was preparing, except for this detail:
If, and only if, the outside function is executed through the Python interpreter, even though you can't change it (for example, from a compiled module), you might be able to use the technique described in this other question to kill the thread that calls that function using an exception.
Is there any way to kill a Thread in Python?
Of course, if you did have control over the function you were calling, the StoppableThread class from that answer works well for this:
import threading
class StoppableThread(threading.Thread):
"""Thread class with a stop() method. The thread itself has to check
regularly for the stopped() condition."""
def __init__(self):
super(StoppableThread, self).__init__()
self._stop = threading.Event()
def stop(self):
self._stop.set()
def stopped(self):
return self._stop.isSet()
class Magical_Attack(StoppableThread):
def __init__(self, enval):
self._energy = enval
super(Magical_Attack, self).__init__()
def run(self):
while True and not self.stopped():
print self._energy
if __name__ == "__main__":
a = Magical_Attack(5)
a.start()
a.join(5.0)
a.stop()

A couple of questions about calling methods outside of QThread from within the QThread - is my design flawed?

I have an application that has a GUI thread and many different worker threads. In this application, I have a functions.py module, which contains a lot of different "utility" functions that are used all over the application.
Yesterday the application has been released and some users (a minority, but still) has reported problems with the application crashing. I looked over my code and noticed a possible design flaw, and would like to check with the lovely people of SO and see if I am right and if this is indeed a flaw.
Suppose I have this defined in my functions.py module:
class Functions:
solveComputationSignal = Signal(str)
updateStatusSignal = Signal(int, str)
text = None
#classmethod
def setResultText(self, text):
self.text = text
#classmethod
def solveComputation(cls, platform, computation, param=None):
#Not the entirety of the method is listed here
result = urllib.urlopen(COMPUTATION_URL).read()
if param is None:
cls.solveComputationSignal.emit(result)
else:
cls.solveAlternateComputation(platform, computation)
while not self.text:
time.sleep(3)
return self.text if self.text else False
#classmethod
def updateCurrentStatus(cls, platform, statusText):
cls.updateStatusSignal.emit(platform, statusText)
I think these methods in themselves are fine. The two signals defined here are connected to in the GUI thread. The first signal pops-up a dialog in which the computation is presented. The GUI thread calls the setResultText() method and sets the resulting string as entered by the user (if anyone knows of a better way to wait until the user has inputted the text other than sleeping and waiting for self.text to become True, please let me know). The solveAlternateComputation is another method in the same class that solves the computation automatically, however, it too calls the setResultText() method that sets the resulting text.
The second signal updates the statusBar text of the main GUI as well.
What's worse is that I think the above design, while perhaps flawed, is not the problem.
The problem lies, I believe, in the way I call these methods, whihch is from the worker threads (note that I have multiple similar workers, all of which are different "platforms")
Assume I have this (and I do):
class WorkerPlatform1(QThread):
#Init and other methods are here
def run(self):
#Thread does its job here, but then when it needs to present the
#computation, instead of emitting a signal, this is what I do
self.f = functions.Functions
result = self.f.solveComputation(platform, computation)
if result:
#Go on with the task
else:
self.f.updateCurrentStatus(platform, "Error grabbing computation!")
In this case I think that my flaw is that the thread itself is not emitting any signals, but rather calling callables residing outside of that thread directly. Am I right in thinking that this could cause my application to crash? Although the faulty module is reported as QtGui4.dll
One more thing: both of these methods in the Functions class are accessed by many threads almost simultaneously. Is this even advisable - have methods residing outside of a thread be accessed by many threads all at the same time? Can it so happen that I "confuse" my program? The reason I am asking is because people who say that the application is not crashing report that, very often, the solveComputation() returns the incorrect text - not all the time, but very often. Since that COMPUTATION_URL's server can take some time to respond (even 10+ seconds), is it possible that, once a thread calls that method, while the urllib library is still waiting for server response, in that time another thread can call it, causing it to use a different COMPUTATION_URL, which will result in it returning an incorrect value on some cases?
Finally, I am thinking of solutions: for my first (crashing) problem, do you think the proper solution would be to directly emit a Signal from the thread itself, and then connect it in the GUI thread? Is that the right way to go about it?
Secondly, for the solveComputation returning incorrect values, would I solve it by moving that method (and accompanying methods) to every Worker class? then I could call them directly and hopefully have the correct response - or, dozens of different responses (since I have that many threads) - for every thread?
Thank you all and I apologize for the wall of text.
EDIT: I would like to add that when running in console with some users, this error appears QObject: Cannot create children for a parent that is in a different thread.
(Parent is QLabel(0x4795500), parent's thread is QThread(0x2d3fd90), current thread is WordpressCreator(0x49f0548)
Your design is flawed if you really are using your Functions class like this with classmethods storing results on class attributes, being shared amongst multiple workers. It should be using all instance methods, and each thread should be using an instance of this class:
class Functions(QObject):
solveComputationSignal = pyqtSignal(str)
updateStatusSignal = pyqtSignal(int, str)
def __init__(self, parent=None):
super(Functions, self).__init__(parent)
self.text = ""
def setResultText(self, text):
self.text = text
def solveComputation(self, platform, computation, param=None):
result = urllib.urlopen(COMPUTATION_URL).read()
if param is None:
self.solveComputationSignal.emit(result)
else:
self.solveAlternateComputation(platform, computation)
while not self.text:
time.sleep(3)
return self.text if self.text else False
def updateCurrentStatus(self, platform, statusText):
self.updateStatusSignal.emit(platform, statusText)
# worker_A
def run(self):
...
f = Functions()
# worker_B
def run(self):
...
f = Functions()
Also, for doing your urlopen, instead of doing sleeps to check for when it is ready, you can make use of the QNetworkAccessManager to make your requests and use signals to be notified when results are ready.

Python: Explanation of example... Why does this work?

I've been mucking around with python for a little while and I have recently come up with something involving multithreading... without further ado... heres what I have...
import pythoncom
import wmi
import threading
class Info(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
pythoncom.CoInitialize()
c = wmi.WMI()
detect = c.Win32_ComputerShutdownEvent.watch_for()
detect()
return
if __name__ == '__main__':
Info().start()
for process in c.Win32_Process(Name="something.exe"):
result = process.Terminate()
So my question is... Why does this work? It may be an overall question regarding the process of the inheritence of threading.Thread... but there is no start() def in the class Info() so why does the run def begin?
This is actually a pretty handy application I needed to use to stop an application that always seems to hang when windows shuts down... finding when the windows shutdown event happens was a bit of a headache but luckily tim golden's masterpiece saves the day!
Because it's defined in the parent. Parent classes are checked for attributes if they're not found (or handled) in the child class.
Subclasses of Thread automatically call their run(), when you call start() on them. Start is defined in Thread.
From the docs
There are two ways to specify the activity: by passing a callable object to the constructor, or by overriding the run() method in a subclass.
and from docs on start()
It arranges for the object’s run() method to be invoked in a separate thread of control.
Don't you intent to wait the thread has ended before killing process ?
If so:
if __name__ == '__main__':
info = Info()
info.start()
info.join()
for process in c.Win32_Process(Name="something.exe"):
result = process.Terminate()

Categories

Resources