Python Multiprocessing outside of main producing unexpected result

Python Multiprocessing outside of main producing unexpected result - python

I have a program that makes a call to an API every minute and do some operations, when some condition is met, I want to create a new process that will make calls to another API every seconds and do some operations. Parent process doesn't care the result that this child process produce, the child will run on its own until everything is done. This way the parent process can continue making call to the api every minute and doing operations without interruption.
I looked into multiprocessing. However I cant get it to work outside of main. I tried passing a callback function, but that created unexpected result (where parent process starting running again in parallel at some point).
Another solution I can think of is just create another project, then make a request. However then I will have a lot of repeated code.
What is the best approach to my problem?
example code:
class Main:
[...]
foo = Foo()
child = Child()
foo.Run(child.Process)
class Foo:
[...]
def Run(callbackfunction):
while(True):
x = self.dataServices.GetDataApi()
if(x == 1020):
callbackfunction()
#start next loop after a minute
class Child:
[...]
def Compute(self):
while(True):
self.dataServics.GetDataApiTwo()
#do stuff
#start next loop after a second
def Process(self):
self.Compute() # i want this function to run from a new process, so it wont interfer
Edit2: added in multiprocess attempt
class Main:
def CreateNewProcess(self, callBack):
if __name__ == '__main__':
p = Process(target=callBack)
p.start()
p.join()
foo = Foo()
child = Child(CreateNewProcess)
foo.Run(child.Process)
class Foo:
def Run(callbackfunction):
while(True):
x = dataServices.GetDataApi()
if(x == 1020):
callbackfunction()
#start next loop after a minute
class Child:
_CreateNewProcess = None
def __init__(self, CreateNewProcess):
self._CreateNewProcess = CreateNewProcess
def Compute(self, CreateNewProcess):
while(True):
dataServics.GetDataApiTwo()
#do stuff
#start next loop after a second
def Process(self):
self.CreateNewProcess(self.Compute) # i want this function to run from a new process, so it wont interfer

I had to reorganize a few things. Among others:
The guard if __name__ == '__main__': should include creation of
objects and especially calls to functions and methods. Usually it is
placed on the global level at the end of code.
Child objects shouldn't be created in main process. In theory you can
do this to use them as containers for necessary data for the child
process and then sending them as parameter but I think a separate
class should be used for this if seen as necessary. Here I used a
simple data parameter which can be anything pickleable.
It is cleaner to have a function on global level as process
target (in my opinion)
Finally it looks like:
from multiprocessing import Process
class Main:
#staticmethod
def CreateNewProcess(data):
p = Process(target=run_child, args=(data,))
p.start()
p.join()
class Foo:
def Run(self, callbackfunction):
while(True):
x = dataServices.GetDataApi()
if(x == 1020):
callbackfunction(data)
#start next loop after a minute
class Child:
def __init__(self, data):
self._data = data
def Compute(self):
while(True):
dataServics.GetDataApiTwo()
#do stuff
#start next loop after a second
# Target for new process. It is cleaner to have a function outside of a
# class for this
def run_child(data): # "data" represents one or more parameters from
# parent to child necessary to run specific child.
# "data" must be pickleable.
# Can be omitted if unnecessary
global child
child = Child(data)
child.Compute()
if __name__ == '__main__':
foo = Foo()
foo.Run(Main.CreateNewProcess)

Related

How to add a pool of processes available for a multiprocessing queue

I am following a preceding question here: how to add more items to a multiprocessing queue while script in motion
the code I am working with now:
import multiprocessing
class MyFancyClass:
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print('Doing something fancy in {} for {}!'.format(proc_name, self.name))
def worker(q):
while True:
obj = q.get()
if obj is None:
break
obj.do_something()
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
queue.put(MyFancyClass('Frankie'))
# print(queue.qsize())
queue.put(None)
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
Right now, there's two items in the queue. if I replace the two lines with a list of, say 50 items....How do I initiate a POOL to allow a number of processes available. for example:
p = multiprocessing.Pool(processes=4)
where does that go? I'd like to be able run multiple items at once, especially if the items run for a bit.
Thanks!

As a rule, you either use Pool or Process(es) plus Queues. Mixing both is a misuse; the Pool already uses Queues (or a similar mechanism) behind the scenes.
If you want to do this with a Pool, change your code to (moving code to main function for performance and better resource cleanup than running in global scope):
def main():
myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
with multiprocessing.Pool(processes=4) as p:
# Submit all the work
futures = [p.apply_async(fancy.do_something) for fancy in myfancyclasses]
# Done submitting, let workers exit as they run out of work
p.close()
# Wait until all the work is finished
for f in futures:
f.wait()
if __name__ == '__main__':
main()
This could be simplified further at the expense of purity, with the .*map* methods of Pool, e.g. to minimize memory usage redefine main as:
def main():
myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
with multiprocessing.Pool(processes=4) as p:
# No return value, so we ignore it, but we need to run out the result
# or the work won't be done
for _ in p.imap_unordered(MyFancyClass.do_something, myfancyclasses):
pass
Yes, technically either approach has a slightly higher overhead in terms of needing to serialize the return value you're not using so give it back to the parent process. But in practice, this cost is pretty low (since your function has no return, it's returning None, which serializes to almost nothing). An advantage to this approach is that for printing to the screen, you generally don't want to do it from the child processes (since they'll end up interleaving output), and you can replace the printing with returns to let the parent do the work, e.g.:
import multiprocessing
class MyFancyClass:
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
# Changed from print to return
return 'Doing something fancy in {} for {}!'.format(proc_name, self.name)
def main():
myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your MyFancyClass instances here
with multiprocessing.Pool(processes=4) as p:
# Using the return value now to avoid interleaved output
for res in p.imap_unordered(MyFancyClass.do_something, myfancyclasses):
print(res)
if __name__ == '__main__':
main()
Note how all of these solutions remove the need to write your own worker function, or manually manage Queues, because Pools do that grunt work for you.
Alternate approach using concurrent.futures to efficiently process results as they become available, while allowing you to choose to submit new work (either based on the results, or based on external information) as you go:
import concurrent.futures
from concurrent.futures import FIRST_COMPLETED
def main():
allow_new_work = True # Set to False to indicate we'll no longer allow new work
myfancyclasses = [MyFancyClass('Fancy Dan'), ...] # define your initial MyFancyClass instances here
with concurrent.futures.ProcessPoolExecutor() as executor:
remaining_futures = {executor.submit(fancy.do_something)
for fancy in myfancyclasses}
while remaining_futures:
done, remaining_futures = concurrent.futures.wait(remaining_futures,
return_when=FIRST_COMPLETED)
for fut in done:
result = fut.result()
# Do stuff with result, maybe submit new work in response
if allow_new_work:
if should_stop_checking_for_new_work():
allow_new_work = False
# Let the workers exit when all remaining tasks done,
# and reject submitting more work from now on
executor.shutdown(wait=False)
elif has_more_work():
# Assumed to return collection of new MyFancyClass instances
new_fanciness = get_more_fanciness()
remaining_futures |= {executor.submit(fancy.do_something)
for fancy in new_fanciness}
myfancyclasses.extend(new_fanciness)

How to run Python custom objects in separate processes, all working on a shared events queue?

I have 4 different Python custom objects and an events queue. Each obect has a method that allows it to retrieve an event from the shared events queue, process it if the type is the desired one and then puts a new event on the same events queue, allowing other processes to process it.
Here's an example.
import multiprocessing as mp
class CustomObject:
def __init__(events_queue: mp.Queue) -> None:
self.events_queue = event_queue
def process_events_queue() -> None:
event = self.events_queue.get()
if type(event) == SpecificEventDataTypeForThisClass:
# do something and create a new_event
self.events_queue.put(new_event)
else:
self.events_queue.put(event)
# there are other methods specific to each object
These 4 objects have specific tasks to do, but they all share this same structure. Since I need to "simulate" the production condition, I want them to run all at the same time, indipendently from eachother.
Here's just an example of what I want to do, if possible.
import multiprocessing as mp
import CustomObject
if __name__ == '__main__':
events_queue = mp.Queue()
data_provider = mp.Process(target=CustomObject, args=(events_queue,))
portfolio = mp.Process(target=CustomObject, args=(events_queue,))
engine = mp.Process(target=CustomObject, args=(events_queue,))
broker = mp.Process(target=CustomObject, args=(events_queue,))
while True:
data_provider.process_events_queue()
portfolio.process_events_queue()
engine.process_events_queue()
broker.process_events_queue()
My idea is to run each object in a separate process, allowing them to communicate with events shared through the events_queue. So my question is, how can I do that?
The problem is that obj = mp.Process(target=CustomObject, args=(events_queue,)) returns a Process instance and I can't access the CustomObject methods from it. Also, is there a smarter way to achieve what I want?

Processes require a function to run, which defines what the process is actually doing. Once this function exits (and there are no non-daemon threads) the process is done. This is similar to how Python itself always executes a __main__ script.
If you do mp.Process(target=CustomObject, args=(events_queue,)) that just tells the process to call CustomObject - which instantiates it once and then is done. This is not what you want, unless the class actually performs work when instantiated - which is a bad idea for other reasons.
Instead, you must define a main function or method that handles what you need: "communicate with events shared through the events_queue". This function should listen to the queue and take action depending on the events received.
A simple implementation looks like this:
import os, time
from multiprocessing import Queue, Process
class Worker:
# separate input and output for simplicity
def __init__(self, commands: Queue, results: Queue):
self.commands = commands
self.results = results
# our main function to be run by a process
def main(self):
# each process should handle more than one command
while True:
value = self.commands.get()
# pick a well-defined signal to detect "no more work"
if value is None:
self.results.put(None)
break
# do whatever needs doing
result = self.do_stuff(value)
print(os.getpid(), ':', self, 'got', value, 'put', result)
time.sleep(0.2) # pretend we do something
# pass on more work if required
self.results.put(result)
# placeholder for what needs doing
def do_stuff(self, value):
raise NotImplementedError
This is a template for a class that just keeps on processing events. The do_stuff method must be overloaded to define what actually happens.
class AddTwo(Worker):
def do_stuff(self, value):
return value + 2
class TimesThree(Worker):
def do_stuff(self, value):
return value * 3
class Printer(Worker):
def do_stuff(self, value):
print(value)
This already defines fully working process payloads: Process(target=TimesThree(in_queue, out_queue).main) schedules the main method in a process, listening for and responding to commands.
Running this mainly requires connecting the individual components:
if __name__ == '__main__':
# bookkeeping of resources we create
processes = []
start_queue = Queue()
# connect our workers via queues
queue = start_queue
for element in (AddTwo, TimesThree, Printer):
instance = element(queue, Queue())
# we run the main method in processes
processes.append(Process(target=instance.main))
queue = instance.results
# start all processes
for process in processes:
process.start()
# send input, but do not wait for output
start_queue.put(1)
start_queue.put(248124)
start_queue.put(-256)
# send shutdown signal
start_queue.put(None)
# wait for processes to shutdown
for process in processes:
process.join()
Note that you do not need classes for this. You can also compose functions for a similar effect, as long as everything is pickle-able:
import os, time
from multiprocessing import Queue, Process
def main(commands, results, do_stuff):
while True:
value = commands.get()
if value is None:
results.put(None)
break
result = do_stuff(value)
print(os.getpid(), ':', do_stuff, 'got', value, 'put', result)
time.sleep(0.2)
results.put(result)
def times_two(value):
return value * 2
if __name__ == '__main__':
in_queue, out_queue = Queue(), Queue()
worker = Process(target=main, args=(in_queue, out_queue, times_two))
worker.start()
for message in (1, 3, 5, None):
in_queue.put(message)
while True:
reply = out_queue.get()
if reply is None:
break
print('result:', reply)

Accessing variables of non-inherited class

I have a module testrun.py which runs all the tests. One of the tests is SWStatus such that
class HWStatus(myTest):
check = []
def __init__(self):
super(SWStatus, self).__init__()
def setup(self):
return
def work(self):
"""
some functionality to calculate the value of i
i is either 10 or 20
"""
if i == 10:
status = True
else:
status = False
check.append(status)
To run this test I do python testrun.py SWStatus and it gives me the results.
I have created HWStatus test such that it will run SWStatus test 10 times.
class HWStatus(myTest):
def __init__(self):
super(SWStatus, self).__init__()
def setup(self):
return
def work(self):
for i in xrange(10):
args = ['python', 'testrun.py', 'SWStatus']
p = subprocess.Popen(args)
while p.poll() != 0:
time.sleep(amount_of_time)
When I do testrun.py HWStatus, it runs SWStatus 10 times.
I'm facing 2 problems here.
I wanted to have check list of 10 values. such that each time it'll append either True or False depending on the logic. But because I'm running SWStatus from HWStatus, check is getting initialized to empty list each time. So even though I'm doing check.append(status), I'm getting just one value. How should I tackle this problem?
My 2nd question is, is there any way where I can access check list from the work method of my HWStatuseven though HWStatus is not inherited from SWStatus?
Can I do something like:
class HWStatus(myTest):
def __init__(self):
super(SWStatus, self).__init__()
def setup(self):
return
def work(self):
for i in xrange(10):
args = ['python', 'testrun.py', 'SWStatus']
p = subprocess.Popen(args)
while p.poll() != 0:
time.sleep(amount_of_time)
print "List of 10",check

Inheritance doesn't affect member visibility in python; all variables in python are visible as long as they're within the lexical scope.
The way you're running your tests though (in separate processes) creates different copies of SWStatus.check. When you start a new process, you create a separate memory area that it runs in. So, 11 copies of the SWstatus.check variable get created in your code, and none can see any other.
I suspect what you want to do is run the tests in parallel, in which case it's better to have the test return its status as an exit status...
import sys
if __name__ == 'main':
t = SWStatus()
sys.exit(not t.work())
However, if you absolutely need all of the tests to run in the same address space, you can use threads instead of processes. However, you'll need to use something like a Queue to coordinate concurrent access to memory.

thread class instance creating a thread function

I have a thread class, in it, I want to create a thread function to do its job corrurently with the thread instance. Is it possible, if yes, how ?
run function of thread class is doing a job at every, excatly, x seconds. I want to create a thread function to do a job parallel with the run function.
class Concurrent(threading.Thread):
def __init__(self,consType, consTemp):
# something
def run(self):
# make foo as a thread
def foo (self):
# something
If not, think about below case, is it possible, how ?
class Concurrent(threading.Thread):
def __init__(self,consType, consTemp):
# something
def run(self):
# make foo as a thread
def foo ():
# something
If it is unclear, please tell . I will try to reedit

Just launch another thread. You already know how to create them and start them, so simply write another sublcass of Thread and start() it along the ones you already have.
Change def foo() for a Thread subclass with run() instead of foo().

First of all, I suggest the you will reconsider using threads. In most cases in Python you should use multiprocessing instead.. That is because Python's GIL.
Unless you are using Jython or IronPython..
If I understood you correctly, just open another thread inside the thread you already opened:
import threading
class FooThread(threading.Thread):
def __init__(self, consType, consTemp):
super(FooThread, self).__init__()
self.consType = consType
self.consTemp = consTemp
def run(self):
print 'FooThread - I just started'
# here will be the implementation of the foo function
class Concurrent(threading.Thread):
def __init__(self, consType, consTemp):
super(Concurrent, self).__init__()
self.consType = consType
self.consTemp = consTemp
def run(self):
print 'Concurrent - I just started'
threadFoo = FooThread('consType', 'consTemp')
threadFoo.start()
# do something every X seconds
if __name__ == '__main__':
thread = Concurrent('consType', 'consTemp')
thread.start()
The output of the program will be:
Concurrent - I just startedFooThread - I just started

QProcess.readAllStandardOutput() doesn't seem to read anything - PyQt

Here is the code sample:
class RunGui (QtGui.QMainWindow)
def __init__(self, parent=None):
...
QtCore.Qobject.connect(self.ui.actionNew, QtCore.SIGNAL("triggered()"), self.new_select)
...
def normal_output_written(self, qprocess):
self.ui.text_edit.append("caught outputReady signal") #works
self.ui.text_edit.append(str(qprocess.readAllStandardOutput())) # doesn't work
def new_select(self):
...
dialog_np = NewProjectDialog()
dialog_np.exec_()
if dialog_np.is_OK:
section = dialog_np.get_section()
project = dialog_np.get_project()
...
np = NewProject()
np.outputReady.connect(lambda: self.normal_output_written(np.qprocess))
np.errorReady.connect(lambda: self.error_output_written(np.qprocess))
np.inputNeeded.connect(lambda: self.input_from_line_edit(np.qprocess))
np.params = partial(np.create_new_project, section, project, otherargs)
np.start()
class NewProject(QtCore.QThread):
outputReady = QtCore.pyqtSignal(object)
errorReady = QtCore.pyqtSignal(object)
inputNeeded = QtCore.pyqtSignal(object)
params = None
message = ""
def __init__(self):
super(NewProject, self).__init__()
self.qprocess = QtCore.QProcess()
self.qprocess.moveToThread(self)
self._inputQueue = Queue()
def run(self):
self.params()
def create_new_project(self, section, project, otherargs):
...
# PyDev for some reason skips the breakpoints inside the thread
self.qprocess.start(command)
self.qprocess.waitForReadyRead()
self.outputReady.emit(self.qprocess) # works - I'm getting signal in RunGui.normal_output_written()
print(str(self.qprocess.readAllStandardOutput())) # prints empty line
.... # other actions inside the method requiring "command" to finish properly.
The idea is beaten to death - get the GUI to run scripts and communicate with the processes. The challenge in this particular example is that the script started in QProcess as command runs an app, that requires user input (confirmation) along the way. Therefore I have to be able to start the script, get all output and parse it, wait for the question to appear in the output and then communicate back the answer, allow it to finish and only then to proceed further with other actions inside create_new_project()

I don't know if this will fix your overall issue, but there are a few design issues I see here.
You are passing around the qprocess between threads instead of just emitting your custom signals with the results of the qprocess
You are using class-level attributes that should probably be instance attributes
Technically you don't even need the QProcess, since you are running it in your thread and actively using blocking calls. It could easily be a subprocess.Popen...but anyways, I might suggest changes like this:
class RunGui (QtGui.QMainWindow)
...
def normal_output_written(self, msg):
self.ui.text_edit.append(msg)
def new_select(self):
...
np = NewProject()
np.outputReady.connect(self.normal_output_written)
np.params = partial(np.create_new_project, section, project, otherargs)
np.start()
class NewProject(QtCore.QThread):
outputReady = QtCore.pyqtSignal(object)
errorReady = QtCore.pyqtSignal(object)
inputNeeded = QtCore.pyqtSignal(object)
def __init__(self):
super(NewProject, self).__init__()
self._inputQueue = Queue()
self.params = None
def run(self):
self.params()
def create_new_project(self, section, project, otherargs):
...
qprocess = QtCore.QProcess()
qprocess.start(command)
if not qprocess.waitForStarted():
# handle a failed command here
return
if not qprocess.waitForReadyRead():
# handle a timeout or error here
return
msg = str(self.qprocess.readAllStandardOutput())
self.outputReady.emit(msg)
Don't pass around the QProcess. Just emit the data. And create it from within the threads method so that it is automatically owned by that thread. Your outside classes should really not have any knowledge of that QProcess object. It doesn't even need to be a member attribute since its only needed during the operation.
Also make sure you are properly checking that your command both successfully started, and is running and outputting data.
Update
To clarify some problems you might be having (per the comments), I wanted to suggest that QProcess might not be the best option if you need to have interactive control with processes that expect periodic user input. It should work find for running scripts that just produce output from start to finish, though really using subprocess would be much easier. For scripts that need user input over time, your best bet may be to use pexpect. It allows you to spawn a process, and then watch for various patterns that you know will indicate the need for input:
foo.py
import time
i = raw_input("Please enter something: ")
print "Output:", i
time.sleep(.1)
print "Another line"
time.sleep(.1)
print "Done"
test.py
import pexpect
import time
child = pexpect.spawn("python foo.py")
child.setecho(False)
ret = -1
while ret < 0:
time.sleep(.05)
ret = child.expect("Please enter something: ")
child.sendline('FOO')
while True:
line = child.readline()
if not line:
break
print line.strip()
# Output: FOO
# Another line
# Done

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Multiprocessing outside of main producing unexpected result - python

Related

How to add a pool of processes available for a multiprocessing queue

How to run Python custom objects in separate processes, all working on a shared events queue?

Accessing variables of non-inherited class

thread class instance creating a thread function

QProcess.readAllStandardOutput() doesn't seem to read anything - PyQt

Categories

Resources