I am trying to create some multiprocessing code for my project. I have created a snippet of the things that I want to do. However its not working as per my expectations. Can you please let me know what is wrong with this.
from multiprocessing import Process, Pipe
import time
class A:
def __init__(self,rpipe,spipe):
print "In the function fun()"
def run(self):
print"in run method"
time.sleep(5)
message = rpipe.recv()
message = str(message).swapcase()
spipe.send(message)
workers = []
my_pipe_1 = Pipe(False)
my_pipe_2 = Pipe(False)
proc_handle = Process(target = A, args=(my_pipe_1[0], my_pipe_2[1],))
workers.append(proc_handle)
proc_handle.run()
my_pipe_1[1].send("hello")
message = my_pipe_2[0].recv()
print message
print "Back in the main function now"
The trace back displayed when i press ctrl-c:
^CTraceback (most recent call last):
File "sim.py", line 22, in <module>
message = my_pipe_2[0].recv()
KeyboardInterrupt
When I run this above code, the main process does not continue after calling "proc_handle.run". Why is this?
You've misunderstood how to use Process. You're creating a Process object, and passing it a class as target, but target is meant to be passed a callable (usually a function) that Process.run then executes. So in your case it's just instantiating A inside Process.run, and that's it.
You should instead make your A class a Process subclass, and just instantiate it directly:
#!/usr/bin/python
from multiprocessing import Process, Pipe
import time
class A(Process):
def __init__(self,rpipe,spipe):
print "In the function fun()"
super(A, self).__init__()
self.rpipe = rpipe
self.spipe = spipe
def run(self):
print"in run method"
time.sleep(5)
message = self.rpipe.recv()
message = str(message).swapcase()
self.spipe.send(message)
if __name__ == "__main__":
workers = []
my_pipe_1 = Pipe(False)
my_pipe_2 = Pipe(False)
proc_handle = A(my_pipe_1[0], my_pipe_2[1])
workers.append(proc_handle)
proc_handle.start()
my_pipe_1[1].send("hello")
message = my_pipe_2[0].recv()
print message
print "Back in the main function now"
mgilson was right, though. You should call start(), not run(), to make A.run execute in a child process.
With these changes, the program works fine for me:
dan#dantop:~> ./mult.py
In the function fun()
in run method
HELLO
Back in the main function now
Taking a stab at this one, I think it's because you're calling proc_handle.run() instead of proc_handle.start().
The former is the activity that the process is going to do -- the latter actually arranges for run to be called on a separate process. In other words, you're never forking the process, so there's no other process for my_pipe_1[1] to communicate with so it hangs.
Related
I'm using the multiprocessing library to launch a Process in parallel with the main one. I use the target argument at the initialisation to specify a function to execute. But the function is not executed approximatively 1 out of 3 times.
After digging into the multiprocessing library and using monkey patches to debug, I found out that the method _bootstrap of BaseProcess (the Process class inherits from BaseProcess), that is supposed to call the function specified in the target parameters at the initialisation, was not called when the method start() of the Process was called.
As my OS is Ubuntu 18.04, the default method to start the process is fork. So the Popen used to launch the process is in the file popen_fork.py of the multiprocessing library. And in this Popen class, the method _launch is calling os.fork() and then calling the Process's _bootstrap method.
With a monkey patch, I found out that the code supposed to be executed in the child process is not executed at all, and this is why the function specified in the target parameter when initializing the process was not executed when the method start() was called.
It is not possible to reproduce the problem in a simpler environment than the one I am working on. But here is some code that represents what I am doing, and what is my problem :
import time
from multiprocessing import Process
from multiprocessing.managers import BaseManager
class A:
def __init__(self, manager):
# manager is an object created by registering it in
# multiprocessing.managers.BaseManager, so it is made for interprocess
# communication
self.manager = manager
self.p = Process(target=self.process_method, args=(self.manager, ))
def start(self):
self.p.start()
def process_method(self, manager):
# This is the method that is not executed 2 out of 3 times
print("(A.process_method) Entering method")
c = 0
while True:
print(f"(A.process_method) Sending message : c = {c}")
manager.on_reception(f"c = {c}")
time.sleep(5)
class Manager:
def __init__(self):
self.msg = None
self.unread_msg = False
def on_reception(self, msg):
self.msg = msg
self.unread_msg = True
def get_last_msg(self):
if self.unread_msg:
self.unread_msg = False
return self.msg
else:
return None
if __name__ == "__main__":
BaseManager.register("Manager", Manager)
bm = BaseManager()
bm.start()
manager = bm.Manager()
a = A(manager)
a.start()
while True:
msg = manager.get_last_msg()
if msg is not None:
print(msg)
The method that should be executed every time is A.process_method. In this example, it is executed every time, but in my environment, it is not.
Does anyone ever had this problem and knows how to fix it ?
After digging more, I found out that a flask server was launched in Thread and not in a Process. I changed it to run in a Process instead of a Thread, and now everything is running as it is supposed to.
Both Flask and my Process are using the logging package. And this can cause a deadlock when launching a new Process.
I want all of my threads to start at the same time, but in my code, it waits for the previous thread to finish it's process before starting a new one. I want all of the threads to start in parallel.
My Code:
class Main(object):
start = True
config = True
givenName = True
def obscure(self, i):
i = i
while self.start:
Config.userInfo(i)
break
while self.config:
Config.open()
break
while self.givenName:
Browser.openSession()
break
Main = Main()
while __name__=='__main__':
Config.userInfo()
Config.open()
for i in range(len(Config.names)):
Task = Thread(target=Main.obscure(i))
Task.start()
break
This line is the main problem:
Task = Thread(target=Main.obscure(i))
target is passed the result of calling Main.obscure(i), not the function to be run in the thread. You are currently running the function in the main thread then passing, essentially, target=None.
You want:
Task = Thread(target=Main.obscure, args=(i,))
Then, Thread will arrange to call Main.obscure with the listed arguments inside the thread.
Also, Main = Main() overwrites the class Main declaration with an instance of Main...but you'll never be able to make another instance since you've lost the reference to the class. Use another name, such as main = Main().
Often there is a need for the program to wait for a function to complete its work. Sometimes it is opposite: there is no need for a main program to wait.
I've put a simple example. There are four buttons. Clicking each will call the same calculate() function. The only difference is the way the function is called.
"Call Directly" button calls calculate() function directly. Since there is a 'Function End' print out it is evident that the program is waiting for the calculate function to complete its job.
"Call via Threading" calls the same function this time using threading mechanism. Since the program prints out ': Function End' message immidiately after the button is presses I can conclude the program doesn't wait for calculate() function to complete. How to override this behavior? How to make program wait till calculate() function is finished?
"Call via Multiprocessing" buttons utilizes multiprocessing to call calculate() function.
Just like with threading multiprocessing doesn't wait for function completion. What statement we have to put in order to make it wait?
"Call via Subprocess" buttons doesn't do anything since I didn't figure out the way to hook subprocess to run internal script function or method. It would be interesting to see how to do it...
Example:
from PyQt4 import QtCore, QtGui
app = QtGui.QApplication(sys.argv)
def calculate(listArg=None):
print '\n\t Starting calculation...'
m=0
for i in range(50000000):
m+=i
print '\t ...calculation completed\n'
class Dialog_01(QtGui.QMainWindow):
def __init__(self):
super(Dialog_01, self).__init__()
myQWidget = QtGui.QWidget()
myBoxLayout = QtGui.QVBoxLayout()
directCall_button = QtGui.QPushButton("Call Directly")
directCall_button.clicked.connect(self.callDirectly)
myBoxLayout.addWidget(directCall_button)
Button_01 = QtGui.QPushButton("Call via Threading")
Button_01.clicked.connect(self.callUsingThreads)
myBoxLayout.addWidget(Button_01)
Button_02 = QtGui.QPushButton("Call via Multiprocessing")
Button_02.clicked.connect(self.callUsingMultiprocessing)
myBoxLayout.addWidget(Button_02)
Button_03 = QtGui.QPushButton("Call via Subprocess")
Button_03.clicked.connect(self.callUsingSubprocess)
myBoxLayout.addWidget(Button_03)
myQWidget.setLayout(myBoxLayout)
self.setCentralWidget(myQWidget)
self.setWindowTitle('Dialog 01')
def callUsingThreads(self):
print '------------------------------- callUsingThreads() ----------------------------------'
import threading
self.myEvent=threading.Event()
self.c_thread=threading.Thread(target=calculate)
self.c_thread.start()
print "\n\t\t : Function End"
def callUsingMultiprocessing(self):
print '------------------------------- callUsingMultiprocessing() ----------------------------------'
from multiprocessing import Pool
pool = Pool(processes=3)
try: pool.map_async( calculate, ['some'])
except Exception, e: print e
print "\n\t\t : Function End"
def callDirectly(self):
print '------------------------------- callDirectly() ----------------------------------'
calculate()
print "\n\t\t : Function End"
def callUsingSubprocess(self):
print '------------------------------- callUsingSubprocess() ----------------------------------'
import subprocess
print '-missing code solution'
print "\n\t\t : Function End"
if __name__ == '__main__':
dialog_1 = Dialog_01()
dialog_1.show()
dialog_1.resize(480,320)
sys.exit(app.exec_())
Use a queue: each thread when completed puts the result on the queue and then you just need to read the appropriate number of results and ignore the remainder:
#!python3.3
import queue # For Python 2.x use 'import Queue as queue'
import threading, time, random
def func(id, result_queue):
print("Thread", id)
time.sleep(random.random() * 5)
result_queue.put((id, 'done'))
def main():
q = queue.Queue()
threads = [ threading.Thread(target=func, args=(i, q)) for i in range(5) ]
for th in threads:
th.daemon = True
th.start()
result1 = q.get()
result2 = q.get()
print("Second result: {}".format(result2))
if __name__=='__main__':
main()
Documentation for Queue.get() (with no arguments it is equivalent to Queue.get(True, None):
Queue.get([block[, timeout]])
Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).
How to wait until only the first thread is finished in Python
You can to use .join() method too.
what is the use of join() in python threading
I find that using the "pool" submodule within "multiprocessing" works amazingly for executing multiple processes at once within a Python Script.
See Section: Using a pool of workers
Look carefully at "# launching multiple evaluations asynchronously may use more processes" in the example. Once you understand what those lines are doing, the following example I constructed will make a lot of sense.
import numpy as np
from multiprocessing import Pool
def desired_function(option, processes, data, etc...):
# your code will go here. option allows you to make choices within your script
# to execute desired sections of code for each pool or subprocess.
return result_array # "for example"
result_array = np.zeros("some shape") # This is normally populated by 1 loop, lets try 4.
processes = 4
pool = Pool(processes=processes)
args = (processes, data, etc...) # Arguments to be passed into desired function.
multiple_results = []
for i in range(processes): # Executes each pool w/ option (1-4 in this case).
multiple_results.append(pool.apply_async(param_process, (i+1,)+args)) # Syncs each.
results = np.array(res.get() for res in multiple_results) # Retrieves results after
# every pool is finished!
for i in range(processes):
result_array = result_array + results[i] # Combines all datasets!
The code will basically run the desired function for a set number of processes. You will have to carefully make you're function can distinguish between each process (hence why I added the variable "option".) Additionally, it doesn't have to be an array that is being populated in the end, but for my example thats how I used it. Hope this simplifies or helps you better understand the power of multiprocessing in Python!
Im trying to understand threading (im new to it) to get my code better. For now, i have a class in a .py file with some functions.
In my main, i initialize an object for this class in each program i have. But, with threads, i would like to be able to create all this objects in one program and call the function with thread.
def inicializa():
clientList = list()
thread_list = list()
config = configparser.ConfigParser()
config.read("accounts.ini")
for section in config.sections(): #define a section da conta que vou usar
email = config.get(section,'email')
password = config.get(section,'password')
the_hash = config.get(section,'hash')
authdata = config.get(section,'authdata')
authdata = eval(authdata)
client = MyClient(email,password,the_hash,authdata)
clientList.append(client)
for client in clientList:
t = threading.Thread(target=client.getBla()) # this function is inside of my class, its work OK outside of the thread if i put client.getBla.
thread_list.append(t)
for thread in thread_list:
thread.start()
return clientList
the error i get when i try to use the thread to start the function client.getBla is:
Exception in thread Thread-1:
TypeError: int object is not callable.
My function dosnt take any arguments, so i don't know whats going on, because i client.getBla() outside of the threads works ok.
Thank you all.
t = threading.Thread(target=client.getBla())
What this line does is evaluate client.getBla() (which returns a int) and pass it as a named argument to the Thread. The target argument takes a callable, so you should do this instead:
t = threading.Thread(target=client.getBla)
Doing this, you pass the function itself, not the function result
Here is the code sample:
class RunGui (QtGui.QMainWindow)
def __init__(self, parent=None):
...
QtCore.Qobject.connect(self.ui.actionNew, QtCore.SIGNAL("triggered()"), self.new_select)
...
def normal_output_written(self, qprocess):
self.ui.text_edit.append("caught outputReady signal") #works
self.ui.text_edit.append(str(qprocess.readAllStandardOutput())) # doesn't work
def new_select(self):
...
dialog_np = NewProjectDialog()
dialog_np.exec_()
if dialog_np.is_OK:
section = dialog_np.get_section()
project = dialog_np.get_project()
...
np = NewProject()
np.outputReady.connect(lambda: self.normal_output_written(np.qprocess))
np.errorReady.connect(lambda: self.error_output_written(np.qprocess))
np.inputNeeded.connect(lambda: self.input_from_line_edit(np.qprocess))
np.params = partial(np.create_new_project, section, project, otherargs)
np.start()
class NewProject(QtCore.QThread):
outputReady = QtCore.pyqtSignal(object)
errorReady = QtCore.pyqtSignal(object)
inputNeeded = QtCore.pyqtSignal(object)
params = None
message = ""
def __init__(self):
super(NewProject, self).__init__()
self.qprocess = QtCore.QProcess()
self.qprocess.moveToThread(self)
self._inputQueue = Queue()
def run(self):
self.params()
def create_new_project(self, section, project, otherargs):
...
# PyDev for some reason skips the breakpoints inside the thread
self.qprocess.start(command)
self.qprocess.waitForReadyRead()
self.outputReady.emit(self.qprocess) # works - I'm getting signal in RunGui.normal_output_written()
print(str(self.qprocess.readAllStandardOutput())) # prints empty line
.... # other actions inside the method requiring "command" to finish properly.
The idea is beaten to death - get the GUI to run scripts and communicate with the processes. The challenge in this particular example is that the script started in QProcess as command runs an app, that requires user input (confirmation) along the way. Therefore I have to be able to start the script, get all output and parse it, wait for the question to appear in the output and then communicate back the answer, allow it to finish and only then to proceed further with other actions inside create_new_project()
I don't know if this will fix your overall issue, but there are a few design issues I see here.
You are passing around the qprocess between threads instead of just emitting your custom signals with the results of the qprocess
You are using class-level attributes that should probably be instance attributes
Technically you don't even need the QProcess, since you are running it in your thread and actively using blocking calls. It could easily be a subprocess.Popen...but anyways, I might suggest changes like this:
class RunGui (QtGui.QMainWindow)
...
def normal_output_written(self, msg):
self.ui.text_edit.append(msg)
def new_select(self):
...
np = NewProject()
np.outputReady.connect(self.normal_output_written)
np.params = partial(np.create_new_project, section, project, otherargs)
np.start()
class NewProject(QtCore.QThread):
outputReady = QtCore.pyqtSignal(object)
errorReady = QtCore.pyqtSignal(object)
inputNeeded = QtCore.pyqtSignal(object)
def __init__(self):
super(NewProject, self).__init__()
self._inputQueue = Queue()
self.params = None
def run(self):
self.params()
def create_new_project(self, section, project, otherargs):
...
qprocess = QtCore.QProcess()
qprocess.start(command)
if not qprocess.waitForStarted():
# handle a failed command here
return
if not qprocess.waitForReadyRead():
# handle a timeout or error here
return
msg = str(self.qprocess.readAllStandardOutput())
self.outputReady.emit(msg)
Don't pass around the QProcess. Just emit the data. And create it from within the threads method so that it is automatically owned by that thread. Your outside classes should really not have any knowledge of that QProcess object. It doesn't even need to be a member attribute since its only needed during the operation.
Also make sure you are properly checking that your command both successfully started, and is running and outputting data.
Update
To clarify some problems you might be having (per the comments), I wanted to suggest that QProcess might not be the best option if you need to have interactive control with processes that expect periodic user input. It should work find for running scripts that just produce output from start to finish, though really using subprocess would be much easier. For scripts that need user input over time, your best bet may be to use pexpect. It allows you to spawn a process, and then watch for various patterns that you know will indicate the need for input:
foo.py
import time
i = raw_input("Please enter something: ")
print "Output:", i
time.sleep(.1)
print "Another line"
time.sleep(.1)
print "Done"
test.py
import pexpect
import time
child = pexpect.spawn("python foo.py")
child.setecho(False)
ret = -1
while ret < 0:
time.sleep(.05)
ret = child.expect("Please enter something: ")
child.sendline('FOO')
while True:
line = child.readline()
if not line:
break
print line.strip()
# Output: FOO
# Another line
# Done