Python multiprocessing creates sub-process using wrong function - python

I'm trying to write code that create sub-process using another module(demo_2.py),
and exit program if i get wanted value on sub-processes.
But result looks like this.
It seems that demo_1 makes two sub-process that run demo_1 and load demo_2.
I want to make sub-process only runs demo_2.
What did i missed?
demo_1.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from multiprocessing import Process,Queue
import sys
import demo_2 as A
def multi_process():
print ("Function multi_process called!")
process_status_A = Queue()
process_status_B = Queue()
A_Process = Process(target = A.process_A, args = (process_status_A,))
B_Process = Process(target = A.process_A, args = (process_status_B,))
A_Process.start()
B_Process.start()
while True:
process_status_output_A = process_status_A.get()
process_status_output_B = process_status_B.get()
if process_status_output_A == 'exit' and process_status_output_B == 'exit':
print ("Success!")
break
process_status_A.close()
process_status_B.close()
A_Process.join()
B_Process.join()
sys.exit()
print ("demo_1 started")
if __name__ == "__main__":
multi_process()
demo_2.py
class process_A(object):
def __init__(self, process_status):
print ("demo_2 called!")
process_status.put('exit')
def call_exit(self):
pass

if process_status_A == 'exit' and process_status_B == 'exit':
should be
if process_status_A_output == 'exit' and process_status_B_output == 'exit':
Conclusion: The naming of variables is important.
Avoid long variable names which are almost the same (such as process_status_A and process_status_A_output).
Placing the distinguishing part of the variable name first helps clarify the meaning of the variable.
So instead of
process_status_A_output
process_status_B_output
perhaps use
output_A
output_B
Because Windows lacks os.fork,
on Windows every time a new subprocess is spawned, a new Python interpreter is started and the calling module is imported.
Therefore, code that you do not wish to be run in the spawned subprocess must be "protected" inside the if-statement (see in particular the section entitled "Safe importing of main module"):
Thus use
if __name__ == "__main__":
print ("demo_1 started")
multi_process()
to avoid printing the extra "demo_1 started" messages.

Related

Python Multiprocessing workflow troubleshooting [duplicate]

I am trying my very first formal python program using Threading and Multiprocessing on a windows machine. I am unable to launch the processes though, with python giving the following message. The thing is, I am not launching my threads in the main module. The threads are handled in a separate module inside a class.
EDIT: By the way this code runs fine on ubuntu. Not quite on windows
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.
This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
My original code is pretty long, but I was able to reproduce the error in an abridged version of the code. It is split in two files, the first is the main module and does very little other than import the module which handles processes/threads and calls a method. The second module is where the meat of the code is.
testMain.py:
import parallelTestModule
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
parallelTestModule.py:
import multiprocessing
from multiprocessing import Process
import threading
class ThreadRunner(threading.Thread):
""" This class represents a single instance of a running thread"""
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name
def run(self):
print self.name,'\n'
class ProcessRunner:
""" This class represents a single instance of a running process """
def runp(self, pid, numThreads):
mythreads = []
for tid in range(numThreads):
name = "Proc-"+str(pid)+"-Thread-"+str(tid)
th = ThreadRunner(name)
mythreads.append(th)
for i in mythreads:
i.start()
for i in mythreads:
i.join()
class ParallelExtractor:
def runInParallel(self, numProcesses, numThreads):
myprocs = []
prunner = ProcessRunner()
for pid in range(numProcesses):
pr = Process(target=prunner.runp, args=(pid, numThreads))
myprocs.append(pr)
# if __name__ == 'parallelTestModule': #This didnt work
# if __name__ == '__main__': #This obviously doesnt work
# multiprocessing.freeze_support() #added after seeing error to no avail
for i in myprocs:
i.start()
for i in myprocs:
i.join()
On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.
Modified testMain.py:
import parallelTestModule
if __name__ == '__main__':
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
Try putting your code inside a main function in testMain.py
import parallelTestModule
if __name__ == '__main__':
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
See the docs:
"For an explanation of why (on Windows) the if __name__ == '__main__'
part is necessary, see Programming guidelines."
which say
"Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process)."
... by using if __name__ == '__main__'
Though the earlier answers are correct, there's a small complication it would help to remark on.
In case your main module imports another module in which global variables or class member variables are defined and initialized to (or using) some new objects, you may have to condition that import in the same way:
if __name__ == '__main__':
import my_module
As #Ofer said, when you are using another libraries or modules, you should import all of them inside the if __name__ == '__main__':
So, in my case, ended like this:
if __name__ == '__main__':
import librosa
import os
import pandas as pd
run_my_program()
hello here is my structure for multi process
from multiprocessing import Process
import time
start = time.perf_counter()
def do_something(time_for_sleep):
print(f'Sleeping {time_for_sleep} second...')
time.sleep(time_for_sleep)
print('Done Sleeping...')
p1 = Process(target=do_something, args=[1])
p2 = Process(target=do_something, args=[2])
if __name__ == '__main__':
p1.start()
p2.start()
p1.join()
p2.join()
finish = time.perf_counter()
print(f'Finished in {round(finish-start,2 )} second(s)')
you don't have to put imports in the if __name__ == '__main__':, just running the program you wish to running inside
In yolo v5 with python 3.8.5
if __name__ == '__main__':
from yolov5 import train
train.run()
In my case it was a simple bug in the code, using a variable before it was created. Worth checking that out before trying the above solutions. Why I got this particular error message, Lord knows.
The below solution should work for both python multiprocessing and pytorch multiprocessing.
As other answers mentioned that the fix is to have if __name__ == '__main__': but I faced several issues in identifying where to start because I am using several scripts and modules. When I can call my first function inside main then everything before it started to create multiple processes (not sure why).
Putting it at the very first line (even before the import) worked. Only calling the first function return timeout error. The below is the first file of my code and multiprocessing is used after calling several functions but putting main in the first seems to be the only fix here.
if __name__ == '__main__':
from mjrl.utils.gym_env import GymEnv
from mjrl.policies.gaussian_mlp import MLP
from mjrl.baselines.quadratic_baseline import QuadraticBaseline
from mjrl.baselines.mlp_baseline import MLPBaseline
from mjrl.algos.npg_cg import NPG
from mjrl.algos.dapg import DAPG
from mjrl.algos.behavior_cloning import BC
from mjrl.utils.train_agent import train_agent
from mjrl.samplers.core import sample_paths
import os
import json
import mjrl.envs
import mj_envs
import time as timer
import pickle
import argparse
import numpy as np
# ===============================================================================
# Get command line arguments
# ===============================================================================
parser = argparse.ArgumentParser(description='Policy gradient algorithms with demonstration data.')
parser.add_argument('--output', type=str, required=True, help='location to store results')
parser.add_argument('--config', type=str, required=True, help='path to config file with exp params')
args = parser.parse_args()
JOB_DIR = args.output
if not os.path.exists(JOB_DIR):
os.mkdir(JOB_DIR)
with open(args.config, 'r') as f:
job_data = eval(f.read())
assert 'algorithm' in job_data.keys()
assert any([job_data['algorithm'] == a for a in ['NPG', 'BCRL', 'DAPG']])
job_data['lam_0'] = 0.0 if 'lam_0' not in job_data.keys() else job_data['lam_0']
job_data['lam_1'] = 0.0 if 'lam_1' not in job_data.keys() else job_data['lam_1']
EXP_FILE = JOB_DIR + '/job_config.json'
with open(EXP_FILE, 'w') as f:
json.dump(job_data, f, indent=4)
# ===============================================================================
# Train Loop
# ===============================================================================
e = GymEnv(job_data['env'])
policy = MLP(e.spec, hidden_sizes=job_data['policy_size'], seed=job_data['seed'])
baseline = MLPBaseline(e.spec, reg_coef=1e-3, batch_size=job_data['vf_batch_size'],
epochs=job_data['vf_epochs'], learn_rate=job_data['vf_learn_rate'])
# Get demonstration data if necessary and behavior clone
if job_data['algorithm'] != 'NPG':
print("========================================")
print("Collecting expert demonstrations")
print("========================================")
demo_paths = pickle.load(open(job_data['demo_file'], 'rb'))
########################################################################################
demo_paths = demo_paths[0:3]
print (job_data['demo_file'], len(demo_paths))
for d in range(len(demo_paths)):
feats = demo_paths[d]['features']
feats = np.vstack(feats)
demo_paths[d]['observations'] = feats
########################################################################################
bc_agent = BC(demo_paths, policy=policy, epochs=job_data['bc_epochs'], batch_size=job_data['bc_batch_size'],
lr=job_data['bc_learn_rate'], loss_type='MSE', set_transforms=False)
in_shift, in_scale, out_shift, out_scale = bc_agent.compute_transformations()
bc_agent.set_transformations(in_shift, in_scale, out_shift, out_scale)
bc_agent.set_variance_with_data(out_scale)
ts = timer.time()
print("========================================")
print("Running BC with expert demonstrations")
print("========================================")
bc_agent.train()
print("========================================")
print("BC training complete !!!")
print("time taken = %f" % (timer.time() - ts))
print("========================================")
# if job_data['eval_rollouts'] >= 1:
# score = e.evaluate_policy(policy, num_episodes=job_data['eval_rollouts'], mean_action=True)
# print("Score with behavior cloning = %f" % score[0][0])
if job_data['algorithm'] != 'DAPG':
# We throw away the demo data when training from scratch or fine-tuning with RL without explicit augmentation
demo_paths = None
# ===============================================================================
# RL Loop
# ===============================================================================
rl_agent = DAPG(e, policy, baseline, demo_paths,
normalized_step_size=job_data['rl_step_size'],
lam_0=job_data['lam_0'], lam_1=job_data['lam_1'],
seed=job_data['seed'], save_logs=True
)
print("========================================")
print("Starting reinforcement learning phase")
print("========================================")
ts = timer.time()
train_agent(job_name=JOB_DIR,
agent=rl_agent,
seed=job_data['seed'],
niter=job_data['rl_num_iter'],
gamma=job_data['rl_gamma'],
gae_lambda=job_data['rl_gae'],
num_cpu=job_data['num_cpu'],
sample_mode='trajectories',
num_traj=job_data['rl_num_traj'],
num_samples= job_data['rl_num_samples'],
save_freq=job_data['save_freq'],
evaluation_rollouts=job_data['eval_rollouts'])
print("time taken = %f" % (timer.time()-ts))
I ran into the same problem. #ofter method is correct because there are some details to pay attention to. The following is the successful debugging code I modified for your reference:
if __name__ == '__main__':
import matplotlib.pyplot as plt
import numpy as np
def imgshow(img):
img = img / 2 + 0.5
np_img = img.numpy()
plt.imshow(np.transpose(np_img, (1, 2, 0)))
plt.show()
dataiter = iter(train_loader)
images, labels = dataiter.next()
imgshow(torchvision.utils.make_grid(images))
print(' '.join('%5s' % classes[labels[i]] for i in range(4)))
For the record, I don't have a subroutine, I just have a main program, but I have the same problem as you. This demonstrates that when importing a Python library file in the middle of a program segment, we should add:
if __name__ == '__main__':
I tried the tricks mentioned above on the following very simple code. but I still cannot stop it from resetting on any of my Window machines with Python 3.8/3.10. I would very much appreciate it if you could tell me where I am wrong.
print('script reset')
def do_something(inp):
print('Done!')
if __name__ == '__main__':
from multiprocessing import Process, get_start_method
print('main reset')
print(get_start_method())
Process(target=do_something, args=[1]).start()
print('Finished')
output displays:
script reset
main reset
spawn
Finished
script reset
Done!
Update:
As far as I understand, you guys are not preventing either the script containing the __main__ or the .start() from resetting (which doesn't happen in Linux), rather you are suggesting workarounds so that we don't see the reset. One has to make all imports minimal and put them in each function separately, but it is still, relative to Linux, slow.

Newbie can't get concurrent.futures to work at all [duplicate]

I am trying my very first formal python program using Threading and Multiprocessing on a windows machine. I am unable to launch the processes though, with python giving the following message. The thing is, I am not launching my threads in the main module. The threads are handled in a separate module inside a class.
EDIT: By the way this code runs fine on ubuntu. Not quite on windows
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.
This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
My original code is pretty long, but I was able to reproduce the error in an abridged version of the code. It is split in two files, the first is the main module and does very little other than import the module which handles processes/threads and calls a method. The second module is where the meat of the code is.
testMain.py:
import parallelTestModule
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
parallelTestModule.py:
import multiprocessing
from multiprocessing import Process
import threading
class ThreadRunner(threading.Thread):
""" This class represents a single instance of a running thread"""
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name
def run(self):
print self.name,'\n'
class ProcessRunner:
""" This class represents a single instance of a running process """
def runp(self, pid, numThreads):
mythreads = []
for tid in range(numThreads):
name = "Proc-"+str(pid)+"-Thread-"+str(tid)
th = ThreadRunner(name)
mythreads.append(th)
for i in mythreads:
i.start()
for i in mythreads:
i.join()
class ParallelExtractor:
def runInParallel(self, numProcesses, numThreads):
myprocs = []
prunner = ProcessRunner()
for pid in range(numProcesses):
pr = Process(target=prunner.runp, args=(pid, numThreads))
myprocs.append(pr)
# if __name__ == 'parallelTestModule': #This didnt work
# if __name__ == '__main__': #This obviously doesnt work
# multiprocessing.freeze_support() #added after seeing error to no avail
for i in myprocs:
i.start()
for i in myprocs:
i.join()
On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.
Modified testMain.py:
import parallelTestModule
if __name__ == '__main__':
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
Try putting your code inside a main function in testMain.py
import parallelTestModule
if __name__ == '__main__':
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
See the docs:
"For an explanation of why (on Windows) the if __name__ == '__main__'
part is necessary, see Programming guidelines."
which say
"Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process)."
... by using if __name__ == '__main__'
Though the earlier answers are correct, there's a small complication it would help to remark on.
In case your main module imports another module in which global variables or class member variables are defined and initialized to (or using) some new objects, you may have to condition that import in the same way:
if __name__ == '__main__':
import my_module
As #Ofer said, when you are using another libraries or modules, you should import all of them inside the if __name__ == '__main__':
So, in my case, ended like this:
if __name__ == '__main__':
import librosa
import os
import pandas as pd
run_my_program()
hello here is my structure for multi process
from multiprocessing import Process
import time
start = time.perf_counter()
def do_something(time_for_sleep):
print(f'Sleeping {time_for_sleep} second...')
time.sleep(time_for_sleep)
print('Done Sleeping...')
p1 = Process(target=do_something, args=[1])
p2 = Process(target=do_something, args=[2])
if __name__ == '__main__':
p1.start()
p2.start()
p1.join()
p2.join()
finish = time.perf_counter()
print(f'Finished in {round(finish-start,2 )} second(s)')
you don't have to put imports in the if __name__ == '__main__':, just running the program you wish to running inside
In yolo v5 with python 3.8.5
if __name__ == '__main__':
from yolov5 import train
train.run()
In my case it was a simple bug in the code, using a variable before it was created. Worth checking that out before trying the above solutions. Why I got this particular error message, Lord knows.
The below solution should work for both python multiprocessing and pytorch multiprocessing.
As other answers mentioned that the fix is to have if __name__ == '__main__': but I faced several issues in identifying where to start because I am using several scripts and modules. When I can call my first function inside main then everything before it started to create multiple processes (not sure why).
Putting it at the very first line (even before the import) worked. Only calling the first function return timeout error. The below is the first file of my code and multiprocessing is used after calling several functions but putting main in the first seems to be the only fix here.
if __name__ == '__main__':
from mjrl.utils.gym_env import GymEnv
from mjrl.policies.gaussian_mlp import MLP
from mjrl.baselines.quadratic_baseline import QuadraticBaseline
from mjrl.baselines.mlp_baseline import MLPBaseline
from mjrl.algos.npg_cg import NPG
from mjrl.algos.dapg import DAPG
from mjrl.algos.behavior_cloning import BC
from mjrl.utils.train_agent import train_agent
from mjrl.samplers.core import sample_paths
import os
import json
import mjrl.envs
import mj_envs
import time as timer
import pickle
import argparse
import numpy as np
# ===============================================================================
# Get command line arguments
# ===============================================================================
parser = argparse.ArgumentParser(description='Policy gradient algorithms with demonstration data.')
parser.add_argument('--output', type=str, required=True, help='location to store results')
parser.add_argument('--config', type=str, required=True, help='path to config file with exp params')
args = parser.parse_args()
JOB_DIR = args.output
if not os.path.exists(JOB_DIR):
os.mkdir(JOB_DIR)
with open(args.config, 'r') as f:
job_data = eval(f.read())
assert 'algorithm' in job_data.keys()
assert any([job_data['algorithm'] == a for a in ['NPG', 'BCRL', 'DAPG']])
job_data['lam_0'] = 0.0 if 'lam_0' not in job_data.keys() else job_data['lam_0']
job_data['lam_1'] = 0.0 if 'lam_1' not in job_data.keys() else job_data['lam_1']
EXP_FILE = JOB_DIR + '/job_config.json'
with open(EXP_FILE, 'w') as f:
json.dump(job_data, f, indent=4)
# ===============================================================================
# Train Loop
# ===============================================================================
e = GymEnv(job_data['env'])
policy = MLP(e.spec, hidden_sizes=job_data['policy_size'], seed=job_data['seed'])
baseline = MLPBaseline(e.spec, reg_coef=1e-3, batch_size=job_data['vf_batch_size'],
epochs=job_data['vf_epochs'], learn_rate=job_data['vf_learn_rate'])
# Get demonstration data if necessary and behavior clone
if job_data['algorithm'] != 'NPG':
print("========================================")
print("Collecting expert demonstrations")
print("========================================")
demo_paths = pickle.load(open(job_data['demo_file'], 'rb'))
########################################################################################
demo_paths = demo_paths[0:3]
print (job_data['demo_file'], len(demo_paths))
for d in range(len(demo_paths)):
feats = demo_paths[d]['features']
feats = np.vstack(feats)
demo_paths[d]['observations'] = feats
########################################################################################
bc_agent = BC(demo_paths, policy=policy, epochs=job_data['bc_epochs'], batch_size=job_data['bc_batch_size'],
lr=job_data['bc_learn_rate'], loss_type='MSE', set_transforms=False)
in_shift, in_scale, out_shift, out_scale = bc_agent.compute_transformations()
bc_agent.set_transformations(in_shift, in_scale, out_shift, out_scale)
bc_agent.set_variance_with_data(out_scale)
ts = timer.time()
print("========================================")
print("Running BC with expert demonstrations")
print("========================================")
bc_agent.train()
print("========================================")
print("BC training complete !!!")
print("time taken = %f" % (timer.time() - ts))
print("========================================")
# if job_data['eval_rollouts'] >= 1:
# score = e.evaluate_policy(policy, num_episodes=job_data['eval_rollouts'], mean_action=True)
# print("Score with behavior cloning = %f" % score[0][0])
if job_data['algorithm'] != 'DAPG':
# We throw away the demo data when training from scratch or fine-tuning with RL without explicit augmentation
demo_paths = None
# ===============================================================================
# RL Loop
# ===============================================================================
rl_agent = DAPG(e, policy, baseline, demo_paths,
normalized_step_size=job_data['rl_step_size'],
lam_0=job_data['lam_0'], lam_1=job_data['lam_1'],
seed=job_data['seed'], save_logs=True
)
print("========================================")
print("Starting reinforcement learning phase")
print("========================================")
ts = timer.time()
train_agent(job_name=JOB_DIR,
agent=rl_agent,
seed=job_data['seed'],
niter=job_data['rl_num_iter'],
gamma=job_data['rl_gamma'],
gae_lambda=job_data['rl_gae'],
num_cpu=job_data['num_cpu'],
sample_mode='trajectories',
num_traj=job_data['rl_num_traj'],
num_samples= job_data['rl_num_samples'],
save_freq=job_data['save_freq'],
evaluation_rollouts=job_data['eval_rollouts'])
print("time taken = %f" % (timer.time()-ts))
I ran into the same problem. #ofter method is correct because there are some details to pay attention to. The following is the successful debugging code I modified for your reference:
if __name__ == '__main__':
import matplotlib.pyplot as plt
import numpy as np
def imgshow(img):
img = img / 2 + 0.5
np_img = img.numpy()
plt.imshow(np.transpose(np_img, (1, 2, 0)))
plt.show()
dataiter = iter(train_loader)
images, labels = dataiter.next()
imgshow(torchvision.utils.make_grid(images))
print(' '.join('%5s' % classes[labels[i]] for i in range(4)))
For the record, I don't have a subroutine, I just have a main program, but I have the same problem as you. This demonstrates that when importing a Python library file in the middle of a program segment, we should add:
if __name__ == '__main__':
I tried the tricks mentioned above on the following very simple code. but I still cannot stop it from resetting on any of my Window machines with Python 3.8/3.10. I would very much appreciate it if you could tell me where I am wrong.
print('script reset')
def do_something(inp):
print('Done!')
if __name__ == '__main__':
from multiprocessing import Process, get_start_method
print('main reset')
print(get_start_method())
Process(target=do_something, args=[1]).start()
print('Finished')
output displays:
script reset
main reset
spawn
Finished
script reset
Done!
Update:
As far as I understand, you guys are not preventing either the script containing the __main__ or the .start() from resetting (which doesn't happen in Linux), rather you are suggesting workarounds so that we don't see the reset. One has to make all imports minimal and put them in each function separately, but it is still, relative to Linux, slow.

User input prevents multiprocessing segment of code to work on windows

The problem resolves around my multiprocessing segment not working when I have an input question.
I have tried many workaround to the problem but cannot find a solution, except eliminating the input, however I need it to allow others to interact with my tool.
import time
from multiprocessing import Pool
import collections
choice = input("Do you wish to start program? \n")
print("hello")
start_time = time.time()
value = collections.namedtuple('value',['vectx','vecty'])
Values = (value(vectx=0,vecty=5),value(vectx=5,vecty=10),value(vectx=10,vecty=15),value(vectx=15,vecty=20))#,value(vectx=200,vecty=300),value(vectx=300,vecty=400),value(vectx=400,vecty=500),value(vectx=500,vecty=600),value(vectx=600,vecty=700),value(vectx=700,vecty=800),value(vectx=800,vecty=900),value(vectx=900,vecty=1000),value(vectx=1000,vecty=1100),value(vectx=1100,vecty=1200))
print("Start")
def Alter(x):
vectx=x.vectx
vecty=x.vecty
Z=(vectx+vecty)
return(Z)
if choice == "Yes":
print(1)
if __name__ == '__main__':
with Pool(10) as p:
result=p.map(Alter, Values)
new = []
print("end")
print("result Done")
for i in result:
new.append(i)
print( "My program took " +str(time.time() - start_time)+ " to run")
Expected result is program completes.
Your problem is that windows doesn't have fork like Unix based machines. So each process of the Pool running on Windows imports the main file on creation.
So what happens in your program is that each new process asks for input and your programs tangles up with itself. It was a little unclear to me the location of the if __name__ == '__main__':, but the point here is that you need to keep everything that needs to run once there. Put outside of it only important stuff shared between all processes. For example, a working code on windows could be:
import time
from multiprocessing import Pool
import collections
def Alter(x):
vectx = x.vectx
vecty = x.vecty
Z = (vectx + vecty)
return Z
value = collections.namedtuple('value', ['vectx', 'vecty'])
if __name__ == '__main__':
choice = input("Do you wish to start program? \n")
Values = (value(vectx=0,vecty=5),value(vectx=5,vecty=10),value(vectx=10,vecty=15))
if choice == "Yes":
print("Start")
start_time = time.time()
with Pool(10) as p:
result = p.map(Alter, Values)
print("My program took " + str(time.time() - start_time) + " to run")
Gives:
Do you wish to start program?
Yes
Start
My program took 1.9328622817993164 to run
From the docs, under programming guidelines, section Safe importing of main module:
Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process).

Killing a process launched from a process that has ended - Python

I am trying to kill a process in Python, that is being launched from another process and I am unable to find the correct place to place my ".terminate()".
To explain myself better I will post some example code:
from multiprocessing import Process
import time
def function():
print "Here is where I am creating the function I need to kill"
ProcessToKill = Process(target = killMe)
ProcessToKill.start()
def killMe():
while True:
print "kill me"
time.sleep(0.5)
if __name__ == '__main__':
Process1 = Process(target = function)
Process1.start()
My question is, where can I place ProcessToKill.terminate(), ideally without having to change the overall structure of the code?
You can hold onto the ProcessToKill object so that you can kill it later:
from multiprocessing import Process
import time
def function():
print "Here is where I am creating the function I need to kill"
ProcessToKill = Process(target = killMe)
ProcessToKill.start()
return ProcessToKill
def killMe():
while True:
print "kill me"
time.sleep(0.5)
if __name__ == '__main__':
Process1 = function()
time.sleep(5)
Process1.terminate()
Here, I've removed your wrapping of function in another Process object, because for the example it seems redundant, but you should be able to do the same thing with a Process that runs another Process.

Multiprocessing beside a main loop

I'm struggling with a issue for some time now.
I'm building a little script which uses a main loop. This is a process that needs some attention from the users. The user responds on the steps and than some magic happens with use of some functions
Beside this I want to spawn another process which monitors the computer system for some specific events like pressing specif keys. If these events occur then it will launch the same functions as when the user gives in the right values.
So I need to make two processes:
-The main loop (which allows user interaction)
-The background "event scanner", which searches for specific events and then reacts on it.
I try this by launching a main loop and a daemon multiprocessing process. The problem is that when I launch the background process it starts, but after that I does not launch the main loop.
I simplified everything a little to make it more clear:
import multiprocessing, sys, time
def main_loop():
while 1:
input = input('What kind of food do you like?')
print(input)
def test():
while 1:
time.sleep(1)
print('this should run in the background')
if __name__ == '__main__':
try:
print('hello!')
mProcess = multiprocessing.Process(target=test())
mProcess.daemon = True
mProcess.start()
#after starting main loop does not start while it prints out the test loop fine.
main_loop()
except:
sys.exit(0)
You should do
mProcess = multiprocessing.Process(target=test)
instead of
mProcess = multiprocessing.Process(target=test())
Your code actually calls test in the parent process, and that call never returns.
You can use the locking synchronization to have a better control over your program's flow. Curiously, the input function raise an EOF error, but I'm sure you can find a workaround.
import multiprocessing, sys, time
def main_loop(l):
time.sleep(4)
l.acquire()
# raise an EOFError, I don't know why .
#_input = input('What kind of food do you like?')
print(" raw input at 4 sec ")
l.release()
return
def test(l):
i=0
while i<8:
time.sleep(1)
l.acquire()
print('this should run in the background : ', i+1, 'sec')
l.release()
i+=1
return
if __name__ == '__main__':
lock = multiprocessing.Lock()
#try:
print('hello!')
mProcess = multiprocessing.Process(target=test, args = (lock, ) ).start()
inputProcess = multiprocessing.Process(target=main_loop, args = (lock,)).start()
#except:
#sys.exit(0)

Categories

Resources