I am debugging a multiprocess program with anaconda2 in pycharm community edition.
It has several background worker processes. The worker process will check the input Queue to retrieve the task without sleep until a task received. In fact, I'm only interested in the main process. But the pycharm debugger always step into the subprocess, it seems that the main process hasn't been working, and the task never sent out. How can I make the debugger out of the subprocess?
The worker subprocess looks like this:
class ILSVRC_worker:
...
def run(self):
cfg_parser = ConfigParser.ConfigParser()
cfg_parser.read(self.cfg_path)
data_factory = ILSVRC_DataFactory(cfg_parser)
logger = mp.log_to_stderr(logging.INFO)
while True:
try:
annotation_path = self.que_in.get(True,0.1)
except Queue.Empty:
continue
if annotation_path is None:
# to exit the subprocess
logger.info('exit the worker process')
break
...
I could think of two ways to achieve this but unfortunately I think it won't be possible with the community edition.
If you have the PID of the process you could try attaching to it by using the Tools>Attach to Process.. functionality (I don't know if that is available in the community edition). This is difficult if you use a Pool because you don't know which process the job is assigned to.
Another way would be to use a remote debugger and connect to it in the dispatched python process. This is only available in the professional edition
I ended up testing my code without any multiprocessing
Related
I'm trying to communicate with an interactive Windows Console application, to send commands to it and receive their output, but the subprocess the create_subprocess_exec() catches seems to be the wrong process, as the application creates two processes.
When starting the subprocess, two processes open - JoyShockMapper.exe and conhost.exe. As far as I understand, conhost.exe is the actual console I should be communicating with which talks to JoyShockMapper.exe, because I fail to properly communicate with proc.
By getting the PID and searching for it I can see, that this is actually the subprocess of the JoyShockMapper.exe and not the conhost.exe. How can I access the conhost.exe process instead?
EDIT: After trying to enter commands into the console application and monitoring the task manager, the memory value of JoyShockMapper.exe process changes, but conhost.exe doesn't change at all. Would this mean that the JoyShockMapper.exe is the console I'm looking for? If so, am I communicating with it the wrong way? And how do I talk to it properly?
EDIT 2: No, the conhost.exe seems to be the thing I need to talk to. How do I connect to it?
import asyncio
from time import sleep
async def main():
proc = await asyncio.subprocess.create_subprocess_exec(
"JoyShockMapper.exe",
stdin=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE)
sleep(3)
print(proc.pid) # Outputs JoyShockMapper.exe PID
proc.stdin.write(b"HIDE_MINIMIZED = OFF\n") # Doesn't do anything, no change in the console
print(await proc.stdout.read(1024)) # Outputs b''
await proc.wait()
asyncio.run(main())
There is no fix.
After I posted this question I've did some more research and I'm pretty sure I saw somewhere Microsoft stating that it's impossible to connect to a running application's stdin and stdout outside of the code that has launched it (JSM's code in this case).
I've decided to rather find the PID of the console host by checking for a match in it's parent PID and doing some hacky awful stuff to send keys to it's window for commands. Getting output from it is unfortunately impossible on Windows.
I'm trying to communicate between multiple threading.Thread(s) doing I/O-bound tasks and multiple multiprocessing.Process(es) doing CPU-bound tasks. Whenever a thread finds work for a process, it will be put on a multiprocessing.Queue, together with the sending end of a multiprocessing.Pipe(duplex=False). The processes then do their part and send results back to the threads via the Pipe. This procedure seems to work in roughly 70% of the cases, the other 30% I receive an AttributeError: Can't get attribute 'DupFd' on <module 'multiprocessing.resource_sharer' from '/usr/lib/python3.5/multiprocessing/resource_sharer.py'>
To reproduce:
import multiprocessing
import threading
import time
def thread_work(work_queue, pipe):
while True:
work_queue.put((threading.current_thread().name, pipe[1]))
received = pipe[0].recv()
print("{}: {}".format(threading.current_thread().name, threading.current_thread().name == received))
time.sleep(0.3)
def process_work(work_queue):
while True:
thread, pipe = work_queue.get()
pipe.send(thread)
work_queue = multiprocessing.Queue()
for i in range(0,3):
receive, send = multiprocessing.Pipe(duplex=False)
t = threading.Thread(target=thread_work, args=[work_queue, (receive, send)])
t.daemon = True
t.start()
for i in range(0,2):
p = multiprocessing.Process(target=process_work, args=[work_queue])
p.daemon = True
p.start()
time.sleep(5)
I had a look in the multiprocessing source code, but couldn't understand why this error occurs.
I tried using the queue.Queue, or a Pipe with duplex=True (default) but coudn't find a pattern in the error. Does anyone have a clue how to debug this?
You are forking an already multi-threaded main-process here. That is known to be problematic in general.
It is in-fact problem prone (and not just in Python). The rule is "thread after you fork, not before". Otherwise, the locks used by the thread executor will get duplicated across processes. If one of those processes dies while it has the lock, all of the other processes using that lock will deadlock -Raymond Hettinger.
Trigger for the error you get is apparantly that the duplication of the file-descriptor for the pipe fails in the child process.
To resolve this issue, either create your child-processes as long as your main-process is still single-threaded or use another start_method for creating new processes like 'spawn' (default on Windows) or 'forkserver', if available.
forkserver
When the program starts and selects the forkserver start method, a server process is started. From then on, whenever a new process is needed, the parent process connects to the server and requests that it fork a new process. The fork server process is single threaded so it is safe for it to use os.fork(). No unnecessary resources are inherited.
Available on Unix platforms which support passing file descriptors over Unix pipes. docs
You can specify another start_method with:
multiprocessing.set_start_method(method)
Set the method which should be used to start child processes. method can be 'fork', 'spawn' or 'forkserver'.
Note that this should be called at most once, and it should be protected inside the if name == 'main' clause of the main module. docs
For a benchmark of the specific start_methods (on Ubuntu 18.04) look here.
I am trying to start a java process that is meant to take a long time, using python's subprocess module.
What I am actually doing is using the multiprocessing module to start a new Process, and using that process, use subprocess module to run java -jar.
This works fine, but when I start the new process, the java process replaces the python process running python Process. I would like java to run as a child process in a way that when the process that started a new multiprocessing.Process died, the process running java would die too.
Is this possible?
Thanks.
Edit: here's some code to clarify my question:
def run_task():
pargs = ["java -jar app.jar"]
p = Popen(pargs)
p.communicate()[0]
return p
while(True):
a = a_blocking_call()
process = Process(target=run_task)
process.start()
if not a:
break
I want the process running run_task to be killed along with the process running java when the process executing the while loop reaches the break line. Is this possible?
I think you should show some code, it's not clear how you are using subprocess and multiprocessing together.
From the documentation it looks like subprocess should spawn and not replace your new Process-started process. Are you sure that isn't happening ? A test case showing it doesn't would be good.
You may get some hints out of Detach a subprocess started using python multiprocessing module
My application creates subprocesses. Usually, these processeses run and terminate without any problems. However, sometimes, they crash.
I am currently using the python subprocess module to create these subprocesses. I check if a subprocess crashed by invoking the Popen.poll() method. Unfortunately, since my debugger is activated at the time of a crash, polling doesn't return the expected output.
I'd like to be able to see the debugging window(not terminate it) and still be able to detect if a process is crashed in the python code.
Is there a way to do this?
When your debugger opens, the process isn't finished yet - and subprocess only knows if a process is running or finished. So no, there is not a way to do this via subprocess.
I found a workaround for this problem. I used the solution given in another question Can the "Application Error" dialog box be disabled?
Items of consideration:
subprocess.check_output() for your child processes return codes
psutil for process & child analysis (and much more)
threading library, to monitor these child states in your script as well once you've decided how you want to handle the crashing, if desired
import psutil
myprocess = psutil.Process(process_id) # you can find your process id in various ways of your choosing
for child in myprocess.children():
print("Status of child process is: {0}".format(child.status()))
You can also use the threading library to load your subprocess into a separate thread, and then perform the above psutil analyses concurrently with your other process.
If you find more, let me know, it's no coincidence I've found this post.
I have a python web application that needs to launch a long running process. The catch is I don't want it to wait around for the process to finish. Just launch and finish.
I'm running on windows XP, and the web app is running under IIS (if that matters).
So far I tried popen but that didn't seem to work. It waited until the child process finished.
Ok, I finally figured this out! This seems to work:
from subprocess import Popen
from win32process import DETACHED_PROCESS
pid = Popen(["C:\python24\python.exe", "long_run.py"],creationflags=DETACHED_PROCESS,shell=True).pid
print pid
print 'done'
#I can now close the console or anything I want and long_run.py continues!
Note: I added shell=True. Otherwise calling print in the child process gave me the error "IOError: [Errno 9] Bad file descriptor"
DETACHED_PROCESS is a Process Creation Flag that is passed to the underlying WINAPI CreateProcess function.
Instead of directly starting processes from your webapp, you could write jobs into a message queue. A separate service reads from the message queue and runs the jobs. Have a look at Celery, a Distributed Task Queue written in Python.
This almost works (from here):
from subprocess import Popen
pid = Popen(["C:\python24\python.exe", "long_run.py"]).pid
print pid
print 'done'
'done' will get printed right away. The problem is that the process above keeps running until long_run.py returns and if I close the process it kills long_run.py's process.
Surely there is some way to make a process completely independent of the parent process.
subprocess.Popen does that.