Python Thread.start blocking in a subprocess? - python

I have a Python class which starts a subprocess that runs another Python program; the first thing it does is spwans athreading.Thread to do some other other work. However, regardless of what the thread's target is, the Thread.start call is blocking, and the rest of the program is not executed.
What could possibly be causing this problem? Is it some general problem regarding the Python global interpreter lock?
EDIT:For some more background, it's a subprocess that runs a single PyTest unit test; inside the unit test, a thread is being started to create a server (the server creation isn't a problem, the threading.Thread.start problem occurs regardless).
The subprocess call is
subprocess.Popen(['/usr/local/bin/py.test', '-v', test_path, '-k',
test_name],
cwd=grading_path,
env=env,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdout, stderr = result.communicate()
and the threading call is
def test_case_1(self):
t = Thread(target=time.sleep, args=(1,)) # fake target, doesn't work regardless
t.start() # it blocks here
... other logic ....
There is communication between the subprocess and the and the parent process; the subprocess basically just sends back some result data.

Related

How to asynchronously call a shell script from Python?

I have a shell script which does some processing over the string passed and then writes it to a file. However I don't want my function foo() to wait for it to complete the operation. How do I call process(msg) and then move on the with the execution of {code block 2} without waiting for process(msg) to complete execution?
def process(msg):
subprocess.call(['sh', './process.sh', msg])
def foo():
# {code block 1}
process(msg)
# {code block 2}
foo() will be called from another function, almost once or twice per second.
Just for completeness: Python's asyncio offers a high level interface for doing just that:
https://docs.python.org/3.9/library/asyncio-subprocess.html#subprocesses
Example from documentation:
import asyncio
async def run(cmd):
proc = await asyncio.create_subprocess_shell(
cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE)
stdout, stderr = await proc.communicate()
print(f'[{cmd!r} exited with {proc.returncode}]')
if stdout:
print(f'[stdout]\n{stdout.decode()}')
if stderr:
print(f'[stderr]\n{stderr.decode()}')
asyncio.run(run('ls /zzz'))
subprocess.call() and subprocess.run() creates a process, waits for it to finish, and returns a CompletedProcess object.
subprocess.Popen() creates a process and returns it. It is used under the hood of the previous functions. You can then wait for the process to finish, send it messages, or whatever else you want to do with it. The arguments are mostly the same as to call or run.
https://docs.python.org/3/library/subprocess.html
As a bit of elaboration, Popen is the python implementation of using the os to start a new process. os.fork() is a lower level that doesn't actually do what we want here, that would spawn another instance of the python interpreter with the same memory state as the current one. If you wanted to use the lower level syscall, os.spawn is closer to subprocess.run than os.fork.
To verify that Popen is doing what you want, this test program will pring "returncode = None", then wait 5 seconds, and print "returncode = 0"
from subprocess import Popen
p = Popen(["sleep", "5"])
print("started the proc") # this will print immediately
p.poll() # this checks if the process is done but does not block
print(f"p returncode = {p.returncode}")
p.wait() # this blocks until the process exits
print(f"p returncode = {p.returncode}")
What you need is https://docs.python.org/3/library/os.html#os.fork i.e. os.fork() that way you can spawn a child which can outlive the parent process which can be later claimed by systemd on Linux. I have no clue about Windows.

subprocess.Popen doesn't work with shell=False

I try to run simple script in windows in the same shell.
When I run
subprocess.call(["python.exe", "a.py"], shell=False)
It works fine.
But when I run
subprocess.Popen(["python.exe", "a.py"], shell=False)
It opens new shell and the shell=false has no affect.
a.py just print message to the screen.
First calling Popen with shell=False doesn't mean that the underlying python won't try to open a window/console. It's just that the current python instance executes python.exe directly and not in a system shell (cmd or sh).
Second, Popen returns a handle on the process, and you have to perform a wait() on this handle for it to end properly or you could generate a defunct process (depending on the platform you're running on). I suggest that you try
p = subprocess.Popen(["python.exe", "a.py"], shell=False)
return_code = p.wait()
to wait for process termination and get return code.
Note that Popen is a very bad way to run processes in background. The best way would be to use a separate thread
import subprocess
import threading
def run_it():
subprocess.call(["python.exe", "a.py"], shell=False)
t = threading.Thread(target=run_it)
t.start()
# do your stuff
# in the end
t.join()

How to kill a subprocess started in a thread?

I am trying to run the Robocopy command (but I am curious about any subprocess) from Python in windows. The code is pretty simple and works well. It is:
def copy():
with Popen(['Robocopy', media_path, destination_path, '/E', '/mir', '/TEE', '/log+:' + log_path], stdout=PIPE, bufsize=1, universal_newlines=True) as Robocopy:
Robocopy.wait()
returncode = Robocopy.returncode
Additionally I am running it in a separate thread with the following:
threading.Thread(target=copy, args=(media_path, destination_path, log_path,), daemon=True)
However, there are certain instances where I want to stop the robocopy (akin to closing the CMD window if it was run from the command line)
Is there a good way to do this in Python?
We fought with reliably killing subprocesses on Windows for a while and eventually came across this:
https://github.com/andreisavu/python-process/blob/master/killableprocess.py
It implements a kill() method for killing your subprocess. We've had really good results with it.
You will need to somehow pass the process object out of the thread and call kill() from another thread, or poll in your thread with wait() using a timeout while monitoring some kind of global-ish flag.
If the process doesn't start other processes then process.kill() should work:
import subprocess
class InterruptableProcess:
def __init__(self, *args):
self._process = subprocess.Popen(args)
def interrupt(self):
self._process.kill()
I don't see why would you need it on Windows but you could run Thread(target=self._process.wait, daemon=True).start() if you'd like.
If there is a possibility that the process may start other processes in turn then you might need a Job object to kill all the descendant processes. It seems killableprocess.py which is suggested by #rrauenza uses this approach (I haven't tested it). See Python: how to kill child process(es) when parent dies?.

Polling subprocess object without blocking

I'm writing a python script that launches programs in the background and then monitors to see if they encounter an error. I am using the subprocess module to start the process and keep a list of running programs.
processes.append((subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE), command))
I have found that when I try to monitor the programs by calling communicate on the subprocess object, the main program waits for the program to finish. I have tried to use poll(), but that doesn't give me access to the error code that caused the crash and I would like to address the issue and retry opening the process.
runningProcesses is a list of tuples containing the subprocess object and the command associated with it.
def monitorPrograms(runningProcesses):
for program in runningProcesses:
temp = program[0].communicate()
if program[0].returncode:
if program[0].returncode == 1:
print "Program exited successfully."
else:
print "Whoops, something went wrong. Program %s crashed." % program[0].pid
When I tried to get the return code without using communicate, the crash of the program didn't register.
Do I have to use threads to run the communication in parallel or is there a simpler way that I am missing?
No need to use threads, to monitor multiple processes, especially if you don't use their output (use DEVNULL instead of PIPE to hide the output), see Python threading multiple bash subprocesses?
Your main issue is incorrect Popen.poll() usage. If it returns None; it means that the process is still running -- you should call it until you get non-None value. Here's a similar to your case code example that prints ping processes statuses.
If you do want to get subprocess' stdout/stderr as a string then you could use threads, async.io.
If you are on Unix and you control all the code that may spawn subprocesses then you could avoid polling and handle SIGCHLD yourself. asyncio stdlib library may handle SIGCHLD. You could also implement it manually, though it might be complicated.
Based on my research, the best way to do this is with threads. Here's an article that I referenced when creating my own package to solve this problem.
The basic method used here is to spin of threads that constantly request log output (and finally the exit status) of the subprocess call.
Here's an example of my own "receiver" which listens for logs:
class Receiver(threading.Thread):
def __init__(self, stream, stream_type=None, callback=None):
super(Receiver, self).__init__()
self.stream = stream
self.stream_type = stream_type
self.callback = callback
self.complete = False
self.text = ''
def run(self):
for line in iter(self.stream.readline, ''):
line = line.rstrip()
if self.callback:
line = self.callback(line, msg_type=self.stream_type)
self.text += line + "\n"
self.complete = True
And now the code that spins the receiver off:
def _execute(self, command):
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
shell=True, preexec_fn=os.setsid)
out = Receiver(process.stdout, stream_type='out', callback=self.handle_log)
err = Receiver(process.stderr, stream_type='err', callback=self.handle_log)
out.start()
err.start()
try:
self.wait_for_complete(out)
except CommandTimeout:
os.killpg(process.pid, signal.SIGTERM)
raise
else:
status = process.poll()
output = CommandOutput(status=status, stdout=out.text, stderr=err.text)
return output
finally:
out.join(timeout=1)
err.join(timeout=1)
CommandOutput is simply a named tuple that makes it easy to reference the data I care about.
You'll notice I have a method 'wait_for_complete' which waits for the receiver to set complete = True. Once complete, the execute method calls process.poll() to get the exit code. We now have all stdout/stderr and the status code of the process.

Python subprocess.Popen not working

I've been reading up on a lot of documentations but am still not sure what I'm doing wrong.
So I have a separate shell script that fires up a separate server then the one I'm working on. Once the server is connected, I want to run ls and that's it. However, for some reason stdin=subprocess.PIPE is preventing the Popen command from terminating so that the next line could execute. For example because the code is stuck I'll Ctrl+C but I'll get an error saying that wait() got a keyboard interrupt. Here's an example code:
import subprocess
from time import sleep
p1 = subprocess.Popen("run_server",
stdout = subprocess.PIPE,
stdin = subprocess.PIPE)
#sleep(1)
p1.wait()
p1.communicate(input = "ls")[0]"
If I replace p1.wait() with sleep(1), the communicate command does run and displays ls, but the script that runs the server detects eof on tty and terminates it self. I must have some kind of wait between Popen and communicate because the server script will terminate for the same reason.
p.wait() does not return until the child process is dead. While the parent script is stuck on p.wait() call; your child process expects input at the same time -- deadlock. Then you press Ctrl+C in the shell; it sends SIGINT signal to all processes in the foreground process group that kills both your parent Python script and run_server subprocess.
You should drop the .wait() call:
#!/usr/bin/env python
from subprocess import Popen, PIPE
p = Popen(["run_server"], stdout=PIPE, stdin=PIPE)
output = p.communicate(b"ls")[0]
Or in Python 3.4+:
#!/usr/bin/env python3
from subprocess import check_output
output = check_output(["run_server"], input=b"ls")
If you want to run several commands then pass them all at once:
input = "\n".join(["ls", "cmd2", "etc"]) # with universal_newlines=True
As you know from reading the subprocess docs, p.communicate() waits for the child process to exit and therefore it should be called at most once. As well as with .wait(), the child process is dead after .communicate() has returned.
The fact that when you Ctrl+C and your traceback says you were stuck in wait() means the next line is executing, the next line is wait(). wait() won't return until your p1 process returns. However, it seems your p1 process won't return until you send it a command, 'ls' in your case. Try sending the command then calling wait().:
import subprocess
from time import sleep
p1 = subprocess.Popen("run_server",
stdout = subprocess.PIPE,
stdin = subprocess.PIPE)
#sleep(1)
p1.communicate(input = "ls")[0]"
p1.wait()
Otherwise, make sure your "run_server" script terminates so your script can advance past p1.wait()

Categories

Resources