I am trying to write a script which has to make a lot of calls to some bash commands, parse and process the outputs and finally give some output.
I was using subprocess.Popen and subprocess.call
If I understand correct these methods spawn a bah process, run the command, get the output and then kill the process.
Is there a way to have a bash process running in the background continuously and then the python calls could just go directly to that process? This would be something like bash running as a server and python calls going to it.
I feel this would optimize the calls a bit as there is no bash process setup and teardown. Or will it give no performance advantage?
I feel this would optimize the calls a bit as there is no bash process setup and teardown.
subprocess never runs the shell unless you ask it explicitly e.g.,
#!/usr/bin/env python
import subprocess
subprocess.check_call(['ls', '-l'])
This call runs ls program without invoking /bin/sh.
Or will it give no performance advantage?
If your subprocess calls actually use the shell e.g., to specify a pipeline consicely or you use bash process substitution that could be verbose and error-prone to define using subprocess module directly then it is unlikely that invoking bash is a performance bottleneck -- measure it first.
There are Python packages that too allow to specify such commands consicely e.g., plumbum could be used to emulate a shell pipeline.
If you want to use bash as a server process then pexpect is useful for dialog-based interactions with an external process -- though it is unlikely that it affects time performance. fabric allows to run both local and remote commands (ssh).
There are other subprocess wrappers such as sarge which can parse a pipeline specified in a string without invoking the shell e.g., it enables cross-platform support for bash-like syntax (&&, ||, & in command lines) or sh -- a complete subprocess replacement on Unix that provides TTY by default (it seems full-featured but the shell-like piping is less straightforward). You can even use Python-ish BASHwards-looking syntax to run commands with xonsh shell.
Again, it is unlikely that it affects performance in a meaningful way in most cases.
The problem of starting and communicating with external processes in a portable manner is complex -- the interaction between processes, pipes, ttys, signals, threading, async. IO, buffering in various places has rough edges. Introducing a new package may complicate things if you don't know how a specific package solve numerous issues related to running shell commands.
If I understand correct these methods spawn a bah process, run the command, get the output and then kill the process.
subprocess.Popen is a bit more involved. It actually creates an I/O thread to avoid deadlocks. See https://www.python.org/dev/peps/pep-0324/:
A communicate() method, which makes it easy to send stdin data and read stdout and stderr data, without risking deadlocks. Most people are aware of the flow control issues involved with child process communication, but not all have the patience or skills to write a fully correct and deadlock-free select loop. This means that many Python applications contain race conditions. A communicate() method in the standard library solves this problem.
Is there a way to have a bash process running in the background continuously and then the python calls could just go directly to that process?
Sure, you can still use subprocess.Popen and send messages to you subprocess and receive messages back without terminating the subprocess. In the simplest case your messages can be lines.
This allows for request-response style protocols as well as publish-subscribe when the subprocess can keep sending you messages back when an event of interest happens.
Related
I'm not sure if what I'm wanting to do is possible, but:
I have a python script (lets call it PY) that calls a batch script to start a tool in terminal mode (lets call it A). This tool gets passed a starting script (tcl script) that sets up its environment and launches a second tool (lets call it B). The two tools communicate over a TCP connection locally.
My question is, with these two programs running (A and B), can I switch back to the python script to run commands in either A or B's TCL interface?
The scripts look sort of like this:
#python PY
def ReadConigAndSetup():
#read some data
...
#run bat
subprocess.run("./some_bat.bat some_data_args")
#bat start program A and pass it a startup script
some_program_A -mode tcl -source ./some_source.tcl
#tcl some_source.tcl
setup environment
open TCP port
start program B
#program B setup tcl
some more setup
after program B has run I'd like to be able to run more commands in program B from python as parsing some of the config files is much easier in the python environment.
The answer is “it depends on the details”.
There's no reason in principle why the program being called can't work fine this way, provided the subprocess relinquishes control back (which it might or might not), but launching complex programs via a BAT file is adding an extra layer of complexity so you might want to think about whether you can simplify a bit there.
If the program running the Tcl code doesn't terminate, things get trickier. This is an area where the details are critical; Tcl code can be written to loop indefinitely — it's a programming language so of course it can be told to be annoying if you insist — and the program being controlled could also decide to loop indefinitely of its own accord, which can happen particularly with GUI applications as the looping is where the user is interacting with the GUI. On Windows, many GUI applications run disconnected from the terminal (whether they do this is a compile-time option) and waiting for them to finish can be quite annoying.
It's possible to run multiple subprocesses at once using subprocess.Popen. Be very careful if you do this. It's possible to get into deadlocks (though that depends a lot on what the subprocesses are doing). It's probably easier to just launch each subprocess from its own thread… but then you're dealing with threads and that's also complicated.
I am using Python Multiprocessing for a project and sometimes the process freezes and apparently the reason why it is happening is this process I find running ps aux:
python -c from multiprocessing.semaphore_tracker import main;main(39)
Some more info:
If I kill the process everything runs fine
This problem is not frequent, meaning there could be days running everything fine without it happenning
I am using PyCharm
I am runing this Python code in a server using PyCharm remote interpreter and sometime using SSH
Questions:
What is happening that this process is appearing?
Why isn't it finishing by itself?
What does it do that freezes other processes?
How to avoid this situation?
According to the documentation:
On Unix using the spawn or forkserver start methods will also start a semaphore tracker process which tracks the unlinked named semaphores created by processes of the program.
Why one would want to use the spawn start method escapes me. It is a (very clever) bodge necessary on ms-windows because that OS doesn't have the fork system call.
So I suspect that Pycharm imposes the use of the forkserver start method because it uses multiple threads internally, and the standard UNIX fork startmethod doesn't deal well with multithreaded programs.
Try running your project from a shell. On UNIX-like operating systems that should default to the fork start method that does not require the semaphore tracker process.
Assume using Linux:
In Perl, the exec function executes an external program and immediately exits itself, leaving the external program in same shell session.
A very close answer using Python is https://stackoverflow.com/a/13256908
However, the Python solution using start_new_session=True starts an external program using setsid method, that means that solution is suitable for making a daemon, not an interactive program.
Here is an simple example of using perl:
perl -e '$para=qq(-X --cmd ":vsp");exec "vim $para"'
After vim is started, the original Perl program has exited and the vim is still in the same shell session(vim is not sent to new session group).
How to get the same solution with Python.
Perl is just wrapping the exec* system call functions here. Python has the same wrappers, in the os module, see the os.exec* documentation:
These functions all execute a new program, replacing the current process; they do not return. On Unix, the new executable is loaded into the current process, and will have the same process id as the caller.
To do the same in Python:
python -c 'import os; para="-X --cmd \":vsp\"".split(); os.execlp("vim", *para)'
os.execlp accepts an argument list and looks up the binary in $PATH from the first argument.
The subprocess module is only ever suitable for running processes next to the Python process, not to replace the Python process. On POSIX systems, the subprocess module uses the low-level exec* functions to implement it's functionality, where a fork of the Python process is then replaced with the command you wanted to run with subprocess.
I'm using subprocess to start a process and let it run in the background, it's a server application. The process itself is a java program with a thin wrapper (which among other things, means that I can just launch it as an executable without having to call java explicitly).
I'm using Popen to run the process and when I set shell=False, it runs but it spawns two processes instead of one. The first process has init as its parent and when I inspect it via ps, it just displays the raw command. However, the second process displays with the expanded java arguments (-D and -X flags) - this is what I expect to see and how the process looks when I run the command manually.
Interestingly, when I set shell=True, the command fails. The command does have a help message but it doesn't seem to indicate that there's a problem with my argument list (there shouldn't be). Everything is the same except the shell named argument to Popen.
I'm using Python 2.7 on Ubuntu. Not really sure what's going on here, any help is appreciated. I suppose it's possible that the java command is doing an exec/fork and for some reason, the parent process isn't dying when I start it through Python.
I saw this SO question which looked promising but doesn't change the behavior that I'm experiencing.
This is actually more of a question about the wrapper than about Python -- you would get the same behavior running it from any other language.
To get the behavior you want, the wrapper would want to have the line where it invokes the JVM look as follows:
exec java -D... -cp ... main.class.here "$#"
...as opposed to lacking the exec on front:
java -D... -cp ... main.class.here "$#"
In the former case, the process image of the wrapper is replaced with that of the JVM it invokes; in the latter, the wrapper waits for the JVM to exit, and then continues to run.
If the wrapper does any cleanup after JVM exit, using exec will prevent this from happening and would thus be the Wrong Thing -- in this case, you would want the wrapper to still exist while the JVM runs, as otherwise it would be unable to perform cleanup afterwards.
Be aware that if the wrapper is responsible for detaching the subprocess, it needs to be able to close open file handles for this to happen correctly. Consider passing close_fds=True to your Popen call if your parent process has more file descriptors than only stdin, stdout and stderr open.
professionals
I know how to launch a command in Linux's terminal via process, sth likes following:
import subprocess
subprocess.Popen('ifconfig -a')
But this is opened in process, how can I launch that in a thread instead?
I know "thread.start_new_thread", while this should call a function. Within the function, I still have to use subprocess. And this just to open a process again..
Thank you for your help.
Respectfully..
A command like ifconfig always runs in a separate process. There is no way to run that command within only a "thread" of your application.
Perhaps you could provide more detail about why you believe this is necessary, and we may be able to suggest a different approach. For example, if you need to capture the output of the ifconfig command, there are certainly ways of doing that within Python.
As you are calling another process outside of your Python application, I think that there is no solution to make it run inside the Python interpreter.