Python subprocess stdout truncated by env variable $COLUMNS - python

I am calling a bash script in python (3.4.3) using subprocess:
import subprocess as sp
res = sp.check_output("myscript", shell=True)
and myscript contains a line:
ps -ef | egrep somecommand
It was not giving the same result as when myscript is directly called in a bash shell window. After much tinkering, I realized that when myscript is called in python, the stdout of "ps -ef" was truncated by the current $COLUMNS value of the shell window before being piped to "egrep". To me, this is crazy as simply by resizing the shell window, the command can give different results!
I managed to "solve" the problem by passing env argument to the subprocess call to specify a wide enough COLUMNS:
res = sp.check_output("myscript", shell=True, env={'COLUMNS':'100'})
However, this looks very dirty to me and I don't understand why the truncation only happens in python subprocess but not in a bash shell. Frankly I'm amazed that this behavior isn't documented in the official python doc unless it's in fact a bug -- I am using python 3.4.3. What is the proper way of avoiding this strange behavior?

You shoud use -ww, from man ps:
-w
Wide output. Use this option twice for unlimited width.

Related

Executing python module through popen from a python script/shell

I have a python module that is executed by the following command:
python3 -m moduleName args
Trying to execute it from a script using subprocess.popen.
subprocess.Popen(command, shell=True, text=True, stdout=subprocess.PIPE)
Based on subprocess documentation, we are recommended to pass a sequence rather than a string. So when I try to pass the below command as the argument
command = ['python3','-m','moduleName','args']
I end up getting a python shell instead of the module being executed. If I pass it as a string, things are working as expected. I'm not able to find documentation or references for this.
Can someone please help throw some light into this behavior?
What would be the best way to make this work?
Thanks!
This behavior is caused by the shell=True option. When Popen runs in shell mode (under POSIX), the command is appended to the shell command after a "-c" option (subprocess.py, Python 3.9):
args = [unix_shell, "-c"] + args
When the list of arguments is expanded, the first argument after '-c' (in your case, 'python3') is treated as the parameter to '-c'. The other arguments are interpreted as further arguments to the unix_shell command. The -m, for example, activates job control in bash, as outlined in the bash manual.
The solution is to either
pass the command as a single string, as you did, or
do not set the shell option for Popen, which is a good idea anyway, as it is lighter on resources and avoids pitfalls like the one you encountered.

Running a GNU parallel command using python subprocess

Here is a simple GNU parallel command that creates a file called "example_i.txt" inside an existing directory called "example_i". It does this four times, for i from 1 to 4, with one job per core:
parallel -j 4 'cd example_{} && touch example_{}.txt' ::: {1..4}
Not very exciting, I know. The problem appears when I try to run this via python (v3.9) using the subprocess module as follows:
import subprocess
cmd = "parallel -j 4 'cd example_{} && touch example_{}.txt' ::: {1..4}"
subprocess.run(cmd, shell=True)
When doing so I get this error:
/bin/sh: 1: cd: can't cd to example_{1..4}
It looks like using the python subprocess call, bash is not triggering the call correctly as a GNU parallel command. Instead, it is substituting the {1..4} explicitly rather than dividing it up into four jobs.
I also tried this with the less advisable os.system(cmd) syntax and got back the same error.
PS: For context, this question stems from me trying to use UQpy (the RunModel module in particular) for uncertainty quantification of a Fortran code that was handed to me. Although this is not directly related to the question, it is relevant because I would like to know how to get this working using these tools as I am not at liberty to change them.
Following #Mark Setchell's comment, indeed it appears that bash is not used by default on POSIX as can be seen in the documentation for subprocess. This is solved by explicitly telling subprocess to use bash by re-writting my python code snippet as:
import subprocess
cmd = "parallel -j 4 'cd example_{} && touch example_{}.txt' ::: {1..4}"
subprocess.run(cmd, shell=True, executable='/bin/bash')
It should be noted that although the argument executable is here being used in the subprocess.run() call, it is not directly a part of this class. The executable argument is actually part of the subprocess.Popen() class, but it is accessible to subprocess.run() by the **other_popen_kwargs argument.

Launching subprocesses on resource limited machine

Edit:
The original intent of this question was to find a way to launch an interactive ssh session via a Python script. I'd tried subprocess.call() before and had gotten a Killed response before anything was output onto the terminal. I just assumed this was an issue/limitation with the subprocess module instead of an issue somewhere else.This was found not to be the case when I ran the script on a non-resource limited machine and it worked fine.
This then turned the question into: How can I run an interactive ssh session with whatever resource limitations were preventing it from running?
Shoutout to Charles Duffy who was a huge help in trying to diagnose all of this .
Below is the original question:
Background:
So I have a script that is currently written in bash. It parses the output of a few console functions and then opens up an ssh session based on those parsed outputs.
It currently works fine, but I'd like to expand it's capabilities a bit by adding some flag arguments to it. I've worked with argparse before and thoroughly enjoyed it. I tried to do some flag work in bash, and let's just say it leaves much to be desired.
The Actual Question:
Is it possible to have python to do stuff in a console and then put the user in that console?
Something like using subprocess to run a series of commands onto the currently viewed console? This in contrast to how subprocess normally runs, where it runs commands and then shuts the intermediate console down
Specific Example because I'm not sure if what I'm describing makes sense:
So here's a basic run down of the functionality I was wanting:
Run a python script
Have that script run some console command and parse the output
Run the following command:
ssh -t $correctnode "cd /local_scratch/pbs.$jobid; bash -l"
This command will ssh to the $correctnode, change directory, and then leave a bash window in that node open.
I already know how to do parts 1 and 2. It's part three that I can't figure out. Any help would be appreciated.
Edit: Unlike this question, I am not simply trying to run a command. I'm trying to display a shell that is created by a command. Specifically, I want to display a bash shell created through an ssh command.
Context For Readers
The OP is operating on a very resource-constrained (particularly, it appears, process-constrained) jumphost box, where starting an ssh process as a subprocess of python goes over a relevant limit (on number of processes, perhaps?)
Approach A: Replacing The Python Interpreter With Your Interactive Process
Using the exec*() family of system calls causes your original process to no longer be in memory (unlike the fork()+exec*() combination used to start a subprocess while leaving the parent process running), so it doesn't count against the account's limits.
import argparse
import os
try:
from shlex import quote
except ImportError:
from pipes import quote
parser = argparse.ArgumentParser()
parser.add_argument('node')
parser.add_argument('jobid')
args = parser.parse_args()
remote_cmd_str = 'cd /local_scratch/pbs.%s && exec bash -i' % (quote(args.jobid))
local_cmd = [
'/usr/bin/env', 'ssh', '-tt', node, remote_cmd_str
]
os.execv("/usr/bin/env", local_cmd)
Approach B: Generating Shell Commands From Python
If we use Python to generate a shell command, the shell can invoke that command only after the Python process exited, such that we stay under our externally-enforced process limit.
First, a slightly more robust approach at generating eval-able output:
import argparse
try:
from shlex import quote
except ImportError:
from pipes import quote
parser = argparse.ArgumentParser()
parser.add_argument('node')
parser.add_argument('jobid')
args = parser.parse_args()
remoteCmd = ['cd', '/local_scratch/pbs.%s' % (args.jobid)]
remoteCmdStr = ' '.join(quote(x) for x in remoteCmd) + ' && bash -l'
cmd = ['ssh', '-t', args.correctnode, remoteCmdStr]
print(' '.join(pipes.quote(x) for x in cmd)
To run this from a shell, if the above is named as genSshCmd:
#!/bin/sh
eval "$(genSshCmd "$#")"
Note that there are two separate layers of quoting here: One for the local shell running eval, and the second for the remote shell started by SSH. This is critical -- you don't want a jobid of $(rm -rf ~) to actually invoke rm.
This is in no way a real answer, just an illustration to my comment.
Let's say you have a Python script, test.py:
import argparse
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('myarg', nargs="*")
args = parser.parse_args()
print("echo Hello world! My arguments are: " + " ".join(args.myarg))
So, you create a bash wrapper around it, test.sh
set -e
$(python test.py $*)
and this is what you get:
$ bash test.sh
Hello world! My arguments are:
$ bash test.sh one two
Hello world! My arguments are: one two
What is going on here:
python script does not execute commands. Instead, it outputs the commands bash script will run (echo in this example). In your case, the last command will be ssh blabla
bash executes the output of the python script (the $(...) part), passing on all its arguments (the $* part)
you can use argparse inside the python script; if anything is wrong with the arguments, the message will be put to stderr and will not be executed by bash; bash script will stop because of set -e flag

Python, subprocess.check_call() and pipes redirection

Why am I getting list of files when executing this command?
subprocess.check_call("time ls &>/dev/null", shell=True)
If I will paste
time ls &>/dev/null
into the console, I will just get the timings.
OS is Linux Ubuntu.
On debian-like systems, the default shell is dash, not bash. Dash does not support the &> shortcut. To get only the subprocess return code, try:
subprocess.check_call("time ls >/dev/null 2>&1", shell=True)
To get subprocess return code and the timing information but not the directory listing, use:
subprocess.check_call("time ls >/dev/null", shell=True)
Minus, of course, the subprocess return code, this is the same behavior that you would see on the dash command prompt.
The Python version is running under sh, but the console version is running in whatever your default shell is, which is probably either bash or dash. (Your sh may actually be a different shell running in POSIX-compliant mode, but that doesn't make any difference.)
Both bash and dash have builtin time functions, but sh doesn't, so you get /usr/bin/time, which is a normal program. The most important difference this makes is that the time builtin is not running as a subprocess with its own independent stdout and stderr.
Also, sh, bash, and dash all have different redirection syntax.
But what you're trying to do seems wrong in the first place, and you're just getting lucky on the console because two mistakes are canceling out.
You want to get rid of the stdout of ls but keep the stderr of time, but that's not what you asked for. You're trying to redirect both stdout and stderr: that's what >& means on any shell that actually supports it.
So why are you still getting the time stderr? Either (a) your default shell doesn't support >&, or (b) you're using the builtin instead of the program, and you're not redirecting the stderr of the shell itself, or maybe (c) both of the above.
If you really want to do exactly the same thing in Python, with the exact same bugs canceling out in the exact same way, you can run your default shell manually instead of using shell=True. Depending on which reason it was working, that would be either this:
subprocess.check_call([os.environ['SHELL'], '-c', 'time ls &> /dev/null'])
or this:
subprocess.check_call('{} -c time ls &> /dev/null'.format(os.environ(SHELL), shell=True)
But really, why are you doing this at all? If you want to redirect stdout and not stderr, write that:
subprocess.check_call('time ls > /dev/null', shell=True)
Or, better yet, why are you even using the shell in the first place?
subprocess.check_call(['time', 'ls'], stdout=subprocess.devnull)

pexpect need to use pipe without starting the bash shell since the bash shell does not understand the command

I want to do something likethis using pexpect
echild = pexpect.spawn('/bin/bash -c "sysinfo -v | grep "SCM"')
fout = file('/home/kiva/release_file.txt' , 'w+')
child.logfile = fout
The problem is that I want the out of that command into a textfile but I have to start a shell since we cannot use pipe in spawn(). The bash shell does not understand sysinfo -v and complains about it.
Do you guys have any idea or know of a way in which I can get the desired output into the file without opening the bash terminal? I can solve the issue by just using the spawn() method without grepping it but I want the exact match and hence grep is necessary.
Thank you
From your short example I do not see why you specifically need to use pexpect to achieve this. I would go for the Popenway. Here's a link that might prove useful:
Replacing shell pipeline - Popen

Categories

Resources