Execute df | grep -w "/" not parsing output correctly

Execute df | grep -w "/" not parsing output correctly - python

I am trying to run the shell command df -h | grep -w "/" using python to watch the root partition usage and wanted to avoid shell=True option for security.
The code I tried as follows:
import subprocess
p1 = subprocess.Popen(['df', '-h'], stdout=subprocess.PIPE)
p2 = subprocess.Popen(['grep', '-w', '"/"'], stdin=p1.stdout, stdout=subprocess.PIPE)
output=p2.communicate()[0]
print(output)
The output I get is:
$ ./subprocess_df_check.py
b''
Expected output is:
$ df -h | grep -w "/"
/dev/sdd 251G 4.9G 234G 3% /

The immediate problem is the unnecessary quotes being added.
p2 = subprocess.Popen(['grep', '-w', '"/"'], stdin=p1.stdout, stdout=subprocess.PIPE)
is not equivalent to the shell command grep -w "/". Instead, it's equivalent to the shell command grep -w '"/"', (or grep -w \"/\", or any other means of writing an argument vector that passes literal double-quote characters on the last non-NUL element of grep's argument vector) and wrong for the same reasons.
Use '/', not '"/"'.

Don't use subprocess with df and / or grep. If you already use python, you can use the statvfs function call like:
import os
import time
path = "/"
while True:
info = os.statvfs(path)
print("Block size [%d] Free blocks [%d] Free inodes [%d]"
% (info.f_bsize, info.f_bfree, info.f_ffree))
time.sleep(15)

Running grep in a separate subprocess is certainly unnecessary. If you are using Python, you already have an excellent tool for looking for substrings within strings.
df = subprocess.run(['df', '-h'],
capture_output=True, text=True, check=True)
for line in df.stdout.split('\n')[1:]:
if '/' in line:
print(line)
Notice also how you basically always want to prefer subprocess.run over Popen when you can, and how you want text=True to get text rather than bytes. Usually you also want check=True to ensure that the subprocess completed successfully.

Ok figured out the whole thing.
import subprocess
p1 = subprocess.Popen(['df', '-h'], stdout=subprocess.PIPE)
p2 = subprocess.Popen(['grep', '-w', '/'], stdin=p1.stdout, stdout=subprocess.PIPE)
output=p2.communicate()[0].split()[4]
print("Root partition is of", output.decode(), "usage now")
Removed unnecessary double quotes, changed from subprocess.Popen(['grep', '-w', '"/"'] to subprocess.Popen(['grep', '-w', '/']. The double quotes are for the shell, not for df. When you have no shell, you need no shell syntax.
On output=p2.communicate()[0].split()[4], the [0] picks only stdout, not the stderr, which is None if no error. Then split()[4] cuts the 4th column which is disk usage percent value from shell df command.
output.decode(), the decode() is to decode the encoded bytes string format and avoid outputting character b in front of the result. Refer here
So the output of the script is:
$ ./subprocess_df_check.py
Root partition is of 3% usage now

Related

subprocess.call() throws error "FileNotFoundError: [Errno 2] No such file or directory" when redirecting stdout to file

I want to redirect the console output to a textfile for further inspection.
The task is to extract TIFF-TAGs from a raster file (TIFF) and filter the results.
In order to achieve this, I have several tools at hand. Some of them are not python libraries, but command-line tools, such as "identify" of ImageMagick.
My example command-string passed to subprocess.check_call() was:
cmd_str = 'identify -verbose /home/andylu/Desktop/Models_Master/AERSURFACE/Input/Images/Denia_CORINE_CODE_18_reclass_NLCD92_reproj_ADAPTED_Europe_AEA.tif | grep -i "274"'
Here, in the output of the TIFF-TAGs produced by "identify" all lines which contain information about the TAG number "274" shall be either displayed in the console, or written to a file.
Error-type 1: Displaying in the console
subprocess.check_call(bash_str, shell=True)
subprocess.CalledProcessError: Command 'identify -verbose /home/andylu/Desktop/Models_Master/AERSURFACE/Input/Images/Denia_CORINE_CODE_18_reclass_NLCD92_reproj_ADAPTED_Europe_AEA.tif | grep -i "274"' returned non-zero exit status 1.
Error-type 2: Redirecting the output to textfile
subprocess.call(bash_str, stdout=filehandle_dummy, stderr=filehandle_dummy
FileNotFoundError: [Errno 2] No such file or directory: 'identify -verbose /home/andylu/Desktop/Models_Master/AERSURFACE/Input/Images/Denia_CORINE_CODE_18_reclass_NLCD92_reproj_ADAPTED_Europe_AEA.tif | grep -i "274"': 'identify -verbose /home/andylu/Desktop/Models_Master/AERSURFACE/Input/Images/Denia_CORINE_CODE_18_reclass_NLCD92_reproj_ADAPTED_Europe_AEA.tif | grep -i "274"'
CODE
These subprocess.check_call() functions were executed by the following convenience function:
def subprocess_stdout_to_console_or_file(bash_str, filehandle=None):
"""Function documentation:\n
Convenience tool which either prints out directly in the provided shell, i.e. console,
or redirects the output to a given file.
NOTE on file redirection: it must not be the filepath, but the FILEHANDLE,
which can be achieved via the open(filepath, "w")-function, e.g. like so:
filehandle = open('out.txt', 'w')
print(filehandle): <_io.TextIOWrapper name='bla_dummy.txt' mode='w' encoding='UTF-8'>
"""
# Check whether a filehandle has been passed or not
if filehandle is None:
# i) If not, just direct the output to the BASH (shell), i.e. the console
subprocess.check_call(bash_str, shell=True)
else:
# ii) Otherwise, write to the provided file via its filehandle
subprocess.check_call(bash_str, stdout=filehandle)
The code piece where everything takes place is already redirecting the output of print() to a textfile. The aforementioned function is called within the function print_out_all_TIFF_Tags_n_filter_for_desired_TAGs().
As the subprocess-outputs are not redirected automatically along with the print()-outputs, it is necessary to pass the filehandle to the subprocess.check_call(bash_str, stdout=filehandle) via its keyword-argument stdout.
Nevertheless, the above-mentioned error would also happen outside this redirection zone of stdout created by contextlib.redirect_stdout().
dummy_filename = "/home/andylu/bla_dummy.txt" # will be saved temporarily in the user's home folder
# NOTE on scope: redirect sys.stdout for python 3.4x according to the following website_
# https://stackoverflow.com/questions/14197009/how-can-i-redirect-print-output-of-a-function-in-python
with open(dummy_filename, 'w') as f:
with contextlib.redirect_stdout(f):
print_out_all_TIFF_Tags_n_filter_for_desired_TAGs(
TIFF_filepath)
EDIT:
For more security, the piping-process should be split up as mentioned in the following, but this didn't really work out for me.
If you have an explanation for why a split-up piping process like
p1 = subprocess.Popen(['gdalinfo', 'TIFF_filepath'], stdout=PIPE)
p2 = subprocess.Popen(['grep', "'Pixel Size =' > 'path_to_textfile'"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
doesn't produce the output-textfile while still exiting successfully, I'd be delighted to learn about the reasons.
OS and Python versions
OS:
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
Python:
Python 3.7.6 (default, Jan 8 2020, 19:59:22)
[GCC 7.3.0] :: Anaconda, Inc. on linux

As for the initial error mentioned in the question:
The comments answered it with that I needed to put in all calls of subprocess.check_call() the kwarg shell=True if I wanted to pass on a prepared shell-command string like
gdalinfo TIFF_filepath | grep 'Pixel Size =' > path_to_textfile
As a sidenote, I noticed that it doesn't make a difference if I enquote the paths or not. I'm not sure whether it makes a difference using single (') or double (") quotes.
Furthermore, for the sake of security outlined in the comments to my questions, I followed the docs about piping savely avoiding shell and consequently changed from my previous standard approach
subprocess.check_call(shell_str, shell=True)
to the (somewhat cumbersome) piping steps delineated hereafter:
p1 = subprocess.Popen(['gdalinfo', 'TIFF_filepath'], stdout=PIPE)
p2 = subprocess.Popen(['grep', "'Pixel Size =' > 'path_to_textfile'"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
In order to get these sequences of command-strings from the initial entire shell-string, I had to write custom string-manipulation functions and play around with them in order to get the strings (like filepaths) enquoted while avoiding to enquote other functional parameters, flags etc. (like -i, >, ...).
This quite complex approach was necessary since shlex.split() function just splitted my shell-command-strings at every whitespace character, which lead to problems when recombining them in the pipes.
Yet in spite of all these apparent improvements, there is no output textfile generated, albeit the process seemingly doesn't produce any errors and finishes "correctly" after the last line of the piping process:
output = p2.communicate()[0]
As a consequence, I'm still forced to use the old and unsecure, but at least well-working approach via the shell:
subprocess.check_call(shell_str, shell=True)
At least it works now employing this former approach, even though I didn't manage to implement the more secure piping procedure where several commands can be glued/piped together.

I once ran into a similar issue like this and this fixed it.
cmd_str.split(' ')
My code :
# >>>>>>>>>>>>>>>>>>>>>>> UNZIP THE FILE AND RETURN THE FILE ARGUMENTS <<<<<<<<<<<<<<<<<<<<<<<<<<<<
def unzipFile(zipFile_):
# INITIALIZE THE UNZIP COMMAND HERE
cmd = "unzip -o " + zipFile_ + " -d " + outputDir
Tlog("UNZIPPING FILE " + zipFile_)
# GET THE PROCESS OUTPUT AND PIPE IT TO VARIABLE
log = subprocess.Popen(cmd.split(' '), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# GET BOTH THE ERROR LOG AND OUTPUT LOG FOR IT
stdout, stderr = log.communicate()
# FORMAT THE OUTPUT
stdout = stdout.decode('utf-8')
stderr = stderr.decode('utf-8')
if stderr != "" :
Tlog("ERROR WHILE UNZIPPING FILE \n\n\t"+stderr+'\n')
sys.exit(0)
# INITIALIZE THE TOTAL UNZIPPED ITEMS
unzipped_items = []
# DECODE THE STDOUT TO 'UTF-8' FORMAT AND PARSE LINE BY LINE
for line in stdout.split('\n'):
# CHECK IF THE LINE CONTAINS KEYWORD 'inflating'
if Regex.search(r"inflating",line) is not None:
# FIND ALL THE MATCHED STRING WITH REGEX
Matched = Regex.findall(r"inflating: "+outputDir+"(.*)",line)[0]
# SUBSTITUTE THE OUTPUT BY REMOVING BEGIN/END WHITESPACES
Matched = Regex.sub('^\s+|\s+$','',Matched)
# APPEND THE OUTPUTS TO LIST
unzipped_items.append(outputDir+Matched)
# RETURN THE OUTPUT
return unzipped_items

python subprocess sends backslash before a quote

I have a string, which is a framed command that should be executed by in command line
cmdToExecute = "TRAPTOOL -a STRING "ABC" -o STRING 'XYZ'"
I am considering the string to have the entire command that should be triggered from command prompt. If you take a closer look at the string cmdToExecute, you can see the option o with value XYZ enclosed in SINGLE QUOTE. There is a reason that this needs to be given in single quote orelse my tool TRAPTOOL will not be able to process the command.
I am using subprocess.Popen to execute the entire command. Before executing my command in a shell, I am printing the content
print "Cmd to be exectued: %r" % cmdToExecute
myProcess = subprocess.Popen(cmdToExecute, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False)
(stdOut, stdErr) = myProcess.communicate()
The output of the above command is,
Cmd to be executed: TRAPTOOL -a STRING "ABC" -o \'XYZ\'.
You can see that the output shows a BACKWARD SLASH added automatically while printing. Actually, the \ is not there in the string, which I tested using a regex. But, when the script is run on my box, the TRAPTOOL truncates the part of the string XYZ on the receiving server. I manually copy pasted the print output and tried sending it, I saw the same error on the receiving server. However, when I removed the backward slash, it sent the trap without any truncation.
Can anyone say why this happens?
Is there anyway where we can see what command is actually executed in subprocess.Popen?
Is there any other way I can execute my command other that subprocess.Popen that might solve this problem?

Try using shlex to split your command string:
>>> import shlex
>>> argv = shlex.split("TRAPTOOL -a STRING \"ABC\" -o STRING 'XYZ'")
>>> myProcess = subprocess.Popen(argv, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False)
>>> (stdOut, stdErr) = myProcess.communicate()
The first parameter to the Popen constructor can be an argument list for your shell command or a string, but an argument list might be easier to work with because of all the quotes involved. (See the Python subprocess documentation.)
If you want to see the commands being written, you could probably do something like:
>>> argv = shlex.split("bash -x -c 'TRAPTOOL -a STRING \"ABC\" -o STRING \'XYZ\''")
This makes bash echo the commands to the shell by means of the -x option.

You asked for the repr representation of the string, not the str representation. Basically, what would you have to type at the Python interactive interpreter to get the same output? That's what %r displays. Change that to %s to see the value as it's actually stored:
print "Cmd to be exectued: %s" % cmdToExecute

Python subprocess argument with quotes [duplicate]

This question already has answers here:
Using subprocess.run with arguments containing quotes
(3 answers)
Closed 1 year ago.
I am trying to run http://mediaarea.net/en/MediaInfo command-line utility from python.
It accepts arguments like this.
*Simple Usage: *
# verbose all info
MediaInfo.exe test.mp4
Template Usage:
# verbose selected info from csv
MediaInfo.exe --inform="file://D:\path\to\csv\template.csv" test.mp4
I am trying to run it with Template argument.I can use above command successfully from CMD.It is working and i can see my selected output fine from Dos window.
But when I try to run it from python , it outputs all info ignoring CSV which I give as argument.
Can anyone explain why ? It is because of quotes ?
NOTE: If path to csv not correct/invalid csv, MediaInfo outputs all info which is happening here exactly.
#App variable is full path to MediaInfo.exe
#filename variable is full path to media file
proc = subprocess.Popen([App ,'--inform="file://D:\path\to\csv\template.csv"',filename],shell=True,stderr=subprocess.PIPE, stdout=subprocess.PIPE)
return_code = proc.wait()
for line in proc.stdout:
print line

On Windows, you could pass the command as string i.e., as is:
from subprocess import check_output
cmd = r'MediaInfo.exe --inform="file://D:\path\to\csv\template.csv" test.mp4'
out = check_output(cmd)
Notice: r'' -- the raw-string literal is used to avoid interpreting '\t' as a single tab character instead of r'\t' two characters (backslash and t).
Unrelated: if you have specified stdout=PIPE, stderr=PIPE then you should read both streams concurrently and before p.wait() is called otherwise a deadlock is possible if the command generates enough output.
If the passing of the command as a string works then your could try a list argument:
from subprocess import check_output
from urllib import pathname2url
cmd = [app, '--inform']
cmd += ['file:' + pathname2url(r'D:\path\to\csv\template.csv')]
cmd += [filename]
out = check_output(cmd)
Also can u write a example for p.wait() deadlock u mentioned.
It is easy. Just produce large output in the child process:
import sys
from subprocess import Popen, PIPE
#XXX DO NOT USE, IT DEADLOCKS
p = Popen([sys.executable, "-c", "print('.' * (1 << 23))"], stdout=PIPE)
p.wait() # <-- this never returns unless the pipe buffer is larger than (1<<23)
assert 0 # unreachable

If you print your arguments, you might see what is going wrong:
>>> print '--inform="file://D:\path\to\csv\template.csv"'
--inform="file://D:\path o\csv emplate.csv"
The problem is \ denotes special characters. If you use the "r" literal in front of your string, these special characters are not escaped:
>>> print r'--inform="file://D:\path\to\csv\template.csv"'
--inform="file://D:\path\to\csv\template.csv"

Passing double quote shell commands in python to subprocess.Popen()?

I've been trying to pass a command that works only with literal double quotes in the commandline around the "concat:file1|file2" argument for ffmpeg.
I cant however make this work from python with subprocess.Popen(). Anyone have an idea how one passes quotes into subprocess.Popen?
Here is the code:
command = "ffmpeg -i "concat:1.ts|2.ts" -vcodec copy -acodec copy temp.mp4"
output,error = subprocess.Popen(command, universal_newlines=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE).communicate()
When I do this, ffmpeg won't take it any other way other than quotes around the concat segement. Is there a way to successfully pass this line to subprocess.Popen command?

I'd suggest using the list form of invocation rather than the quoted string version:
command = ["ffmpeg", "-i", "concat:1.ts|2.ts", "-vcodec", "copy",
"-acodec", "copy", "temp.mp4"]
output,error = subprocess.Popen(
command, universal_newlines=True,
stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
This more accurately represents the exact set of parameters that are going to be passed to the end process and eliminates the need to mess around with shell quoting.
That said, if you absolutely want to use the plain string version, just use different quotes (and shell=True):
command = 'ffmpeg -i "concat:1.ts|2.ts" -vcodec copy -acodec copy temp.mp4'
output,error = subprocess.Popen(
command, universal_newlines=True, shell=True,
stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()

Either use single quotes 'around the "whole pattern"' to automatically escape the doubles or explicitly "escape the \"double quotes\"". Your problem has nothing to do with Popen as such.
Just for the record, I had a problem particularly with a list-based command passed to Popen that would not preserve proper double quotes around a glob pattern (i.e. what was suggested in the accepted answer) under Windows. Joining the list into a string with ' '.join(cmd) before passing it to Popen solved the problem.

This works with python 2.7.3 The command to pipe stderr to stdout has changed since older versions of python:
Put this in a file called test.py:
#!/usr/bin/python
import subprocess
command = 'php -r "echo gethostname();"'
p = subprocess.Popen(command, universal_newlines=True, shell=True,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
text = p.stdout.read()
retcode = p.wait()
print text
Invoke it:
python test.py
It prints my hostname, which is apollo:
apollo
Read up on the manual for subprocess: http://docs.python.org/2/library/subprocess.html

I have been working with a similar issue, with running a relatively complex
command over ssh. It also had multiple double quotes and single quotes. Because
I was piping the command through python, ssh, powershell etc.
If you can instead just convert the command into a shell script, and run the
shell script through subprocess.call/Popen/run, these issues will go away.
So depending on whether you are on windows or on linux or mac, put the
following in a shell script either (script.sh or script.bat)
ffmpeg -i "concat:1.ts|2.ts" -vcodec copy -acodec copy temp.mp4
Then you can run
import subprocess; subprocess.call(`./script.sh`; shell=True)
Without having to worry about single quotes, etc.

This line of code in your question isn't valid Python syntax:
command = "ffmpeg -i "concat:1.ts|2.ts" -vcodec copy -acodec copy temp.mp4"
If you had a Python file with just this line in it, you would get a syntax error. A string literal surrounded with double quotes can't have double quotes in them unless they are escaped with a backslash. So you could fix that line by replacing it with:
command = "ffmpeg -i \"concat:1.ts|2.ts\" -vcodec copy -acodec copy temp.mp4"
Another way to fix this line is to use single quotes for the string literal in Python, that way Python is not confused when the string itself contains a double quote:
command = 'ffmpeg -i "concat:1.ts|2.ts" -vcodec copy -acodec copy temp.mp4'
Once you have fixed the syntax error, you can then tackle the issue with using subprocess, as explained in this answer. I also wrote this answer to explain a helpful mental model for subprocess in general.

Also struggling with a string argument containing spaces and not wanting to use the shell=True.
The solution was to use double quotes for the inside strings.
args = ['salt', '-G', 'environment:DEV', 'grains.setvals', '{"man_version": "man-dev-2.3"}']
try:
p = subprocess.Popen(args, stdin=subprocess.PIPE
, stdout=subprocess.PIPE
, stderr=subprocess.PIPE
)
(stdin,stderr) = p.communicate()
except (subprocess.CalledProcessError, OSError ) as err:
exit(1)
if p.returncode != 0:
print("Failure in returncode of command:")

Anybody suffering from this pain. It also works with params enclosed with quotation marks.
params = ["ls", "-la"]
subprocess.check_output(" ".join(params), shell=True)

Problems capturing Python subprocess output on Mac OS X

I'm running Python 3.3 on Mac OS 10.6.8. I am writing a script that runs several subprocesses, and I want to capture the output of each one and record it in a file. I'm having trouble with this.
I first tried the following:
import subprocess
logFile = open("log.txt", 'w')
proc = subprocess.Popen(args, stdout=logFile, stderr=logFile)
proc.wait()
This produced an empty log.txt. After poking around on the internet for a bit, I tried this instead
import subprocess
proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, err = proc.communicate()
logFile = open("log.txt", 'w')
logFile.write(output)
This, too, produced an empty log.txt. So instead of writing to the file, I tried to just print the output to the command line:
output, err = proc.communicate()
print(output)
print(err)
That produced this:
b''
b''
The process I'm trying to run is fastq_quality_trimmer. It takes an input file, filters it, and saves the result to a new file. It only writes a few lines to stdout, like so
Minimum Quality Threshold: 20
Minimum Length: 20
Input: 750000 reads.
Output: 750000 reads.
discarded 0 (0%) too-short reads.
If I run it from the command line and redirect the output like this
fastq_quality_trimmer -Q 33 -v -t 50 -l 20 -i in.fq -o in_trimmed.fq > log.txt
the output is successfully written to log.txt.
I thought perhaps that fastq_quality_trimmer was somehow failing to run when I called it with Popen, but my script produces a filtered file that is identical to the one produced when I run fastq_quality_trimmer from the command line. So it's working; I just can't capture the output. To make matters more confusing, I can successfully capture the output of other processes (echo, other Python scripts) using code that is essentially identical to what I've posted.
Any thoughts? Am I missing something blindingly obvious?

You forgot a comma:
["fastq_quality_trimmer", "-Q", "33" "-v", "-t", "50", "-l", "20", "-i", leftInitial, "-o", leftTrimmed]
add it between "33" and "-v".
You are essentially passing in the arguments -Q 33-v instead of -Q 33 -v.
Python will concatenate two adjacent strings if there is only whitespace between them:
>>> "33", "-v"
('33', '-v')
>>> "33" "-v"
'33-v'
Since -v is the verbose switch that is required to make fastq_quality_trimmer produce output at all, it'll remain silent with it missing.
Whenever you encounter problems with calling a subprocess, triple check the command line created. Pre-pending args with ['echo'] can help in that:
proc = subprocess.Popen(['echo'] + args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, err = proc.communicate()
print(output)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.