Python - Getting rid of unnecessary output

Python - Getting rid of unnecessary output - python

I am logging into a remote node using SSH, getting the status of a service and want to print it.
Running the bash command on my remote node yields.
[root#redis-1 ~]# redis-cli -a '!t3bmjEJss' info replication | grep role | cut -d':' -f2
slave
The python code that Ive written is
def serviceDetails(ip,svc):
if svc == 'redis-server':
ssh = subprocess.Popen(["ssh", "%s" % ip, "redis-cli -a '!t3Z9LJt2_wmUDbmjEJss' info replication | grep role | cut -d':' -f2"], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
result = ssh.stdout.readlines()
print (result)
else:
print ("Redis service is not running on this node")
The output that I am getting from result variable is:
[b'slave\r\n']
Why do all these extra characters appear ? And how can I get rid of them ?

The entire process of calling subprocess.Popen and then manually reading from its stdout property can be condensed into one call which will also automatically performs the bytes to string conversion:
subprocess.check_output([arg0, arg1, ...], encoding='utf-8')
If you also want to read stderr, then include a stderr=subprocess.STDOUT.
You can find the docs for subprocess.check_output here.

When you use .readlines(), it will return a list of lines. You can use .read() if you want it all in one string. It has the b there because it is a byte string. To get it to a normal string, you can use .decode('utf-8') in most cases. It may be a different encoding, but utf-8 will probably work. Then to get rid of the new line, you can use .strip(). Putting it all together, either of these would work:
result = ssh.stdout.read().decode('utf-8').strip()
print(result)
# slave
or
result = [line.decode('utf-8').strip() for line in ssh.stdout.readlines()]
print(result)
# ['slave']
Either one will work when you have only one line. If you have more than one line, the first will not work properly; it will have \r\n in the middle of the string.

Related

subprocess.call with command having embedded spaces and quotes

I would like to retrieve output from a shell command that contains spaces and quotes. It looks like this:
import subprocess
cmd = "docker logs nc1 2>&1 |grep mortality| awk '{print $1}'|sort|uniq"
subprocess.check_output(cmd)
This fails with "No such file or directory". What is the best/easiest way to pass commands such as these to subprocess?

The absolutely best solution here is to refactor the code to replace the entire tail of the pipeline with native Python code.
import subprocess
from collections import Counter
s = subprocess.run(
["docker", "logs", "nc1"],
text=True, capture_output=True, check=True)
count = Counter()
for line in s.stdout.splitlines():
if "mortality" in line:
count[line.split()[0]] += 1
for count, word in count.most_common():
print(count, word)
There are minor differences in how Counter objects resolve ties (if two words have the same count, the one which was seen first is returned first, rather than by sort order), but I'm guessing that's unimportant here.
I am also ignoring standard output from the subprocess; if you genuinely want to include output from error messages, too, just include s.stderr in the loop driver too.
However, my hunch is that you don't realize your code was doing that, which drives home the point nicely: Mixing shell script and Python raises the mainainability burden, because now you have to understand both shell script and Python to understand the code.
(And in terms of shell script style, I would definitely get rid of the useless grep by refactoring it into the Awk script, and probably also fold in the sort | uniq which has a trivial and more efficient replacement in Awk. But here, we are replacing all of that with Python code anyway.)
If you really wanted to stick to a pipeline, then you need to add shell=True to use shell features like redirection, pipes, and quoting. Without shell=True, Python looks for a command whose file name is the entire string you were passing in, which of course doesn't exist.

subprocess.call() throws error "FileNotFoundError: [Errno 2] No such file or directory" when redirecting stdout to file

I want to redirect the console output to a textfile for further inspection.
The task is to extract TIFF-TAGs from a raster file (TIFF) and filter the results.
In order to achieve this, I have several tools at hand. Some of them are not python libraries, but command-line tools, such as "identify" of ImageMagick.
My example command-string passed to subprocess.check_call() was:
cmd_str = 'identify -verbose /home/andylu/Desktop/Models_Master/AERSURFACE/Input/Images/Denia_CORINE_CODE_18_reclass_NLCD92_reproj_ADAPTED_Europe_AEA.tif | grep -i "274"'
Here, in the output of the TIFF-TAGs produced by "identify" all lines which contain information about the TAG number "274" shall be either displayed in the console, or written to a file.
Error-type 1: Displaying in the console
subprocess.check_call(bash_str, shell=True)
subprocess.CalledProcessError: Command 'identify -verbose /home/andylu/Desktop/Models_Master/AERSURFACE/Input/Images/Denia_CORINE_CODE_18_reclass_NLCD92_reproj_ADAPTED_Europe_AEA.tif | grep -i "274"' returned non-zero exit status 1.
Error-type 2: Redirecting the output to textfile
subprocess.call(bash_str, stdout=filehandle_dummy, stderr=filehandle_dummy
FileNotFoundError: [Errno 2] No such file or directory: 'identify -verbose /home/andylu/Desktop/Models_Master/AERSURFACE/Input/Images/Denia_CORINE_CODE_18_reclass_NLCD92_reproj_ADAPTED_Europe_AEA.tif | grep -i "274"': 'identify -verbose /home/andylu/Desktop/Models_Master/AERSURFACE/Input/Images/Denia_CORINE_CODE_18_reclass_NLCD92_reproj_ADAPTED_Europe_AEA.tif | grep -i "274"'
CODE
These subprocess.check_call() functions were executed by the following convenience function:
def subprocess_stdout_to_console_or_file(bash_str, filehandle=None):
"""Function documentation:\n
Convenience tool which either prints out directly in the provided shell, i.e. console,
or redirects the output to a given file.
NOTE on file redirection: it must not be the filepath, but the FILEHANDLE,
which can be achieved via the open(filepath, "w")-function, e.g. like so:
filehandle = open('out.txt', 'w')
print(filehandle): <_io.TextIOWrapper name='bla_dummy.txt' mode='w' encoding='UTF-8'>
"""
# Check whether a filehandle has been passed or not
if filehandle is None:
# i) If not, just direct the output to the BASH (shell), i.e. the console
subprocess.check_call(bash_str, shell=True)
else:
# ii) Otherwise, write to the provided file via its filehandle
subprocess.check_call(bash_str, stdout=filehandle)
The code piece where everything takes place is already redirecting the output of print() to a textfile. The aforementioned function is called within the function print_out_all_TIFF_Tags_n_filter_for_desired_TAGs().
As the subprocess-outputs are not redirected automatically along with the print()-outputs, it is necessary to pass the filehandle to the subprocess.check_call(bash_str, stdout=filehandle) via its keyword-argument stdout.
Nevertheless, the above-mentioned error would also happen outside this redirection zone of stdout created by contextlib.redirect_stdout().
dummy_filename = "/home/andylu/bla_dummy.txt" # will be saved temporarily in the user's home folder
# NOTE on scope: redirect sys.stdout for python 3.4x according to the following website_
# https://stackoverflow.com/questions/14197009/how-can-i-redirect-print-output-of-a-function-in-python
with open(dummy_filename, 'w') as f:
with contextlib.redirect_stdout(f):
print_out_all_TIFF_Tags_n_filter_for_desired_TAGs(
TIFF_filepath)
EDIT:
For more security, the piping-process should be split up as mentioned in the following, but this didn't really work out for me.
If you have an explanation for why a split-up piping process like
p1 = subprocess.Popen(['gdalinfo', 'TIFF_filepath'], stdout=PIPE)
p2 = subprocess.Popen(['grep', "'Pixel Size =' > 'path_to_textfile'"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
doesn't produce the output-textfile while still exiting successfully, I'd be delighted to learn about the reasons.
OS and Python versions
OS:
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
Python:
Python 3.7.6 (default, Jan 8 2020, 19:59:22)
[GCC 7.3.0] :: Anaconda, Inc. on linux

As for the initial error mentioned in the question:
The comments answered it with that I needed to put in all calls of subprocess.check_call() the kwarg shell=True if I wanted to pass on a prepared shell-command string like
gdalinfo TIFF_filepath | grep 'Pixel Size =' > path_to_textfile
As a sidenote, I noticed that it doesn't make a difference if I enquote the paths or not. I'm not sure whether it makes a difference using single (') or double (") quotes.
Furthermore, for the sake of security outlined in the comments to my questions, I followed the docs about piping savely avoiding shell and consequently changed from my previous standard approach
subprocess.check_call(shell_str, shell=True)
to the (somewhat cumbersome) piping steps delineated hereafter:
p1 = subprocess.Popen(['gdalinfo', 'TIFF_filepath'], stdout=PIPE)
p2 = subprocess.Popen(['grep', "'Pixel Size =' > 'path_to_textfile'"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
In order to get these sequences of command-strings from the initial entire shell-string, I had to write custom string-manipulation functions and play around with them in order to get the strings (like filepaths) enquoted while avoiding to enquote other functional parameters, flags etc. (like -i, >, ...).
This quite complex approach was necessary since shlex.split() function just splitted my shell-command-strings at every whitespace character, which lead to problems when recombining them in the pipes.
Yet in spite of all these apparent improvements, there is no output textfile generated, albeit the process seemingly doesn't produce any errors and finishes "correctly" after the last line of the piping process:
output = p2.communicate()[0]
As a consequence, I'm still forced to use the old and unsecure, but at least well-working approach via the shell:
subprocess.check_call(shell_str, shell=True)
At least it works now employing this former approach, even though I didn't manage to implement the more secure piping procedure where several commands can be glued/piped together.

I once ran into a similar issue like this and this fixed it.
cmd_str.split(' ')
My code :
# >>>>>>>>>>>>>>>>>>>>>>> UNZIP THE FILE AND RETURN THE FILE ARGUMENTS <<<<<<<<<<<<<<<<<<<<<<<<<<<<
def unzipFile(zipFile_):
# INITIALIZE THE UNZIP COMMAND HERE
cmd = "unzip -o " + zipFile_ + " -d " + outputDir
Tlog("UNZIPPING FILE " + zipFile_)
# GET THE PROCESS OUTPUT AND PIPE IT TO VARIABLE
log = subprocess.Popen(cmd.split(' '), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# GET BOTH THE ERROR LOG AND OUTPUT LOG FOR IT
stdout, stderr = log.communicate()
# FORMAT THE OUTPUT
stdout = stdout.decode('utf-8')
stderr = stderr.decode('utf-8')
if stderr != "" :
Tlog("ERROR WHILE UNZIPPING FILE \n\n\t"+stderr+'\n')
sys.exit(0)
# INITIALIZE THE TOTAL UNZIPPED ITEMS
unzipped_items = []
# DECODE THE STDOUT TO 'UTF-8' FORMAT AND PARSE LINE BY LINE
for line in stdout.split('\n'):
# CHECK IF THE LINE CONTAINS KEYWORD 'inflating'
if Regex.search(r"inflating",line) is not None:
# FIND ALL THE MATCHED STRING WITH REGEX
Matched = Regex.findall(r"inflating: "+outputDir+"(.*)",line)[0]
# SUBSTITUTE THE OUTPUT BY REMOVING BEGIN/END WHITESPACES
Matched = Regex.sub('^\s+|\s+$','',Matched)
# APPEND THE OUTPUTS TO LIST
unzipped_items.append(outputDir+Matched)
# RETURN THE OUTPUT
return unzipped_items

python subprocess sends backslash before a quote

I have a string, which is a framed command that should be executed by in command line
cmdToExecute = "TRAPTOOL -a STRING "ABC" -o STRING 'XYZ'"
I am considering the string to have the entire command that should be triggered from command prompt. If you take a closer look at the string cmdToExecute, you can see the option o with value XYZ enclosed in SINGLE QUOTE. There is a reason that this needs to be given in single quote orelse my tool TRAPTOOL will not be able to process the command.
I am using subprocess.Popen to execute the entire command. Before executing my command in a shell, I am printing the content
print "Cmd to be exectued: %r" % cmdToExecute
myProcess = subprocess.Popen(cmdToExecute, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False)
(stdOut, stdErr) = myProcess.communicate()
The output of the above command is,
Cmd to be executed: TRAPTOOL -a STRING "ABC" -o \'XYZ\'.
You can see that the output shows a BACKWARD SLASH added automatically while printing. Actually, the \ is not there in the string, which I tested using a regex. But, when the script is run on my box, the TRAPTOOL truncates the part of the string XYZ on the receiving server. I manually copy pasted the print output and tried sending it, I saw the same error on the receiving server. However, when I removed the backward slash, it sent the trap without any truncation.
Can anyone say why this happens?
Is there anyway where we can see what command is actually executed in subprocess.Popen?
Is there any other way I can execute my command other that subprocess.Popen that might solve this problem?

Try using shlex to split your command string:
>>> import shlex
>>> argv = shlex.split("TRAPTOOL -a STRING \"ABC\" -o STRING 'XYZ'")
>>> myProcess = subprocess.Popen(argv, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False)
>>> (stdOut, stdErr) = myProcess.communicate()
The first parameter to the Popen constructor can be an argument list for your shell command or a string, but an argument list might be easier to work with because of all the quotes involved. (See the Python subprocess documentation.)
If you want to see the commands being written, you could probably do something like:
>>> argv = shlex.split("bash -x -c 'TRAPTOOL -a STRING \"ABC\" -o STRING \'XYZ\''")
This makes bash echo the commands to the shell by means of the -x option.

You asked for the repr representation of the string, not the str representation. Basically, what would you have to type at the Python interactive interpreter to get the same output? That's what %r displays. Change that to %s to see the value as it's actually stored:
print "Cmd to be exectued: %s" % cmdToExecute

Python3 subprocess output

I want to run the Linux word count utility wc to determine the number of lines currently in the /var/log/syslog, so that I can detect that it's growing. I've tried various test, and while I get the results back from wc, it includes both the line count as well as the command (e.g., var/log/syslog).
So it's returning:
1338 /var/log/syslog
But I only want the line count, so I want to strip off the /var/log/syslog portion, and just keep 1338.
I have tried converting it to string from bytestring, and then stripping the result, but no joy. Same story for converting to string and stripping, decoding, etc - all fail to produce the output I'm looking for.
These are some examples of what I get, with 1338 lines in syslog:
b'1338 /var/log/syslog\n'
1338 /var/log/syslog
Here's some test code I've written to try and crack this nut, but no solution:
import subprocess
#check_output returns byte string
stdoutdata = subprocess.check_output("wc --lines /var/log/syslog", shell=True)
print("2A stdoutdata: " + str(stdoutdata))
stdoutdata = stdoutdata.decode("utf-8")
print("2B stdoutdata: " + str(stdoutdata))
stdoutdata=stdoutdata.strip()
print("2C stdoutdata: " + str(stdoutdata))
The output from this is:
2A stdoutdata: b'1338 /var/log/syslog\n'
2B stdoutdata: 1338 /var/log/syslog
2C stdoutdata: 1338 /var/log/syslog
2D stdoutdata: 1338 /var/log/syslog

I suggest that you use subprocess.getoutput() as it does exactly what you want—run a command in a shell and get its string output (as opposed to byte string output). Then you can split on whitespace and grab the first element from the returned list of strings.
Try this:
import subprocess
stdoutdata = subprocess.getoutput("wc --lines /var/log/syslog")
print("stdoutdata: " + stdoutdata.split()[0])

Since Python 3.6 you can make check_output() return a str instead of bytes by giving it an encoding parameter:
check_output('wc --lines /var/log/syslog', encoding='UTF-8')
But since you just want the count, and both split() and int() are usable with bytes, you don't need to bother with the encoding:
linecount = int(check_output('wc -l /var/log/syslog').split()[0])
While some things might be easier with an external program (e.g., counting log line entries printed by journalctl), in this particular case you don't need to use an external program. The simplest Python-only solution is:
with open('/var/log/syslog', 'rt') as f:
linecount = len(f.readlines())
This does have the disadvantage that it reads the entire file into memory; if it's a huge file instead initialize linecount = 0 before you open the file and use a for line in f: linecount += 1 loop instead of readlines() to have only a small part of the file in memory as you count.

To avoid invoking a shell and decoding filenames that might be an arbitrary byte sequence (except '\0') on *nix, you could pass the file as stdin:
import subprocess
with open(b'/var/log/syslog', 'rb') as file:
nlines = int(subprocess.check_output(['wc', '-l'], stdin=file))
print(nlines)
Or you could ignore any decoding errors:
import subprocess
stdoutdata = subprocess.check_output(['wc', '-l', '/var/log/syslog'])
nlines = int(stdoutdata.decode('ascii', 'ignore').partition(' ')[0])
print(nlines)

Equivalent to Curt J. Sampson's answer is also this one (it's returning a string):
subprocess.check_output('wc -l /path/to/your/file | cut -d " " -f1', universal_newlines=True, shell=True)
from docs:
If encoding or errors are specified, or text is true, file objects for
stdin, stdout and stderr are opened in text mode using the specified
encoding and errors or the io.TextIOWrapper default. The
universal_newlines argument is equivalent to text and is provided for
backwards compatibility. By default, file objects are opened in binary
mode.
Something similar, but a bit more complex using subprocess.run():
subprocess.run(command, shell=True, check=True, universal_newlines=True, stdout=subprocess.PIPE).stdout
as subprocess.check_output() could be equivalent to subprocess.run().

getoutput (and the closer replacement getstatusoutput) are not a direct replacement of check_output - there are security changes in 3.x that prevent some previous commands from working that way (my script was attempting to work with iptables and failing with the new commands). Better to adapt to the new python3 output and add the argument universal_newlines=True:
check_output(command, universal_newlines=True)
This command will behave as you expect check_output, but return string output instead of bytes. It's a direct replacement.

Python subprocess to call Unix commands, a question about how output is stored

I am writing a python script that reads a line/string, calls Unix, uses grep to search a query file for lines that contain the string, and then prints the results.
from subprocess import call
for line in infilelines:
output = call(["grep", line, "path/to/query/file"])
print output
print line`
When I look at my results printed to the screen, I will get a list of matching strings from the query file, but I will also get "1" and "0" integers as output, and line is never printed to the screen. I expect to get the lines from the query file that match my string, followed by the string that I used in my search.

call returns the process return code.
If using Python 2.7, use check_output.
from subprocess import check_output
output = check_output(["grep", line, "path/to/query/file"])
If using anything before that, use communicate.
import subprocess
process = subprocess.Popen(["grep", line, "path/to/query/file"], stdout=subprocess.PIPE)
output = process.communicate()[0]
This will open a pipe for stdout that you can read with communicate. If you want stderr too, you need to add "stderr=subprocess.PIPE" too.
This will return the full output. If you want to parse it into separate lines, use split.
output.split('\n')
I believe Python takes care of line-ending conversions for you, but since you're using grep I'm going to assume you're on Unix where the line-ending is \n anyway.
http://docs.python.org/library/subprocess.html#subprocess.check_output

The following code works with Python >= 2.5:
from commands import getoutput
output = getoutput('grep %s path/to/query/file' % line)
output_list = output.splitlines()

Why would you want to execute a call to external grep when Python itself can do it? This is extra overhead and your code will then be dependent on grep being installed. This is how you do simple grep in Python with "in" operator.
query=open("/path/to/query/file").readlines()
query=[ i.rstrip() for i in query ]
f=open("file")
for line in f:
if "line" in query:
print line.rstrip()
f.close()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.