Python subprocess argument with quotes [duplicate]

Python subprocess argument with quotes [duplicate] - python

This question already has answers here:
Using subprocess.run with arguments containing quotes
(3 answers)
Closed 1 year ago.
I am trying to run http://mediaarea.net/en/MediaInfo command-line utility from python.
It accepts arguments like this.
*Simple Usage: *
# verbose all info
MediaInfo.exe test.mp4
Template Usage:
# verbose selected info from csv
MediaInfo.exe --inform="file://D:\path\to\csv\template.csv" test.mp4
I am trying to run it with Template argument.I can use above command successfully from CMD.It is working and i can see my selected output fine from Dos window.
But when I try to run it from python , it outputs all info ignoring CSV which I give as argument.
Can anyone explain why ? It is because of quotes ?
NOTE: If path to csv not correct/invalid csv, MediaInfo outputs all info which is happening here exactly.
#App variable is full path to MediaInfo.exe
#filename variable is full path to media file
proc = subprocess.Popen([App ,'--inform="file://D:\path\to\csv\template.csv"',filename],shell=True,stderr=subprocess.PIPE, stdout=subprocess.PIPE)
return_code = proc.wait()
for line in proc.stdout:
print line

On Windows, you could pass the command as string i.e., as is:
from subprocess import check_output
cmd = r'MediaInfo.exe --inform="file://D:\path\to\csv\template.csv" test.mp4'
out = check_output(cmd)
Notice: r'' -- the raw-string literal is used to avoid interpreting '\t' as a single tab character instead of r'\t' two characters (backslash and t).
Unrelated: if you have specified stdout=PIPE, stderr=PIPE then you should read both streams concurrently and before p.wait() is called otherwise a deadlock is possible if the command generates enough output.
If the passing of the command as a string works then your could try a list argument:
from subprocess import check_output
from urllib import pathname2url
cmd = [app, '--inform']
cmd += ['file:' + pathname2url(r'D:\path\to\csv\template.csv')]
cmd += [filename]
out = check_output(cmd)
Also can u write a example for p.wait() deadlock u mentioned.
It is easy. Just produce large output in the child process:
import sys
from subprocess import Popen, PIPE
#XXX DO NOT USE, IT DEADLOCKS
p = Popen([sys.executable, "-c", "print('.' * (1 << 23))"], stdout=PIPE)
p.wait() # <-- this never returns unless the pipe buffer is larger than (1<<23)
assert 0 # unreachable

If you print your arguments, you might see what is going wrong:
>>> print '--inform="file://D:\path\to\csv\template.csv"'
--inform="file://D:\path o\csv emplate.csv"
The problem is \ denotes special characters. If you use the "r" literal in front of your string, these special characters are not escaped:
>>> print r'--inform="file://D:\path\to\csv\template.csv"'
--inform="file://D:\path\to\csv\template.csv"

Related

How can I save the os commands outputs in a text file? [duplicate]

This question already has answers here:
Save output of os.system to text file
(4 answers)
Closed 2 years ago.
I'm trying to write a script which uses the os command(linux) and save them in the text file. But when I try to run this code the output of the os command is not saved in the text file.
#!/usr/bin/python
import sys
import os
target = raw_input('Enter the website : ')
ping_it = os.system('ping ' + target)
string_it = str(ping_it)
with open("Output.txt", "w+") as fo:
fo.write(string_it)
fo.close()
After running the script when I check txt file the only thing I get is no 2 in the Output.txt.

Welcome to Stackoverflow.
The main issue here is that os.system is not designed to produce the output from the command - it simply runs it, and the process sends its output to whatever it inherits from its parent (your program).
To capture output it's easiest to use the subprocess module, which allows you to capture the process's outputs.
Here's a fairly simple program that will get you started:
import subprocess
target = 'google.com'
ping_it = subprocess.Popen('ping ' + target,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
out, err = ping_it.communicate()
with open("Output.txt", "w+") as fo:
fo.write(str(out))
fo.close()
If you want to read output as it is produced rather than waiting for the subprocess to terminate you can use a single subprocess.PIPE channel and read from that, which is conveniently expressed in forms like this:
with Popen(["ping", "google.com"], stdout=PIPE) as proc:
print(proc.stdout.read())
In this example I chose to give the command as a list of arguments rather than as a simple string. This avoids having to join arguements into a string if they are already in list form.
Note that when interacting with subprocesses in this way it's possible for the subprocess to get in a blocked state because either stdout or stderr has filled up its output buffer space. If your program then tries to read from the other channel that will create a deadlock, where each process is waiting for the other to do something. To avoid this you can make stderr a temporary file, then verify after subprocess completion that the file contains nothing of significance (and, ideally, remove it).

From docs you can use os.popen to assign output of any command to a variable.
import os
target = raw_input('Enter the website : ')
output = os.popen('ping ' + target).read() # Saving the output
with open('output.txt', 'w+') as f:
f.write(output)

Exactly what are you trying to save in the file? You did save the output of os.command, which is nothing more than the final status of the execution. This is exactly what the documentation tells you is the return value of that command.
If you want the output of the ping command, you need to use something that focuses on ping, not on os.command. The simple way is to add UNIX redirection:
os.system('ping ' + target + '&> Output.txt')
If you feel a need to pass the results through Python, use a separate process and receive the command results; see here.
You can also spawn a separate process and examine the results as they are produced, line by line. You don't seem to need that, but just in case, see my own question here.

Python2: Writing to stdin of interactive process, using Popen.communicate(), without trailing newline

I am trying to write what I thought would be a simple utility script to call a different command, but Popen.communicate() seems to append a newline. I imagine this is to terminate input, and it works with a basic script that takes an input and prints it out, but it's causing problems when the other program is interactive (such as e.g. bc).
Minimal code to reproduce, using bc in lieu of the other program (since both are interactive, getting it to work with bc should solve the problem):
#!/usr/bin/env python
from subprocess import Popen, PIPE
command = "bc"
p = Popen(command, stdin=PIPE, stdout=PIPE, stderr=PIPE)
stdout_data = p.communicate(input="2+2")
print(stdout_data)
This prints ('', '(standard_in) 1: syntax error\n'), presumably caused by the appended newline character, as piping the same string to bc in a shell, echo "2+2" | bc, prints 4 just fine.
Is it possible to use Popen.communicate() without appending the newline, or would I need to use a different method?

I guess I'm an idiot, because the solution was the opposite of what I thought: adding a newline to the input: stdout_data = p.communicate(input="2+2\n") makes the script print ('4\n', '') as it should, rather than give an error.

python subprocess sends backslash before a quote

I have a string, which is a framed command that should be executed by in command line
cmdToExecute = "TRAPTOOL -a STRING "ABC" -o STRING 'XYZ'"
I am considering the string to have the entire command that should be triggered from command prompt. If you take a closer look at the string cmdToExecute, you can see the option o with value XYZ enclosed in SINGLE QUOTE. There is a reason that this needs to be given in single quote orelse my tool TRAPTOOL will not be able to process the command.
I am using subprocess.Popen to execute the entire command. Before executing my command in a shell, I am printing the content
print "Cmd to be exectued: %r" % cmdToExecute
myProcess = subprocess.Popen(cmdToExecute, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False)
(stdOut, stdErr) = myProcess.communicate()
The output of the above command is,
Cmd to be executed: TRAPTOOL -a STRING "ABC" -o \'XYZ\'.
You can see that the output shows a BACKWARD SLASH added automatically while printing. Actually, the \ is not there in the string, which I tested using a regex. But, when the script is run on my box, the TRAPTOOL truncates the part of the string XYZ on the receiving server. I manually copy pasted the print output and tried sending it, I saw the same error on the receiving server. However, when I removed the backward slash, it sent the trap without any truncation.
Can anyone say why this happens?
Is there anyway where we can see what command is actually executed in subprocess.Popen?
Is there any other way I can execute my command other that subprocess.Popen that might solve this problem?

Try using shlex to split your command string:
>>> import shlex
>>> argv = shlex.split("TRAPTOOL -a STRING \"ABC\" -o STRING 'XYZ'")
>>> myProcess = subprocess.Popen(argv, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False)
>>> (stdOut, stdErr) = myProcess.communicate()
The first parameter to the Popen constructor can be an argument list for your shell command or a string, but an argument list might be easier to work with because of all the quotes involved. (See the Python subprocess documentation.)
If you want to see the commands being written, you could probably do something like:
>>> argv = shlex.split("bash -x -c 'TRAPTOOL -a STRING \"ABC\" -o STRING \'XYZ\''")
This makes bash echo the commands to the shell by means of the -x option.

You asked for the repr representation of the string, not the str representation. Basically, what would you have to type at the Python interactive interpreter to get the same output? That's what %r displays. Change that to %s to see the value as it's actually stored:
print "Cmd to be exectued: %s" % cmdToExecute

Python3 subprocess output

I want to run the Linux word count utility wc to determine the number of lines currently in the /var/log/syslog, so that I can detect that it's growing. I've tried various test, and while I get the results back from wc, it includes both the line count as well as the command (e.g., var/log/syslog).
So it's returning:
1338 /var/log/syslog
But I only want the line count, so I want to strip off the /var/log/syslog portion, and just keep 1338.
I have tried converting it to string from bytestring, and then stripping the result, but no joy. Same story for converting to string and stripping, decoding, etc - all fail to produce the output I'm looking for.
These are some examples of what I get, with 1338 lines in syslog:
b'1338 /var/log/syslog\n'
1338 /var/log/syslog
Here's some test code I've written to try and crack this nut, but no solution:
import subprocess
#check_output returns byte string
stdoutdata = subprocess.check_output("wc --lines /var/log/syslog", shell=True)
print("2A stdoutdata: " + str(stdoutdata))
stdoutdata = stdoutdata.decode("utf-8")
print("2B stdoutdata: " + str(stdoutdata))
stdoutdata=stdoutdata.strip()
print("2C stdoutdata: " + str(stdoutdata))
The output from this is:
2A stdoutdata: b'1338 /var/log/syslog\n'
2B stdoutdata: 1338 /var/log/syslog
2C stdoutdata: 1338 /var/log/syslog
2D stdoutdata: 1338 /var/log/syslog

I suggest that you use subprocess.getoutput() as it does exactly what you want—run a command in a shell and get its string output (as opposed to byte string output). Then you can split on whitespace and grab the first element from the returned list of strings.
Try this:
import subprocess
stdoutdata = subprocess.getoutput("wc --lines /var/log/syslog")
print("stdoutdata: " + stdoutdata.split()[0])

Since Python 3.6 you can make check_output() return a str instead of bytes by giving it an encoding parameter:
check_output('wc --lines /var/log/syslog', encoding='UTF-8')
But since you just want the count, and both split() and int() are usable with bytes, you don't need to bother with the encoding:
linecount = int(check_output('wc -l /var/log/syslog').split()[0])
While some things might be easier with an external program (e.g., counting log line entries printed by journalctl), in this particular case you don't need to use an external program. The simplest Python-only solution is:
with open('/var/log/syslog', 'rt') as f:
linecount = len(f.readlines())
This does have the disadvantage that it reads the entire file into memory; if it's a huge file instead initialize linecount = 0 before you open the file and use a for line in f: linecount += 1 loop instead of readlines() to have only a small part of the file in memory as you count.

To avoid invoking a shell and decoding filenames that might be an arbitrary byte sequence (except '\0') on *nix, you could pass the file as stdin:
import subprocess
with open(b'/var/log/syslog', 'rb') as file:
nlines = int(subprocess.check_output(['wc', '-l'], stdin=file))
print(nlines)
Or you could ignore any decoding errors:
import subprocess
stdoutdata = subprocess.check_output(['wc', '-l', '/var/log/syslog'])
nlines = int(stdoutdata.decode('ascii', 'ignore').partition(' ')[0])
print(nlines)

Equivalent to Curt J. Sampson's answer is also this one (it's returning a string):
subprocess.check_output('wc -l /path/to/your/file | cut -d " " -f1', universal_newlines=True, shell=True)
from docs:
If encoding or errors are specified, or text is true, file objects for
stdin, stdout and stderr are opened in text mode using the specified
encoding and errors or the io.TextIOWrapper default. The
universal_newlines argument is equivalent to text and is provided for
backwards compatibility. By default, file objects are opened in binary
mode.
Something similar, but a bit more complex using subprocess.run():
subprocess.run(command, shell=True, check=True, universal_newlines=True, stdout=subprocess.PIPE).stdout
as subprocess.check_output() could be equivalent to subprocess.run().

getoutput (and the closer replacement getstatusoutput) are not a direct replacement of check_output - there are security changes in 3.x that prevent some previous commands from working that way (my script was attempting to work with iptables and failing with the new commands). Better to adapt to the new python3 output and add the argument universal_newlines=True:
check_output(command, universal_newlines=True)
This command will behave as you expect check_output, but return string output instead of bytes. It's a direct replacement.

Python on Windows: path as subprocess argument gets modified and generating error

Am using subprocess on Windows and Python 2.6 as follows. I am trying to parse a text file using a legacy parser application (assume parser.py) as follows:
import subprocess
k = subprocess.Popen(['python', 'parser.py', '-f C:\Report1\2011-03-14.txt'],
shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
print k.communicate()
The issue here is with the way filename gets passed to the legacy application where I cannot change the code but only can access it using Python.
It generates with the following error:
IOError: [Errno 22] invalid mode (\'r\') or filename: C:\\Report1\\2011-03-14.txt
When I copy the modified filename(with double forward slashes) from the traceback to check the existence, the system is not able to find it.
Question: How can I pass the path as argument so that it gets treated without getting changed to double slashes so that the system can read the file?
NOTE: os.sep also does not resolve the issue.
EDIT: Executing using os.system works perfectly, but the issue there is to grab the output for later use. Am currently using os.sytem in a module(run_parser.py) and then using subprocess in another module(get_parse_status.py) that Popens run_parser.py to grab the output. Would appreciate anything that is better than this.
Thanks for the time.

Change your parameter list to encode the path as a raw string:
k = subprocess.Popen(['python', 'parser.py', '-f', r'C:\Report1\2011-03-14.txt'],
shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
a simple program that reads a file and reports the length:
import sys
import os
userinput = sys.argv[1]
data = open(userinput, 'rb').read()
datalength = len(data)
fname = os.path.basename(userinput)
print "%s datasize = %s" % (fname, datalength)
Then to call it through the interpreter:
>>> k = subprocess.Popen(['python', 'test2.py', 'w:\bin\test2.py'], shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
>>> k.communicate()
5: ('Traceback (most recent call last):\r\n File "w:\\bin\\test2.py", line 4, in <module>
data = open(userinput, \'rb\').read()
IOError: [Errno 22] invalid mode (\'rb\') or filename: 'w:\\x08in\\test2.py', None)
>>> k = subprocess.Popen(['python', r'w:\bin\test2.py', r'w:\bin\test2.py'], shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
>>> k.communicate()
6: ('test2.py datasize = 194\n', None)

"C:\Report1\2011-03-14.txt" isn't the same as the path C:\Report1\2011-03-14.txt. It's actually some bytestring, 'C:\\Report1\x811-03-14.txt'. Strangely enough it doesn't sound like this is your issue, but it might be related. r"C:\Report1\2011-03-14.txt" fixes this.
But be aware that double backslashes in the printed representation doesn't necessarily mean that there are actually two backslashes. '\\' is a Python string of length 1.

"C:\Report1\2011-03-14.txt" isn't the same as the path C:\Report1\2011-03-14.txt. It's actually some bytestring, 'C:\Report1\x811-03-14.txt'. Strangely enough it doesn't sound like this is your issue, but it might be related. r"C:\Report1\2011-03-14.txt" fixes this.
But be aware that double backslashes in the printed representation doesn't necessarily mean that there are actually two backslashes. '\' is a Python string of length 1.
Have you tried:
from subprocess import Popen, PIPE
k = Popen(r'python parser.py -f "C:\Report1\2011-03-14.txt"',
shell=True,
stdout=PIPE,
stderr=STDOUT)
print k.communicate()
I find that often when passing args on the command line via Popen, enclosing the parameters in double-quotes is the only reliable way to get it to behave. I also don't always trust the list method of calling Popen and usually form the command myself. Notice also the raw indicator (r'').

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.