Python argparse argument with quotes - python

Is there any way I can tell argparse to not eat quotation marks?
For example, When I give an argument with quotes, argparse only takes what's inside of the quotes as the argument. I want to capture the quotation marks as well (without having to escape them on the command line.)
pbsnodes -x | xmlparse -t "interactive-00"
produces
interactive-00
I want
"interactive-00"

I think it is the shell that eats them, so python will actually never see them. Escaping them on the command line may be your only option.
If it's the \"backslash\" style escaping you don't like for some reason, then this way should work instead:
pbsnodes -x | xmlparse -t '"interactive-00"'

Command line is parsed into argument vector by python process itself. Depending on how python is built, that would be done by some sort of run-time library. For Windows build, that would be most likely MS Visual C++ runtime library. More details about how it parses command line can be found in Visual C++ documentation: Parsing C++ command-Line arguments.
In particular:
A string surrounded by double quotation marks ("string") is interpreted as a single argument, regardless of white space contained within. A quoted string can be embedded in an argument.
A double quotation mark preceded by a backslash (\") is interpreted as a literal double quotation mark character (").
If you want to see unprocessed command line, on Windows you can do this:
import win32api
print(win32api.GetCommandLine())

Related

Python Subprocess.Run inserting escape characters into program arguments

subprocess.run([program,'-force-gfx-jobs native -token='+PHPSESSID+ ' -config={"BackendURL":"https://prod.app.com","Version","live"}'])
When this command is run and you took in task manager at the command line values passed to the program the double quotes are prefixed by backslashes like this
subprocess.run([program,'-force-gfx-jobs native -token='+PHPSESSID+ ' -config={\"BackendURL\":\"https://prod.app.com\",\"Version\",\"live\"}'])
You are passing a single string where you should be passing a list of strings.
subprocess.run([program,'-force-gfx-jobs', 'native',
'-token='+PHPSESSID, '-config={"BackendURL":"https://prod.app.com","Version","live"}'])
I don't think the displayed backslashes are a problem; they are just for disambiguation (but this is Windows, so I'm probably underestimating the amount of crazy).

Why is there a difference between using a list or a string with subprocess.Popen and quotes on the commandline

When running the following script:
import os
import sys
import subprocess
if len(sys.argv) > 1:
print sys.argv[1]
sys.exit(0)
commandline = [sys.executable]
commandline.append(os.path.realpath(__file__))
commandline.append('"test"')
p = subprocess.Popen(commandline)
p.wait()
p = subprocess.Popen(" ".join(commandline))
p.wait()
It returns the following output
"test"
test
Why is there a difference between providing a list of arguments or one string?
This is run on a windows machine and you will see backslashes before the quotes on the command in the task manager.
I expected the same result in both runs.
Edit:
The problem is not so much in the automatic escaping of spaces (I find that is the programmers responsibility), but more about my quotes being escaped or not in the process commandline.
These are the two subprocesses taken from the windows task manager:
A different non-python process parses the first commandline with the backslashes, which brings unexpected behaviour. How can I have it so that I can use a list and not have the quotes escaped on the commandline?
Edit2:
The quotes are definitely added by python. If you run the following:
import subprocess
commandline = ['echo']
commandline.append('"test"')
commandline.append('>')
commandline.append(r'D:\test1.txt')
p = subprocess.Popen(commandline, shell=True)
p.wait()
commandline = 'echo "test" > D:\\test2.txt'
p = subprocess.Popen(commandline, shell=True)
p.wait()
Then you will see that the outputs are
D:\test1.txt:
\"test\"
D:\test2.txt:
"test"
The string API is dangerous since it might change the meaning of arguments. Example: You want to execute C:\Program Files\App\app.exe. If you use the string version of Popen(), you get an error:
C:\Program: Error 13
What happens is that with the string API, Python will split the input by spaces and try to start the command C:\Program with the single argument Files\App\app.exe. To fix this, you need to quote properly. Which gets you in quote hell when you have quotes in your arguments (i.e. when you really want to pass "test" as an argument with the quotes).
To solve this (and other subtle) bugs, there is the list API where each element of the list will become a single item passed to the OS without any modifications. With list API, you get what you see. If you quote an argument with the list API, it will be passed on with the quotes. If there are spaces, they won't split your argument. If there are arbitrary other special characters (like * or %), they will all be passed on.
[EDIT] As usual, things are much more complex on Windows. From the Python documentation for the subprocess module:
17.1.5.1. Converting an argument sequence to a string on Windows
On Windows, an args sequence is converted to a string that can be parsed using the following rules (which correspond to the rules used by the MS C runtime):
Arguments are delimited by white space, which is either a space or a tab.
A string surrounded by double quotation marks is interpreted as a single argument, regardless of white space contained within. A quoted string can be embedded in an argument.
A double quotation mark preceded by a backslash is interpreted as a literal double quotation mark.
Backslashes are interpreted literally, unless they immediately precede a double quotation mark.
If backslashes immediately precede a double quotation mark, every pair of backslashes is interpreted as a literal backslash. If the number of backslashes is odd, the last backslash escapes the next double quotation mark as described in rule 3.
So the backslashes are there because MS C runtime wants it that way.

Why do I need 4 backslashes in a Python path?

When I'm using Python 3 to launch a program via subprocess.call(), why do I need 4 backslashes in paths?
This is my code:
cmd = 'C:\\\\Windows\\\\System32\\\\cmd.exe'
cmd = shlex.split(cmd)
subprocess.call(cmd)
When I examine the command line of the launched cmd.exe instance with Task Manager, it shows the path correctly with only one backslash separating each path.
Because of this, I need this on Windows to make the paths work:
if platform.platform().startswith('Windows'):
cmd = cmd.replace(os.sep, os.sep + os.sep)
is there a more elegant solution?
Part of the problem is that you're using shlex, which implements escaping rules used by Unix-ish shells. But you're running on Windows, whose command shells use different rules. That accounts for one level of needing to double backslashes (i.e., to worm around something shlex does that you didn't need to begin with).
That you're using a regular string instead of a raw string (r"...") accounts for the other level of needing to double backslashes, and 2*2 = 4. QED ;-)
This works fine on Windows:
cmd = subprocess.call(r"C:\Windows\System32\cmd.exe")
By the way, read the docs for subprocess.Popen() carefully: the Windows CreateProcess() API call requires a string for an argument. When you pass a sequence instead, Python tries to turn that sequence into a string, via rules explained in the docs. When feasible, it's better - on Windows - to pass the string you want directly.
When you are creating the string, you need to double each backslash for escaping, and then when the string is passed to your shell, you need to double each backslash again. You can cute the backslashes in half by using a raw string:
cmd = r'C:\\Windows\\System32\\cmd.exe'
\ has special meaning - you're using it as part of an escape sequence. Double up the backslashes, and you have a literal backslash \.
The caveat is that, with only one pair of escaped backslashes, you still have only one literal backslash. You need to escape that backslash, too.
Alternatively, why not just use os.sep instead? You'll be able to ensure your code is more portable (since it'll use the system-specific separator), and you won't have to deal [directly] with escaping backslashes.
As John points out 4 slashes isn't necessary when accessing files locally.
One place where 4 slashes is necessary is when connecting to (generally windows) servers over SMB or CIFS.
Normally you would just use \servername\share\
But each one of those slashes needs to be escaped. So thus the 4 slashes before servernames.
you could also use subprocess.call()
import subprocess as sp
sp.call(['c:\\program files\\<path>'])

Weird issue during parsing a path in Python

Given this variables:
cardIP="00.00.00.00"
dir="D:\\TestingScript"
mainScriptPath='"\\\\XX\\XX\\XX\\Testing\\SNMP Tests\\Python Script\\MainScript.py"'
When using subprocess.call("cmd /c "+mainScriptPath+" "+dir+" "+cardIP) and print(mainScriptPath+" "+dir+" "+cardIP) I get this:
"\\XX\XX\XX\Testing\SNMP Tests\Python Script\MainScript.py" D:\TestingScript 00.00.00.00
which is what I wanted, OK.
But now, I want the 'dir' variable to be also inside "" because I am going to use dir names with spaces.
So, I do the same thing I did with 'mainScriptPath':
cardIP="00.00.00.00"
dir='"D:\\Testing Script"'
mainScriptPath='"\\XX\\XX\\XX\\Testing\\SNMP Tests\\Python Script\\MainScript.py"'
But now, when I'm doing print(mainScriptPath+" "+dir+" "+cardIP) I get:
"\\XX\XX\XX\Testing\SNMP Tests\Python Script\MainScript.py" "D:\Testing Script" 00.00.00.00
Which is great, but when executed in subprocess.call("cmd /c "+mainScriptPath+" "+dir+" "+cardIP) there is a failure with 'mainScriptPath' variable:
'\\XX\XX\XX\Testing\SNMP' is not recognized as an internal or external command...
It doesn't make sense to me.
Why does it fail?
In addition, I tried also:
dir="\""+"D:\\Testing Script"+"\""
Which in 'print' acts well but in 'subprocess.call' raise the same problem.
(Windows XP, Python3.3)
Use proper string formatting, use single quotes for the formatting string and simply include the quotes:
subprocess.call('cmd /c "{}" "{}" "{}"'.format(mainScriptPath, dir, cardIP))
The alternative is to pass in a list of arguments and have Python take care of quoting for you:
subprocess.call(['cmd', '/c', mainScriptPath, dir, cardIP])
When the first argument to .call() is a list, Python uses the process described under the section Converting an argument sequence to a string on Windows.
On Windows, an args sequence is converted to a string that can be
parsed using the following rules (which correspond to the rules used
by the MS C runtime):
Arguments are delimited by white space, which is either a space or a tab.
A string surrounded by double quotation marks is interpreted as a single argument, regardless of white space contained within. A quoted
string can be embedded in an argument.
A double quotation mark preceded by a backslash is interpreted as a literal double quotation mark.
Backslashes are interpreted literally, unless they immediately precede a double quotation mark.
If backslashes immediately precede a double quotation mark, every pair of backslashes is interpreted as a literal backslash. If the
number of backslashes is odd, the last backslash escapes the next
double quotation mark as described in rule 3.
This means that passing in your arguments as a sequence makes Python worry about all the nitty gritty details of escaping your arguments properly, including handling embedded backslashes and double quotes.

Passing a command line argument to Python whos string contains a metacharacter

I am attempting to pass in a string as input argument to a Python program, from the command line i.e. $python parser_prog.py <pos1> <pos2> --opt1 --opt2 and interpreting these using argparse. Of course if contains any metacharacters these are first interpreted by the shell, so it needs to be quoted.
This seems to work, strings are passed through literally, preserving the \*?! characters:
$ python parser_prog.py 'str\1*?' 'str2!'
However, when I attempt to pass through a '-' (hyphen) character, I cannot seem to mask it. It is interpreted as an invalid option.
$ python parser_prog.py 'str\1*?' '-str2!'
I have tried single and double quotes, is there a way to make sure Python interprets this as a raw string? (I'm not in the interpreter yet, this is on the shell command line, so I can't use pythonic expressions such as r'str1')
Thank you for any hints!
As you said yourself, Python only sees the strings after being processed by the shell. The command-line arguments '-f' and -f look identical to the called program, and there is no way to dsitinguish them. That said, I think that argparse supports a -- argument to denote the end of the options, and everything after this is treated as a positional argument.

Categories

Resources