ValueError: No escaped character for python shlex.split - python

I have a string as follows
mystring1=xcopy /Q /Y d:\\Program Files\\TestData\\*.* c:\\Program Files\\TestData\\Company name\\
mystring2=xcopy '/Q' '/Y' 'd:\tj\tjData\\' "c:\Program Files\TestData\\Company name\\"
I used shlex module as follows
mylist1=shlex.split(mystring1)
mylist2=shlex.split(mystring2)
but I am getting an error:
ValueError: No escaped character
mylist1 value should be [xcopy,/Q,/Y,d:\Program Files\TestData\,c:\Program Files\TestData\Company name\]
and
mylist2 value should be [xcopy,/Q,/Y,d:\tj\tjData\,c:\Program Files\TestData\Company name\]

Well, I'm not sure to understand what you want to do but, on the first hand, I see a Windows user and, on the second hand, I seed a Posix option in the manual.
So I thought : "posix=False" is for him.
And here is what it give :
>>> mystring1
'xcopy /Q /Y d:\\Program Files\\TestData\\*.* c:\\Program Files\\TestData\\Company name\\'
>>> split(mystring1, posix=False)
['xcopy', '/Q', '/Y', 'd:\\Program', 'Files\\TestData\\*.*', 'c:\\Program', 'Files\\TestData\\Company', 'name\\']
>>> mystring2
'xcopy \'/Q\' \'/Y\' \'d:\tj\tjData\\\' "c:\\Program Files\\TestData\\Company name"'
>>> split(mystring2, posix=False)
['xcopy', "'/Q'", "'/Y'", "'d:\tj\tjData\\'", '"c:\\Program Files\\TestData\\Company name"']
Character escaping is maybe not exactly what you need but, as I do not frequent Windows,I would venture no further on this point.
Edit: as I know it is not always easy to navigate in the documentation when you start on a subject, here are some links :
shlex <= you shloud always RTFM. At least twice.
Python Lexcial Analysys <= Could be not obvious, but will change your minds.

The formatting of the input values is really bad.
Consider reading the formatting help.
Which string causes an error?
A first look at your input: The backslash character has a special meaning in Python strings.
So when the path is:
s = 'C:\MSDOS'
you have to write:
s = 'C:\\MSDOS'
The first backslash says: "Attention! The next character is not meant to have a special function", the second backslash is the character itself.
Have a look at http://docs.python.org/release/2.5.2/ref/strings.html

Related

In Python, with subprocess.Popen, is it possible to pass literal quotes to the command to be run, when Popen's command line parameter is in list form?

In Python, with subprocess.Popen, is it possible to pass literal quotes as an argument, when the command and its parameters are in list form?
I'll explain further what I mean. Some commands can have literal quotes in their arguments e.g. I'm trying "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --profile-directory="Profile 1" Some might even require them.
Note that one answer points out that technically it is possible to get Chrome from the command line to launch whatever profile, without passing a literal quote C:\Users\User>"C:\Program Files....\chrome.exe" "--profile-directory=Profile 2"
Nevertheless, i'm asking about passing literal quotes so "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --profile-directory="Profile 1"
For simplicity's sake i'll use calc.exe since it's in the path.
import time
import subprocess
proc=subprocess.Popen("calc.exe"+" "+'--profile-directory="Profile 3"')
proc2=subprocess.Popen(["calc.exe",'--profile-directory="Profile 4"'])
time.sleep(3)
proc.wait()
proc2.wait()
Now look at the difference in the command line as visible in task manager or via wmic.
C:\Users\User>wmic process where caption="calc.exe" get commandline | findstr calc
c:\windows\system32\calc.exe --profile-directory="Profile 3"
c:\windows\system32\calc.exe "--profile-directory=\"Profile 4\""
C:\Users\User>
You can see this from the python interpreter
>>> subprocess.Popen(["c:/windows/system32/calc.exe","abc"+'"'+"def"])
...
>>>
>>> subprocess.run("C:\Windows\System32\wbem\WMIC.exe process where caption=\"calc.exe\" get commandline")
...
c:/windows/system32/calc.exe abc\"def
....
>>>
You see it's sticking a backslash in there.
Some comments regarding some suggestions given.
One suggestion assumes that --profile-directory="Profile 1" is the same as --profile-directory "Profile 1" i.e. the assumption that you can replace the = with a space and chrome will work the same. But that isn't the case. So writing subprocess.Popen(["C:\...\chrome.exe", "--profile-directory", "Profile 3"]) will indeed produce "C:\....\chrome.exe" --profile-directory "Profile 1" but that won't work.. it leads chrome to either not open at all, or to open a browser window that offers profiles to click on. The equals sign is necessary.
Another suggestion does
subprocess.Popen(
" ".join(
[
"C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe",
'--profile-directory="Person 1"',
]
)
That's not passing a list to Popen, that's passing a list to join, and join is converting it to a string.
Another suggestion is
subprocess.Popen('C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe --profile-directory="Profile 3"')
That's using a string. But as you see from my question, I managed it using a string. I'm asking about using a list.
Another suggestion suggested "--profile-directory='Profile 1'"
If I run chrome with --profile-directory="Profile 1" I get a particular profile that I use sometimes. But if running chrome with "--profile-directory='Profile 1'" Then it doesn't load up that profile. It loads up a blank profile. And going to chrome://version shows "'profile 1'" rather than "profile 1" It's like a different profile, like you may as well have said chrome.exe --profile-directory="profile A". And it also creates directories starting with ' like C:\Users\User\AppData\Local\Google\Chrome\User Data\'Profile 1234' that should be removed.
Another suggestion suggested
subprocess.Popen(
[
"C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe",
"--profile-directory=Profile 1",
]
That is interesting because it does "C:\...chrome.exe" "--profile-directory=Profile 1"
And it does infact load chrome with the specified profile. Though it doesn't try to pass literal quotes!
My question asks about when passing literal quotes. It's as if maybe it assumes it's a linux shell and inserts a backslash before it, which in a linux would ensure the quote makes it past the shell and to the program being run. Though i'm not sure it'd even go to the linux shell on linux. e.g. on Windows if I stick a cmd escape character in there like ^ so "--pro^file-directory=Profile 1" then the ^ just gets passed literally. So the cmd shell doesn't intervene.
Why is it that on Windows, subprocess.Popen calls list2cmdline when passed a list, which(and here's the big 'why'), then adds a backslash to any literal double quote within a string, meaning that when using the 'method' of passing a list to to Popen rather than passing a string to it, there is this problem, that you can't pass a literal double quote! So, why does it add that backslash!
I did here a suggestion that looking at argsv in windows vs linux might show a difference. I'm not sure that they would since both implement C.
I don't see why POpen in any situation should behave like Windows needs a backslash inserted more than Linux does.
$ cat ./testargs.py
#!/usr/bin/env python3
import sys
print(sys.argv)
C:\blah>type .\testargsw.py
import sys
print(sys.argv)
in both cases
C:\blah>.\testargsw.py abc\^"def
['C:\\Users\\User\\testargsw.py', 'abc"def']
>.\testargsw.py abc\"def
['C:\\Users\\User\\testargsw.py', 'abc"def']
C:\blah>
$ ./testargs.py abc\"def
['./testargs.py', 'abc"def']
Maybe Windows , specifically the MS C Runtime.. The Code responsible for sending a program's arguments received from the shell, to the main method into argv, is requiring an extra backslash, in a sense because after escaping the double quote, a backslash is then required. (And [here] is put in by the user).
That said, I have heard though that looking at what a shell does on Linux is basically misleading, because a major part of the purpose of the subprocess module is to ensure that you can avoid using a shell entirely.
The script example is perhaps not that relevant(it was just something somebody suggested I check), but my issue is that POpen when passed a list is adding in a backslash as shown by WMIC output(also visible in task manager in the command line column).
added
I spoke to a person that has used python for a long time. They said subprocess was added somewhere in 2.x They still use os.popen(). That takes a string not a list. There have been moves to shift people from os.popen to subprocess.Popen https://docs.python.org/3/library/subprocess.html#replacing-os-popen-os-popen2-os-popen3
An issue with subprocess.Popen in Windows, is it has this list feature, that I think behaves funny.
The easy workaround to that is to not use the list feature of it. To not pass it a list. It's a new feature and not necessary. You can pass it a string.
The question includes an example from the python interpreter and shows how (on windows at least), python adds a backslash to the literal quote.
The person I spoke to pointed out to me two documents that relate to that.
A string is a sequence. A sequence could be a string or list or tuple, though in this document they use the term sequence to just mean list or tuple, and they don't mean string when they say sequence.
https://peps.python.org/pep-0324/
"class Popen(args........."
"args should be a string, or a sequence of program arguments"
It mentions about on unix, shell=True and shell=False
And then it says
"On Windows: the Popen class uses CreateProcess() to execute the child program, which operates on strings. If args is a sequence, it will be converted to a string using the list2cmdline method. Please note that not all MS Windows applications interpret the command line the same way: The list2cmdline is designed for applications using the same rules as the MS C runtime."
Technically a string is a sequence, but that document uses the term sequence in a funny way. But what it means is it's that on Windows, if args is not given a string, but is given a list or tuple, then it uses the list2cmdline method.
Be sure to use print otherwise it uses repr() of the string
>>> print(subprocess.list2cmdline(['a', '"b c"']))
a "\"b c\""
>>>
so that's the function that it's using behind the scenes, on windows, that is inserting a backslash in there.
The guy I spoke to pointed me to this document too
https://bugs.python.org/issue11827
a technical user comments, "list2cmdline() in subprocess is publicly accessible (doesn't begin with underscores) but it isn't documented."
And the point is made there that, let's say they made list2cmdline() private, the fact is that what Popen is doing to the list, in Windows, to get the command line, is undocumented.
So the question then becomes, what is the design trying to do, what is the justification for the insertion of backslash. If a programmer wanted to insert a backslash they could do so. It seeems to me to make more sense then to avoid passing a list to subprocess.POpen.
Windows cmd doesn't even use backslash as an escape character!!!! It uses caret.
C:\Users\User>echo \\
\\
C:\Users\User>echo ^\
\
C:\Users\User>
it's linux eg bash, that uses backslash as an escape character
$ echo \\
\
$
Some executables in windows might want a quote escaped and with a backslash, but then the technical user can do that just as a technical linux user does.
So given that they haven't even documented the "feature" (or bug), how they would justify it, I don't know, but they could start by documenting it!
So I don't understand why passing a list to subprocess.ppopen is adding a backslash?
I could take the list join it with space and pass it as a string to popen, so it won't add a backslash, but as mentioned, that'd be avoiding the question.
In Python, with subprocess.Popen, is it possible to pass literal
quotes as an argument, when the command and its parameters are in list
form?
I think yes, but to elements of argsv, not to the command line..
I'll explain further what I mean. Some commands can have literal
quotes in their arguments e.g. I'm trying "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --profile-directory="Profile 1" Some might even require them.
Note that one answer points out that technically it is possible to get
Chrome from the command line to launch whatever profile, without
passing a literal quote C:\Users\User>"C:\Program Files....\chrome.exe" "--profile-directory=Profile 2"
Nevertheless, i'm asking about passing literal quotes so
"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --profile-directory="Profile 1"
I think there was some confusion, that I might have resolved now.
When running a binary executable in windows(so, an exe, not a bat file) and giving it arguments
People often speak of how linux and windows behave differently..
Windows doesn't separate arguments out, it just dumps stuff to the command line. And there's no concept of space separating out arguments.
Apparently in linux, the command line is parsed by the shell and split into arguments and the arguments are passed by the shell to a POSIX function called execv, which then get passed to main's argsv.
Whereas in windows, there's an aspect of C pertaining to Windows called the MS C Runtime, that parses the command line and splits it into arguments.
So when looking in task manager or from WMIC, at the command line, it's just a dump when it comes to quotes. The quotes aren't some special some literal. But when the MS C Runtime looks at it then the quotes will have that meaning of some special some literal, a special one will not go to argsv, and a literal one will. Usually they are all special, none are literal.
>calc """"""""""^G
>wmic process where caption="calc.exe" get commandline | findstr calc
calc """"""""""G
One we see what's in the command line then one can consider what the arguments are.. what would would go to argsv.
In the case of the two Chrome examples, there's actually no literal quote, in the sense of a literal quote that'd go to argsv. in either example
Example 1
"C:\Program Files(x86)\Google\Chrome\Application\chrome.exe" --profile-directory="Profile 1"
Example 2
"C:\Program Files....\chrome.exe" "--profile-directory=Profile 2"
Looking at "Example 1"
It's not that there's an argument --profile-directory="Profile 1" that has two literal quotes.
The quotes when read by MS CRT(MS C runtime), will just preserve space keeping that "1" as part of the same argument.
So a program that displays argv will show
C:\blah>type w.c
#include <stdio.h>
int main(int argc, char *argv[]) {
int i = 0;
while (argv[i]) {
printf("argv[%d] = %s\n", i, argv[i]);
i++;
}
return 0;
}
C:\blah>w.exe --profile-directory="Profile 1"
argv[0] = w.exe
argv[1] = --profile-directory=Profile 1
C:\blah>
See, no literal quotes there
Infact both different command lines, will produce the same thing going to argv
>w.exe "--profile-directory=Profile 1"
argv[0] = w.exe
argv[1] = --profile-directory=Profile 1
>w.exe --profile-directory="Profile 1"
argv[0] = w.exe
argv[1] = --profile-directory=Profile 1
>
So, how would you even get a literal quote there, well, this link mentions it https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-170
the find command might have funny parsing, but if you are using find or a windows implementation of grep.. and wanted to look for a quote.. Then, you'd want one of the argsv parameters to contain a quote.
And that's when you'd put a backslash in there
>w.exe \"
argv[0] = w.exe
argv[1] = "
>
The command line would have a backslash.. So that the argv would have a literal quote.
And that's what passing a list to subprocess.Popen is all about..
You are putting in what you want to be in the argv.
It is then producing the command line that would result in those things being in the argv.
That's why it's inserting a backslash before the double quotes!
So..
When I trying to use list, I was under the mistaken impression that I needed to get literal quotes into the arguments. And I thought I was forming the command line. Both of those premises were wrong.
I suppose if a python script has two options.. one to run an executable for linux users, and one to run an executable for windows users. The command line might be different, because on the one hand, the windows command line (either what's typed or it, what windows dumps as it), and on the other hand, what's typed on the linux command line.
And with using a list where each element is an element of argsv, it lets the library do the work, in the case of windows, using the list2cmd function, to produce whatever the command line would be in order for the MS C Runtime to produce the result you want in argsv.
side note
In the testargs.py file used in the question, it'd not give expected output https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-command-line-arguments?view=msvc-170 for some strings e.g. `"ab\"c" "\\" d` because it is giving the repl representation Adjust it as follows.
user#comp:~# cat ./testargs.py
#!/usr/bin/env python3
import sys
print(sys.argv)
user#comp:~# cat ./testargs_corrected.py
#!/usr/bin/env python3
import sys
# print(sys.argv) would behaves unintended for "ab\"c" "\\" d
# (it's escaping strings for output in an repr)
# so not using print(sys.argv), using below line instead.
# '//' that it shows, means / 'cos '//' is a single character in python
# but much clearer to show it not in the repl form.
for n, a in enumerate(sys.argv): print(f"argv[{n}] = {a}")
user#comp:~# ./testargs.py "ab\"c" "\\" d
['./testargs.py', 'ab"c', '\\', 'd']
user#comp:~#
user#comp:~# ./testargs_corrected.py "ab\"c" "\\" d
argv[0] = ./testargs_corrected.py
argv[1] = ab"c
argv[2] = \
argv[3] = d
user#comp:~#
Should do the trick (might wanna pass shell=True to Popen if it doesn't):
subprocess.Popen(["C:\Program Files (x86)\Google\Chrome\Application\chrome.exe", "--profile-directory", "Profile 3"]);
This is possible because --some-flag="some value" is the same as --some-flag "some value"
ChRomE solution (working, omg):
import subprocess
subprocess.Popen(
[
"C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe",
"--profile-directory=Profile 1",
]
)

Replace double backslash in string literal with single backslash

I'm trying to print a string that contains double backslash (one to escape the other) such that only one of the backslashes are printed. I thought this would happen automatically, but I must be missing some detail.
I have this little snippet:
for path in self.tokenized:
pdb.set_trace()
print(self.tokenized[path])
When I debug with that pdb.set_trace() I can see that my strings have double backslashes, and then I enter continue to print the remainder and it prints that same thing.
> /home/kendall/Development/path-parser/tokenize_custom.py(82)print_tokens()
-> print(self.tokenized[path])
(Pdb) self.tokenized[path]
['c:', '\\home', '\\kendall', '\\Desktop', '\\home\\kendall\\Desktop']
(Pdb) c
['c:', '\\home', '\\kendall', '\\Desktop', '\\home\\kendall\\Desktop']
Note that I'm writing a parser that parses Windows file paths -- thus the backslashes.
This is what it looks like to run the program:
kendall#kendall-XPS-8500:~/Development/path-parser$ python main.py -f c:\\home\\kendall\\Desktop
The issue you are having is that you're printing a list, which only knows one way to stringify its contents: repr. repr is only designed for debugging use. Idiomatically, when possible (classes are a notable exception), it outputs a syntactically valid python expression that can be directly fed into the interpretter to reproduce the original object - hence the escaped backslashes.
Instead, you need to loop through each list, and print each string individually.
You can use str.join() to do this for you.
To get the exact same output, minus the doubled backslashes, you'd need to do something like:
print("[{0}]".format(", ".join(self.tokenized[path])))

Ignore evaluation of '\n' character in sys.argv

Oddly enough, when running this program with the arguments of
program.py "(lp0\nS'cat'\np1\naI5\na."
With program.py being:
import sys,pickle
print sys.argv[1]=="(lp0\nS'cat'\np1\naI5\na."
False is printed... I have narrowed the difference in evaluation to the \n character however I can find no way of ignoring such.
Why is this and how can I fix it?
You need to use raw string literal like this:
sys.argv[1] == r"(lp0\nS'cat'\np1\naI5\na."
Also, you can use a string in the parameters without quotes.
It is because the syntax of strings in Python and in the shell (presumably Bash) is different.
You may want to run the program as
echo $'"(lp0\nS\'cat\'\np1\naI5\na.'
program.py $'"(lp0\nS\'cat\'\np1\naI5\na.'

Quote POSIX shell special characters in Python output

There are times that I automagically create small shell scripts from Python, and I want to make sure that the filename arguments do not contain non-escaped special characters. I've rolled my own solution, that I will provide as an answer, but I am almost certain I've seen such a function lost somewhere in the standard library. By “lost” I mean I didn't find it in an obvious module like shlex, cmd or subprocess.
Do you know of such a function in the stdlib? If yes, where is it?
Even a negative (but definite and correct :) answer will be accepted.
pipes.quote():
>>> from pipes import quote
>>> quote("""some'horrible"string\with lots of junk!$$!""")
'"some\'horrible\\"string\\\\with lots of junk!\\$\\$!"'
Although note that it's arguably got a bug where a zero-length arg will return nothing:
>>> quote("")
''
Probably it would be better if it returned '""'.
The function I use is:
def quote_filename(filename):
return '"%s"' % (
filename
.replace('\\', '\\\\')
.replace('"', '\"')
.replace('$', '\$')
.replace('`', '\`')
)
that is: I always enclose the filename in double quotes, and then quote the only characters special inside double quotes.

Dealing with BACKSLASH character in non-string literals in Python

I have the following string read from an XML elememnt, and it is assigned to a variable called filename. I don't know how to make this any clearer as saying filename = the following string, without leading someone to think that I have a string literal then.
\\server\data\uploads\0224.1307.Varallo.mov
when I try and pass this to
os.path.basename(filename)
I get the following
\\server\\data\\uploads\x124.1307.Varallo.mov
I tried filename.replace('\\','\\\\') but that doesn't work either. os.path.basename(filename) then returns the following.
\\\\server\\data\\uploads\\0224.1307.Varallo.mov
Notice that the \0 is now not being converted to \x but now it doesn't process the string at all.
what can I do to my filename variable to get this String in a proper state so that os.path.basename() will actually give me back the basename. I am on OSX so the uncpath stuff is not available.
All attempts to replace the \ with \\ manually fail because of the \0 getting converted to \x in the beginning of the basename.
NOTE: this is NOT a string literal so r'' doesn't work.
We need more information. What exactly is in the variable filename? To answer, use print repr(filename) and add the results to your question above.
Wild guess
DISCLAIMER: This is a guess - try:
import ntpath
print ntpath.basename(filename)
All the downvoting in the world won't change the fact that you're doing it wrong. os.path is for native paths. \\foo\bar\baz is not a OS X path, it's a Windows UNC. posixpath is not equipped to handle UNCs; ntpath is.

Categories

Resources