Why does PyCharm use double backslash to indicate escaping?

Why does PyCharm use double backslash to indicate escaping? - python

For instance, I write a normal string and another "abnormal" string like this:
Now I debug it, finding that in the debug tool, the "abnormal" string will be shown like this:
Here's the question:
Why does PyCharm show double backslashes instead of a single backslash? As is known to all, \' means '. Is there any trick?

What I believe is happening is the ' in your c variable string needs to be escaped and PyCharm knows this at runtime, given you have surrounded the full string in " (You'll notice in the debugger, your c string is now surrounded by '). To escape the single quote it changes it to \', but now, there is a \ in your string that needs escaping, and to escape \ in Python, you type \\.
EDIT
Let me see if I can explain the order of escaping going on here.
"u' this is not normal" is assigned to c
PyCharm converts the string in c to 'u' this is not normal' at runtime. See how, without escaping the 2nd ', your string is now closed off right after u.
PyCharm escapes the ' automatically for you by adding a slash before it. The string is now 'u\' this is not normal'. At this point, everything should be fine but PyCharm may be taking an additional step for safety.
PyCharm then escapes the slash it just added to your string, leaving the string as: 'u\\' this is not normal'.
It is likely a setting inside PyCharm. Does it cause an actual issue with your code?

Related

re.escape returns unusable directory

using re.escape() on this directory:
C:\Users\admin\code
Should theoratically return this, right?
C:\\Users\\admin\\code
However, what I actually get is this:
C\:\\Users\\admin\\code
Notice the backslash immediately after C. This makes the string unusable, and trying to use directory.replace('\', '') just bugs out Python because it can't deal with a single backslash string, and treats everything after it as string.
Any ideas?
Update
This was a dumb question :p

No it should not. It's help says "Escape all the characters in pattern except ASCII letters, numbers and '_'"
What you are reporting you are getting is after calling the print function on the resulting string. In console, if you type directory and press enter, it would give something like: C\\:\\\\Users\\\\admin\\\\code. When using directory.replace('\\','') it would replace all backslashes. For example: directory.replace('\\','x') gives Cx:xxUsersxxadminxxcode. What might work in this case is replacing both the backslash and colon with ':' i.e. directory.replace('\\:',':'). This will work.
However, I will suggest doing something else. A neat way to work with Windows directories in Python is to use forward slash. Python and the OS will work out a way to understand your paths with forward slashes. Further, if you aren't using absolute paths, as far as the paths are concerned, your code will be portable to Unix-style OSes.
It also seems to me that you are calling re.escape unnecessarily. If the printing the directory is giving you C:\Users\admin\code then it's a perfectly fine directory to use already. And you don't need to escape it. It's already done. If it wasn't escaped print('C:\Users\admin\code') would give something like C:\Usersdmin\code since \a has special meaning (beep).

Replace double backslash in string literal with single backslash

I'm trying to print a string that contains double backslash (one to escape the other) such that only one of the backslashes are printed. I thought this would happen automatically, but I must be missing some detail.
I have this little snippet:
for path in self.tokenized:
pdb.set_trace()
print(self.tokenized[path])
When I debug with that pdb.set_trace() I can see that my strings have double backslashes, and then I enter continue to print the remainder and it prints that same thing.
> /home/kendall/Development/path-parser/tokenize_custom.py(82)print_tokens()
-> print(self.tokenized[path])
(Pdb) self.tokenized[path]
['c:', '\\home', '\\kendall', '\\Desktop', '\\home\\kendall\\Desktop']
(Pdb) c
['c:', '\\home', '\\kendall', '\\Desktop', '\\home\\kendall\\Desktop']
Note that I'm writing a parser that parses Windows file paths -- thus the backslashes.
This is what it looks like to run the program:
kendall#kendall-XPS-8500:~/Development/path-parser$ python main.py -f c:\\home\\kendall\\Desktop

The issue you are having is that you're printing a list, which only knows one way to stringify its contents: repr. repr is only designed for debugging use. Idiomatically, when possible (classes are a notable exception), it outputs a syntactically valid python expression that can be directly fed into the interpretter to reproduce the original object - hence the escaped backslashes.
Instead, you need to loop through each list, and print each string individually.
You can use str.join() to do this for you.
To get the exact same output, minus the doubled backslashes, you'd need to do something like:
print("[{0}]".format(", ".join(self.tokenized[path])))

removing weird double quotes (from excel file) in python string

I'm loading in an excel file to python3 using xlrd. They are basically lines of text in a spreadsheet. On some of these lines are quotation marks. For example, one line can be:
She said, "My name is Jennifer."
When I'm reading them into python and making them into strings, the double quotes are read in as a weird double quote character that looks like a double quote in italics. I'm assuming that somewhere along the way, python read in the character as some foreign character rather than actual double quotes due to some encoding issue or something. So in the above example, if I assign that line as "text", then we'll have something like the following (although not exactly since I don't actually type out the line, so imagine "text" was already assigned beforehand):
text = 'She said, “My name is Jennifer.”'
text[10] == '"'
The second line will spit out a False because it doesn't seem to recognize it as a normal double quote character. I'm working within the Mac terminal if that makes a difference.
My questions are:
1. Is there a way to easily strip these weird double quotes?
2. Is there a way when I read in the file to get python to recognize them as double quotes properly?

I'm assuming that somewhere along the way, python read in the character as some foreign character
Yes; it read that in because that's what the file data actually represents.
rather than actual double quotes due to some encoding issue or something.
There's no issue with the encoding. The actual character is not an "actual double quote".
Is there a way to easily strip these weird double quotes?
You can use the .replace method of strings as you would normally, to either replace them with an "actual double quote" or with nothing.
Is there a way when I read in the file to get python to recognize them as double quotes properly?
If you're looking for them, you can compare them to the character they actually are.
As noted in the comment, they are most likely U+201C LEFT DOUBLE QUOTATION MARK and U+201D RIGHT DOUBLE QUOTATION MARK. They're used so that opening and closing quotes can look different (by curving in different directions), which pretty typography normally does (as opposed to using " which is simply more convenient for programmers). You represent them in Python with a Unicode escape, thus:
text[10] == '\u201c'
You could also have directly asked Python for this info, by asking for text[10] at the Python command line (which would evaluate that and show you the representation), or explicitly in a script with e.g. print(repr(text[10])).

Weird issue during parsing a path in Python

Given this variables:
cardIP="00.00.00.00"
dir="D:\\TestingScript"
mainScriptPath='"\\\\XX\\XX\\XX\\Testing\\SNMP Tests\\Python Script\\MainScript.py"'
When using subprocess.call("cmd /c "+mainScriptPath+" "+dir+" "+cardIP) and print(mainScriptPath+" "+dir+" "+cardIP) I get this:
"\\XX\XX\XX\Testing\SNMP Tests\Python Script\MainScript.py" D:\TestingScript 00.00.00.00
which is what I wanted, OK.
But now, I want the 'dir' variable to be also inside "" because I am going to use dir names with spaces.
So, I do the same thing I did with 'mainScriptPath':
cardIP="00.00.00.00"
dir='"D:\\Testing Script"'
mainScriptPath='"\\XX\\XX\\XX\\Testing\\SNMP Tests\\Python Script\\MainScript.py"'
But now, when I'm doing print(mainScriptPath+" "+dir+" "+cardIP) I get:
"\\XX\XX\XX\Testing\SNMP Tests\Python Script\MainScript.py" "D:\Testing Script" 00.00.00.00
Which is great, but when executed in subprocess.call("cmd /c "+mainScriptPath+" "+dir+" "+cardIP) there is a failure with 'mainScriptPath' variable:
'\\XX\XX\XX\Testing\SNMP' is not recognized as an internal or external command...
It doesn't make sense to me.
Why does it fail?
In addition, I tried also:
dir="\""+"D:\\Testing Script"+"\""
Which in 'print' acts well but in 'subprocess.call' raise the same problem.
(Windows XP, Python3.3)

Use proper string formatting, use single quotes for the formatting string and simply include the quotes:
subprocess.call('cmd /c "{}" "{}" "{}"'.format(mainScriptPath, dir, cardIP))
The alternative is to pass in a list of arguments and have Python take care of quoting for you:
subprocess.call(['cmd', '/c', mainScriptPath, dir, cardIP])
When the first argument to .call() is a list, Python uses the process described under the section Converting an argument sequence to a string on Windows.
On Windows, an args sequence is converted to a string that can be
parsed using the following rules (which correspond to the rules used
by the MS C runtime):
Arguments are delimited by white space, which is either a space or a tab.
A string surrounded by double quotation marks is interpreted as a single argument, regardless of white space contained within. A quoted
string can be embedded in an argument.
A double quotation mark preceded by a backslash is interpreted as a literal double quotation mark.
Backslashes are interpreted literally, unless they immediately precede a double quotation mark.
If backslashes immediately precede a double quotation mark, every pair of backslashes is interpreted as a literal backslash. If the
number of backslashes is odd, the last backslash escapes the next
double quotation mark as described in rule 3.
This means that passing in your arguments as a sequence makes Python worry about all the nitty gritty details of escaping your arguments properly, including handling embedded backslashes and double quotes.

Python, trying to run a program from the command prompt

I am trying to run a program from the command prompt in windows. I am having some issues. The code is below:
commandString = "'C:\Program Files\WebShot\webshotcmd.exe' //url '" + columns[3] + "' //out '"+columns[1]+"~"+columns[2]+".jpg'"
os.system(commandString)
time.sleep(10)
So with the single quotes I get "The filename, directory name, or volume label syntax is incorrect." If I replace the single quotes with \" then it says something to the effect of "'C:\Program' is not a valid executable."
I realize it is a syntax error, but I am not quite sure how to fix this....
column[3] contains a full url copy pasted from a web browser (so it should be url encoded). column[1] will only contain numbers and periods. column[2] contains some text, double quotes and colons are replaced. Mentioning just in case...
Thanks!

Windows requires double quotes in this situation, and you used single quotes.
Use the subprocess module rather than os.system, which is more robust and avoids calling the shell directly, making you not have to worry about confusing escaping issues.
Dont use + to put together long strings. Use string formatting (string %s" % (formatting,)), which is more readable, efficient, and idiomatic.
In this case, don't form a long string as a shell command anyhow, make a list and pass it to subprocess.call.
As best as I can tell you are escaping your forward slash but not your backslashes, which is backwards. A string literal with // has both slashes in the string it makes. In any event, rather than either you should use the os.path module which avoids any confusion from parsing escapes and often makes scripts more portable.

Use the subprocess module for calling system commands. Also ,try removing the single quotes and use double quotes.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.