Shouldn't r'\' be a valid string value in python? [duplicate] - python

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
why can’t I end a raw string with a \
Given r'\\' is equivalent to '\\\\', why r'\' isn't equivalent to '\\'?
What I got on my python3.2 was
print(r'\')
File "<stdin>", line 1
print(r'\')
^
SyntaxError: EOL while scanning string literal

You cannot have a backslash as the last character in a raw string unless it is part of an even number of backslashes; it escapes the closing quote.
Compare this to:
>>> r'\ '
'\\ '
From the string literal documentation:
When an 'r' or 'R' prefix is present, a character following a backslash is included in the string without change, and all backslashes are left in the string. For example, the string literal r"\n" consists of two characters: a backslash and a lowercase 'n'. String quotes can be escaped with a backslash, but the backslash remains in the string; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw string cannot end in a single backslash (since the backslash would escape the following quote character).

Related

Why is the 'r' before strings in python so important? [duplicate]

This question already has answers here:
What exactly do "u" and "r" string prefixes do, and what are raw string literals?
(7 answers)
Closed 6 years ago.
I first saw it used in building regular expressions across multiple lines as a method argument to re.compile(), so I assumed that r stands for RegEx.
For example:
regex = re.compile(
r'^[A-Z]'
r'[A-Z0-9-]'
r'[A-Z]$', re.IGNORECASE
)
So what does r mean in this case? Why do we need it?
The r means that the string is to be treated as a raw string, which means all escape codes will be ignored.
For an example:
'\n' will be treated as a newline character, while r'\n' will be treated as the characters \ followed by n.
When an 'r' or 'R' prefix is present,
a character following a backslash is
included in the string without change,
and all backslashes are left in the
string. For example, the string
literal r"\n" consists of two
characters: a backslash and a
lowercase 'n'. String quotes can be
escaped with a backslash, but the
backslash remains in the string; for
example, r"\"" is a valid string
literal consisting of two characters:
a backslash and a double quote; r"\"
is not a valid string literal (even a
raw string cannot end in an odd number
of backslashes). Specifically, a raw
string cannot end in a single
backslash (since the backslash would
escape the following quote character).
Note also that a single backslash
followed by a newline is interpreted
as those two characters as part of the
string, not as a line continuation.
Source: Python string literals
It means that escapes won’t be translated. For example:
r'\n'
is a string with a backslash followed by the letter n. (Without the r it would be a newline.)
b does stand for byte-string and is used in Python 3, where strings are Unicode by default. In Python 2.x strings were byte-strings by default and you’d use u to indicate Unicode.

Unexpected behaviour in interactive python

>>> r'\'
SyntaxError: EOL while scanning string literal
I expected '\\' as output.
As per Python documentation https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals:
shortstring ::= "'" shortstringitem* "'" | '"' shortstringitem* '"'
shortstringitem ::= shortstringchar | stringescapeseq
shortstringchar ::= <any source character except "\" or newline or the quote>
stringescapeseq ::= "\" <any source character>
Look at the definition of lexical element shortstringchar, it's clear that even in a raw string r'' or r"", a single backslash followed by something would be considered part of the string content, as the LL(1) analyzer descends into stringescapeseq instead of reading the "end of string" token, so the following single quote is not parsed as "end of raw string".
While not as intuitive, the lexical analyzer (Python so far uses an LL(1) parser) is designed as such.
If you really want a string with only one single backslash, use '\\' or "\\" (no r prefix).
See the last paragraph of the definition of string literals:
Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw literal cannot end in a single backslash (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the literal, not as a line continuation.

Python: Why do raw strings require backslash to be escaped? [duplicate]

This question already has answers here:
Can't escape the backslash with regex?
(7 answers)
Closed 7 years ago.
This explanation is from the python documentation:
Both string and bytes literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and treat backslashes as literal characters. As a result, in string literals, '\U' and '\u' escapes in raw strings are not treated specially. Given that Python 2.x’s raw unicode literals behave differently than Python 3.x’s the 'ur' syntax is not supported.
If raw strings treat backslashes as char literals, why does the backslash need to be escaped in the expression:
re.compile(r"'\\'")
Instead of just being able to write:
re.compile(r"'\'")
To capture a single backslash when using the re module?
because '\' has special meaning in re it means escape the character after it in the language you use to define a re so if you want to match '+' as a character your re will be '\+'

Replace \\ with \ in python [duplicate]

This question already has answers here:
python: replace a double \\ in a path with a single \
(3 answers)
Closed 8 years ago.
I Have a string having path of folder like below
>>> path
'\\\\sdgte\\ssdfdaa\\asfdsf'
I want to replace \\ with \ . I tried to replace but does not work as below
>>> path.replace('\\','\')
File "<input>", line 1
path.replace('\\','\')
^
SyntaxError: EOL while scanning string literal
Any Help will be highly appreciated.
There is no "\\" in the string. If you print it instead of looking at its representation you'll see the value that the string actually contains.
>>> print path
\\sdgte\ssdfdaa\asfdsf
You should use the escape charachter '\' to escape each \ in your string
path.replace('\\\\','\\')
you probably don't need to replace anything. \ is a special character in python that means "the next character, literally" in string literals. That is, if you want a string, containg a backslash, you'd probably type "\\":
>>> len('\\')
1
>>> print '\\'
\
>>> print '\\\\foo\\bar'
\\foo\bar
>>>
The reason you're getting that SyntaxError is the same reason you're seeing the doubled backslashes to begin with: backslash is the "escape" character, used to indicate the start of a special sequence, like "\n" for line feed, which would otherwise be difficult to represent in a string. The backslash character itself therefore has to be represented by a double backslash.
On the other hand, if you don't need to use any escape sequences within a string, you can preface the string with "r" instead of doubling the backslashes:
path.replace(r'\\', r'\')
path.replace(r'\\', '\\')
"r" indicates a "raw" string.
The problem you are running into, is that \ is an escape character. Instead of reading that as
replace '\\' with '\'
python is reading your argument as "replace the single backslash character with the single quotation mark character". The reason you are getting the error you are, is because python is ignoring your second single quotation mark because it thinks that is what you want it to do.
What you want is:
path.replace('\\\\', '\\')
you have to escape all backslashes because they are special.

Escape characters in raw Python string [duplicate]

This question already has answers here:
Why can't Python's raw string literals end with a single backslash?
(14 answers)
Python Literal r'\' Not Accepted [duplicate]
(5 answers)
Closed 9 years ago.
I was under the impression that in Python a raw string, written as r'this is a raw string' would omit any escape characters, and print EXACTLY what is between the quotes. My problem is that when I try print r'\' I get SyntaxError: EOL while scanning string literal. print r'\n' correctly prints \n, though.
Quoting the documentation:
When an 'r' or 'R' prefix is present, a character following a backslash is included in the string without change, and all backslashes are left in the string. For example, the string literal r"\n" consists of two characters: a backslash and a lowercase 'n'. String quotes can be escaped with a backslash, but the backslash remains in the string; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw string cannot end in a single backslash (since the backslash would escape the following quote character).
Added emphasis mine.
Raw strings thus do attach some meaning to a backslash, but only where quotes are concerned.
From the Python docs:
If we make the string literal a “raw” string, \n sequences are not converted to newlines, but the backslash at the end of the line, and the newline character in the source, are both included in the string as data.
You have the wrong idea about raw strings.

Categories

Resources