python replaces single backlash with double when using extend [duplicate]

python replaces single backlash with double when using extend [duplicate] - python

This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?

The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.

Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'

save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .

It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.

alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.

Related

Python： How do string variables prevent escape?

>>>m = "\frac{7x+5}{1+y^2}"
>>>print(m)
rac{7x+5}{1+y^2}
>>>print(r""+m)
rac{7x+5}{1+y^2}
>>>print(r"{}".format(m))
rac{7x+5}{1+y^2}
>>>print(repr(m))
'\x0crac{7x+5}{1+y^2}'
I want the result:"\frac{7x+5}{1+y^2}"
Must be a string variable!!!

You need the string literal that contains the slash to be a raw string.
m = r"\frac{7x+5}{1+y^2}"
Raw strings are just another way of writing strings. They aren't a different type. For example r"" is exactly the same as "" because there are no characters to escape, it doesn't produce some kind of raw empty string and adding it to another string changes nothing.

Another option is to add the escape sign to the escape sign to signify that it is a string literal
m = "\\frac{7x+5}{1+y^2}"
print(m)
print(r""+m)
print(r"{}".format(m))
print(repr(m))

A good place to start is to read the docs here. So you can use either the escape character "\" as here
>>> m = "\\frac{7x+5}{1+y^2}"
>>> print(m)
\frac{7x+5}{1+y^2}
or use string literals, which takes the string to be as is
>>> m = r"\frac{7x+5}{1+y^2}"
>>> print(m)
\frac{7x+5}{1+y^2}

Removing escape characters from a string

How can i remove the escape chars in Python 2.7 and python 3 ?
Example:
a = "\u00E7a\u00E7a\u00E7a=http\://\u00E1\u00E9\u00ED\u00F3\u00FA\u00E7/()\=)(){[]}"
decoded = a.decode('unicode_escape')
print decoded
Result:
çaçaça=http\://áéíóúç/()\=)(){[]}
Expected result
çaçaça=http://áéíóúç/()=)(){[]}
EDIT: In order to avoid unnecessary downvotes. using .replace isn't our primary focus since this problem was raised by a legacy solution from other teams ( db table with reference data with contains portuguese chars and regular expressions).

You're looking for a simple str.replace
>>> print decoded.replace('\\', '')
çaçaça=http://áéíóúç/()=)(){[]}
The remaining \ is actually a literal backslash, not an escape sequence.

You can simply remove the unnecessary the escape character in your string, i.e.
>>> a = "\u00E7a\u00E7a\u00E7a=http://\u00E1\u00E9\u00ED\u00F3\u00FA\u00E7/()=)(){[]}"
>>> decoded = a.decode('unicode_escape')
>>> print decoded
çaçaça=http://áéíóúç/()=)(){[]}

Hexadecimal file is loading with 2 back slashes before each byte instead of one

I have a hex file in this format: \xda\xd8\xb8\x7d
When I load the file with Python, it loads with two back slashes instead of one.
with open('shellcode.txt', 'r') as file:
shellcode = file.read().replace('\n', '')
Like this: \\xda\\xd8\\xb8\\x7d
I've tried using hex.replace("\\", "\"), but I'm getting an error
EOL while scanning string literal
What is the proper way to replace \\ with \?

Here is an example
>>> h = "\\x123"
>>> h
'\\x123'
>>> print h
\x123
>>>
The two backslashes are needed because \ is an escape character, and so it needs to be escaped. When you print h, it shows what you want

Backshlash (\) is an escape character. It is used for changing the meaning of the character(s) following it.
For example, if you want to create a string which contains a quote, you have to escape it:
s = "abc\"def"
print s # prints: abc"def
If there was no backslash, the first quote would be interpreted as the end of the string.
Now, if you really wanted that backslash in the string, you would have to escape the bacsklash using another backslash:
s = "abc\\def"
print s # prints: abc\def
However, if you look at the representation of the string, it will be shown with the escape characters:
print repr(s) # prints: 'abc\\def'
Therefore, this line should include escapes for each backslash:
hex.replace("\\", "\") # wrong
hex.replace("\\\\", "\\") # correct
But that is not the solution to the problem!
There is no way that file.read().replace('\n', '') introduced additional backslashes. What probably happened is that OP printed the representation of the string with backslashes (\) which ended up printing escaped backslashes (\\).

You can make a bytes object with a utf-8 encoding, and then decode as unicode-escape.
>>> x = "\\x61\\x62\\x63"
>>> y = bytes(x, "utf-8").decode("unicode-escape")
>>> print(x)
\x61\x62\x63
>>> print(y)
abc

double backslash while take input [duplicate]

This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?

The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.

Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'

save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .

It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.

alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.

Similar C string format in Python

I need to read a file with some strange string lines like : \x72\xFE\x20TEST_STRING\0\0\0
but when I do a print of this string (with repr()) it prints this : r\xfe TEST_STRING\x00\x00\x00
Example :
>>> test = '\x72\xFE\x20TEST_STRING\0\0\0'
>>> print test
r? TEST_STRING
>>> print repr(test)
'r\xfe TEST_STRING\x00\x00\x00'
How can I get the same line from a file in Python and my editor ?
Is python changing encoding during string manipulation ?

You should use python's raw strings, like this (note the 'r' in front of the string)
test = r'\x72\xFE\x20TEST_STRING\0\0\0'
Then it won't try to interpret the escapes as special characters.
When reading from a text file python shouldn't be trying to interpret the string as having multi-byte unicode characters. You should get a exactly what's in the file:
In [22]: fp = open("test.txt", "r")
In [23]: s = fp.read()
In [24]: s
Out[24]: '\\x72\\xFE\\x20TEST_STRING\\0\\0\\0\n\n'
In [25]: print s
\x72\xFE\x20TEST_STRING\0\0\0

\x20 is a space. When you put that into a Python string it is stored exactly the same way as a space.
If you have printable characters in a string it does not matter whether they were typed as the actual character or some escape sequence, they will be represented the same way because they are in fact the same value.
Consider the following examples:
>>> ' ' == '\x20'
True
>>> hex(ord('a'))
'0x61'
>>> '\x61'
'a'

Python did not change the encoding:
When printing Python just resolved the printable chars in your string: chr(0x72) is a "r", chr(0xfe) is not printable, so you get the "?", chr(0x20) is chr(32) that is a space " ", and zero bytes are not printed at all.
repr() resolves the "r", leaves the chr(0xfe), and prints the chr(0) in full hexadecimal notation for chr(0x00).
So if you want the same line in your editor and for repr(), you have to type your string in your editor in the same notation repr() does, that is you write
test='r\xfe TEST_STRING\x00\x00\x00'
and repr(test) should print the same string:

To avoid having python interpret the backslashes as escaped characters, prefix your string with an "r" character:
>>> test = r'\x72\xFE\x20TEST_STRING\0\0\0'
>>> print test
\x72\xFE\x20TEST_STRING\0\0\0`

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python replaces single backlash with double when using extend [duplicate] - python

save yourself from getting a headache you can use other slashes as well. if you know what I saying. the opposite looking slashes. you're using now PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item try to use PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item it won't be treated as a escaping character .

Related

Python： How do string variables prevent escape?

Removing escape characters from a string

Hexadecimal file is loading with 2 back slashes before each byte instead of one

double backslash while take input [duplicate]

Similar C string format in Python

Categories

Resources