I have the string
a = 'ddd\ttt\nnn'
I want to remove the '\' from the string. and It will be
a = 'dddtttnnn'
how to do that in python since '\t' and '\n' has special meaning in python
Assuming you want to remove \t and \n type characters (with those representing tab and newline in this case and remove the meaning of \ in the string in general) you can do:
>>> a = 'ddd\ttt\nnn'
>>> print a
ddd tt
nn
>>> repr(a)[1:-1].replace('\\','')
'dddtttnnn'
>>> print repr(a)[1:-1].replace('\\','')
dddtttnnn
If it is a raw string (i.e., the \ is not interpolated to a single character), you do not need the repr:
>>> a = r'ddd\ttt\nnn'
>>> a.replace('\\','')
'dddtttnnn'
Related
This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?
The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.
Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'
save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .
It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.
alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.
I have a hex file in this format: \xda\xd8\xb8\x7d
When I load the file with Python, it loads with two back slashes instead of one.
with open('shellcode.txt', 'r') as file:
shellcode = file.read().replace('\n', '')
Like this: \\xda\\xd8\\xb8\\x7d
I've tried using hex.replace("\\", "\"), but I'm getting an error
EOL while scanning string literal
What is the proper way to replace \\ with \?
Here is an example
>>> h = "\\x123"
>>> h
'\\x123'
>>> print h
\x123
>>>
The two backslashes are needed because \ is an escape character, and so it needs to be escaped. When you print h, it shows what you want
Backshlash (\) is an escape character. It is used for changing the meaning of the character(s) following it.
For example, if you want to create a string which contains a quote, you have to escape it:
s = "abc\"def"
print s # prints: abc"def
If there was no backslash, the first quote would be interpreted as the end of the string.
Now, if you really wanted that backslash in the string, you would have to escape the bacsklash using another backslash:
s = "abc\\def"
print s # prints: abc\def
However, if you look at the representation of the string, it will be shown with the escape characters:
print repr(s) # prints: 'abc\\def'
Therefore, this line should include escapes for each backslash:
hex.replace("\\", "\") # wrong
hex.replace("\\\\", "\\") # correct
But that is not the solution to the problem!
There is no way that file.read().replace('\n', '') introduced additional backslashes. What probably happened is that OP printed the representation of the string with backslashes (\) which ended up printing escaped backslashes (\\).
You can make a bytes object with a utf-8 encoding, and then decode as unicode-escape.
>>> x = "\\x61\\x62\\x63"
>>> y = bytes(x, "utf-8").decode("unicode-escape")
>>> print(x)
\x61\x62\x63
>>> print(y)
abc
This question already has answers here:
Why do backslashes appear twice?
(2 answers)
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 4 years ago.
I have a dictionary:
my_dictionary = {"058498":"table", "064165":"pen", "055123":"pencil"}
I iterate over it:
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)
def doIt(PDF):
part = MIMEBase('application', "octet-stream")
part.set_payload( open(PDF,"rb").read() )
But I get this error:
IOError: [Errno 2] No such file or directory: 'C:\\Users\\user\\Desktop\\File_055123.pdf'
It can't find my file. Why does it think there are double backslashes in file path?
The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.
>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'
For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:
>>> print(a)
raw s\tring
>>> print(b)
raw s\tring
And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.
Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:
>>> t = 'not raw s\tring' # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring
So in the PDF path+name:
>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf
More info about escape sequences in the table here. Also see __str__ vs __repr__.
Double backslashes are due to r, raw string:
r'C:\Users\user\Desktop\File_%s.pdf' ,
It is used because the \ might escape some of the characters.
>>> strs = "c:\desktop\notebook"
>>> print strs #here print thinks that \n in \notebook is the newline char
c:\desktop
otebook
>>> strs = r"c:\desktop\notebook" #using r'' escapes the \
>>> print strs
c:\desktop\notebook
>>> print repr(strs) #actual content of strs
'c:\\desktop\\notebook'
save yourself from getting a headache you can use other slashes as well.
if you know what I saying. the opposite looking slashes.
you're using now
PDF = 'C:\Users\user\Desktop\File_%s.pdf' %item
try to use
**
PDF = 'C:/Users/user/Desktop/File_%s.pdf' %item
**
it won't be treated as a escaping character .
It doesn't. Double backslash is just the way of the computer of saying backslash. Yes, I know this sounds weird, but think of it this way - in order to represent special characters, backslash was chosen as an escaping character (e.g. \n means newline, and not the backslash character followed by the n character). But what happens if you actually want to print (or use) a backslash (possibly followed by more characters), but you don't want the computer to treat it as an escaping character? In that case we escape the backslash itself, meaning we use a double backslash so the computer will understand it's a single backslash.
It's done automatically in your case because of the r you added before the string.
alwbtc #
I dare say: "I found the bug..."
replace
PDF = r'C:\Users\user\Desktop\File_%s.pdf' %item
doIt(PDF)`
with
for item in my_dictionary:
PDF = r'C:\Users\user\Desktop\File_%s.pdf' % mydictionary[item]
doIt(PDF)`
in fact you were really looking for File_pencil.pdf (not File_055123.pdf).
You were sliding the index dictionary not its contents.
This forum topic maybe a side-effect.
How can I effectively split multiline string containing backslashes resulting in unwanted escape characters into separate lines?
Here is an example input I'm dealing with:
strInput = '''signalArr(0)="ASCB D\axx\bxx\fxx\nxx"
signalArr(1)="root\rxx\txx\vxx"'''
I've tried this (to transform single backslash into double. So backslash escape would have precedence and following character would be treated "normally"):
def doubleBackslash(inputString):
inputString.replace('\\','\\\\')
inputString.replace('\a','\\a')
inputString.replace('\b','\\b')
inputString.replace('\f','\\f')
inputString.replace('\n','\\n')
inputString.replace('\r','\\r')
inputString.replace('\t','\\t')
inputString.replace('\v','\\v')
return inputString
strInputProcessed = doubleBackslash(strInput)
I'd like to get:
lineList = strInputProcessed.splitlines()
>> ['signalArr(0)="ASCB D\axx\bxx\fxx\nxx"','signalArr(1)="root\rxx\txx\vxx"']
What I got:
>> ['signalArr(0)="ASCB D\x07xx\x08xx', 'xx', 'xx"', 'signalArr(1)="root', 'xx\txx', 'xx"']
Try storing your input as a raw string, then all '\n' characters will automatically be escaped:
>>> var = r'''abc\n
... cba'''
>>> print var
abc\n
cba
>>> var.splitlines()
['abc\\n', 'bca']
(Note the r before the '. This denotes the string is raw)
As an extra, if you wish to escape an existing string, instead of the replace commands you did above, you can use encode with 'string-escape'.
>>> s = 'abc\nabc\nabc'
>>> s.encode('string-escape')
'abc\\nabc\\nabc'
and similarly if needed, you can undo the string-escaping of a string.
>>> s.decode('string-escape')
Finally, thought I would add in your context:
>>> strInput = r'''signalArr(0)="ASCB D\axx\bxx\fxx\nxx"
... signalArr(1)="root\rxx\txx\vxx"'''
>>> strInput.splitlines()
['signalArr(0)="ASCB D\\axx\\bxx\\fxx\\nxx"', 'signalArr(1)="root\\rxx\\txx\\vxx"']
Even though the extra \ are present on the printed string, they don't really exist in memory. Iterating the string will prove this, as it does not give you an extra \ character that is used to escape.
>>> s = r'\a\b\c'
>>>
>>> for c in s:
... print c
\
a
\
b
\
c
>>> list(s)
['\\', 'a', '\\', 'b', '\\', 'c']
I'm so confused... why/how is a different from b?! Why don't they print the same thing?
>>> a = '"'
>>> a
'"'
>>> b = "'"
>>> b
"'"
The strings are not presented differently. Their presentation is just adjusted to avoid having to quote the contained quote. Both ' and " are legal string literal delimiters.
Note that the contents of the string are very different. " is not the same string as '; a == b is (patently) False.
Python would have to use a \ backslash for the " or ' character otherwise. If you use both characters in a string, then python is forced to use quoting:
>>> '\'"'
'\'"'
>>> """Tripple quoted means you can use both without escaping them: "'"""
'Tripple quoted means you can use both without escaping them: "\''
As you can see, the string representation used by Python still uses single quotes and a backslash to represent that last string.