This question already has answers here:
Process escape sequences in a string in Python
(8 answers)
Closed 9 years ago.
I have a string that looks like this:
>>> st = 'aaaaa\x12bbbbb'
I can convert it to a raw string via:
>>> escaped_st = st.encode('string-escape')
'aaaaa\\x12bbbbb'
How can I convert the escaped string back to the original string? I was trying to do something like this:
escaped_st.replace('\\\\', '\\')
Decode the encoded string with the same encoding:
>>> st = 'aaaaa\x12bbbbb'
>>> escaped_st = st.encode('string-escape')
>>> escaped_st
'aaaaa\\x12bbbbb'
>>> escaped_st.decode('string-escape')
'aaaaa\x12bbbbb'
Related
This question already has answers here:
Decode Hex String in Python 3
(3 answers)
Closed 3 years ago.
I have lots of unicode characters codes stored as strings in Python3, e.g.
unicode = '3077'
where U+3077 is ぷ. How do I print this as human-readable text? I.e. how do I convert the string unicode to unicode_as_text such that:
>>> print(unicode_as_text)
ぷ
Your string is the unicode codepoint represented in hexdecimal, so the character can be rendered by printing the result of calling chr on the decimal value of the code point.
>>> print(chr(int('3077', 16)))
ぷ
This question already has answers here:
Process escape sequences in a string in Python
(8 answers)
Closed 7 months ago.
I want to create a raw unicode character from a string hex representation. That is, I have a string s = '\u0222' which will be the 'Ȣ' character.
Now, this works if I do
>>> s = '\u0222'
>>> print(s)
'Ȣ'
but, if I try to do concatenation, it comes out as
>>> h = '0222'
>>> s = r'\u' + '0222'
>>> print(s)
\u0222
>>> s
'\\u0222'
because as it can be seen, what's actually in string is '\\u' not '\u'. How can I create the unicode character from hex strings or, how can I enter a true single backslash?
This was a lot harder to solve than I initially expected:
code = '0222'
uni_code = r'\u' + code
s = uni_code.encode().decode('unicode_escape')
print(s)
Or
code = b'0222'
uni_code = b'\u' + code
s = uni_code.decode('unicode_escape')
print(s)
Entering \u0222 is only for string constants and the Python interpreter generates a single Unicode code point for that syntax. It's not meant to be constructed manually. The chr() function is used to generate Unicode code points. The following works for strings or integers:
>>> chr(int('0222',16)) # convert string to int base 16
'Ȣ'
>>> chr(0x222) # or just pass an integer.
'Ȣ'
And FYI ord() is the complementary function:
>>> hex(ord('Ȣ'))
'0x222'
This question already has answers here:
How can I put an actual backslash in a string literal (not use it for an escape sequence)?
(4 answers)
Closed 6 years ago.
I would need to stripoff "domain\" from "domain\name" to extract name which can be any name or the word name literally
>>> s="domain\name"
>>> x=s[5:]
>>> print(x)
n
ame
>>> s="domain\bh16"
>>> x=s[5:]
>>> print(x)
h16
>>> x=s[4:]
>>> print(x)
ih16
You can convert it to a raw string and use replace as normal
s = r"domain\bh16"
print(s.replace("domain\\", '')) #bh16
This question already has answers here:
Remove specific characters from a string in Python
(26 answers)
Removing numbers from string [closed]
(8 answers)
Closed 8 years ago.
I’d like to eliminate numbers in a string in Python.
str = "aaaa22222111111kkkkk"
I want this to be "aaaakkkkk".
I use re.sub to replace, but it doesn't work:
str = "aaaa22222111111kkkkk"
str = re.sub(r'^[0-9]+$',"",str)
Maybe, this replaces a string which only contains numbers with "".
How should I do with this?
your regex is wrong:
re.sub(r'[0-9]',"",str)
should work:
>>> str="aaaa22222111111kkkkk"
>>> re.sub(r'[0-9]',"",str)
'aaaakkkkk'
This question already has an answer here:
How can I convert strings like "\u5c0f\u738b\u5b50\u003a\u6c49\u6cd5\u82f1\u5bf9\u7167" to Chinese characters
(1 answer)
Closed 9 years ago.
I have unicode string, i'm sure that it's UTF-8, but I can't decode it. The string is '\u041b\u0435\u0433\u043a\u043e\u0432\u044b\u0435'. How to decode it?
You can use aString.decode('unicode_escape'), it convert a unicode-format string to unicode object
>>> u'\u041b\u0435\u0433\u043a\u043e\u0432\u044b\u0435'
u'\u041b\u0435\u0433\u043a\u043e\u0432\u044b\u0435'
>>> '\u041b\u0435\u0433\u043a\u043e\u0432\u044b\u0435'.decode('unicode_escape')
u'\u041b\u0435\u0433\u043a\u043e\u0432\u044b\u0435'
>>>
In your case
>>> print '\u041b\u0435\u0433\u043a\u043e\u0432\u044b\u0435'.decode('unicode_escape')
Легковые
>>>