This question already has answers here:
Decode escaped characters in URL
(5 answers)
Closed 5 years ago.
How to make this string readable in Python 2.7?
%D0%9A%D0%BE%D0%BD%D1%86%D0%B5%D0%BF%D1%86%D0%B8%D1%8F_%D0%A4%D0%B5%D0%B4%D0%B5%D1%80%D0%B0%D0%BB%D1%8C%D0%BD%D0%BE%D0%B9_%D1%86%D0%B5%D0%BB%D0%B5%D0%B2%D0%BE%D0%B9_%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC%D0%BC%D1%8B_%D1%80%D0%B0%D0%B7%D0%B2%D0%B8%D1%82%D0%B8%D1%8F_%D0%BE%D0%B1%D1%80%D0%B0%D0%B7%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D1%8F_%D0%BD%D0%B0_2016-2020_%D0%B3%D0%B3
This string contains Cyrillic symbol and it's a part of a URL (a query string parameter).
use urllib.unquote from the standard library.
urllib.unquote(string)¶
Replace %xx escapes by their single-character equivalent.
Example: unquote('/%7Econnolly/') yields '/~connolly/'.
Related
This question already has an answer here:
How to encode Python 3 string using \u escape code?
(1 answer)
Closed 1 year ago.
lets say i have a string,
"Hello–World"
how would I convert it to something like this
"Hello\u2013World"
where "\u2013" is the unicode representative of "–"
Use str.encode with unicode_escape:
>>> print(s.encode('unicode_escape'))
b'Hello\\u2013World'
If you want a string (and to a byte string like above):
>>> print(s.encode('unicode_escape').decode())
Hello\u2013World
This question already has answers here:
How do I url unencode in Python?
(3 answers)
Closed 5 years ago.
I'm trying to find a python package/sample code that can convert the following input "why+don%27t+you+want+to+talk+to+me" to "why+don't+you+want+to+talk+to+me".
Converting the Hex codes like %27 to ' respectively. I can hardcode the who hex character set and then swap them with their symbols. However, I want a simple and scalable solution.
Thanks for helping
You can use urllib's unquote function.
import urllib.parse
urllib.parse.unquote('why+don%27t+you+want+to+talk+to+me')
This question already has answers here:
How to write string literals in Python without having to escape them?
(6 answers)
Closed 6 years ago.
\201 is a character code recognised in Python. What is the best way to ignore this in strings?
s = '\2016'
s = s.replace('\\', '/')
print s #6
If you have a string literal with a backslash in it, you can escape the backslash:
s = '\\2016'
or you can use a "raw" string:
s = r'\2016'
This question already has answers here:
Convert a Unicode string to a string in Python (containing extra symbols)
(12 answers)
Closed 7 years ago.
I will give the example from Turkish, for example "şğüı" becomes "sgui"
I'm sure each language has it's own conversion methods, sometimes a character might be converted to multiple ASCII characters, like "alpha"/"phi" etc.
I'm wondering whether there is a library/method that achieves this conversion
What you are asking is called transliteration.
Try the Unidecode library.
This question already has answers here:
Saving UTF-8 texts with json.dumps as UTF-8, not as a \u escape sequence
(12 answers)
Closed 7 months ago.
For example:
>>> print(json.dumps('růže'))
"r\u016f\u017ee"
(Of course, in the real program it's not just a single string, and it also appears like this in the file, when using json.dump()) I'd like it to output simply "růže" as well, how to do that?
Pass the ensure_ascii=False argument to json.dumps:
>>> print(json.dumps('růže', ensure_ascii=False))
"růže"