This question already has answers here:
opencv imread() on Windows for non-ASCII file names
(4 answers)
How do I read an image from a path with Unicode characters?
(5 answers)
Closed 5 years ago.
It seems if my file path or file name has unicode characters, I can't use opencv's imwrite method (no explicit error comes, but folders are empty).
I'm using Windows 7 and Python 3.6.
If I use non-unicode specific characters, the imwrite method will work fine.
The unicode-characters in question are greek letters:
C:\Users\Moondra\Desktop\TEMP\LRF Spinning Ένα διασκεδαστικό ψάρεμαδιάφορα ψάρια \image_459.jpg
Is there some argument I can use to allow imwrite with unicode characters?
Thank you.
Related
This question already has answers here:
Decode HTML entities in Python string?
(6 answers)
Closed 2 years ago.
I am trying scraping and meet an issue about the words shows as ''and '', i serach the whole network but there's no answer about how to decode it, so I come to here to ask for help, is there's any way to decode it?
These words called "html entities". Searching use this name, you can find many methods to parse them in python. (Decode HTML entities in Python string?)
import html
print(html.unescape(''))
P.S. Unicode code point U+E091 and U+E3C4 are in Private Use Area of Unicode, these don't have any meaning unless someone defines it (e.g. webfonts).
This question already has answers here:
How to get rid of double backslash in python windows file path string? [duplicate]
(5 answers)
Why do backslashes appear twice?
(2 answers)
Closed 2 years ago.
I am using Spyder when I import the environmental variables using:
import sys
print(sys.path)
I get the following:
['C:\\Users\\james', 'C:\\Python\\Anaconda3\\python37.zip', 'C:\\Python\\Anaconda3\\DLLs', 'C:\\Python\\Anaconda3\\lib', 'C:\\Python\\Anaconda3', '', 'C:\\Python\\Anaconda3\\lib\\site-packages', 'C:\\Python\\Anaconda3\\lib\\site-packages\\win32', 'C:\\Python\\Anaconda3\\lib\\site-packages\\win32\\lib', 'C:\\Python\\Anaconda3\\lib\\site-packages\\Pythonwin', 'C:\\Python\\Anaconda3\\lib\\site-packages\\IPython\\extensions', 'C:\\Users\\james\\.ipython']
I am wondering why I get double backslashes, while the tutorial I am watching displays the paths with a single forward slash. eg
['C:/Users/james', ...
The difference is that your tutorial is on not-windows system, and directories are like this/is/path. On Windows, the directories are like this\is\path. But in Python and in most programming languages, \(backslash) is used for escapes, so to writethis\is\path you need to write 2 slashes.
This question already has an answer here:
text with unicode escape sequences to unicode in python [duplicate]
(1 answer)
Closed 2 years ago.
I have a tab separated file written as following:
col_name cnt
\u7834\u6653\u5fae\u660e 8
\u9ed8\u8ba4 12
I use pandas.read_excel to read them into python, and it display the same thing.
How can I read data and derive the following? Thanks!
col_name cnt
破晓微明 8
默认 12
I am using python 3.7.7 and pandas 1.0.4
You need to decode the text with an appropriate decoder. For this case we can use unicode-escape. But to decode the text you have to make bytes out of it first.
col_name = r'\u7834\u6653\u5fae\u660e'
print(bytes(col_name, 'ascii').decode('unicode-escape'))
This will give you 破晓微明.
I don't think this can be done during the call to pandas.read_excel but I'm no pandas expert. You might have to change the contentn of the column after reading the file.
This question already has answers here:
Decode escaped characters in URL
(5 answers)
Closed 5 years ago.
How to make this string readable in Python 2.7?
%D0%9A%D0%BE%D0%BD%D1%86%D0%B5%D0%BF%D1%86%D0%B8%D1%8F_%D0%A4%D0%B5%D0%B4%D0%B5%D1%80%D0%B0%D0%BB%D1%8C%D0%BD%D0%BE%D0%B9_%D1%86%D0%B5%D0%BB%D0%B5%D0%B2%D0%BE%D0%B9_%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC%D0%BC%D1%8B_%D1%80%D0%B0%D0%B7%D0%B2%D0%B8%D1%82%D0%B8%D1%8F_%D0%BE%D0%B1%D1%80%D0%B0%D0%B7%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D1%8F_%D0%BD%D0%B0_2016-2020_%D0%B3%D0%B3
This string contains Cyrillic symbol and it's a part of a URL (a query string parameter).
use urllib.unquote from the standard library.
urllib.unquote(string)¶
Replace %xx escapes by their single-character equivalent.
Example: unquote('/%7Econnolly/') yields '/~connolly/'.
This question already has answers here:
Convert a Unicode string to a string in Python (containing extra symbols)
(12 answers)
Closed 7 years ago.
I will give the example from Turkish, for example "şğüı" becomes "sgui"
I'm sure each language has it's own conversion methods, sometimes a character might be converted to multiple ASCII characters, like "alpha"/"phi" etc.
I'm wondering whether there is a library/method that achieves this conversion
What you are asking is called transliteration.
Try the Unidecode library.