When I open a file which has multilanguage character contents arabic contents are not rendering correctly, I setup encoding to utf-8 but did not help. How do you solve this?
On windows the process for getting code to display correctly includes all of:
including a comment on the second line of each source file with #coding:utf-8, the first should really be the shebang #!/usr/bin/env python
ensure that you have a font installed with good Unicode support, including the character page that you are going to be using - Consolas (the default for Wing-IDE) is usually a good choice but AFAIK does not include the full Arabic character ranges however Lucida Console should provide this.
Ensure that Wing-IDE has the Edit->Preferences->User Interface->Fonts->Editor Font/Size set to the selected font, i.e. Licida Console.
Related
I am making a program involving ANSI escape codes, and they were all working fine on replit until I switched back to IDLE (3.9 on both). But it doesn't work:
it should have looked like this:
I have seen several posts before that complain that the IDLE doesn't support these escape sequences because it isn't a terminal, so I tried to do it directly from the cmd but the beastly symbol still appeared, this time as a boxed question mark:
I know that it won't work straight from the IDLE, so I wonder if you can import a software like mintty into python?
Powershell works though...
P.S. please don't tell me to import colorama or something! I really want this to be the way. I also don't have immediate access to iPython (even though I would like to) so it's not really an option for me... unless I have to :D
EDIT: the code I put across the python programs:
import sys, os
os.system("")
CSI = f"{chr(0x1B)}["
print(f"""{CSI}3m{CSI}1m{CSI}4m{CSI}31m
look at this""")
sys.stdout.flush()
# I put the sys.stdout.flush() and os.system("") to try and fix the problem...
The IDLE shell is not a terminal emulator and not intended to be production environment. The decision so far is that is should show program developers what their program's output, without interpretation. This may change in the future but no final decision yet.
If you are on Windows, I believe its console can be put into ANSI mode, but Python does not do that for you. I don't know if program code can do so.
As near as I can tell, there is not wrapper program that turns mintty into an importable python module. It would not make much sense. Rather, you would want to open mintty or a mintty-based terminal-emulator, such as git bash, and open python within that terminal instead of CommandPrompt.
ANSI code is a broad term. You have to specify which code page you are using. For example, my Windows is in Chinese Simplified. Therefore if I want to escape from UTF-8 default in Python, I would put # coding : cp936 on the first or second line of a script. Then it can read and write text files with the simplified Chinese coding.
Second Question:
Could I make a red/green/etc. font for every character and put it as print('...', file=...)? It should be possible because colored emojis exist.
It should work, but I would like to know how I could (if it's possible) automate this with some program that makes a file containing those characters and displays them with the previous print statement.
Cheers!
Hello i just started to learn python yesterday and today the guy in the video just started with
print("hello")
Then he executed it without a problem, but when I try to do it, I got the following error message :
SyntaxError: Non-UTF-8 code starting with '\xff' in file
c:/Users/user/desktop/anan_bruh/bune.py on line 1, but no encoding
declared; see http://python.org/dev/peps/pep-0263/ for details
You need to ensure that your file is saved as an "ASCII" format.
If you are using windows, try using a program such as notepad++ or PFE or any other "text editing" program and ensure that the save file type is ASCII, ANSI, Latin or UTF-8.
If you are using Windows notepad, do not save the file as UTF-16. You can set the Encoding when you "Save as...". See the screen shot below:
If you are using Microsoft word or write or some other type of "word processor" - the easiest solution is don't do that! Instead, use notepad, PFE, Notepad++ or any of the hundreds of other "text editing" programs available online.
I hope this helps you.
I just saw the additional comment about using Visual Studio. I do not have that setup, but nevertheless the answer is the same. If I save my little "hello world" script using UTF-16, then I get the exact same error as you.
If I save my file as ANSI or UTF-8, the error is resolved. No doubt there is a way to save the files with the correct encoding within Visual Studio, but maybe try my answer first, verify that the solution works and then try to figure out how to do the same thing in Visual Studio.
I believe this is a common issue when it comes to the default encoding of characters on Linux and Windows. However after I searched the internet I have not got any easy way to fix it automatically and therefore I am about to write a script to do it.
Here is the scenario:
I created some files on Windows system, some with non-English names (Chinese specifically in my case). And I compressed them into a zip file using 7-zip. After that I downloaded the zip file to a Linux and extract the files on the Linux system (Ubuntu 16.04 LTS) (the default archive program). As much as I have guessed, all the non-English file names are now displayed as some corrupted characters! At first I thought it should be easy with convmv, but...
I tried convmv, and it says:"Skipping, already utf8". Nothing got changed.
And then I decided to write a tool using Python to do the dirty job, after some testing I come to a point where I cannot associate the original file names to the corrupted file names, (unless by hashing the contents.)
Here is an example. I setup a webserver to list the file names on Windows, and one file, after encoded with "gbk" in python, is displayed as
u'j\u63a5\u53e3\u6587\u6863'
And I can query the file names on my Linux system. I can create a file directly with the name as shown above, and the name is CORRECT. I can also encode the unicode gbk string to utf8 encoding and create a file, the name is also CORRECT. (Thus I am not able to do them at the same time since they are indeed the same name). Now when I read the file name which I extracted earlier, which should be the same file. BUT the file name is completely different as:
'j\xe2\x95\x9c\xe2\x95\x99.....'
decoding it with utf8, it is something like u'j\u255c\u2559...'. decoding it with gbk resulted in UnicodeDecodeError exception, and I also tried to decode it with utf8 and then encode with gbk, but the result is still something else.
To summarize it, I cannot inspect the original file name by decoding or encoding it after it was extracted to the linux system. If I really want to let a program do the job, I have to either re-do the archive with possibly some encoding options maybe, or just go with my script but using file content hash (like md5 or sha1) to determine its original file name on Windows.
Do I still got any chance to infer the original name from a python script in above case other than comparing file contents between two systems?
With a little experimentation with common encodings, I was able to reverse your mojibake:
bad = 'j\xe2\x95\x9c\xe2\x95\x99\xe2\x94\x90\xe2\x94\x8c\xe2\x95\xac\xe2\x94\x80\xe2\x95\xa1\xe2\x95\xa1'
>>> good = bad.decode('utf8').encode('cp437').decode('gbk')
>>> good
u'j\u63a5\u53e3\u6587\u6863' # u'j接口文档'
gbk - common Chinese Windows encoding
cp437 - common US Windows OEM console encoding
utf8 - common Linux encoding
I have a Logitech 360 keyboard, with which I am trying to code Python on the Raspberry Pi B. The apostrophe key produces a slanted quote, instead of the 'vertical' single quote, and this causes syntax errors in code (the same code runs perfectly when I paste in a snippet from the browser, which is the only way I can find to produce the correct flavor of apostrophe).
The syntax error is "Non-ASCII character '\xc2' in file '---' on line X, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details." The webpage suggests declaring a character encoding at the beginning of the script, but it didn't work for me, and in any case, I would rather not have to have it at the beginning of every script - I just want the keyboard to produce the correct character to begin with. I have fiddled with the keyboard config international settings; nothing works. It's driving me nuts.
you cannot use "\xc2" as a quote character without redefining the quote character in python source grammar(really your problems extend beyond even this) and recompiling python ....
you probably can change what character your logitech uses as a quote
You may want to check your Internationalisation Options by running
sudo raspi-config
Choose Option 4 - Internationalization Options
and then Option I3 - Change Keyboard Layout
Go through and check your settings and then try your keyboard again.
I'm relatively new to programming and recently I've started playing around with pygame (set of modules to write games in python). I'm looking to create a program/game in which some of the labels, strings, buttons, etc are in Arabic. I'm guessing that pygame has to support Arabic letters and it probably doesn't? Or could I possibly use another GUI library that does support Arabic and use that in unison with pygame? Any direction would be much appreciated!
Well Python itself uses Unicode for everything so that's not the problem. A quick googling also shows that PyGame should be able to render Unicode fonts just fine. So I assume the problem is more that it can't find fonts for the specific language to use for rendering.
Here is a short example for PyGame and especially this link should be useful.
This is the important library - so specifying a font that can render your language and using it to render it should work fine. Probably a good idea to write a small wrapper
Nb: Haven't used PyGame myself so this is based on speculation and some quick search about how PyGame renders fonts.
PS: If you want the game to work reliably for all of your users, it's probably a good idea to include an Open Source font in your release, otherwise you need some methodology to check if the user has some fonts installed that will work fine - a probably non-trivial problem if you want Xplattform support.
Python does support unicode-coded source.
Set the coding of your source file to the right type with a line of the form # coding: [yourCoding] at the very beginning of your file. I think # coding: utf-8 works for Arabic.
Then prepend your string literals with a u, like so:
u'アク'
(Sorry if you don't have a Japanese font installed, it's the only one I had handy!)
This makes python treat them as unicode characters. There's further information specific to Arabic on this site.
Both previous answers are great.
There is also a great built-in python function called unicode. It's very easy to use.
I wanted to write Hebrew text so I wrote a function:
def hebrew(text):
# Encoding that supports hebrew + punctuation marks
return unicode(text, "Windows-1255")
Then you can call it using:
hebrew("<Hebrew text>")
And it will return your text Hebrew encoded.