How to detect the right font to use depending on the langage

How to detect the right font to use depending on the langage - python

For a program of mine I have a database full of street name (using GIS stuff) in unicode. The user selects any part of the world he wants to see (using openstreetmap, google maps or whatever) and my program displays every streets selected using a nice font to show their names. As you may know not every font can display non latin characters... and it gives me headaches. I wonder how to tell my program "if this word is written in chinese, then use a chinese font".
EDIT: I forgot to mention that I want to use non-standard fonts. Arial, Courier and some other can display non-latin words, but I want to use other fonts (I have a specific font for chinese, another one for japanese, another one for arabic...). I just have to know what font to chose depending of the word I want to write.

You need information about the language of the text.
And when you decide what fonts you want, you do a mapping from language to font.
If you try to do it automatically, it does not work. The fonts for Japanese, Chinese Traditional, and Chinese Simplified look differently even for the same character. They might be inteligible, but a native would be able to tell (ok, complain) that the font is wrong.
Plus, if you do anything algorithmically, there is no way to consider the estethic part (for instance the fact that you don't like Arial :-)

Use utf-8 text and a font that has glyphs for every possible character defined, like Arial/Verdana in Windows. That bypasses the entire detection problem. One font will handle everything.

Related

Editing text in PyCharm

I've started using PyCharm 2 this year and it's working well for me, the only thing is that when I add comments in, it all sort of gets lost in the amounts of code.
Is there a way to add any text formatting to only certain parts of my programs? Like increasing the font of the comments that separate different practice questions, or even just bolding some of my comments to make the sections stand out more?
I know you can change the text for the entire file, but I want some diversity so it's a bit easier for other people (and sometimes myself) to read.

Go to Preferences->Editing->Color Scheme. From there you can change the way all the different types of program elements are displayed. You can use colors, bold, and italic to highlight things differently. I don't think you can assign different fonts or sizes, though.

Go to File -> Settings -> Editor -> Color scheme
Comments are under "Language Defaults"

Python: How to specify and view high-numbered Unicode characters?

The Unicode character U+1d134 is the musical symbol for "common time"; it looks like a capital 'C'.
But using Python 3.6, when I specify '\U0001d134' I get a glyph that seems to indicate an unknown symbol. On my Mac, it looks like a square with a question-mark in it.
Is the inability to display the corresponding glyph simply a font limitation, or is it something else? (Like maybe something I'm doing wrong....)
For clarity, I want to use this and other such symbols in an app I'm writing, and would like to find out if there's a way to do this.

The problem lies not in your code but in your local system. You don't have any font installed that contains the character 𝄴 "MUSICAL SYMBOL COMMON TIME".
That is also the reason none of your browsers can display it. Usually, browsers are quite good in hunting down any font that can display a certain character. Reason they all fail is what's in the paragraph above.
But – as it happened,
>>> print ('\U0001d134')
𝄴
worked for me, displaying this:
I pasted it into UnicodeChecker, which helpfully listed 'all' fonts that contain this character: only one, Bravura. It's an Open Source font so go ahead and download it. (Be careful to follow proper procedures if you want to distribute it along with your app.)
To think that I only had that font installed because of an earlier SO question.

How to change the characters used for QTextOption.ShowTabsAndSpaces?

Is there a way to change which character is used for QT's QTextOption.ShowTabsAndSpaces flag?
I find that the default character that's used for viewing whitespace (specifically spaces) stands out a little too much. I'd like to change the font or character used so that it's less distinct.
It looks like the character used is unicode "Middle Dot", · (U+00B7) and I'd like to use, say, U+02D1 ˑ.
Ideally I'd like to be able to set it to whatever the user wants.
I've been searching through the Qt docs and have only been able to find how to turn this flag on (here).
EDIT:
I guess I should show some code... Here's how I'm currently adding the whitespace indicators:
opts = self.document().defaultTextOption()
opts.setFlags(opts.flags() | QTextOption.ShowTabsAndSpaces)
self.document().setDefaultTextOption(opts)
Running Python 3.4 and PyQt4, but should be able to port C++ code over.
EDIT2:
Thanks to Andrei Shikalev's answer below, I've posted a feature request for this on the QT tracker: https://bugreports.qt.io/browse/QTBUG-46072

Currently we could not change characters for tabs and white space. This characters hardcoded in Qt source for QTextLayout:
QChar visualTab(0x2192);
...
QChar visualSpace((ushort)0xb7);
More info in source for QTextLayout on GitHub.
You can create feature request for tabs and white spaces custom characters. IMHO this feature will be useful for custom-looking editors, based on Qt.

Efficiently applying text widget tags in tkinter text widgets

I'm trying to implement syntax highlighting for text in a Text widget.
I use an external library to lexically analyse the text and give me the tokenised text. After that I go over all the words in the text and apply tag to their position in the text widget so that I can style each word.
My concern now is how do I deal with changes. Every time the user presses a key, do I tokenise the entire text again and add style tags to the text widget for the entire text. This is proving to be quite slow.
I then transitioned to only doing the highlighting process for the line the insert character was to make it faster but this is giving faulty results and the highlighting is not perfect now.
What would be an ideal compromise between fast and perfect? What is the best way to do this?

One possible answer is to do something like Idle does. As a user hits each key, its custom incremental parser tags identifiers that are a keyword, builtin, or def/class name*. It also tags delimited char sequences that are a string or comment. I does what can be done quickly enough.
For example, if one types printer not in a string or comment, Idle checks if the word is a keyword or builtin name after each key. After t is hit, print is tagged. After e (or any other identifier char) is entered, printe is untagged.
I believe some of the code is in idlelib/Hyperparser.py and some in ColorDelegator.py. You are free to copy and adapt code, but please do not use it directly, as the API may change. I presume the parser does the minimum needed according to the current state (after def/class, in an identifier, comment, string, etc.)
Idle has an re-based function to retag the entire file. I 'think' this is separate from the incremental colorizer, but I have not read all the relevant code. If one edits a long enough file, such as idlelib/EditorWindow.py (about 3000 lines), and changes font size, Idle retags the file (I am not sure why). There is a noticeable delay between the file turning all black and its being recolorized. You definitely would not want that delay with each keystroke.
Class/functions names with non-ascii chars are not yet properly recognized in 3.x, but should be. The patch is stuck on deciding the faster accurate method. Recognizing an ascii-only (2.x) indentifier is trivial by comparison.
PS I am correctly in guessing that you are interested in tagging something other than Python code?

how to use a non-european language with a python library

I'm relatively new to programming and recently I've started playing around with pygame (set of modules to write games in python). I'm looking to create a program/game in which some of the labels, strings, buttons, etc are in Arabic. I'm guessing that pygame has to support Arabic letters and it probably doesn't? Or could I possibly use another GUI library that does support Arabic and use that in unison with pygame? Any direction would be much appreciated!

Well Python itself uses Unicode for everything so that's not the problem. A quick googling also shows that PyGame should be able to render Unicode fonts just fine. So I assume the problem is more that it can't find fonts for the specific language to use for rendering.
Here is a short example for PyGame and especially this link should be useful.
This is the important library - so specifying a font that can render your language and using it to render it should work fine. Probably a good idea to write a small wrapper
Nb: Haven't used PyGame myself so this is based on speculation and some quick search about how PyGame renders fonts.
PS: If you want the game to work reliably for all of your users, it's probably a good idea to include an Open Source font in your release, otherwise you need some methodology to check if the user has some fonts installed that will work fine - a probably non-trivial problem if you want Xplattform support.

Python does support unicode-coded source.
Set the coding of your source file to the right type with a line of the form # coding: [yourCoding] at the very beginning of your file. I think # coding: utf-8 works for Arabic.
Then prepend your string literals with a u, like so:
u'アク'
(Sorry if you don't have a Japanese font installed, it's the only one I had handy!)
This makes python treat them as unicode characters. There's further information specific to Arabic on this site.

Both previous answers are great.
There is also a great built-in python function called unicode. It's very easy to use.
I wanted to write Hebrew text so I wrote a function:
def hebrew(text):
# Encoding that supports hebrew + punctuation marks
return unicode(text, "Windows-1255")
Then you can call it using:
hebrew("<Hebrew text>")
And it will return your text Hebrew encoded.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to detect the right font to use depending on the langage - python

Use utf-8 text and a font that has glyphs for every possible character defined, like Arial/Verdana in Windows. That bypasses the entire detection problem. One font will handle everything.

Related

Editing text in PyCharm

Python: How to specify and view high-numbered Unicode characters?

How to change the characters used for QTextOption.ShowTabsAndSpaces?

Efficiently applying text widget tags in tkinter text widgets

how to use a non-european language with a python library

Categories

Resources