How to determine if a Glyph can be displayed?

How to determine if a Glyph can be displayed? - python

I have a large list of Unicode icons that I want to display. However, I would like to hide/skip any icon that I cannot display (because I don't have the correct font installed). Is there a programmatic way to determine this?

There's nothing built into Python for this. However, you can apply the fonttools module e.g. as follows (used in Windows 10):
# ToDo: find fallback font
# ToDo: reverse algorithm (font => characters) instead of (character => fonts)
# ToDo: check/print merely basic font (omit variants like Bold, Light, Condensed, …)
import unicodedata
import sys
import os
from fontTools.ttLib import TTFont, TTCollection
fontsPaths = []
fontcPaths = []
fontsdirs = [ os.path.join( os.getenv('SystemRoot'), 'Fonts') # r"c:\Windows\Fonts"
, r"D:\Downloads\MathJax-TeX-fonts-otf"
# , os.path.join( os.getenv('LOCALAPPDATA'), r'Microsoft\Windows\Fonts')
]
print(fontsdirs, file=sys.stderr)
for fontsdir in fontsdirs:
for root,dirs,files in os.walk( fontsdir ):
for file in files:
if file.endswith(".ttf") or file.endswith(".otf") or file.endswith(".ttc"):
tfile = os.path.join(root,file)
if file.endswith(".ttc"):
fontcPaths.append(tfile)
else:
fontsPaths.append(tfile)
# print( len(fonts), "fonts", fontsdir)
def char_in_font(unicode_char, font):
for cmap in font['cmap'].tables:
if cmap.isUnicode() or cmap.getEncoding() == 'utf_16_be':
if ord(unicode_char) in cmap.cmap:
# print(type(cmap))
auxcn = cmap.cmap[ord(unicode_char)]
# print(auxcn, type(auxcn))
return auxcn if auxcn != '' else '<nil>'
return ''
def checkfont(char,font,fontdict,fontpath):
nameID_index = 1 # works generally (not always)
for i,f in enumerate(font['name'].names):
# An Introduction to TrueType Fonts: A look inside the TTF format
# https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=IWS-Chapter08
# 1 = Font Family name, 2 = Font SubFamily name, 4 = Full font name
if f.nameID == 1:
nameID_index = i
break
fontname = font['name'].names[nameID_index].toStr()
if fontname not in fontdict.keys():
aux = char_in_font(char, font)
if aux != '':
fontdict[fontname] = "{} ({}) [{}] '{}' \t {} {}".format(
char,
'0x{:04x}'.format(ord(char)),
aux,
fontname, # string.decode('unicode-escape'),
# '', ''
'in', fontpath.split('\\')[-1]
)
def testfont(char):
fontdict = {}
for fontpath in fontsPaths:
font = TTFont(fontpath) # specify the path to the font
checkfont(char,font,fontdict,fontpath)
for fontpath in fontcPaths: # specify the path to the font collection
fonts = TTCollection(fontpath)
for ii in range(len(fonts)):
font = TTFont(fontpath, fontNumber=ii) # fontfile and index
checkfont(char,font,fontdict,fontpath)
return fontdict.values()
def testprint(char):
print('') # empty line for better readability
print(char, ' 0x{:04x}'.format(ord(char)), unicodedata.name(char, '???'))
fontarray = testfont(char)
for x in fontarray:
print(x)
if len(sys.argv) == 1:
# sample output
testprint(u"अ") # 0x0905 Devanagari Letter A
else:
for i in range( 1, len(sys.argv) ):
if len(sys.argv[i]) >=2:
try:
chars = chr(int(sys.argv[i])) # 0x042F or 1071
except:
try:
chars = chr(int(sys.argv[i],16)) # 042F
except:
chars = (sys.argv[i].
encode('raw_unicode_escape').
decode('unicode_escape')) # ➕🐈\U00010A30\u042F\xFE
else:
chars = sys.argv[i] # Я (Cyrillic Capital Letter Ya)
for char in chars:
testprint(char);
Sample output (if called without arguments): .\FontGlyphs.py
['C:\\WINDOWS\\Fonts', 'D:\\Downloads\\MathJax-TeX-fonts-otf']
अ 0x0905 DEVANAGARI LETTER A
अ (0x0905) [uni0905] 'Nirmala UI' in Nirmala.ttf
अ (0x0905) [uni0905] 'Nirmala UI Semilight' in NirmalaS.ttf
अ (0x0905) [uni0905] 'Unifont' in unifont-8.0.01.ttf
अ (0x0905) [uni0905] 'Unifont CSUR' in unifont_csur-8.0.01.ttf
Another example: .\FontGlyphs.py 🐈
['C:\\WINDOWS\\Fonts', 'D:\\Downloads\\MathJax-TeX-fonts-otf']
🐈 0x1f408 CAT
🐈 (0x1f408) [u1F408] 'EmojiOne Color' in EmojiOneColor-SVGinOT.ttf
🐈 (0x1f408) [u1F408] 'Segoe UI Emoji' in seguiemj.ttf
🐈 (0x1f408) [u1F408] 'Segoe UI Symbol' in seguisym.ttf
FYI, I have written similar script that shows output (glyphs) rendered using appropriate fonts (using default browser…
Limitation the script does not recognize Emoji Sequence, for instance
.\FontGlyphs.py 👍🏽
['C:\\WINDOWS\\Fonts', 'D:\\Downloads\\MathJax-TeX-fonts-otf']
👍 0x1f44d THUMBS UP SIGN
👍 (0x1f44d) [u1F44D] 'EmojiOne Color' in EmojiOneColor-SVGinOT.ttf
👍 (0x1f44d) [u1F44D] 'Segoe UI Emoji' in seguiemj.ttf
👍 (0x1f44d) [u1F44D] 'Segoe UI Symbol' in seguisym.ttf
🏽 0x1f3fd EMOJI MODIFIER FITZPATRICK TYPE-4
🏽 (0x1f3fd) [u1F3FD] 'EmojiOne Color' in EmojiOneColor-SVGinOT.ttf
🏽 (0x1f3fd) [u1F3FD] 'Segoe UI Emoji' in seguiemj.ttf
🏽 (0x1f3fd) [u1F3FD] 'Segoe UI Symbol' in seguisym.ttf

You can use pywin32 to check for the required fonts.
import win32gui
def fontFamProc(font, tm, fonttype, names):
names.append(font.lfFaceName)
return True
fonts = []
deviceContext = win32gui.GetDC(None)
win32gui.EnumFontFamilies(deviceContext, None, fontFamProc, fonts)
win32gui.ReleaseDC(deviceContext, None)
print(fonts)

Well, you could simply print all of Unicode and find out that way. E.g., (I can print most all if not all :
import io
with io.open("all_utf-8.txt", "w", encoding="utf8") as f:
for n in range(150000):
try:
i = chr(n)
if i.isprintable():
print(f"{i}", end="", file=f)
if n % 200 == 0:
print(file=f)
except UnicodeError:
pass
(note the use of the built-in Python str method isprintable())
& here's a bit of a zoom in so you can actually see the individual chars/glyphs... 🙂

Related

How to display non Latin character on output text console correctly?

I'm new in Python. I'm experimenting to make "IDE" to run Python script that display non-Latin character, especially Indonesian local scripts (Javanese, Balinese, Buginese etc.) that have been included in the Unicode Standard. Please consider installing the Indonesian Unicode fonts to render the characters correctly.
Here is the code:
from tkinter import *
from tkinter.filedialog import asksaveasfilename, askopenfilename
import subprocess
compiler = Tk()
compiler.title ('ᮊᮧᮙ᮪ᮕᮤᮜᮨᮁ ᮞᮥᮔ᮪ᮓ')
file_path = ''
def set_file_path(path):
global file_path
file_path = path
def save_as():
if file_path == '':
path = asksaveasfilename(filetypes=[('Python Files', '*.py')])
else:
path = file_path
with open(path, 'w') as file:
code = editor.get('1.0', END)
file.write(code)
set_file_path(path)
def open_file():
path = askopenfilename(filetypes=[('Python Files', '*.py')])
with open(path, 'r') as file:
code =file.read()
editor.delete('1.0', END)
editor.insert('1.0', code)
set_file_path(path)
def run():
if file_path == '':
save_prompt = Toplevel()
text = Label(save_prompt, text='ᮞᮤᮙ᮪ᮕᮨᮔ᮪ ᮠᮩᮜ ᮊᮧᮓᮩ ᮃᮔ᮪ᮏᮩᮔ᮪')
text.pack()
return
command = f'python {file_path}'
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
output, error = process.communicate()
code_output.insert('1.0', output)
code_output.insert('1.0', error)
menu_bar = Menu(compiler)
file_menu = Menu(menu_bar, tearoff=0)
file_menu.add_command(label='ᮘᮥᮊ', command=open_file)
file_menu.add_command(label='ᮞᮤᮙ᮪ᮕᮨᮔ᮪', command=save_as)
file_menu.add_command(label='ᮞᮤᮙ᮪ᮕᮨᮔ᮪ ᮔᮥ ᮞᮦᮏᮦᮔ᮪', command=save_as)
file_menu.add_command(label='ᮊᮜᮥᮃᮁ', command=exit)
menu_bar.add_cascade(label='ᮘᮨᮁᮊᮞ᮪', menu=file_menu)
run_bar = Menu(menu_bar, tearoff=0)
run_bar.add_command(label='ᮏᮜᮔ᮪ᮊᮩᮔ᮪', command=run)
menu_bar.add_cascade(label='ᮏᮜᮔ᮪ᮊᮩᮔ᮪', menu=run_bar)
compiler.config(menu=menu_bar)
editor = Text()
editor.pack()
code_output = Text(height=10)
code_output.pack()
compiler.mainloop()
I want to print the non-Latin characters, for example:
print('Sundanese script')
print('ᮃᮊ᮪ᮞᮛ ᮞᮥᮔ᮪ᮓ')
unfortunately the result shows this:
Sundanese script
á®ƒá®Šá®ªá®žá®› á®žá®¥á®”á®ªá®“
or the code like this:
import unicodedata
u = chr(233) + chr(0x03B2) + chr(0xA991) + chr(0x1B11) + chr(0x1B91) + chr(0x1BE1)
for i, c in enumerate(u):
print(i, '%04x' % ord(c), unicodedata.category(c), u[i], end=" ")
print(unicodedata.name(c))
will display this:
0 00e9 Ll Ã© LATIN SMALL LETTER E WITH ACUTE
1 03b2 Ll Î² GREEK SMALL LETTER BETA
2 a991 Lo ê¦‘ JAVANESE LETTER KA MURDA
3 1b11 Lo á¬‘ BALINESE LETTER OKARA
4 1b91 Lo á®‘ SUNDANESE LETTER NYA
5 1be1 Lo á¯¡ BATAK LETTER CA
But the import unicodedata script could be displayed well on the terminal:
>>> import unicodedata
>>>
>>> u = chr(233) + chr(0x03B2) + chr(0xA991) + chr(0x1B11) + chr(0x1B91) + chr(0x1BE1)
>>>
>>> for i, c in enumerate(u):
... print(i, '%04x' % ord(c), unicodedata.category(c), u[i], end=" ")
... print(unicodedata.name(c))
...
0 00e9 Ll é LATIN SMALL LETTER E WITH ACUTE
1 03b2 Ll β GREEK SMALL LETTER BETA
2 a991 Lo ꦑ JAVANESE LETTER KA MURDA
3 1b11 Lo ᬑ BALINESE LETTER OKARA
4 1b91 Lo ᮑ SUNDANESE LETTER NYA
5 1be1 Lo ᯡ BATAK LETTER CA
Can anyone point out the problem?

Run python script to replace betacode with greek letters LaTeX

I want to convert the betacode in an existing .tex-File to normal greek letters.
For example: I want to replace:
\bcode{lo/gos}
with simple:
λόγος
And so on for all other glyphs. Fortunately there seems to be a python-script that is supposed to do just that. But, being completely inexperienced I simply don’t know how to run it.
Here is the code of the python sript:
# beta2unicode.py
#
# Version 2004-11-23
#
# James Tauber
# http://jtauber.com/
#
# You are free to redistribute this, but please inform me of any errors
#
# USAGE:
#
# trie = beta2unicodeTrie()
# beta = "LO/GOS\n";
# unicode, remainder = trie.convert(beta)
#
# - to get final sigma, string must end in \n
# - remainder will contain rest of beta if not all can be converted
class Trie:
def __init__(self):
self.root = [None, {}]
def add(self, key, value):
curr_node = self.root
for ch in key:
curr_node = curr_node[1].setdefault(ch, [None, {}])
curr_node[0] = value
def find(self, key):
curr_node = self.root
for ch in key:
try:
curr_node = curr_node[1][ch]
except KeyError:
return None
return curr_node[0]
def findp(self, key):
curr_node = self.root
remainder = key
for ch in key:
try:
curr_node = curr_node[1][ch]
except KeyError:
return (curr_node[0], remainder)
remainder = remainder[1:]
return (curr_node[0], remainder)
def convert(self, keystring):
valuestring = ""
key = keystring
while key:
value, key = self.findp(key)
if not value:
return (valuestring, key)
valuestring += value
return (valuestring, key)
def beta2unicodeTrie():
t = Trie()
t.add("*A", u"\u0391")
t.add("*B", u"\u0392")
t.add("*G", u"\u0393")
t.add("*D", u"\u0394")
t.add("*E", u"\u0395")
t.add("*Z", u"\u0396")
t.add("*H", u"\u0397")
t.add("*Q", u"\u0398")
t.add("*I", u"\u0399")
t.add("*K", u"\u039A")
t.add("*L", u"\u039B")
t.add("*M", u"\u039C")
t.add("*N", u"\u039D")
t.add("*C", u"\u039E")
t.add("*O", u"\u039F")
t.add("*P", u"\u03A0")
t.add("*R", u"\u03A1")
t.add("*S", u"\u03A3")
t.add("*T", u"\u03A4")
t.add("*U", u"\u03A5")
t.add("*F", u"\u03A6")
t.add("*X", u"\u03A7")
t.add("*Y", u"\u03A8")
t.add("*W", u"\u03A9")
t.add("A", u"\u03B1")
t.add("B", u"\u03B2")
t.add("G", u"\u03B3")
t.add("D", u"\u03B4")
t.add("E", u"\u03B5")
t.add("Z", u"\u03B6")
t.add("H", u"\u03B7")
t.add("Q", u"\u03B8")
t.add("I", u"\u03B9")
t.add("K", u"\u03BA")
t.add("L", u"\u03BB")
t.add("M", u"\u03BC")
t.add("N", u"\u03BD")
t.add("C", u"\u03BE")
t.add("O", u"\u03BF")
t.add("P", u"\u03C0")
t.add("R", u"\u03C1")
t.add("S\n", u"\u03C2")
t.add("S,", u"\u03C2,")
t.add("S.", u"\u03C2.")
t.add("S:", u"\u03C2:")
t.add("S;", u"\u03C2;")
t.add("S]", u"\u03C2]")
t.add("S#", u"\u03C2#")
t.add("S_", u"\u03C2_")
t.add("S", u"\u03C3")
t.add("T", u"\u03C4")
t.add("U", u"\u03C5")
t.add("F", u"\u03C6")
t.add("X", u"\u03C7")
t.add("Y", u"\u03C8")
t.add("W", u"\u03C9")
t.add("I+", U"\u03CA")
t.add("U+", U"\u03CB")
t.add("A)", u"\u1F00")
t.add("A(", u"\u1F01")
t.add("A)\\", u"\u1F02")
t.add("A(\\", u"\u1F03")
t.add("A)/", u"\u1F04")
t.add("A(/", u"\u1F05")
t.add("E)", u"\u1F10")
t.add("E(", u"\u1F11")
t.add("E)\\", u"\u1F12")
t.add("E(\\", u"\u1F13")
t.add("E)/", u"\u1F14")
t.add("E(/", u"\u1F15")
t.add("H)", u"\u1F20")
t.add("H(", u"\u1F21")
t.add("H)\\", u"\u1F22")
t.add("H(\\", u"\u1F23")
t.add("H)/", u"\u1F24")
t.add("H(/", u"\u1F25")
t.add("I)", u"\u1F30")
t.add("I(", u"\u1F31")
t.add("I)\\", u"\u1F32")
t.add("I(\\", u"\u1F33")
t.add("I)/", u"\u1F34")
t.add("I(/", u"\u1F35")
t.add("O)", u"\u1F40")
t.add("O(", u"\u1F41")
t.add("O)\\", u"\u1F42")
t.add("O(\\", u"\u1F43")
t.add("O)/", u"\u1F44")
t.add("O(/", u"\u1F45")
t.add("U)", u"\u1F50")
t.add("U(", u"\u1F51")
t.add("U)\\", u"\u1F52")
t.add("U(\\", u"\u1F53")
t.add("U)/", u"\u1F54")
t.add("U(/", u"\u1F55")
t.add("W)", u"\u1F60")
t.add("W(", u"\u1F61")
t.add("W)\\", u"\u1F62")
t.add("W(\\", u"\u1F63")
t.add("W)/", u"\u1F64")
t.add("W(/", u"\u1F65")
t.add("A)=", u"\u1F06")
t.add("A(=", u"\u1F07")
t.add("H)=", u"\u1F26")
t.add("H(=", u"\u1F27")
t.add("I)=", u"\u1F36")
t.add("I(=", u"\u1F37")
t.add("U)=", u"\u1F56")
t.add("U(=", u"\u1F57")
t.add("W)=", u"\u1F66")
t.add("W(=", u"\u1F67")
t.add("*A)", u"\u1F08")
t.add("*)A", u"\u1F08")
t.add("*A(", u"\u1F09")
t.add("*(A", u"\u1F09")
#
t.add("*(\A", u"\u1F0B")
t.add("*A)/", u"\u1F0C")
t.add("*)/A", u"\u1F0C")
t.add("*A(/", u"\u1F0F")
t.add("*(/A", u"\u1F0F")
t.add("*E)", u"\u1F18")
t.add("*)E", u"\u1F18")
t.add("*E(", u"\u1F19")
t.add("*(E", u"\u1F19")
#
t.add("*(\E", u"\u1F1B")
t.add("*E)/", u"\u1F1C")
t.add("*)/E", u"\u1F1C")
t.add("*E(/", u"\u1F1D")
t.add("*(/E", u"\u1F1D")
t.add("*H)", u"\u1F28")
t.add("*)H", u"\u1F28")
t.add("*H(", u"\u1F29")
t.add("*(H", u"\u1F29")
t.add("*H)\\", u"\u1F2A")
t.add(")\\*H", u"\u1F2A")
t.add("*)\\H", u"\u1F2A")
#
t.add("*H)/", u"\u1F2C")
t.add("*)/H", u"\u1F2C")
#
t.add("*)=H", u"\u1F2E")
t.add("(/*H", u"\u1F2F")
t.add("*(/H", u"\u1F2F")
t.add("*I)", u"\u1F38")
t.add("*)I", u"\u1F38")
t.add("*I(", u"\u1F39")
t.add("*(I", u"\u1F39")
#
#
t.add("*I)/", u"\u1F3C")
t.add("*)/I", u"\u1F3C")
#
#
t.add("*I(/", u"\u1F3F")
t.add("*(/I", u"\u1F3F")
#
t.add("*O)", u"\u1F48")
t.add("*)O", u"\u1F48")
t.add("*O(", u"\u1F49")
t.add("*(O", u"\u1F49")
#
#
t.add("*(\O", u"\u1F4B")
t.add("*O)/", u"\u1F4C")
t.add("*)/O", u"\u1F4C")
t.add("*O(/", u"\u1F4F")
t.add("*(/O", u"\u1F4F")
#
t.add("*U(", u"\u1F59")
t.add("*(U", u"\u1F59")
#
t.add("*(/U", u"\u1F5D")
#
t.add("*(=U", u"\u1F5F")
t.add("*W)", u"\u1F68")
t.add("*W(", u"\u1F69")
t.add("*(W", u"\u1F69")
#
#
t.add("*W)/", u"\u1F6C")
t.add("*)/W", u"\u1F6C")
t.add("*W(/", u"\u1F6F")
t.add("*(/W", u"\u1F6F")
t.add("*A)=", u"\u1F0E")
t.add("*)=A", u"\u1F0E")
t.add("*A(=", u"\u1F0F")
t.add("*W)=", u"\u1F6E")
t.add("*)=W", u"\u1F6E")
t.add("*W(=", u"\u1F6F")
t.add("*(=W", u"\u1F6F")
t.add("A\\", u"\u1F70")
t.add("A/", u"\u1F71")
t.add("E\\", u"\u1F72")
t.add("E/", u"\u1F73")
t.add("H\\", u"\u1F74")
t.add("H/", u"\u1F75")
t.add("I\\", u"\u1F76")
t.add("I/", u"\u1F77")
t.add("O\\", u"\u1F78")
t.add("O/", u"\u1F79")
t.add("U\\", u"\u1F7A")
t.add("U/", u"\u1F7B")
t.add("W\\", u"\u1F7C")
t.add("W/", u"\u1F7D")
t.add("A)/|", u"\u1F84")
t.add("A(/|", u"\u1F85")
t.add("H)|", u"\u1F90")
t.add("H(|", u"\u1F91")
t.add("H)/|", u"\u1F94")
t.add("H)=|", u"\u1F96")
t.add("H(=|", u"\u1F97")
t.add("W)|", u"\u1FA0")
t.add("W(=|", u"\u1FA7")
t.add("A=", u"\u1FB6")
t.add("H=", u"\u1FC6")
t.add("I=", u"\u1FD6")
t.add("U=", u"\u1FE6")
t.add("W=", u"\u1FF6")
t.add("I\\+", u"\u1FD2")
t.add("I/+", u"\u1FD3")
t.add("I+/", u"\u1FD3")
t.add("U\\+", u"\u1FE2")
t.add("U/+", u"\u1FE3")
t.add("A|", u"\u1FB3")
t.add("A/|", u"\u1FB4")
t.add("H|", u"\u1FC3")
t.add("H/|", u"\u1FC4")
t.add("W|", u"\u1FF3")
t.add("W|/", u"\u1FF4")
t.add("W/|", u"\u1FF4")
t.add("A=|", u"\u1FB7")
t.add("H=|", u"\u1FC7")
t.add("W=|", u"\u1FF7")
t.add("R(", u"\u1FE4")
t.add("*R(", u"\u1FEC")
t.add("*(R", u"\u1FEC")
# t.add("~", u"~")
# t.add("-", u"-")
# t.add("(null)", u"(null)")
# t.add("&", "&")
t.add("0", u"0")
t.add("1", u"1")
t.add("2", u"2")
t.add("3", u"3")
t.add("4", u"4")
t.add("5", u"5")
t.add("6", u"6")
t.add("7", u"7")
t.add("8", u"8")
t.add("9", u"9")
t.add("#", u"#")
t.add("$", u"$")
t.add(" ", u" ")
t.add(".", u".")
t.add(",", u",")
t.add("'", u"'")
t.add(":", u":")
t.add(";", u";")
t.add("_", u"_")
t.add("[", u"[")
t.add("]", u"]")
t.add("\n", u"")
return t
t = beta2unicodeTrie()
import sys
for line in file(sys.argv[1]):
a, b = t.convert(line)
if b:
print a.encode("utf-8"), b
raise Exception
print a.encode("utf-8")
And here is a little .tex-file with which it should work.
\documentclass[12pt]{scrbook}
\usepackage[polutonikogreek, ngerman]{babel}
\usepackage[ngerman]{betababel}
\usepackage{fontspec}
%\defaultfontfeatures{Ligatures=TeX}
%\newfontfeature{Microtype}{protrusion=default;expansion=default;}
\begin{document}
\bcode{lo/gos}
\end{document}
In case the script does not work: would it be possible to convert all the strings within the \bcode-Makro with something like regex? For example the "o/" to the ό and so on? What would be the weapon of choice here?

Do I have python installed?
Try python -V at a shell prompt. Your code is python 2 code, so you will a python 2 version.
I need to install Python
Most straight forward way if you don't need a complex environment (and you don't for this problem) is just to go to python.org. Don't forget you need python 2.
Running the program
Generally it will be as simple as:
python beta2unicode.py myfile.tex-file
And to capture the output:
python beta2unicode.py myfile.tex-file > myfile.not-tex-file
Does the script work?
Almost. You will need to replace the code at the end of the script that starts the same way this does, with this:
import sys
t = beta2unicodeTrie()
import re
BCODE = re.compile(r'\\bcode{[^}]*}')
for line in open(sys.argv[1]):
matches = BCODE.search(line)
for match in BCODE.findall(line):
bcode = match[7:-1]
a, b = t.convert(bcode.upper())
if b:
raise IOError("failed conversion '%s' in '%s'" % (b, line))
converted = a.encode("utf-8")
line = line.replace(match, converted)
print(line.rstrip())
Results
\documentclass[12pt]{scrbook}
\usepackage[polutonikogreek, ngerman]{babel}
\usepackage[ngerman]{betababel}
\usepackage{fontspec}
%\defaultfontfeatures{Ligatures=TeX}
%\newfontfeature{Microtype}{protrusion=default;expansion=default;}
\begin{document}
λόγοσ
\end{document}

QtGui.QTextEdit set line color baced on what text the line contains

It's my first time using stackoverflow to find an answer, to my problems.
I'm using a QtGui.QTextEdit to display text similar to below and would like to change the color of the text on some lines based on if they contain certain text.
lines that start with --[ will be blue and lines that contain [ERROR] would be red.
I currently have something like the following,
from PyQt4 import QtCore, QtGui, uic
import sys
class Log(QtGui.QWidget):
def __init__(self, path=None, parent=None):
QtGui.QMainWindow.__init__(self, parent)
self.taskLog = QtGui.QTextEdit()
self.taskLog.setLineWrapMode(False)
vbox = QtGui.QVBoxLayout()
vbox.addWidget(self.taskLog)
self.setLayout(vbox)
log = open("/net/test.log", 'r')
self.taskLog.setText(log.read())
log.close()
app = QtGui.QApplication(sys.argv)
wnd = Log()
wnd.show()
sys.exit(app.exec_())
The text looks something like this at the moment
--[ Begin
this is a test
[ERROR] this test failed.
--[ Command returned exit code 1
Hopefully you all will be able to help me work this out a lot faster that, trying to work it out my self.
Thanks,
Mark

This can be done quite easily with QSyntaxHighlighter. Here's a simple demo:
from PyQt4 import QtCore, QtGui
sample = """
--[ Begin
this is a test
[ERROR] this test failed.
--[ Command returned exit code 1
"""
class Highlighter(QtGui.QSyntaxHighlighter):
def __init__(self, parent):
super(Highlighter, self).__init__(parent)
self.sectionFormat = QtGui.QTextCharFormat()
self.sectionFormat.setForeground(QtCore.Qt.blue)
self.errorFormat = QtGui.QTextCharFormat()
self.errorFormat.setForeground(QtCore.Qt.red)
def highlightBlock(self, text):
# uncomment this line for Python2
# text = unicode(text)
if text.startswith('--['):
self.setFormat(0, len(text), self.sectionFormat)
elif text.startswith('[ERROR]'):
self.setFormat(0, len(text), self.errorFormat)
class Window(QtGui.QWidget):
def __init__(self):
super(Window, self).__init__()
self.editor = QtGui.QTextEdit(self)
self.highlighter = Highlighter(self.editor.document())
self.editor.setText(sample)
layout = QtGui.QVBoxLayout(self)
layout.addWidget(self.editor)
if __name__ == '__main__':
import sys
app = QtGui.QApplication(sys.argv)
window = Window()
window.setGeometry(500, 150, 300, 300)
window.show()
sys.exit(app.exec_())

You can achieve this using HTML format
textEdit.setHtml(text);
But even better, the QSyntaxHighlighter class:
Doc : http://doc.qt.io/qt-5/qsyntaxhighlighter.html
Python Exemple : https://wiki.python.org/moin/PyQt/Python%20syntax%20highlighting
Here an exemple with a code editor.
import sys
from PyQt4.QtCore import QRegExp
from PyQt4.QtGui import QColor, QTextCharFormat, QFont, QSyntaxHighlighter
def format(color, style=''):
"""Return a QTextCharFormat with the given attributes.
"""
_color = QColor()
_color.setNamedColor(color)
_format = QTextCharFormat()
_format.setForeground(_color)
if 'bold' in style:
_format.setFontWeight(QFont.Bold)
if 'italic' in style:
_format.setFontItalic(True)
return _format
# Syntax styles that can be shared by all languages
STYLES = {
'keyword': format('blue'),
'operator': format('red'),
'brace': format('darkGray'),
'defclass': format('black', 'bold'),
'string': format('magenta'),
'string2': format('darkMagenta'),
'comment': format('darkGreen', 'italic'),
'self': format('black', 'italic'),
'numbers': format('brown'),
}
class PythonHighlighter (QSyntaxHighlighter):
"""Syntax highlighter for the Python language.
"""
# Python keywords
keywords = [
'and', 'assert', 'break', 'class', 'continue', 'def',
'del', 'elif', 'else', 'except', 'exec', 'finally',
'for', 'from', 'global', 'if', 'import', 'in',
'is', 'lambda', 'not', 'or', 'pass', 'print',
'raise', 'return', 'try', 'while', 'yield',
'None', 'True', 'False',
]
# Python operators
operators = [
'=',
# Comparison
'==', '!=', '<', '<=', '>', '>=',
# Arithmetic
'\+', '-', '\*', '/', '//', '\%', '\*\*',
# In-place
'\+=', '-=', '\*=', '/=', '\%=',
# Bitwise
'\^', '\|', '\&', '\~', '>>', '<<',
]
# Python braces
braces = [
'\{', '\}', '\(', '\)', '\[', '\]',
]
def __init__(self, document):
QSyntaxHighlighter.__init__(self, document)
# Multi-line strings (expression, flag, style)
# FIXME: The triple-quotes in these two lines will mess up the
# syntax highlighting from this point onward
self.tri_single = (QRegExp("'''"), 1, STYLES['string2'])
self.tri_double = (QRegExp('"""'), 2, STYLES['string2'])
rules = []
# Keyword, operator, and brace rules
rules += [(r'\b%s\b' % w, 0, STYLES['keyword'])
for w in PythonHighlighter.keywords]
rules += [(r'%s' % o, 0, STYLES['operator'])
for o in PythonHighlighter.operators]
rules += [(r'%s' % b, 0, STYLES['brace'])
for b in PythonHighlighter.braces]
# All other rules
rules += [
# 'self'
(r'\bself\b', 0, STYLES['self']),
# Double-quoted string, possibly containing escape sequences
(r'"[^"\\]*(\\.[^"\\]*)*"', 0, STYLES['string']),
# Single-quoted string, possibly containing escape sequences
(r"'[^'\\]*(\\.[^'\\]*)*'", 0, STYLES['string']),
# 'def' followed by an identifier
(r'\bdef\b\s*(\w+)', 1, STYLES['defclass']),
# 'class' followed by an identifier
(r'\bclass\b\s*(\w+)', 1, STYLES['defclass']),
# From '#' until a newline
(r'#[^\n]*', 0, STYLES['comment']),
# Numeric literals
(r'\b[+-]?[0-9]+[lL]?\b', 0, STYLES['numbers']),
(r'\b[+-]?0[xX][0-9A-Fa-f]+[lL]?\b', 0, STYLES['numbers']),
(r'\b[+-]?[0-9]+(?:\.[0-9]+)?(?:[eE][+-]?[0-9]+)?\b', 0, STYLES['numbers']),
]
# Build a QRegExp for each pattern
self.rules = [(QRegExp(pat), index, fmt)
for (pat, index, fmt) in rules]
def highlightBlock(self, text):
"""Apply syntax highlighting to the given block of text.
"""
# Do other syntax formatting
for expression, nth, format in self.rules:
index = expression.indexIn(text, 0)
while index >= 0:
# We actually want the index of the nth match
index = expression.pos(nth)
length = expression.cap(nth).length()
self.setFormat(index, length, format)
index = expression.indexIn(text, index + length)
self.setCurrentBlockState(0)
# Do multi-line strings
in_multiline = self.match_multiline(text, *self.tri_single)
if not in_multiline:
in_multiline = self.match_multiline(text, *self.tri_double)
def match_multiline(self, text, delimiter, in_state, style):
"""Do highlighting of multi-line strings. ``delimiter`` should be a
``QRegExp`` for triple-single-quotes or triple-double-quotes, and
``in_state`` should be a unique integer to represent the corresponding
state changes when inside those strings. Returns True if we're still
inside a multi-line string when this function is finished.
"""
# If inside triple-single quotes, start at 0
if self.previousBlockState() == in_state:
start = 0
add = 0
# Otherwise, look for the delimiter on this line
else:
start = delimiter.indexIn(text)
# Move past this match
add = delimiter.matchedLength()
# As long as there's a delimiter match on this line...
while start >= 0:
# Look for the ending delimiter
end = delimiter.indexIn(text, start + add)
# Ending delimiter on this line?
if end >= add:
length = end - start + add + delimiter.matchedLength()
self.setCurrentBlockState(0)
# No; multi-line string
else:
self.setCurrentBlockState(in_state)
length = text.length() - start + add
# Apply formatting
self.setFormat(start, length, style)
# Look for the next match
start = delimiter.indexIn(text, start + length)
# Return True if still inside a multi-line string, False otherwise
if self.currentBlockState() == in_state:
return True
else:
return False

Two issue about python OpenOPC library

Issues description and environments
The OpenOPC library is friendly and easy to use, the api is simple too, but I have found two issues during the development of a tool to record real time OPC items data.
The development environment is: Window 8.1, Python 2.7.6, wxpython 2.8 unicode
The testing environment is: Window XP SP3, Python 2.7.6, wxpython 2.8 unicode, Rockwell's soft logix as OPC Server
The deploy environment is: Window XP SP3, connected with Rockwell's real PLC, installed RSLogix 5000 and RSLinx Classic Gateway
Questions
the opc.list function doesn't list all the item of specify node both in testing and workstaion environment. The question is how to list the 't' from the opc server?
An int array 'dint100' and a dint 't' is added with RS logix 5000 at the scope of soft_1
With the default OPC client test tool from Rockwell it could list the new added 't'
With OpenOPC library, I couldn't find out how to list the item 't', but I could read it's value by opc.read('[soft_1]t') with it's tag.
If the 't' could be listed, it could be added into the IO tree of my tool.
The opc.servers function will encounter an OPCError on the deploy environment, but the client could connect the 'RSLinx OPC Server' directly with the server name. Does opc.servers function dependent on some special dll or service?
Any suggestions will be appreciated! Thanks in advance!

Consider that the browsing problems ("opc.list") may not be on your side. RSLinx is notorious for its broken OPC browsing. Try some test/simulation server from a different vendor, to test this hypothesis.

I realize that I'm really late to this game. I found what was causing this issue. OpenOPC.py assumes that there cannot be both a "Leaf" and a "Branch" on the same level. Replace the function ilist with this:
def ilist(self, paths='*', recursive=False, flat=False, include_type=False):
"""Iterable version of list()"""
try:
self._update_tx_time()
pythoncom.CoInitialize()
try:
browser = self._opc.CreateBrowser()
# For OPC servers that don't support browsing
except:
return
paths, single, valid = type_check(paths)
if not valid:
raise TypeError("list(): 'paths' parameter must be a string or a list of strings")
if len(paths) == 0: paths = ['*']
nodes = {}
for path in paths:
if flat:
browser.MoveToRoot()
browser.Filter = ''
browser.ShowLeafs(True)
pattern = re.compile('^%s$' % wild2regex(path) , re.IGNORECASE)
matches = filter(pattern.search, browser)
if include_type: matches = [(x, node_type) for x in matches]
for node in matches: yield node
continue
queue = []
queue.append(path)
while len(queue) > 0:
tag = queue.pop(0)
browser.MoveToRoot()
browser.Filter = ''
pattern = None
path_str = '/'
path_list = tag.replace('.','/').split('/')
path_list = [p for p in path_list if len(p) > 0]
found_filter = False
path_postfix = '/'
for i, p in enumerate(path_list):
if found_filter:
path_postfix += p + '/'
elif p.find('*') >= 0:
pattern = re.compile('^%s$' % wild2regex(p) , re.IGNORECASE)
found_filter = True
elif len(p) != 0:
pattern = re.compile('^.*$')
browser.ShowBranches()
# Branch node, so move down
if len(browser) > 0:
try:
browser.MoveDown(p)
path_str += p + '/'
except:
if i < len(path_list)-1: return
pattern = re.compile('^%s$' % wild2regex(p) , re.IGNORECASE)
# Leaf node, so append all remaining path parts together
# to form a single search expression
else:
###################################### JG Edit - Flip the next two rows comment/uncommented
p = '.'.join(path_list[i:])
# p = string.join(path_list[i:], '.')
pattern = re.compile('^%s$' % wild2regex(p) , re.IGNORECASE)
break
###################################### JG Edit - Comment this to return to original
browser.ShowBranches()
node_types = ['Branch','Leaf']
if len(browser) == 0:
lowest_level = True
node_types.pop(0)
else:
lowest_level = False
for node_type in node_types:
if node_type=='Leaf':
browser.ShowLeafs(False)
matches = filter(pattern.search, browser)
if not lowest_level and recursive:
queue += [path_str + x + path_postfix for x in matches]
else:
###################################### JG Edit - Flip the next two rows comment/uncommented
if lowest_level or node_type=='Leaf': matches = [exceptional(browser.GetItemID,x)(x) for x in matches]
# if lowest_level: matches = [exceptional(browser.GetItemID,x)(x) for x in matches]
if include_type: matches = [(x, node_type) for x in matches]
for node in matches:
if not node in nodes: yield node
nodes[node] = True
###################################### Uncomment this to return to original
# browser.ShowBranches()
# if len(browser) == 0:
# browser.ShowLeafs(False)
# lowest_level = True
# node_type = 'Leaf'
# else:
# lowest_level = False
# node_type = 'Branch'
# matches = filter(pattern.search, browser)
# if not lowest_level and recursive:
# queue += [path_str + x + path_postfix for x in matches]
# else:
# if lowest_level: matches = [exceptional(browser.GetItemID,x)(x) for x in matches]
# if include_type: matches = [(x, node_type) for x in matches]
# for node in matches:
# if not node in nodes: yield node
# nodes[node] = True
except pythoncom.com_error as err:
error_msg = 'list: %s' % self._get_error_str(err)
raise OPCError(error_msg)

ANSI graphic codes and Python

I was browsing the Django source code and I saw this function:
def colorize(text='', opts=(), **kwargs):
"""
Returns your text, enclosed in ANSI graphics codes.
Depends on the keyword arguments 'fg' and 'bg', and the contents of
the opts tuple/list.
Returns the RESET code if no parameters are given.
Valid colors:
'black', 'red', 'green', 'yellow', 'blue', 'magenta', 'cyan', 'white'
Valid options:
'bold'
'underscore'
'blink'
'reverse'
'conceal'
'noreset' - string will not be auto-terminated with the RESET code
Examples:
colorize('hello', fg='red', bg='blue', opts=('blink',))
colorize()
colorize('goodbye', opts=('underscore',))
print colorize('first line', fg='red', opts=('noreset',))
print 'this should be red too'
print colorize('and so should this')
print 'this should not be red'
"""
code_list = []
if text == '' and len(opts) == 1 and opts[0] == 'reset':
return '\x1b[%sm' % RESET
for k, v in kwargs.iteritems():
if k == 'fg':
code_list.append(foreground[v])
elif k == 'bg':
code_list.append(background[v])
for o in opts:
if o in opt_dict:
code_list.append(opt_dict[o])
if 'noreset' not in opts:
text = text + '\x1b[%sm' % RESET
return ('\x1b[%sm' % ';'.join(code_list)) + text
I removed it out of the context and placed in another file just to try it, the thing is that it doesn't seem to colour the text I pass it. It might be that I don't understand it correctly but isn't it supposed to just return the text surrounded with ANSI graphics codes which than the terminal will convert to actual colours.
I tried all the given examples of calling it, but it just returned the argument I specified as a text.
I'm using Ubuntu so I think the terminal should support colours.

It's that you have many terms undefined, because it relies on several variables defined outside of the function.
Instead just
import django.utils.termcolors as termcolors
red_hello = termcolors.colorize("Hello", fg='red') # '\x1b[31mHello\x1b[0m'
print red_hello
Or just also copy the first few lines of django/utils/termcolors.py specifically:
color_names = ('black', 'red', 'green', 'yellow', 'blue', 'magenta', 'cyan', 'white')
foreground = dict([(color_names[x], '3%s' % x) for x in range(8)])
background = dict([(color_names[x], '4%s' % x) for x in range(8)])
RESET = '0'
def colorize( ... ):
...
print colorize("Hello", fg='red') # '\x1b[31mHello\x1b[0m'
Also note:
>>> from django.utils.termcolors import colorize
>>> red_hello = colorize("Hello", fg="red")
>>> red_hello # by not printing; it will not appear red; special characters are escaped
'\x1b[31mHello\x1b[0m'
>>> print red_hello # by print it will appear red; special characters are not escaped
Hello

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to determine if a Glyph can be displayed? - python

I have a large list of Unicode icons that I want to display. However, I would like to hide/skip any icon that I cannot display (because I don't have the correct font installed). Is there a programmatic way to determine this?

Related

How to display non Latin character on output text console correctly?

Run python script to replace betacode with greek letters LaTeX

QtGui.QTextEdit set line color baced on what text the line contains

Two issue about python OpenOPC library

ANSI graphic codes and Python

Categories

Resources