How to display non-English fonts in matplotlib and networkx? - python

This is a followup question to this question. Since it addresses a more general issue I make it a new question.
I have a network for which the labels of the nodes are in Farsi language (Arabic alphabet). When I try to use networkx to display my network it shows blank squares instead of Arabic letters. Below I copy a good example provided in the answers in here.
from bidi.algorithm import get_display
import matplotlib.pyplot as plt
import arabic_reshaper
import networkx as nx
# Arabic text preprocessing
reshaped_text = arabic_reshaper.reshape(u'زبان فارسی')
artext = get_display(reshaped_text)
# constructing the sample graph
G=nx.Graph()
G.add_edge('a', artext ,weight=0.6)
pos=nx.spring_layout(G)
nx.draw_networkx_nodes(G,pos,node_size=700)
nx.draw_networkx_edges(G,pos,edgelist=G.edges(data=True),width=6)
# Drawing Arabic text
# Just Make sure your version of the font 'Times New Roman' has Arabic in it.
# You can use any Arabic font here.
nx.draw_networkx_labels(G,pos,font_size=20, font_family='Times New Roman')
# showing the graph
plt.axis('off')
plt.show()
which generates the following image:
I tried to install the needed fonts by following command lines in python, but I get the same thing.
>>> import matplotlib.pyplot
>>> matplotlib.rcParams.update({font.family' : 'TraditionalArabic'})
Here is the ERROR message, to be more specific:
/usr/local/anaconda3/lib/python3.5/site-packages/matplotlib/font_manager.py:1288: UserWarning: findfont: Font family ['TraditionalArabic'] not found. Falling back to Bitstream Vera Sans
(prop.get_family(), self.defaultFamily[fontext])
I am also investigating ways to install the needed fonts from ubuntu cli, if possible, and put it in my docker file as it gets installed every time I spin my runs.
Best regards, s.

Related

stylecloud does not show underline words

as i know and i have read definition of the wordcloud is following :
Wordcloud is a popular technique that helps us identify the keywords in a text.
In a wordcloud, more frequent words have a larger and bolder font, while less frequent words have smaller or thinner fonts.
In Python, you can make simple wordclouds with the wordcloud library and nice-looking wordclouds with the stylecloudlibrary.
i have following code in order to plot those underlaine and keywords from the text :
import numpy as np
import matplotlib.pyplot as plt
import stylecloud
stylecloud.gen_stylecloud(file_path='SJ-Speech.txt',
icon_name= "fas fa-apple-alt")
plt.show()
expected output should be this :
but result is nothing :
C:\Users\User\PycharmProjects\AI_Project\venv\Scripts\python.exe C:/Users/User/PycharmProjects/AI_Project/Word_Clous_Example.py
Process finished with exit code 0
did i miss something?please help me

matplotlib + latex + custom ttf font

I have to make a figure in python. I need it to use the font Palatino. I downloaded the font here. I placed it under *\matplotlib\mpl-data\fonts\ttf (which turned out to be useless since I had to provide full path to make it work).
Using the following lines allows me to use the font:
prop = fm.FontProperties(fname='C:/Users/MyPC/pyApp/venv/Lib/site-packages/matplotlib/mpl-data/fonts/ttf/Palatino-Roman.ttf')
mpl.rcParams['font.family'] = prop.get_name()
Yay.
Now when I want to use Latex in matplotlib,
rc('text',usetex=True)
the font is now not the one I want. I tried to follow the official page about that and instead use:
rc('font',**{'family':'serif','serif':['Palatino']})
rc('text', usetex=True)
but I cannot see any difference. I tried all possibilities and it looks like the same font.
What am I doing wrong? Perhaps its the latex side that's lacking the required font package...
You can load any latex packages when using rc('text',usetex=True)
You can add this in your code:
plt.rcParams['text.latex.preamble'] = [r'\usepackage{palatino, mathpazo}']

How should I use Arial font while creating word-cloud while using Python3 in Google-Colab?

I have a data-set of twitter texts which is a mixture of English, Arabic, and Farsi. I wanted to create a word-cloud out of it. Unfortunately, my word-cloud shows empty squares for Arabic and Persian words in the photo. I happened to hear about three ways of tackling this problem:
Using different encodings: I tried "UTF-8","UTF-16","UTF-32" and "ISO-8859-1" which didn't fix the problem
Using arabic_reshaper: didn't work
Using a font which simultaneously supports the three languages like "Arial" font: while trying to change the font to Arial in word-cloud I receive the following error:
input
wordcloud = WordCloud(font_path = 'arial',stopwords = stopwords, background_color = "white", max_font_size = 50, max_words = 100).generate(reshaped_text)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
output
cannot open resource
This code works well in Anaconda but not in Google-Colab. The only thing needs to be solved is what path should I enter for font_path in Google-Colab
With Persian language you have three problem to solve:
Persian character don't show correctly. This will solve either with encoding or font which I think you have solved it.
Persian character appears but they are separated, in this case you should use arabic_reshaper's reshape function. Keep in mind this don't solve your problem completely and you need step 3.
Persian words written left to right, you should solve this problem with python-bidi library.
For an example I successfully created word cloud with the following code:
import matplotlib.pyplot as plt
import arabic_reshaper
from bidi.algorithm import get_display
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
txt = '''I would love to try or hear the sample audio your app can produce. I do not want to purchase, because I've purchased so many apps that say they do something and do not deliver.
Can you please add audio samples with text you've converted? I'd love to see the end results.
Thanks!
سلام حال سلام سلام سلام حال شما چطوره است نیست
'''
word_cloud = WordCloud(font_path='arial', stopwords=STOPWORDS, background_color="white", max_font_size=50, max_words=100)
word_cloud = word_cloud.generate_from_text(get_display(arabic_reshaper.reshape(txt)))
plt.imshow(word_cloud, interpolation='bilinear')
plt.axis("off")
plt.show()
I uploaded the font to my google-drive and used this code which worked:
wordcloud = WordCloud(font_path='/content/drive/My Drive/ARIAL.TTF',stopwords=stopwords, background_color="white", max_font_size=50, max_words=100).generate(get_display(arabic_reshaper.reshape(all_tweets)))
You may want to test these specific word cloud libraries for Persian.
persian_wordcloud
wordcloud-fa
check these out too:
and

Latex font in matplotlib - Script-r

In matplotlib, one can easily use latex script to label axes, or write legends or any other text. But is there a way to use new fonts such as 'script-r' in matplotlib? In the following code, I am labelling the axes using latex fonts.
import numpy as np
import matplotlib.pyplot as plt
tmax=10
h=0.01
number_of_realizations=6
for n in range(number_of_realizations):
xpos1=0
xvel1=0
xlist=[]
tlist=[]
t=0
while t<tmax:
xlist.append(xpos1)
tlist.append(t)
xvel1=np.random.normal(loc=0.0, scale=1.0, size=None)
xpos2=xpos1+(h**0.5)*xvel1 # update position at time t
xpos1=xpos2
t=t+h
plt.plot(tlist, xlist)
plt.xlabel(r'$ t$', fontsize=50)
plt.ylabel(r'$r$', fontsize=50)
plt.title('Brownian motion', fontsize=20)
plt.show()
It produces the following figure
But I want 'script-r' in place of normal 'r'.
In latex one has to add the following lines in preamble to render 'script-r'
\DeclareFontFamily{T1}{calligra}{}
\DeclareFontShape{T1}{calligra}{m}{n}{<->s*[2.2]callig15}{}
\DeclareRobustCommand{\sr}{%
\mspace{-2mu}%
\text{\usefont{T1}{calligra}{m}{n}r\/}%
\mspace{2mu}%
}
I don't understand how to do this in matplotlib. Any help is appreciated.
Matplotlib uses it's own hand-rolled (pure Python) implementation of TeX to do all of the math text stuff, so you absolutely cannot assume that what works in standard LaTeX will work with Matplotlib. That being said, here's how you do it:
Install the calligra font so that Matplotlib can see it, then rebuild the font cache.
Lots of other threads deal with how to do this, I'm not going to go into detail, but here's some reference:
Use a font installed in a random spot on your filesystem.
How to install a new font into the Matplotlib managed font cache.
List all fonts currently known to your install of Matplotlib.
Replace one of Matplotlib's TeX font families with your font of choice.
Here's a function I wrote a while ago that reliably does that:
import matplotlib
def setMathtextFont(fontName='Helvetica', texFontFamilies=None):
texFontFamilies = ['it','rm','tt','bf','cal','sf'] if texFontFamilies is None else texFontFamilies
matplotlib.rcParams.update({'mathtext.fontset': 'custom'})
for texFontFamily in texFontFamilies:
matplotlib.rcParams.update({('mathtext.%s' % texFontFamily): fontName})
For you, a good way to use the function would be to replace the font used by \mathcal with calligra:
setMathtextFont('calligra', ['cal'])
Label your plots, for example, r'$\mathcal{foo}$', and the contents of the \math<whatever> macro should show up in the desired font.
Here's how you'd change your label-making code:
plt.ylabel(r'$\mathcal{r}$', fontsize=50)
and that should do it.

Matplotlib PDF export uses wrong font

I want to generate high-quality diagrams for a presentation. I’m using Python’s matplotlib to generate the graphics. Unfortunately, the PDF export seems to ignore my font settings.
I tried setting the font both by passing a FontProperties object to the text drawing functions and by setting the option globally. For the record, here is a MWE to reproduce the problem:
import scipy
import matplotlib
matplotlib.use('cairo')
import matplotlib.pylab as pylab
import matplotlib.font_manager as fm
data = scipy.arange(5)
for font in ['Helvetica', 'Gill Sans']:
fig = pylab.figure()
ax = fig.add_subplot(111)
ax.bar(data, data)
ax.set_xticks(data)
ax.set_xticklabels(data, fontproperties = fm.FontProperties(family = font))
pylab.savefig('foo-%s.pdf' % font)
In both cases, the produced output is identical and uses Helvetica (and yes, I do have both fonts installed).
Just to be sure, the following doesn’t help either:
matplotlib.rc('font', family = 'Gill Sans')
Finally, if I replace the backend, instead using the native viewer:
matplotlib.use('MacOSX')
I do get the correct font displayed – but only in the viewer GUI. The PDF output is once again wrong.
To be sure – I can set other fonts – but only other classes of font families: I can set serif fonts or fantasy or monospace. But all sans-serif fonts seem to default to Helvetica.
Basically, #Jouni’s is the right answer but since I still had some trouble getting it to work, here’s my final solution:
#!/usr/bin/env python2.6
import scipy
import matplotlib
matplotlib.use('cairo')
import matplotlib.pylab as pylab
import matplotlib.font_manager as fm
font = fm.FontProperties(
family = 'Gill Sans', fname = '/Library/Fonts/GillSans.ttc')
data = scipy.arange(5)
fig = pylab.figure()
ax = fig.add_subplot(111)
ax.bar(data, data)
ax.set_yticklabels(ax.get_yticks(), fontproperties = font)
ax.set_xticklabels(ax.get_xticks(), fontproperties = font)
pylab.savefig('foo.pdf')
Notice that the font has to be set explicitly using the fontproperties key. Apparently, there’s no rc setting for the fname property (at least I didn’t find it).
Giving a family key in the instantiation of font isn’t strictly necessary here, it will be ignored by the PDF backend.
This code works with the cairo backend only. Using MacOSX won’t work.
The "family" argument and the corresponding rc parameter are not meant to specify the name of the font can actually be used this way. There's an (arguably baroque) CSS-like font selection system that helps the same script work on different computers, selecting the closest font available. The usually recommended way to use e.g. Gill Sans is to add it to the front of the value of the rc parameter font.sans-serif (see sample rc file), and then set font.family to sans-serif.
This can be annoying if the font manager decides for some obscure reason that Gill Sans is not the closest match to your specification. A way to bypass the font selection logic is to use FontProperties(fname='/path/to/font.ttf') (docstring).
In your case, I suspect that the MacOSX backend uses fonts via the operating system's mechanisms and so automatically supports all kinds of fonts, but the pdf backend has its own font support code that doesn't support your version of Gill Sans.
This is an addition to the answers above if you came here for a non-cairo backend.
The pdf-backend of matplotlib does not yet support true type font collections (saved as .ttc files). See this issue.
The currently suggested workaround is to extract the font-of-interest from a .ttc file and save it as a .ttf file. And then use that font in the way described by Konrad Rudolph.
You can use the python-package fonttools to achieve this:
font = TTFont("/System/Library/Fonts/Helvetica.ttc", fontNumber=0)
font.save("Helvetica-regular.ttf")
As far as I can see, it is not possible to make this setting "global" by passing the path to this new .ttf file to the rc. If you are really desperate, you could try to extract all fonts from a .ttc into separate .ttf files, uninstall the .ttc and install the ttfs separately. To have the extracted font side-by-side with the original font from the .ttc, you need to change the font name with tools like FontForge. I haven't tested this, though.
Check if you are rendering the text with LaTeX, i.e., if text.usetex is set to True. Because LaTeX rendering only supports a few fonts, it largely ignores/overwrites your other fonts settings. This might be the cause.

Categories

Resources