I have a problem displaying non-ASCII characters in Matplotlib, these characters are rendered as small boxes instead of a proper font, it looks like (I filled these boxes with red paint to hightlight them):
How do I fix it?
A related question is Accented characters in Matplotlib.
This problem may actually have a couple of different causes:
The default font does not include these glyphs
You may change the default font using the following (before any plotting is done!)
matplotlib.rc('font', family='Arial')
In some versions of matplotlib you'll have to set the family:
matplotlib.rc('font', **{'sans-serif' : 'Arial',
'family' : 'sans-serif'})
(Note that because sans-serif contains a hyphen inside the **{} syntax, it is actually necessary.)
The first command changes the sans-serif font family to contain only one font (in my case it was Arial), the second sets the default font family to sans-serif.
Other options are included in the documentation.
You have improperly created/passed string objects into Matplotlib
Even if the font contains proper glyphs, if you forgot to use u to create Unicode constants, Matplotlib will have this behaviour:
plt.xlabel("Średnia odległość między stacjami wsparcia a modelowaną [km]")
So you need to add u:
plt.xlabel(u"Średnia odległość między stacjami wsparcia a modelowaną [km]")
Another cause is that you forgot to put a UTF-8 magic comment on top of the file (I read that this might be the source of the problem):
# -*- coding: utf-8 -*-
Related
I have been trying to extract text from PDF files and most of the files seem to work fine. However, one particular document has text in this unusual font: in solid
I have tried extraction using PHP and then Python and both were unable to fix this font. I tried copying text and tried to see if I can get it fixed in text editing tools but couldn't do much.Please note that the original PDF document looks fine but when text is copied and pasted in a text editing tool, the gap between characters starts to appear. I am completely clueless on what to do. Please suggest a solution to fix this in PHP/Python (preferably PHP).
Pre-unicode, some character encodings allowed you to compose Japanese/Korean/Chinese characters either as two half width characters or one full width character. In that case, latin characters could be full width to be mixed evenly with the other characters. You have Full Width Latin characters on your hands and that's why the space out oddly.
You can normalize the string with NFKD compatibility decomposition to get to regular latin. This will also change any half/full width Japanese/Korean/Chinese characters by, um ... I'm not sure, but I think into characters built from multi code point characters.
>>> import unicodedata
>>> t="in solid"
>>> unicodedata.normalize("NFKC", t)
'in solid'
I want to display a sentence with words in different colors within the same frame . But all the code I've seen just change the color of the stimuli as a whole , not a part of it ...
Here's my code for a try ,but it failed
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
from psychopy import visual,core
win = visual.Window([400,400])
sent=[u'先生',u'を呼んだ',u'学生が',u'教室に',u'入った。']
sent[0].color=[1,1,1]
sent[1].color=[1.0,-1,-1]
sentence=visual.TextStim(win,text=sent[0]+sent[1])
sentence.setAutoDraw(True)
win.flip()
I am wondering whether there's a way for me to change the text color before it becomes a visual.TextStim ?
No, the TextStim applies formatting (colour, italic, etc) to its entire contents. If you want words with different colours, they unfortunately each need to be in their own TextStim
An alternative is to the use TextBox class which I think allows per-character formatting, but only for monospace fonts:
http://www.psychopy.org/api/visual/textbox.html#psychopy.visual.TextBox
Having said that, I've found it doesn't work reliably at present, at least on Mac OS.
I have seen this issue pop up here and there but have yet to find a suitable answer.
When making a plot in matplotlib, the only way to insert symbols and math functions (like fractions, exponents, etc...) is to use TeX formatting. However, by default TeX formatting uses a different font AND italicizes the text. So for example, if I wanted an axis label to say the following:
photons/cm^2/s/Angstrom
I have to do the following:
ax1.set_ylabel(r'Photons/$cm^2$/s/$\AA$')
This produces a very ugly label that uses 2 different fonts and has bits and pieces italicized.
How do I permanently change the font of TeX (Not the other way around) so that it matches the default font used by matplotlib?
I have seen other solutions that tell the user to manually make all text the same in a plot by using \mathrm{} for example but this is ridiculously tedious. I have also seen solutions which change the default font of matplotlib to match TeX which seem utterly backwards to me.
It turns out the solution was rather simple and a colleague of mine had the solution.
If I were to use this line of code to create a title:
fig.suptitle(r'$H_2$ Emission from GJ832')
The result would be "H2 Emission from GJ832" which is an illustration of the problem I was having. However, it turns out anything inside of the $$ is converted to math type and thus the italics assigned.
If we change that line of code to the following:
fig.suptitle(r'H$_2$ Emission from GJ832')
Then the result is "H2 Emission from GJ832" without the italics. So this is an example of where we can constrain the math type to include only the math parts of the text, namely creating the subscript of 2.
However, if I were to change the code to the following:
fig.suptitle(r'H$_{two}$ Emission from GJ832')
the result would be "Htwo Emission from GJ832" which introduces the italics again. In this case, and for any case where you must have text (or are creating unit symbols) inside the dollar signs, you can easily remove the italics the following way:
fig.suptitle(r'H$_{\rm two}$ Emission from GJ832')
or in the case of creating a symbol:
ax2.set_xlabel(r'Wavelength ($\rm \AA$)')
The former results in "Htwo Emission from GJ832"
and the latter in "Wavelength (A)"
where A is the Angstrom symbol.
Both of these produce the desired result with nothing italicized by calling \rm before the text or symbol in the dollar signs. The result is nothing italicized INCLUDING the Angstrom symbol created by \AA.
While this doesn't change the default of the TeX formatting, it IS a simple solution to the problem and doesn't require any new packages. Thank you Roland Smith for the suggestions anyway. I hope this helps others who have been struggling with the same issue.
For typesetting units, use the siunitx package (with mode=text) rather than math mode.
Update: The above is only valid when you have defined text.usetex : True in your rc settings.
From the matplotlib docs:
Note that you do not need to have TeX installed, since matplotlib ships its own TeX expression parser, layout engine and fonts.
And:
Regular text and mathtext can be interleaved within the same string. Mathtext can use the Computer Modern fonts (from (La)TeX), STIX fonts (with are designed to blend well with Times) or a Unicode font that you provide. The mathtext font can be selected with the customization variable mathtext.fontset
Reading this, it sounds that setting mathtext.fontset and the regular font that matplotlib uses the same would solve the problem if you don't use TeX.
I searched for creating aligned strings in Python and found some relevant stuff, but didn't work for me. Here's one example:
for line in [[1, 128, 1298039], [123388, 0, 2]]:
print('{:>8} {:>8} {:>8}'.format(*line))
Output:
1 128 1298039
123388 0 2
This is what I see in the shell:
As you can see, the alignment didn't happen. Same problem arises when using \t.
What can I do to align the strings in a neat, tabular format?
You have configured your IDLE shell to use a proportional font, one that uses different widths for different characters. Notice how the () pair takes almost the same amount of horizontal space as the > character above it.
Your code is otherwise entirely correct; with a fixed-width font the numbers will line up correctly.
Switch to using a fixed width font instead. Courier is a good default choice, but Windows has various other fonts installed that are proportional, including Consolas.
Configure the font in the Options -> Configure IDLE menu. Pick a different font from the Font Face list. The sample characters in the panel below should line up (except for the second line k at the end, it should stick out).
I'm trying to put a filename, that includes underscores, as the title of a plot. This gets rendered as defining a subscript character, since by default I have LaTeX interpretation on. I'd like to prevent matplotlib from applying LaTeX to this string, while leaving my default text.usetex as True in my matplotlib configuration file.
How do I do this?
In my version, 1.3.1 (Ubuntu 14), I do not have an option to pass in a usetex keyword argument, as indicated in the documentation.
Probably something like this (untested):
title = 'I_hate_subscripts'
title = title.replace('_', '\_')
plt.title(title)