font_scale vs font.size in seaborn rc - python

I have just embarked on Seaborn and encountered some obstacles to get familiar with it. Specifically, with this method:
sns.set_context('paper',rc={"font.size":1000,'axes.labelsize':5})
What is the meaning of "font.size"? I have tried tweaking that parameter several times from 0 to even a huge number as 1000. Unfortunately, I saw no effects in my experiment.

The rc argument of seaborn.set_context is passed into plotting_context which contains
def plotting_context(context=None, font_scale=1, rc=None):
# ...
# Now independently scale the fonts
font_keys = ["axes.labelsize", "axes.titlesize", "legend.fontsize",
"xtick.labelsize", "ytick.labelsize", "font.size"]
font_dict = {k: context_dict[k] * font_scale for k in font_keys}
context_dict.update(font_dict)
# ...
This piece of code sets the values for text sizing for axes, legend, xtick, and ytick. Since those sizes are explicitly set they will ignore the font.size parameter, which sets only the default - and is only used if a value has not been explicitly set, as is noted here
## note that font.size controls default text sizes. To configure
## special text sizes tick labels, axes, labels, title, etc, see the rc
## settings for axes and ticks [...]
So in order to see the effects of 'font.size':x you will need to create some text which is not included in those which have their sizes explicitly set by plotting_context, e.g. a matplotlib.axes.Axes.Text instance.

Related

Adjust label size/wrap and -position automatically for pcolormesh/pcolor in pyplot

When creating 2d-images with pyplot by using pcolor/pcolormesh I am using custom labels in my project. Unfortunately, some of those labels are too long to fit into their allocated space, and therefore spill over the next label, as shown below for a 2x2-matrix:
and a 4x4-matrix:
I tried to find solutions for that issue, and until now have found two possible approaches: Replacing all spaces with newlines (as proposed by https://stackoverflow.com/a/62521738/2546099), or using textwrap.wrap(), which needs a fixed text width (as proposed by https://stackoverflow.com/a/15740730/2546099 or https://medium.com/dunder-data/automatically-wrap-graph-labels-in-matplotlib-and-seaborn-a48740bc9ce)
I tested both approaches for both test matrices shown above, resulting in (when replacing space with newlines) for the 2x2-matrix:
and for the 4x4-matrix:
Similarly, I tested the second approach for both matrices, the 2x2-matrix, resulting in
and the 4x4-matrix, resulting in
Still, both approaches share the same issues:
When wrapping the label, a part of the label will extend into the image. Setting a fixed pad size will only work if the label size is known beforehand, to adjust the distance properly. Else, some labels might be spaced too far apart from the image, while others are still inside the image.
Wrapping the label with a fixed size might leave too much white space around. The best case might be figure 5 and 6: I used a wrap at 10 characters for both figures, but while the text placement works out nicely (from my point of view) for four columns, there is enough space to increase the wrap limit to 20 characters for two columns. For even more columns 10 characters might even be too much.
Therefore, are there other solutions which might solve my problem dynamically, depending on the size of the figure and the available space?
The code I used for generating the pictures above is:
import textwrap
import numpy as np
import matplotlib.pyplot as plt
imshow_size = 4
x_vec, y_vec = (
np.linspace(0, imshow_size - 1, imshow_size),
np.linspace(0, imshow_size - 1, imshow_size),
)
labels_to_plot = np.linspace(0, imshow_size, imshow_size + 1)
categories = ["This is a very long long label serving its job as place holder"] * (
imshow_size
)
## Idea 1: Replace space with \n, similar to https://stackoverflow.com/a/62521738/2546099
# categories = [category.replace(" ", "\n") for category in categories]
## Idea 2: Wrap labels at certain text length, similar to https://stackoverflow.com/a/15740730/2546099 or
## https://medium.com/dunder-data/automatically-wrap-graph-labels-in-matplotlib-and-seaborn-a48740bc9ce
categories = [textwrap.fill(category, 10) for category in categories]
X, Y = np.meshgrid(x_vec, y_vec)
Z = np.random.rand(imshow_size, imshow_size)
fig, ax = plt.subplots()
ax.pcolormesh(X, Y, Z)
ax.set_xticks(labels_to_plot[:-1])
ax.set_xticklabels(categories, va="center")
ax.set_yticks(labels_to_plot[:-1])
ax.set_yticklabels(categories, rotation=90, va="center")
fig.tight_layout()
plt.savefig("with_wrap_at_10_with_four_labels.png", bbox_inches="tight")
# plt.show()

Formatting Axes of Binned Temporal (Continuous) Bar Graphs

I'm having some issues formatting the x-axis of binned temporal bar graphs.
Here is the data:
import pandas as pd
import numpy as np
import altair as alt
c1 = pd.date_range(start="2021-01-01",end="2021-02-15")
c2 = np.random.randint(1,6, size=len(c1))
df = pd.DataFrame({"day": c1, "value": c2})
df = df.drop(np.random.choice(df.index, size=12, replace=False))
Here is one approach, let's call this A. This uses a timeUnit to bin the data, but behind the scenes it uses x/x2 encoding.
c1 = alt.Chart(df).mark_bar().encode\
( alt.X("monthdate(day):T")
, alt.Y("value")
)
t1 = alt.Chart(df).mark_text(baseline="bottom").encode\
( alt.X("monthdate(day):T")
, alt.Y("value")
, alt.Text("value")
)
(c1 + t1).interactive(bind_y=False).properties(width=800)
Here is another approach, let's call this B. This uses x/x2 encoding directly.
c2 = alt.Chart(df).transform_calculate\
( day = "toDate(datum.day)"
, start = "datum.day - 12*60*60*1000"
, end = "datum.day + 12*60*60*1000"
).mark_bar().encode\
( alt.X("start:T")
, alt.X2("end:T")
, alt.Y("value")
)
t2 = alt.Chart(df).mark_text(baseline="bottom").encode\
( alt.X("day:T")
, alt.Y("value")
, alt.Text("value")
)
(c2 + t2).interactive(bind_y=False).properties(width=800)
Each has their share of formatting issues, with much overlap.
A (timeUnit):
Align the text marks, axis labels, and axis ticks to the center of the bar marks. Is it possible to offset either the axis or the data by a data-dependent or zoom-dependent amount? For example axis offset is in pixels and therefore unsuitable. Vega offers xOffset encoding, but that might not be available to Altair. It's a pity that band or bandPosition fields don't work with continuous domains.
There should be just one tick, one grid, and one label per bar. I can solve this in vega-lite using "axis": {"tickCount": "day"} but this generates a schema error in Altair 4.1.0.
Unique axis label formatting on year and month transitions, including linebreaks within axis labels. I believe a registered custom format type will allow me to use a custom JavaScript function, but I could use an example.
Change the axis labels based on how many bars are visible at the current zoom level. For example, to add the weekday at high zoom levels. According to this it is possible. I'm not sure how, but it probably involves an alt.Condition on the axis format field. This format field is then passed to the custom format function (above, A.3).
B (x/x2):
Half-sized grid every now and then. See screenshots. This is pretty strange. edit: Apparently also an issue with A, judging by the screenshots below. Never noticed it till now.
Missing or inconsistent dividers between bars. A possible solution is to add alt.X(..., bin="binned"), but this removes the grid verticals and I haven't looked into adding them back. I don't want to reduce the x1-x2 width because that leads to an undesirable zoom-dependent gap.
Inconsistent axis labels. I've seen four separate label formats at once on the x-axis, and it's even more confusing than it appears: %a %d, %Y, %b %d, %B. It's weird that this isn't an issue with code version A. Hopefully this is fixed with a custom format function as in A.3. I see that vega's scale/encode/labels/update/text ties a timeFormat to the signal handler. Is that the same thing as a custom format function?
Same as A.2.
Same as A.3.
Same as A.4.
Here are the images documenting all these problems:
A (timeUnit):
B (x/x2):

How do the x and y parameters in the Label object work for bokeh?

I've read the documentation for the Label class in Bokeh but the x and y parameters are quite confusing. Their behavior seems to change if you pass something to the x_units and y_units parameters but I don't understand what the units are supposed to be by default.
More specifically, I have a list of strings that I'm using for my x-axis:
xlab = [
'COREPCE2',
'COREPCE3',
'COREPCE4',
'COREPCE5',
'COREPCE6',
'',
'T5YIE'
]
p = figure(..., y_range = (0,.04), x_range = xlab)
If I wanted to draw basically anything else on the plot, I could just use those strings. For example I drew some lines like this:
p.line(['COREPCE2', 'T5YIE'], [.02,.02], color = 'black', line_dash = 'dashed')
p.line(['', ''], [0,.04], color = 'black')
And that works fine, this is the full chart.
Here's the issue though. I want to put a text label on the "COREPCE4" location of the x axis. If I try just passing the string for the x parameter in the Label class it just doesn't work:
section = Label(x = 'COREPCE4', y = .03, text = 'Survey of Professional Forecasters: August 9, 2019')
p.add_layout(section)
It throws an error: ValueError: expected a value of type Real, got COREPCE4 of type str. I don't really know what units its expecting. Is there a way to make Bokeh recognize that I want to use the x-axis label as my x parameter in the same way I've done with the other glyphs?
The propertied x_units, y_units, refer to screen (pixel) vs data-space (axis) units. As of Bokeh 1.3.4 the x and y properties of Label can only be set from floating point numbers, so they cannot be used directly with categorical coordinates. For now you should use LabelSet, even if you are only showing a single label, since it can work with categorical coordinates.

Get colors from matplotlib style [duplicate]

This question already has answers here:
Get default line colour cycle
(4 answers)
Closed 3 years ago.
What I would like to achieve:
I want to create several pie charts on one figure. They all share some categories but sometimes have different ones. Obviously I want all of the same categories to have the same colors.
That is why I created a dictionary which links the categories (= labels) to the colors. With that I can specify the colors of the pie chart. But I would like to use the ggplot color (which come with matplotlib.style.style.use('ggplot')). How can I get those colors to feed them into my dictionary?
# set colors for labels
color_dict = {}
for i in range(0, len(data_categories)):
color_dict[data_categories[i]] = ???
# apply colors
ind_label = 0
for pie_wedge in pie[0]:
leg = ax[ind].get_legend()
pie_wedge.set_facecolor(color_dict[labels_0[ind_label]])
leg.legendHandles[ind_label].set_color_(color_dict[labels_0[ind_label]])
ind_label += 1
Short answer
To access the colors used in the ggplot style, you can do as follows
In [37]: import matplotlib.pyplot as plt
In [38]: plt.style.use('ggplot')
In [39]: colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
In [40]: print('\n'.join(color for color in colors))
#E24A33
#348ABD
#988ED5
#777777
#FBC15E
#8EBA42
#FFB5B8
In the above example the colors, as RGB strings, are contained in the list colors.
Remember to call plt.style.use(...) before accessing the color list, otherwise you'll find the standard colors.
More detailed explanation
The answer above is tailored for modern releases of Matplotlib, where the plot colors and possibly other plot properties, like line widths and dashes (see this answer of mine) are stored in the rcParams dictionary with the key 'axes.prop_cycle' and are contained in a new kind of object, a cycler (another explanation of a cycler is contained in my answer referenced above).
To get the list of colors, we have to get the cycler from rcParams and then use its .by_key() method
Signature: c.by_key()
Docstring: Values by key
This returns the transposed values of the cycler. Iterating
over a `Cycler` yields dicts with a single value for each key,
this method returns a `dict` of `list` which are the values
for the given key.
The returned value can be used to create an equivalent `Cycler`
using only `+`.
Returns
-------
transpose : dict
dict of lists of the values for each key.
to have a dictionary of values that, at last, we index using the key 'color'.
Addendum
Updated, 2023-01-01.
It is not strictly necessary to use('a_style') to access its colors, the colors are (possibly) defined in a matplotlib.RcParams object that is stored in the dictionary matplotlib.style.library.
E.g., let's print all the color sequences defined in the different styles
In [23]: for style in sorted(plt.style.library):
...: the_rc = plt.style.library[style]
...: if 'axes.prop_cycle' in the_rc:
...: colors = the_rc['axes.prop_cycle'].by_key()['color']
...: print('%25s: %s'%(style, ', '.join(color for color in colors)))
...: else:
...: print('%25s: this style does not modify colors'%style)
Solarize_Light2: #268BD2, #2AA198, #859900, #B58900, #CB4B16, #DC322F, #D33682, #6C71C4
_classic_test_patch: this style does not modify colors
_mpl-gallery: this style does not modify colors
_mpl-gallery-nogrid: this style does not modify colors
bmh: #348ABD, #A60628, #7A68A6, #467821, #D55E00, #CC79A7, #56B4E9, #009E73, #F0E442, #0072B2
classic: b, g, r, c, m, y, k
dark_background: #8dd3c7, #feffb3, #bfbbd9, #fa8174, #81b1d2, #fdb462, #b3de69, #bc82bd, #ccebc4, #ffed6f
fast: this style does not modify colors
fivethirtyeight: #008fd5, #fc4f30, #e5ae38, #6d904f, #8b8b8b, #810f7c
ggplot: #E24A33, #348ABD, #988ED5, #777777, #FBC15E, #8EBA42, #FFB5B8
grayscale: 0.00, 0.40, 0.60, 0.70
seaborn: #4C72B0, #55A868, #C44E52, #8172B2, #CCB974, #64B5CD
seaborn-bright: #003FFF, #03ED3A, #E8000B, #8A2BE2, #FFC400, #00D7FF
seaborn-colorblind: #0072B2, #009E73, #D55E00, #CC79A7, #F0E442, #56B4E9
seaborn-dark: this style does not modify colors
seaborn-dark-palette: #001C7F, #017517, #8C0900, #7600A1, #B8860B, #006374
seaborn-darkgrid: this style does not modify colors
seaborn-deep: #4C72B0, #55A868, #C44E52, #8172B2, #CCB974, #64B5CD
seaborn-muted: #4878CF, #6ACC65, #D65F5F, #B47CC7, #C4AD66, #77BEDB
seaborn-notebook: this style does not modify colors
seaborn-paper: this style does not modify colors
seaborn-pastel: #92C6FF, #97F0AA, #FF9F9A, #D0BBFF, #FFFEA3, #B0E0E6
seaborn-poster: this style does not modify colors
seaborn-talk: this style does not modify colors
seaborn-ticks: this style does not modify colors
seaborn-white: this style does not modify colors
seaborn-whitegrid: this style does not modify colors
tableau-colorblind10: #006BA4, #FF800E, #ABABAB, #595959, #5F9ED1, #C85200, #898989, #A2C8EC, #FFBC79, #CFCFCF
In my understanding
the seaborn-xxx styles that do not modify colors are to be used as the last step in a sequence of styles, e.g., plt.style.use(['seaborn', 'seaborn-poster']) or plt.style.use(['seaborn', 'seaborn-muted', 'seaborn-poster'])
also the _ starting styles are meant to modify other styles, and
the only other style,fast, that does not modify the colors is all about tweaking the rendering parameters to have a faster rendering.

How to set the line width of error bar caps

How can the line width of the error bar caps in Matplotlib be changed?
I tried the following code:
(_, caplines, _) = matplotlib.pyplot.errorbar(
data['distance'], data['energy'], yerr=data['energy sigma'],
capsize=10, elinewidth=3)
for capline in caplines:
capline.set_linewidth(10)
capline.set_color('red')
pp.draw()
Unfortunately, this updates the color of the caps, but does not update the line width of the caps!
The resulting effect is similar to the "fat error bar lines / thin caps" in the following image:
It would be nice to have "fat" bar caps, in the case; how can this be done, in Matplotlib? Drawing the bar caps "manually", one by one with plot() would work, but a simpler alternative would be best.
EOL, you were very close..,
distance = [1,3,7,9]
energy = [10,20,30,40]
sigma = [1,3,2,5]
(_, caps, _) = plt.errorbar(distance, energy, sigma, capsize=20, elinewidth=3)
for cap in caps:
cap.set_color('red')
cap.set_markeredgewidth(10)
plt.show
set_markeredgewidth sets the width of the cap lines.
Matplotlib objects have so many attributes that often it is difficult to remember the right ones for a given object. IPython is a very useful tool for introspecting matplotlib. I used it to analyze the properties of the 2Dlines correponding to the error cap lines and I found that and other marker properties.
Cheers
This is based on #joaquin's answer, but a little more concise (if you just want plain error caps with no special styling):
distance = [1,3,7,9]
energy = [10,20,30,40]
sigma = [1,3,2,5]
plt.errorbar(distance,
energy,
sigma,
capsize=5,
elinewidth=2,
markeredgewidth=2)

Categories

Resources