This question already has answers here:
Get default line colour cycle
(4 answers)
Closed 3 years ago.
What I would like to achieve:
I want to create several pie charts on one figure. They all share some categories but sometimes have different ones. Obviously I want all of the same categories to have the same colors.
That is why I created a dictionary which links the categories (= labels) to the colors. With that I can specify the colors of the pie chart. But I would like to use the ggplot color (which come with matplotlib.style.style.use('ggplot')). How can I get those colors to feed them into my dictionary?
# set colors for labels
color_dict = {}
for i in range(0, len(data_categories)):
color_dict[data_categories[i]] = ???
# apply colors
ind_label = 0
for pie_wedge in pie[0]:
leg = ax[ind].get_legend()
pie_wedge.set_facecolor(color_dict[labels_0[ind_label]])
leg.legendHandles[ind_label].set_color_(color_dict[labels_0[ind_label]])
ind_label += 1
Short answer
To access the colors used in the ggplot style, you can do as follows
In [37]: import matplotlib.pyplot as plt
In [38]: plt.style.use('ggplot')
In [39]: colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
In [40]: print('\n'.join(color for color in colors))
#E24A33
#348ABD
#988ED5
#777777
#FBC15E
#8EBA42
#FFB5B8
In the above example the colors, as RGB strings, are contained in the list colors.
Remember to call plt.style.use(...) before accessing the color list, otherwise you'll find the standard colors.
More detailed explanation
The answer above is tailored for modern releases of Matplotlib, where the plot colors and possibly other plot properties, like line widths and dashes (see this answer of mine) are stored in the rcParams dictionary with the key 'axes.prop_cycle' and are contained in a new kind of object, a cycler (another explanation of a cycler is contained in my answer referenced above).
To get the list of colors, we have to get the cycler from rcParams and then use its .by_key() method
Signature: c.by_key()
Docstring: Values by key
This returns the transposed values of the cycler. Iterating
over a `Cycler` yields dicts with a single value for each key,
this method returns a `dict` of `list` which are the values
for the given key.
The returned value can be used to create an equivalent `Cycler`
using only `+`.
Returns
-------
transpose : dict
dict of lists of the values for each key.
to have a dictionary of values that, at last, we index using the key 'color'.
Addendum
Updated, 2023-01-01.
It is not strictly necessary to use('a_style') to access its colors, the colors are (possibly) defined in a matplotlib.RcParams object that is stored in the dictionary matplotlib.style.library.
E.g., let's print all the color sequences defined in the different styles
In [23]: for style in sorted(plt.style.library):
...: the_rc = plt.style.library[style]
...: if 'axes.prop_cycle' in the_rc:
...: colors = the_rc['axes.prop_cycle'].by_key()['color']
...: print('%25s: %s'%(style, ', '.join(color for color in colors)))
...: else:
...: print('%25s: this style does not modify colors'%style)
Solarize_Light2: #268BD2, #2AA198, #859900, #B58900, #CB4B16, #DC322F, #D33682, #6C71C4
_classic_test_patch: this style does not modify colors
_mpl-gallery: this style does not modify colors
_mpl-gallery-nogrid: this style does not modify colors
bmh: #348ABD, #A60628, #7A68A6, #467821, #D55E00, #CC79A7, #56B4E9, #009E73, #F0E442, #0072B2
classic: b, g, r, c, m, y, k
dark_background: #8dd3c7, #feffb3, #bfbbd9, #fa8174, #81b1d2, #fdb462, #b3de69, #bc82bd, #ccebc4, #ffed6f
fast: this style does not modify colors
fivethirtyeight: #008fd5, #fc4f30, #e5ae38, #6d904f, #8b8b8b, #810f7c
ggplot: #E24A33, #348ABD, #988ED5, #777777, #FBC15E, #8EBA42, #FFB5B8
grayscale: 0.00, 0.40, 0.60, 0.70
seaborn: #4C72B0, #55A868, #C44E52, #8172B2, #CCB974, #64B5CD
seaborn-bright: #003FFF, #03ED3A, #E8000B, #8A2BE2, #FFC400, #00D7FF
seaborn-colorblind: #0072B2, #009E73, #D55E00, #CC79A7, #F0E442, #56B4E9
seaborn-dark: this style does not modify colors
seaborn-dark-palette: #001C7F, #017517, #8C0900, #7600A1, #B8860B, #006374
seaborn-darkgrid: this style does not modify colors
seaborn-deep: #4C72B0, #55A868, #C44E52, #8172B2, #CCB974, #64B5CD
seaborn-muted: #4878CF, #6ACC65, #D65F5F, #B47CC7, #C4AD66, #77BEDB
seaborn-notebook: this style does not modify colors
seaborn-paper: this style does not modify colors
seaborn-pastel: #92C6FF, #97F0AA, #FF9F9A, #D0BBFF, #FFFEA3, #B0E0E6
seaborn-poster: this style does not modify colors
seaborn-talk: this style does not modify colors
seaborn-ticks: this style does not modify colors
seaborn-white: this style does not modify colors
seaborn-whitegrid: this style does not modify colors
tableau-colorblind10: #006BA4, #FF800E, #ABABAB, #595959, #5F9ED1, #C85200, #898989, #A2C8EC, #FFBC79, #CFCFCF
In my understanding
the seaborn-xxx styles that do not modify colors are to be used as the last step in a sequence of styles, e.g., plt.style.use(['seaborn', 'seaborn-poster']) or plt.style.use(['seaborn', 'seaborn-muted', 'seaborn-poster'])
also the _ starting styles are meant to modify other styles, and
the only other style,fast, that does not modify the colors is all about tweaking the rendering parameters to have a faster rendering.
Related
I'd like to style a Pandas DataFrame display with a background color that is based on the logarithm (base 10) of a value, rather than the data frame value itself. The numeric display should show the original values (along with specified numeric formatting), rather than the log of the values.
I've seen many solutions involving the apply and applymap methods, but am not really clear on how to use these, especially since I don't want to change the underlying dataframe.
Here is an example of the type of data I have. Using the "gradient" to highlight is not satisfactory, but highlighting based on the log base 10 would be really useful.
import pandas as pd
import numpy as np
E = np.array([1.26528431e-03, 2.03866202e-04, 6.64793821e-05, 1.88018687e-05,
4.80967314e-06, 1.22584958e-06, 3.09260354e-07, 7.76751705e-08])
df = pd.DataFrame(E,columns=['Error'])
df.style.format('{:.2e}'.format).background_gradient(cmap='Blues')
Since pandas 1.3.0, background_gradient now has a gmap (gradient map) argument that allows you to set the values that determine the background colors.
See the examples here (this link is to the dev docs - can be replaced once 1.3.0 is released) https://pandas.pydata.org/pandas-docs/dev/reference/api/pandas.io.formats.style.Styler.background_gradient.html#pandas.io.formats.style.Styler.background_gradient
I figured out how to use the apply function to do exactly what I want. And also, I discovered a few more features in Matplotlib's colors module, including LogNorm which normalizes using a log. So in the end, this was relatively easy.
What I learned :
Do not use background_gradient, but rather supply your own function that maps DataFrame values to colors. The argument to the function is the dataframe to be displayed. The return argument should be a dataframe with the same columns, etc, but with values replaced by colors, e.g. strings background-color:#ffaa44.
Pass this function as an argument to apply.
import pandas as
import numpy as np
from matplotlib import colors, cm
import seaborn as sns
def color_log(x):
df = x.copy()
cmap = sns.color_palette("spring",as_cmap=True).reversed()
evals = df['Error'].values
norm = colors.LogNorm(vmin=1e-10,vmax=1)
normed = norm(evals)
cstr = "background-color: {:s}".format
c = [cstr(colors.rgb2hex(x)) for x in cm.get_cmap(cmap)(normed)]
df['Error'] = c
return df
E = np.array([1.26528431e-03, 2.03866202e-04, 6.64793821e-05, 1.88018687e-05,
4.80967314e-06, 1.22584958e-06, 3.09260354e-07, 7.76751705e-08])
df = pd.DataFrame(E,columns=['Error'])
df.style.format('{:.2e}'.format).apply(color_log,axis=None)
Note (1) The second argument to the apply function is an "axis". By supplying axis=None, the entire data frame will be passed to color_log. Passing axis=0 will pass in each column of the data frame as a Series. In this case, the code supplied above will not work. However, this would be useful for dataframes in which each column should be handled separately.
Note (2) If axis=None is used, and the DataFrame has more than one column, the color mapping function passed to apply should set colors for all columns in the DataFrame. For example,
df[:,:] = 'background-color:#eeeeee'
would sets all columns to grey. Then, selective columns could be overwritten with other colors choices.
I would be happy to know if there is yet a simpler way to do this.
I would like to move the x-axis to the top of my plot and manually fill the colors. However, the usual method in ggplot does not work in plotnine. When I provide the position='top' in my scale_x_continuous() I receive the warning: PlotnineWarning: scale_x_continuous could not recognize parameter 'position'. I understand position is not in plotnine's scale_x_continuous, but what is the replacement? Also, scale_fill_manual() results in an Invalid RGBA argument: 'color' error. Specifically, the value requires an array-like object. Thus I provided the array of colors, but still had an issue. How do I manually set the colors for a scale_fill object?
import pandas as pd
from plotnine import *
lst = [[1,1,'a'],[2,2,'a'],[3,3,'a'],[4,4,'b'],[5,5,'b']]
df = pd.DataFrame(lst, columns =['xx', 'yy','lbls'])
fill_clrs = {'a': 'goldenrod1',
'b': 'darkslategray3'}
ggplot()+\
geom_tile(aes(x='xx', y='yy', fill = 'lbls'), df) +\
geom_text(aes(x='xx', y='yy', label='lbls'),df, color='white')+\
scale_x_continuous(expand=(0,0), position = "top")+\
scale_fill_manual(values = np.array(list(fill_clrs.values())))
Plotnine does not support changing the position of any axis.
You can pass a list or a dict of colour values to scale_fill_manual provided they are recognisable colour names. The colours you have are obscure and they are not recognised. To see that it works try 'red' and 'green', see https://matplotlib.org/gallery/color/named_colors.html for all the named colors. Otherwise, you can also use hex colors e.g. #ff00cc.
I have just embarked on Seaborn and encountered some obstacles to get familiar with it. Specifically, with this method:
sns.set_context('paper',rc={"font.size":1000,'axes.labelsize':5})
What is the meaning of "font.size"? I have tried tweaking that parameter several times from 0 to even a huge number as 1000. Unfortunately, I saw no effects in my experiment.
The rc argument of seaborn.set_context is passed into plotting_context which contains
def plotting_context(context=None, font_scale=1, rc=None):
# ...
# Now independently scale the fonts
font_keys = ["axes.labelsize", "axes.titlesize", "legend.fontsize",
"xtick.labelsize", "ytick.labelsize", "font.size"]
font_dict = {k: context_dict[k] * font_scale for k in font_keys}
context_dict.update(font_dict)
# ...
This piece of code sets the values for text sizing for axes, legend, xtick, and ytick. Since those sizes are explicitly set they will ignore the font.size parameter, which sets only the default - and is only used if a value has not been explicitly set, as is noted here
## note that font.size controls default text sizes. To configure
## special text sizes tick labels, axes, labels, title, etc, see the rc
## settings for axes and ticks [...]
So in order to see the effects of 'font.size':x you will need to create some text which is not included in those which have their sizes explicitly set by plotting_context, e.g. a matplotlib.axes.Axes.Text instance.
how to detach height of the stacked bars from colors of the fill?
I have multiple categories which I want to present in stacked bar chart so that the height represent the value and color is conditionally defined by another variable (something like fill= in the ggplot ).
I am new to bokeh and struggling with the stack bar chart mechanics. I tried construct this type of chart, but I haven't got anything except all sorts of errors. The examples of stacked bar chart are very limited in the bokeh documentation.
My Data is stored in pandas dataframe:
data =
['A',1, 15, 1]
'A',2, 14, 2
'A',3, 60, 1
'B',1, 15, 2
'B',2, 25, 2
'B',3, 20, 1
'C',1, 15, 1
'C',2, 25, 1
'C',3, 55, 2
...
]
Columns represent Category, Regime, Value, State.
I want to plot Category on x axis, Regimes stacked on y axis where bar length represents Value and color represents State.
is this achievable in bokeh?
can anybody demonstrate please
I think this problem becomes much easier if you transform your data to the following form:
from bokeh.plotting import figure
from bokeh.io import show
from bokeh.transform import stack, factor_cmap
import pandas as pd
df = pd.DataFrame({
"Category": ["a", "b"],
"Regime1_Value": [1, 4],
"Regime1_State": ["A", "B"],
"Regime2_Value": [2, 5],
"Regime2_State": ["B", "B"],
"Regime3_Value": [3, 6],
"Regime3_State": ["B", "A"]})
p = figure(x_range=["a", "b"])
p.vbar_stack(["Regime1_Value", "Regime2_Value", "Regime3_Value"],
x="Category",
fill_color=[
factor_cmap(state, palette=["red", "green"], factors=["A", "B"])
for state in ["Regime1_State","Regime2_State", "Regime3_State"]],
line_color="black",
width=0.9,
source=df)
show(p)
This is a bit strange, because vbar_stack behaves unlike a "normal glyph". Normally you have three options for attributes of a renderer (assume we want to plot n dots/rectangles/shapes/things:
Give a single value that is used for all n glyphs
Give a column name that is looked up in the source (source[column_name] must produce an "array" of length n)
Give an array of length n of data
But vbar_stack does not create one renderer, it creates as many as there are elements in the first array you give. Lets call this number k. Then to make sense of the attributes you have again three options:
Give a single value that is used for all glyphs
Give an array of k things that are used as columns names in the source (each lookup must produce an array of length n).
Give an array of length n of data (so for all 1-k glyphs have the same data).
So p.vbar(x=[a,b,c]) and p.vbar_stacked(x=[a,b,c]) actually do different things (the first gives literal data, the second gives column names) which confused, and it's not clear from the documentation.
But why do we have to transform your data so strangely? Lets unroll vbar_stack and write it on our own (details left out for brevity):
plotted_regimes = []
for regime in regimes:
if not plotted_regimes:
bottom = 0
else:
bottom = stack(*plotted_regimes)
p.vbar(bottom=bottom, top=stack(*plotted_regimes, regime))
plotted_regimes.append(regime)
So for each regime we have a separate vbar that has its bottom where the sum of the other regimes ended. Now with the original data structure this is not really possible because there doesn't need to be a a value for each regime for each category. Here we are forced to set these values to 0 if we actually want.
Because the stacked values corrospond to column names we have to put these values in one dataframe. The vbar_stack call in the beginning could also be written with stack (basically because vbar_stack is a convenience wrapper around stack).
The factor_cmap is used so that we don't have to manually assign colors. We could also simply add a Regime1_Color column, but this way the mapping is done automatically (and client side).
I have a series of lines that each need to be plotted with a separate colour. Each line is actually made up of several data sets (positive, negative regions etc.) and so I'd like to be able to create a generator that will feed one colour at a time across a spectrum, for example the gist_rainbow map shown here.
I have found the following works but it seems very complicated and more importantly difficult to remember,
from pylab import *
NUM_COLORS = 22
mp = cm.datad['gist_rainbow']
get_color = matplotlib.colors.LinearSegmentedColormap.from_list(mp, colors=['r', 'b'], N=NUM_COLORS)
...
# Then in a for loop
this_color = get_color(float(i)/NUM_COLORS)
Moreover, it does not cover the range of colours in the gist_rainbow map, I have to redefine a map.
Maybe a generator is not the best way to do this, if so what is the accepted way?
To index colors from a specific colormap you can use:
import pylab
NUM_COLORS = 22
cm = pylab.get_cmap('gist_rainbow')
for i in range(NUM_COLORS):
color = cm(1.*i/NUM_COLORS) # color will now be an RGBA tuple
# or if you really want a generator:
cgen = (cm(1.*i/NUM_COLORS) for i in range(NUM_COLORS))