python facetgrid with sns.barplot and map; target no overlapping group bars

python facetgrid with sns.barplot and map; target no overlapping group bars - python

I am currently implementing a code for facetgrid with subplots of barplots with two different groups ('type'), respectively. I am intending to get a plot, where the different groups are not stacked and not overlapping. I am using following code
g = sns.FacetGrid(data,
col='C',
hue = 'type',
sharex=False,
sharey=False,
size=7,
palette=sns.color_palette(['red','green']),
)
g = g.map(sns.barplot, 'A', 'B').add_legend()
The data is a pandas long format df with following example structure:
data=pd.DataFrame({'A':['X','X','Y','Y','X','X','Y','Y'],
'B':[0,1,2,3,4,5,6,7],
'C':[1,1,1,1,2,2,2,2],
'type':['ctrl','cond1','ctrl','cond1','ctrl','cond1','ctrl','cond1']}
)
In the created barplots I get now fully overlapping barplots of the two groups, thus ctrlis missing, see below. However, I am intending to get neighbouring non-overlapping bars each. How to achieve that? My real code has some more bars per plot, where you can see overlapping colors (here fully covered)

this answer shows up how to use FacetGrid directly.
But, if you have 0.9.0 installed, I would recommend you make use of the new catplot() function that will produce the right (at least I think?) plot. Note that this function returns a FacetGrid object. You can pass kwargs to the call to customize the resulting FacetGrid, or modify its properties afterwards.
g = sns.catplot(data=data, x='A', y='B', hue='type', col='C', kind='bar')

I think you want to provide the hue argument to the barplot, not the FacetGrid. Because the grouping takes place within the (single) barplot, not on the facet's level.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
data=pd.DataFrame({'A':['X','X','Y','Y','X','X','Y','Y'],
'B':[0,1,2,3,4,5,6,7],
'C':[1,1,1,1,2,2,2,2],
'type':['ctrl','cond1','ctrl','cond1','ctrl','cond1','ctrl','cond1']})
g = sns.FacetGrid(data,
col='C',
sharex=False,
sharey=False,
height=4)
g = g.map(sns.barplot, 'A', 'B', "type",
hue_order=np.unique(data["type"]),
order=["X", "Y"],
palette=sns.color_palette(['red','green']))
g.add_legend()
plt.show()

Related

How to reduce the blank area in a grouped boxplot with many missing hue categories

I have an issue when plotting a categorical grouped boxplot by seaborn in Python, especially using 'hue'.
My raw data is as shown in the figure below. And I wanted to plot values in column 8 after categorized by column 1 and 4.
I used seaborn and my code is shown below:
ax = sns.boxplot(x=output[:,1], y=output[:,8], hue=output[:,4])
ax.set_xticklabel(ax.get_xticklabels(), rotation=90)
plt.legend([],[])
However, the generated plot always contains large blank area, as shown in the upper figure below. I tried to add 'dodge=False' in sns.boxplot according to a post here (https://stackoverflow.com/questions/53641287/off-center-x-axis-in-seaborn), but it gives the lower figure below.
Actually, what I want Python to plot is a boxplot like what I generated using JMP below.
It seems that if one of the 2nd categories is empty, seaborn will still leave the space on the generated figure for each 1st category, thus causes the observed off-set/blank area.
So I wonder if there is any way to solve this issue, like using other package in python?

Seaborn reserves a spot for each individual hue value, even when some of these values are missing. When many hue values are missing, this leads to annoying open spots. (When there would be only one box per x-value, dodge=False would solve the problem.)
A workaround is to generate a separate subplot for each individual x-label.
Reproducible example for default boxplot with missing hue values
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
np.random.seed(20230206)
df = pd.DataFrame({'label': np.repeat(['label1', 'label2', 'label3', 'label4'], 250),
'cat': np.repeat(np.random.choice([*'abcdefghijklmnopqrst'], 40), 25),
'value': np.random.randn(1000).cumsum()})
df['cat'] = pd.Categorical(df['cat'], [*'abcdefghijklmnopqrst'])
sns.set_style('white')
plt.figure(figsize=(15, 5))
ax = sns.boxplot(df, x='label', y='value', hue='cat', palette='turbo')
sns.move_legend(ax, loc='upper left', bbox_to_anchor=(1, 1), ncol=2)
sns.despine()
plt.tight_layout()
plt.show()
Individual subplots per x value
A FacetGrid is generated with a subplot ("facet") for each x value
The original hue will be used as x-value for each subplot. To avoid empty spots, the hue should be of string type. When the hue would be pd.Categorical, seaborn would still reserve a spot for each of the categories.
df['cat'] = df['cat'].astype(str) # the column should be of string type, not pd.Categorical
g = sns.FacetGrid(df, col='label', sharex=False)
g.map_dataframe(sns.boxplot, x='cat', y='value')
for label, ax in g.axes_dict.items():
ax.set_title('') # remove the title generated by sns.FacetGrid
ax.set_xlabel(label) # use the label from the dataframe as xlabel
plt.tight_layout()
plt.show()
Adding consistent coloring
A dictionary palette can color the boxes such that corresponding boxes in different subplots have the same color. hue= with the same column as the x= will do the coloring, and dodge=False will remove the empty spots.
df['cat'] = df['cat'].astype(str) # the column should be of string type, not pd.Categorical
cats = np.sort(df['cat'].unique())
palette_dict = {cat: color for cat, color in zip(cats, sns.color_palette('turbo', len(cats)))}
g = sns.FacetGrid(df, col='label', sharex=False)
g.map_dataframe(sns.boxplot, x='cat', y='value',
hue='cat', dodge=False, palette=palette_dict)
for label, ax in g.axes_dict.items():
ax.set_title('') # remove the title generated by sns.FacetGrid
ax.set_xlabel(label) # use the label from the dataframe as xlabel
# ax.tick_params(axis='x', labelrotation=90) # optionally rotate the tick labels
plt.tight_layout()
plt.show()

How to affect a list of colors to histogram index bar in matplotlib?

I have the the folowing dataframe "freqs2" with index (SD to SD17) and associated values (frequencies) :
freqs
SD 101
SD2 128
...
SD17 65
I would like to affect a list of precise colors (in order) for each index. I've tried the following code :
colors=['#e5243b','#DDA63A', '#4C9F38','#C5192D','#FF3A21','#26BDE2','#FCC30B','#A21942','#FD6925','#DD1367','#FD9D24','#BF8B2E','#3F7E44','#0A97D9','#56C02B','#00689D','#19486A']
freqs2.plot.bar(freqs2.index, legend=False,rot=45,width=0.85, figsize=(12, 6),fontsize=(14),color=colors )
plt.ylabel('Frequency',fontsize=(17))
As result I obtain all my chart bars in red color (first color of the list).
Based on similar questions, I've tried to integrate "freqs2.index" to stipulate that the list of colors concern index but the problem stay the same.

It looks like a bug in pandas, plotting directly in matplotlib or using seaborn (which I recommend) works:
import seaborn as sns
colors=['#e5243b','#dda63a', '#4C9F38','#C5192D','#FF3A21','#26BDE2','#FCC30B','#A21942','#FD6925','#DD1367','#FD9D24','#BF8B2E','#3F7E44','#0A97D9','#56C02B','#00689D','#19486A']
# # plotting directly with matplotlib works too:
# fig = plt.figure()
# ax = fig.add_axes([0,0,1,1])
# ax.bar(x=df.index, height=df['freqs'], color=colors)
ax = sns.barplot(data=df, x= df.index, y='freqs', palette=colors)
ax.tick_params(axis='x', labelrotation=45)
plt.ylabel('Frequency',fontsize=17)
plt.show()
Edit: an issue already exists on Github

is it possible to combine 2 differents styles in Matplotlib or seaborn in one plot?

I don't know if it's possible with Matplotlib or seaborn or another tools to plot 1 line and 1 bar (candlestick style) , both in one figure . Like the image below (in excel) :
The x-axis and y-axis are the same
following the response below , I choose mplfinance : mplfinance
i have the following dataframe (daily)
and with the following function we can plot :
def ploting_chart(daily):
# Take marketcolors from 'yahoo'
mc = mpf.make_marketcolors(base_mpf_style='yahoo',up='#ff3300',down='#009900',inherit=True)
# Create a style based on `seaborn` using those market colors:
s = mpf.make_mpf_style(base_mpl_style='seaborn',marketcolors=mc,y_on_right=True,
gridstyle = 'solid' , mavcolors = ['#4d79ff','#d24dff']
)
# **kwargs
kwargs = dict(
type='candle',mav=(7,15),volume=True, figratio=(11,8),figscale=2,
title = 'Covid-19 Madagascar en traitement',ylabel = 'Total en traitement',
update_width_config=dict(candle_linewidth=0.5,candle_width=0.5),
ylabel_lower = 'Total'
)
# Plot my new custom mpf style:
mpf.plot(daily,**kwargs,style=s,scale_width_adjustment=dict(volume=0.4))
I get the final result

Yes, the plt.figure or plt.subplots gives you a figure object and then you can plot as many figures as you want. In fact if you use
import seaborn as sns
fmri = sns.load_dataset("fmri")
f,ax = plt.subplots(1,1,figsize=(10,7)) # make a subplot of 1 row and 1 column
g1 = sns.lineplot(x="timepoint", y="signal", data=fmri,ax=ax) # ax=axis object is must
g2 = sns.some_other_chart(your_data, ax=ax)
g3 = ax.some_matlotlib_chart(your_data) # no need to use ax=ax
Seaborn does not support Candlestick but you can plot using the matplotlib on the same axis.
from matplotlib.finance import candlestick_ohlc
candlestick_ohlc(ax, data.values, width=0.6, colorup='g', colordown='r') # just a dummy code to explain. YOu can see the ax object here as first arg
You can even use the pandas df.plot(data,kind='bar',ax=ax,**kwargs) to plot within the same axis object.
Note: Some of the seaborn charts do not support plotting on the same ax because they use their own grid such as relplot

Yes, mplfinance allows you to plot multiple data sets, on the same plot, or on multiple subplots, where each one can be any of candlestick, ohlc-bars, line, scatter, or bar chart.
For more information, see for example:
Adding Your Own Technical Studies to Plots
Subplots: Multiple Plots on a Single Figure, including:
The Panels Method
External Axes Method
Note, as a general rule, it is recommended to not use the "External Axes Method" if what you are trying to accomplish can be done otherwise with mplfinance in panels mode.

How to put the legend on first subplot of seaborn.FacetGrid?

I have a pandas DataFrame df which I visualize with subplots of a seaborn.barplot. My problem is that I want to move my legend inside one of the subplots.
To create subplots based on a condition (in my case Area), I use seaborn.FacetGrid. This is the code I use:
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
# .. load data
grid = sns.FacetGrid(df, col="Area", col_order=['F1','F2','F3'])
bp = grid.map(sns.barplot,'Param','Time','Method')
bp.add_legend()
bp.set_titles("{col_name}")
bp.set_ylabels("Time (s)")
bp.set_xlabels("Number")
sns.plt.show()
Which generates this plot:
You see that the legend here is totally at the right, but I would like to have it inside one of the plots (for example the left one) since my original data labels are quite long and the legend occupies too much space. This is the example for only 1 plot where the legend is inside the plot:
and the code:
mask = df['Area']=='F3'
ax=sns.barplot(x='Param',y='Time',hue='Method',data=df[mask])
sns.plt.show()
Test 1:
I tried the example of an answer where they have the legend in one of the subplots:
grid = sns.FacetGrid(df, col="Area", col_order=['F1','F2','F3'])
bp = grid.map(sns.barplot,'Param','Time','Method')
Ax = bp.axes[0]
Boxes = [item for item in Ax.get_children()
if isinstance(item, matplotlib.patches.Rectangle)][:-1]
legend_labels = ['So1', 'So2', 'So3', 'So4', 'So5']
# Create the legend patches
legend_patches = [matplotlib.patches.Patch(color=C, label=L) for
C, L in zip([item.get_facecolor() for item in Boxes],
legend_labels)]
# Plot the legend
plt.legend(legend_patches)
sns.plt.show()
Note that I changed plt.legend(handles=legend_patches) did not work for me therefore I use plt.legend(legend_patches) as commented in this answer. The result however is:
As you see the legend is in the third subplot and neither the colors nor labels match.
Test 2:
Finally I tried to create a subplot with a column wrap of 2 (col_wrap=2) with the idea of having the legend in the right-bottom square:
grid = sns.FacetGrid(df, col="MapPubName", col_order=['F1','F2','F3'],col_wrap=2)
but this also results in the legend being at the right:
Question: How can I get the legend inside the first subplot? Or how can I move the legend to anywhere in the grid?

You can set the legend on the specific axes you want, by using grid.axes[i][j].legend()
For your case of a 1 row, 3 column grid, you want to set grid.axes[0][0].legend() to plot on the left hand side.
Here's a simple example derived from your code, but changed to account for the sample dataset.
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
df = sns.load_dataset("tips")
grid = sns.FacetGrid(df, col="day")
bp = grid.map(sns.barplot,"time",'total_bill','sex')
grid.axes[0][0].legend()
bp.set_titles("{col_name}")
bp.set_ylabels("Time (s)")
bp.set_xlabels("Number")
sns.plt.show()

Use the legend_out=False option.
If you are making a faceted bar plot, you should use factorplot with kind=bar. Otherwise, if you don't explicitly specify the order for each facet, it is possible that your plot will end up being wrong.
import seaborn as sns
tips = sns.load_dataset("tips")
sns.factorplot(x="sex", y="total_bill", hue="smoker", col="day",
data=tips, kind="bar", aspect=.7, legend_out=False)

Plotting correlation heatmaps with Seaborn FacetGrid

I am trying to create a single image with heatmaps representing the correlation of features of data points for each label separately. With seaborn I can create a heatmap for a single class like so
grouped = df.groupby('target')
sns.heatmap(grouped.get_group('Class_1').corr())
An I get this which makes sense:
But then I try to make a list of all the labels like so:
g = sns.FacetGrid(df, col='target')
g.map(lambda grp: sns.heatmap(grp.corr()))
And sadly I get this which makes no sense to me:

Turns out you can do it pretty concisely with just seaborn if you use map_dataframe instead of map:
g = sns.FacetGrid(df, col='target')
g.map_dataframe(lambda data, color: sns.heatmap(data.corr(), linewidths=0))
#mwaskom points out in his comment that it might be a good idea to explicitly set the limits of the colormap so that the different facets can be more directly compared. The documentation describes relevant heatmap parameters:
vmin, vmax : floats, optional
Values to anchor the colormap, otherwise they are inferred from the data and other keyword arguments.

Without FacetGrid, but making a corr heatmap for each group in a column:
import pandas as pd
import seaborn as sns
from numpy.random import randint
import matplotlib.pyplot as plt
df = pd.DataFrame(randint(0,10,(200,12)),columns=list('abcdefghijkl'))
grouped = df.groupby('a')
rowlength = grouped.ngroups/2 # fix up if odd number of groups
fig, axs = plt.subplots(figsize=(9,4), nrows=2, ncols=rowlength)
targets = zip(grouped.groups.keys(), axs.flatten())
for i, (key, ax) in enumerate(targets):
sns.heatmap(grouped.get_group(key).corr(), ax=ax,
xticklabels=(i >= rowlength),
yticklabels=(i%rowlength==0),
cbar=False) # Use cbar_ax into single side axis
ax.set_title('a=%d'%key)
plt.show()
Maybe there's a way to set up a lambda to correctly pass the data from the g.facet_data() generator through corr before going to heatmap.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python facetgrid with sns.barplot and map; target no overlapping group bars - python

Related

How to reduce the blank area in a grouped boxplot with many missing hue categories

How to affect a list of colors to histogram index bar in matplotlib?

is it possible to combine 2 differents styles in Matplotlib or seaborn in one plot?

How to put the legend on first subplot of seaborn.FacetGrid?

Plotting correlation heatmaps with Seaborn FacetGrid

Categories

Resources