Hue and shared x-axis not working Seaborn facet grid - python

I'm aiming to plot a stacked chart that displays normalised values from a pandas df. Using below, each unique value in Item has it's own row. I then aim to plot a stacked chart containing normalised values from Label, with Num along the x-axis.
However, hue seems to pass a different set of colours for each individual Item. They aren't consistent, for ex, A in Up is blue, while A in Right is green.
I'm also hoping to share the x-axis for Num is consistent for each Item. The values aren't aligned with the respective x-axis.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Num' : [1,2,1,2,3,2,1,3,2,2,1,2,3,3,1,3],
'Label' : ['A','B','C','B','B','C','C','B','B','A','C','A','B','A','C','A'],
'Item' : ['Up','Left','Up','Left','Down','Right','Up','Down','Right','Down','Right','Up','Up','Right','Down','Left'],
})
g = sns.FacetGrid(df,
row = 'Item',
row_order = ['Up','Right','Down','Left'],
aspect = 2,
height = 4,
sharex = True,
legend_out = True
)
g.map_dataframe(sns.histplot, x = 'Num', hue = 'Label', multiple = 'fill', shrink = 0.8, binwidth = 1)
g.add_legend()

Using FacetGrid directly can be tricky; it is basically doing a groupb-by and for loop over the axes, and it does not track any function-specific state that would be needed to make sure that the answer to questions like "what order should be used for each hue level" is the same in each facet. So you would need to supply that information somehow (i.e. hue_order or passing a palette dictionary). In fact, there is a warning in the documentation to this effect.
But you generally don't need to use FacetGrid directly; you can use one of the figure-level functions, which do all of the bookkeeping for you to make sure that information is aligned across facets. Here you would use displot:
sns.displot(
data=df, x="Num", hue="Label",
row='Item', row_order=['Up','Right','Down','Left'],
multiple="fill", shrink=.8, discrete=True,
aspect=4, height=2,
)
Note that I've made one other change to your code here, which is to use discrete=True instead of binwidth=1, which is what I think you want.

Related

Plotting 3 variables on dataframe [duplicate]

data = {'name': ['A', 'B', 'C', 'D'],
'score': [-9.5, -8.3, -8.1, -7.0],
'color': [4, 3, 2, 1]}
df = pd.DataFrame(data)
I have my data in a dataframe like above and I am plotting it to a seaborn swarmplot like the one below. The points are plotted based on their score, and depending on how that falls between the 3 dotted lines, I want to color the points differently. I use the 'color' column of the df to assign a key based on where the 'score' values fall that corresponds to colors and a dictionary.
colors = {1:'pink', 2:'orange', 3:'red', 4:'green'}
I then create the swarmplot with the below code and map the color dictionary to the colors column of my df.
ax = sns.swarmplot(data=df, y='score', s=10, c=df['color'].map(colors))
When I do this I don't generate any errors, but no colors are applied, and the points remain their default blue (image below, left). So, how can I assign colors to points in a seaborn swarmplot based on my df['color'] column?
Final note: When I try to use palette=df['color'].map(colors) instead of c=df['color'].map(colors), the graph just changes everything to the last color in my colors dictionary (image below, right)
Edit: Thank you for your suggestion #Trenton McKinney. It is somewhat successful in that the colors do properly map, but when I include the 'name' column (Its actually Title) for x, as below, my points are plotted like a scatter instead of a swarm plot. But I get an error if I try to remove x='Title' from my parameters.
ax = sns.swarmplot(data=data, x='Title', y='score', s=10, hue='color', palette=colors)
As per seaborn issue 941, this is the expected behavior.
Seems like the API doesn't play well when specifying only x or only y.
The issue is resolved by passing a list of the same strings based on the length of the dataframe: ['']*len(df) or ['text']*len(df)
ax = sns.swarmplot(data=df, x=['']*len(df), y='score', hue='color', palette=colors)

seaborn/matplotlib change number of columns in legend object

I've seen Creating multi column legend in python seaborn plot but I think my question is a bit different. In short, I've got a dataframe that I'm plotting in seaborn's lmplot and getting a FacetGrid. Trouble is, there are tons of values for hue so I get a super long, single column legend. Code example below:
ers = sns.lmplot(
data=emorb,
x="Pb",
y="Nd",
row="Ridge Sys",
hue="Seg Name",
scatter=True,
fit_reg=False,
scatter_kws={"alpha":0.7, "edgecolor": "w"},
palette=sns.color_palette("bright", 20),
legend=True
)
ers.set(ylim=(0.5122,0.5134))
I can access the legend object that is created by calling ers._legend and this returns an object with type Legend (basically, a matplotlib object). However, I can't then call to this legend object to change the number of columns, e.g., with:
l = ers._legend
l(ncols=9)
Any suggestions, or am I missing something perhaps more obvious, such as a way to redraw the legend and specify any parameters?
Thanks.
Whoops, figured it out:
The FacetGrid object has an attribute fig, i.e.
g = sns.lmplot()
parent_mpl_figure = g.fig
And so if I set legend=False in sns.lmplot(), I can then specify parent_mpl_figure.legend(labels=[], ncol=9, bbox_to_anchor=(1,1)).
Written cleanly:
g = sns.lmplot(legend = False)
parent_mpl_figure = g.fig
parent_mpl_figure.legend(labels = [], ncol = 9, bbox_to_anchor = (1,1))
Hope this is instructive for someone else / now to figure out how to have each Facet span the full color palette so that different hue groups within each Facet group are easier to distinguish...

set axis limits on individual facets of seaborn facetgrid

I'm trying to set the x-axis limits to different values for each facet a Seaborn facetgrid distplot. I understand that I can get access to all the axes within the subplots through g.axes, so I've tried to iterate over them and set the xlim with:
g = sns.FacetGrid(
mapping,
col=options.facetCol,
row=options.facetRow,
col_order=sorted(cols),
hue=options.group,
)
g = g.map(sns.distplot, options.axis)
for i, ax in enumerate(g.axes.flat): # set every-other axis for testing purposes
if i % 2 == 0[enter link description here][1]:
ax.set_xlim(-400, 500)
else:
ax.set_xlim(-200, 200)
However, when I do this, all axes get set to (-200, 200) not just every other facet.
What am I doing wrong?
mwaskom had the solution; posting here for completeness - just had to change the following line to:
g = sns.FacetGrid(
mapping,
col=options.facetCol,
row=options.facetRow,
col_order=sorted(cols),
hue=options.group,
sharex=False, # <- This option solved the problem!
)
As suggested by mwaskom you can simply use FacetGrid's sharex (respectively sharey) to allow plots to have independent axis scales:
share{x,y} : bool, ‘col’, or ‘row’ optional
If true, the facets will share y axes across columns and/or x axes across rows.
For example, with:
sharex=False each plot has its own axis
sharex='col' each column has its own axis
sharex='row' each row has its own axis (even if this one doesn't make too much sense to me)
sns.FacetGrid(data, ..., sharex='col')
If you use FacetGrid indirectly, for example via displot or relplot, you will have to use the facet_kws keyword argument:
sns.displot(data, ..., facet_kws={'sharex': 'col'})

Is there a way to make matplotlib scatter plot marker or color according to a discrete variable in a different column?

I'm making scatterplots out of a DF using matplotlib. In order to get different colors for each data set, I'm making two separate calls to plt.scatter:
plt.scatter(zzz['HFmV'], zzz['LFmV'], label = dut_groups[0], color = 'r' )
plt.scatter(qqq['HFmV'], qqq['LFmV'], label = dut_groups[1], color = 'b' )
plt.legend()
plt.show()
This gives me the desired color dependence but really what would be ideal is if I could just get pandas to give me the scatterplot with several datasets on the same plot by something like
df.plot(kind = scatter(x,y, color = df.Group, marker = df.Head)
Apparently there is no such animal (at least that I could find). So, next best thing in my mind would be to put the plt.scatter calls into a loop where I could make the color or marker vary according to one of the rows (not x or y, but some other row. If the row I want to use were a continuous variable it looks like I could use a colormap, but in my case the row I need to sue for this is a string ( categorical type of variable, not a number).
Any help much appreciated.
What you're doing will almost work, but you have to pass color a vector of colors, not just a vector of variables. So you could do:
color = df.Group.map({dut_groups[0]: "r", dut_groups[1]: "b"})
plt.scatter(x, y, color=color)
Same goes for the marker style
You could also use seaborn to do the color-mapping the way you expect (as discussed here), although it doesn't do marker style mapping:
import seaborn as sns
import pandas as pd
from numpy.random import randn
data = pd.DataFrame(dict(x=randn(40), y=randn(40), g=["a", "b"] * 20))
sns.lmplot("x", "y", hue="g", data=data, fit_reg=False)

seaborn FacetGrid , ranked barplot separated on row and col

Given some data:
pt = pd.DataFrame({'alrmV':[000,000,000,101,101,111,111],
'he':[e,e,e,e,h,e,e],
'inc':[0,0,0,0,0,1,1]})
I would like to create a bar plot separated on row and col.
g = sns.FacetGrid(pt, row='inc', col='he', margin_titles=True)
g.map( sns.barplot(pt['alrmV']), color='steelblue')
This, works, but how do I also add:
an ordered x-axis
only display the top-two-by-count alrmV types
To get an ordered x-axis, that displays the top 2 count types, I played around with this grouping, but unable to get it into a Facet grid:
grouped = pt.groupby( ['he','inc'] )
grw= grouped['alrmV'].value_counts().fillna(0.) #.unstack().fillna(0.)
grw[:2].plot(kind='bar')
Using FacetGrid, slicing limits the total count displayed
g.map(sns.barplot(pt['alrmV'][:10]), color='steelblue')
So how can I get a bar graph, that is separated on row and col, and is ordered and displays only top 2 counts?
I couldn't get the example to work with the data you provided, so I'll use one of the example datasets to demonstrate:
import seaborn as sns
tips = sns.load_dataset("tips")
We'll make a plot with sex in the columns, smoker in the rows, using day as the x variable for the barplot. To get the top two days in order, we could do
top_two_ordered = tips.day.value_counts().order().index[-2:]
Then you can pass this list to the x_order argument of barplot.
Although you can use FacetGrid directly here, it's probably easier to use the factorplot function:
g = sns.factorplot("day", col="sex", row="smoker",
data=tips, margin_titles=True, size=3,
x_order=top_two_ordered)
Which draws:
While I wouldn't recommend doing exactly what you proposed (plotting bars for different x values in each facet), it could be accomplished by doing something like
g = sns.FacetGrid(tips, col="sex", row="smoker", sharex=False)
def ordered_barplot(data, **kws):
x_order = data.day.value_counts().order().index[-2:]
sns.barplot(data.day, x_order=x_order)
g.map_dataframe(ordered_barplot)
to make

Categories

Resources