creating plot matrix with relplot in seaborn - python

I am trying to add multiple plots and create a matrix plot with seaborn. unfortunately python give me following warning.
"relplot is a figure-level function and does not accept target axes. You may wish to try scatterplot"
fig, axes = plt.subplots(nrows=5,ncols=5,figsize=(20,20),sharex=True, sharey=True)
for i in range(5):
for j in range(5):
axes[i][j]=seaborn.relplot(x=col[i+2],y=col[j+2],data=df,ax=axes=[i][j])
I would like to know if there's any method with which I can combine all the plots plotted with relplot.

Hi Kinto welcome to StackOverflow!
relplot works differently than for example scatterplot. With relplot you don't need to define subplots and loop over them. Instead you can say what you would like to vary on each row or column of a graph.
For an example from the documentation:
import seaborn as sns
sns.set(style="ticks")
tips = sns.load_dataset("tips")
g = sns.relplot(
x="total_bill", y="tip", hue="day",
col="time", row="sex", data=tips
)
Which says: on each subplot, plot the total bill on the x-axis, the tip on the y-axis and vary the hue in a subplot with the day. Then for each column, plot unique data from the "time" column of the tips dataset. In this case there are two unique times: "Lunch" and "Diner". And finally vary the "sex" for each subplot row. In this case there are two types of "sex": "Male" and "Female", so on one row you plot the male tipping behavior and on the second the female tipping behavior.
I'm not sure what your data looks like, but hopefully this explanation helps you.

Related

Show scatter plot title from column value

I'm making automatic scatterplot charting with regression from my dataframe example
I want to make correlation between Column2 to Column3 and Column2 to Column4 in separate scatter plot group by Column1. For example, there will be 3 scatter plot of Column2 to Column3 each with the title of A, B, and C
For the plotting I'm using pandas scatter plot for example:
df.groupby('Column1').plot.scatter(x='Column2',y='Column3')
Scatter plot return exactly 3 plot, but I want to know how to add chart title based on grouping from Column1 and also how to add the regression line. I haven't use seaborn or matplotlib yet because it's quite confusing for me, so I appreciate if you can explain more :))
EDIT 1
Sorry for not being clear before. My code previously running fine but with output like this.
It's ok but hard to look at which plot belong to which group. Hence my intention is to create something like this
EDIT 2
Shoutout to everyone who are kindly helping. Caina especially helped me a lot, shoutout to him also.
This is the code he write based on several addition on the comment.
fig, axes = plt.subplots(1, df.Column1.nunique(), figsize=(12,8))
groups = df.groupby('Column1')
fig.tight_layout(pad=3)
# If `fig.tight_layout(pad=3)` does not work:
# plt.subplots_adjust(wspace=0.5)
for i, (gname, gdata) in enumerate(groups):
sns.regplot(x='Column2', y='Column3', data=gdata, ax=axes[i])
axes[i].set_title(gname)
axes[i].set_ylim(0,)
And this is my result plot
The title and the axis works beautifully. This thread can be considered closed as I got the help for the plot.
But as you can see, the bottom plot is in weird presentation as it also display the x axis from 0 to 600 and y from 0 to 25 although all of them should have same y axis format. For the other chart they are also stretched horizontally.
I'm trying to use the method here but not really successful with parameter square or equal
Can I ask for further on how to adjust the axis so the plot will be a square?Thank you!
You can iterate over your groups, specifying the title:
fig, axes = plt.subplots(1, df.Column1.nunique(), figsize=(12,8))
groups = df.groupby('Column1')
for i, (gname, gdata) in enumerate(groups):
sns.regplot(x='Column2', y='Column3', data=gdata, ax=axes[i]).set_title(gname)
You can also use seaborn.FacetGrid directly:
g = sns.FacetGrid(df, col='Column1')
g.map(sns.regplot, 'Column2', 'Column2')
Edit (further customization based on new requirements in comments):
fig, axes = plt.subplots(1, df.Column1.nunique(), figsize=(12,8))
groups = df.groupby('Column1')
fig.tight_layout(pad=3)
# If `fig.tight_layout(pad=3)` does not work:
# plt.subplots_adjust(wspace=0.5)
for i, (gname, gdata) in enumerate(groups):
sns.regplot(x='Column2', y='Column3', data=gdata, ax=axes[i])
axes[i].set_title(gname)
axes[i].set_ylim(0,)
You can also use the seaborn in-built functionality. If you want to see the correlation you can just do df.corr().
import seaborn as sns
sns.heatmap(df.corr(),annot=True) # This is HeatMap
You can also use pairplot.
sns.set_style('whitegrid')
sns.set(rc={'figure.figsize':(13.7,10.27)})
sns.set(font_scale=1.3)
sns.set_palette("cubehelix",8)
sns.pairplot(df)
Here's what I did. You can try something like this.
fig, axes = plt.subplots(ncols=3,figsize=(12,6))
plt.subplots_adjust(wspace=0.5, hspace=0.3)
df.groupby('Column1').plot.scatter(x='Column2',y='Column3',color="DarkBlue",ax=axes[0])
axes[0].set_title("A")
df.groupby('Column1').plot.scatter(x='Column2',y='Column3',color="DarkBlue",ax=axes[1])
axes[1].set_title("B")
df.groupby('Column1').plot.scatter(x='Column2',y='Column3',color="DarkBlue",ax=axes[2])
axes[2].set_title("C")
plt.show()
The output of this is:
If you want to see the answers vertically, then change ncols to nrows and fix the figsize.
fig, axes = plt.subplots(nrows=3,figsize=(3,12))
plt.subplots_adjust(wspace=0.2, hspace=.5)
df.groupby('Column1').plot.scatter(x='Column2',y='Column3',color="DarkBlue",ax=axes[0])
axes[0].set_title("A")
df.groupby('Column1').plot.scatter(x='Column2',y='Column3',color="DarkBlue",ax=axes[1])
axes[1].set_title("B")
df.groupby('Column1').plot.scatter(x='Column2',y='Column3',color="DarkBlue",ax=axes[2])
axes[2].set_title("C")
plt.show()
This will give you:

Seaborn lineplot without lines between points

How can I use the lineplot plotting function in seaborn to create a plot with no lines connecting between the points. I know the function is called lineplot, but it has the useful feature of merging all datapoints with the same x value and plotting a single mean and confidence interval.
tips = sns.load_dataset('tips')
sns.lineplot(x='size', y='total_bill', data=tips, marker='o', err_style='bars')
How do I plot without the line? I'm not sure of a better way to phrase my question. How can I plot points only? Lineless lineplot?
I know that seaborn has a pointplot function, but that is for categorical data. In some cases, my x-values are continuous values, so pointplot would not work.
I realize one could get into the matplotlib figure artists and delete the line, but that gets more complicated as the amount of stuff on the plot increases. I was wondering if there are some sort of arguments that can be passed to the lineplot function.
To get error bars without the connecting lines, you can set the linestyle parameter to '':
import seaborn as sns
tips = sns.load_dataset('tips')
sns.lineplot(x='size', y='total_bill', data=tips, marker='o', linestyle='', err_style='bars')
Other types of linestyle could also be interesting, for example "a loosely dotted line": sns.lineplot(..., linestyle=(0, (1, 10)))
I recommend setting join=False.
For me only join = True works.
sns.pointplot(data=df, x = "x_attribute", y = "y_attribute", ci= 95, join=False)

Seaborn PairPlot rotate x tick labels. Categorical data labels are overlapping

I'm trying to create plots which show the correlation of the "value" parameter to different categorical parameters. Here's what I have so far:
plot = sns.pairplot(df, x_vars=['country', 'tier_code', 'industry', 'company_size', 'region'], y_vars=['value'], height=10)
Which produces the following set of plots:
As you can see, the x axis is extremely crowded for the "country" and "industry" plots. I would like to rotate the category labels 90 degrees so that they wouldn't overlap.
All the examples for rotating I could find were for other kinds of plots and didn't work for the pairplot. I could probably get it to work if I made each plot separately using catplot, but I would like to make them all at once. Is that possible?
I am using Google Colab in case it makes any difference. My seaborn version number is 0.10.0.
Manish's answer uses the get_xticklabels method, which doesn't always play well with the higher level seaborn functions in my experience. So here's a solution avoiding that. Since I don't have your data, I'm using seaborn's tips dataset for an example.
I'm naming the object returned by sns.pairplot() grid, just to remind us that it is a PairGrid instance. In general, its axes attribute yields a two-dimensional array of axes objects, corresponding to the subplot grid. So I'm using the flat method to turn this into a one-dimensional array, although it wouldn't be necessary in your special case with only one row of subplots.
In my example I don't want to rotate the labels for the third subplot, as they are single digits, so I slice the axes array accordingly with [:2].
import seaborn as sns
sns.set()
tips = sns.load_dataset("tips")
grid = sns.pairplot(tips, x_vars=['sex', 'day', 'size'], y_vars=['tip'])
for ax in grid.axes.flat[:2]:
ax.tick_params(axis='x', labelrotation=90)
You can rotate x-axis labels as:
plot = sns.pairplot(df, x_vars=['country', 'tier_code', 'industry', 'company_size', 'region'],
y_vars=['value'], height=10)
rotation = 90
for axis in plot.fig.axes: # get all the axis
axis.set_xticklabels(axis.get_xticklabels(), rotation = rotation)
plot.fig.show()
Hope it helps.

Matplotlib Subplot axes sharing: Apply to every other plot?

I am trying to find a way to apply the shared axes parameters of subplot() to every other plot in a series of subplots.
I've got the following code, which uses data from RPM4, based on rows in fpD
fig, ax = plt.subplots(2*(fpD['name'].count()), sharex=True, figsize=(6,fpD['name'].count()*2),
gridspec_kw={'height_ratios':[5,1]*fpD['name'].count()})
for i, r in fpD.iterrows():
RPM4[RPM4['name'] == RPM3.iloc[i,0]].plot(x='date', y='RPM', ax=ax[(2*i)], legend=False)
RPM4[RPM4['name'] == RPM3.iloc[i,0]].plot(kind='area', color='lightgrey', x='date', y='total', ax=ax[(2*i)+1],
legend=False,)
ax[2*i].set_title('test', fontsize=12)
plt.tight_layout()
Which produces an output that is very close to what I need. It loops through the 'name' column in a table and produces two plots for each, and displays them as subplots:
As you can see, the sharex parameter works fine for me here, since I want all the plots to share the same axis.
However, what I'd really like is for all the even-numbered (bigger) plots to share the same y axis, and for the odd-numbered (small grey) plots to all share a different y axis.
Any help on accomplishing this is much appreciated, thanks!

seaborn barplot with labels for x values (and no hue)

My dataframe contains two columns, I would like to plot their values in a barplot. Like this:
import seaborn as sns
# load sample data and drop all but two columns
tips = sns.load_dataset("tips")
tips= tips[["day", "total_bill"]]
sns.set(style="whitegrid")
ax = sns.barplot(x="day", y="total_bill", data=tips)
On top of this barplot, I would also like to add a legend with labels for each x value. Seaborn supports this, but as far as I can see, it works only when you specify a hue argument. Each label in the legend then corresponds to a hue value.
Can I create a legend with explanations for the x values?
This might be a confusing question. I don't want to rename the label for the axis or the ticks along the axis. Instead, I would like to have a separate legend with additional explanations. My bars give me some nice space to put this legend and the explanations would be too long to have them as ticks.
Is this what you want:
sns.set(style="whitegrid")
ax = sns.barplot(x="day", y="total_bill", data=tips)
ax.legend(ax.patches, ['1','2','3','Something that I can\'t say'], loc=[1.01,0.5])
Output:

Categories

Resources