How can I use the lineplot plotting function in seaborn to create a plot with no lines connecting between the points. I know the function is called lineplot, but it has the useful feature of merging all datapoints with the same x value and plotting a single mean and confidence interval.
tips = sns.load_dataset('tips')
sns.lineplot(x='size', y='total_bill', data=tips, marker='o', err_style='bars')
How do I plot without the line? I'm not sure of a better way to phrase my question. How can I plot points only? Lineless lineplot?
I know that seaborn has a pointplot function, but that is for categorical data. In some cases, my x-values are continuous values, so pointplot would not work.
I realize one could get into the matplotlib figure artists and delete the line, but that gets more complicated as the amount of stuff on the plot increases. I was wondering if there are some sort of arguments that can be passed to the lineplot function.
To get error bars without the connecting lines, you can set the linestyle parameter to '':
import seaborn as sns
tips = sns.load_dataset('tips')
sns.lineplot(x='size', y='total_bill', data=tips, marker='o', linestyle='', err_style='bars')
Other types of linestyle could also be interesting, for example "a loosely dotted line": sns.lineplot(..., linestyle=(0, (1, 10)))
I recommend setting join=False.
For me only join = True works.
sns.pointplot(data=df, x = "x_attribute", y = "y_attribute", ci= 95, join=False)
Related
I try to figure out how to create scatter plot in matplotlib with two different y-axis values.
Now i have one and need to add second with index column values on y.
points1 = plt.scatter(r3_load["TimeUTC"], r3_load["r3_load_MW"],
c=r3_load["r3_load_MW"], s=50, cmap="rainbow", alpha=1) #set style options
plt.rcParams['figure.figsize'] = [20,10]
#plt.colorbar(points)
plt.title("timeUTC vs Load")
#plt.xlim(0, 400)
#plt.ylim(0, 300)
plt.xlabel('timeUTC')
plt.ylabel('Load_MW')
cbar = plt.colorbar(points1)
cbar.set_label('Load')
Result i expect is like this:
So second scatter set should be for TimeUTC vs index. Colors are not the subject;) also in excel y-axes are different sites, but doesnt matter.
Appriciate your help! Thanks, Paulina
Continuing after the suggestions in the comments.
There are two ways of using matplotlib.
Via the matplotlib.pyplot interface, like you were doing in your original code snippet with .plt
The object-oriented way. This is the suggested way to use matplotlib, especially when you need more customisation like in your case. In your code, ax1 is an Axes instance.
From an Axes instance, you can plot your data using the Axes.plot and Axes.scatter methods, very similar to what you did through the pyplot interface. This means, you can write a Axes.scatter call instead of .plot and use the same parameters as in your original code:
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.scatter(r3_load["TimeUTC"], r3_load["r3_load_MW"],
c=r3_load["r3_load_MW"], s=50, cmap="rainbow", alpha=1)
ax2.plot(r3_dda249["TimeUTC"], r3_dda249.index, c='b', linestyle='-')
ax1.set_xlabel('TimeUTC')
ax1.set_ylabel('r3_load_MW', color='g')
ax2.set_ylabel('index', color='b')
plt.show()
I have a box plot that I create using the following command:
sns.boxplot(y='points_per_block', x='block', data=data, hue='habit_trial')
So the different colors represent whether the trial was a habit trial or not (0,1). I want to also plot the individual data points, which I tried to achieve using:
sns.stripplot(y='points_per_block', x='block', data=data, hue='habit_trial')
The result was the following
I want the individual points to display over the corresponding box plots. Is there a way to do this without resorting to hacking their positions in some manner? The problem comes from the fact that the separation of data using hue works differently for stripplot and boxplot but I would have thought that these would be easily combinable.
Thanks in advance.
Seaborn functions working with categorical data usually have a dodge= parameter indicating whether data with different hue should be separated a bit. For a boxplot, dodge defaults to True, as it usually would look bad without dodging. For a stripplot defaults to dodge=False.
The following example also shows how the legend can be updated (matplotlib 3.4 is needed for HandlerTuple):
import seaborn as sns
from matplotlib.legend_handler import HandlerTuple
tips = sns.load_dataset("tips")
ax = sns.boxplot(data=tips, x="day", y="total_bill",
hue="smoker", hue_order=['Yes', 'No'], boxprops={'alpha': 0.4})
sns.stripplot(data=tips, x="day", y="total_bill",
hue="smoker", hue_order=['Yes', 'No'], dodge=True, ax=ax)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=[(handles[0], handles[2]), (handles[1], handles[3])],
labels=['Smoker', 'Non-smoker'],
loc='upper left', handlelength=4,
handler_map={tuple: HandlerTuple(ndivide=None)})
I use Seaborn/Matplotlib to display different outputs (time and distance for example) for different parameters. I would like to associate the two outputs on the same plot, thus I use seaborn's satplot and barplot.
My problem is I don't get the expected display. The graph is here but some noisy extra axis appear.
I'm running the following code
ax = plt.subplot(311)
ax2 = ax.twinx()
data = sns.load_dataset("tips")
sns.barplot(ax=ax, x="day",y="total_bill", hue="size" , data=data, ci=None)
ax.set_yscale("log")
sns.catplot(data=data, x="day", y="tip", ax=ax2, hue="size", kind="swarm", palette="bright")
And I have the following result :
Could you help me to remove this extra axis ? It is especially inconvenient when having multiple subplots.
The extra axis you see is the one returned by the catplot. To get rid of it, you can add the following line after the sns.catplot(...) where the index 2 refers to the count of the figure.
plt.close(2)
To test that, if you use plt.close(1), it will remove the main figure containing bar chart
The extra axes you see is the catplot you create. catplot is a figure-level function (i.e. it creates its own figure); and hence does not really have an ax argument. One could see it as bug that it still allows for it. What you would probably like to do is to create a sns.swarmplot instead, which does have the ax argument.
I would like to use sns.jointplot to visualise the association between X and Y in the presence of two groups. However, in
tips = sns.load_dataset("tips")
sns.jointplot("total_bill", "tip", data=tips)
there is no "hue" option as in other sns plots such as sns.scatterplot. How could one assign different colours for different groups (e.g. hue="smoker") in both the scatter plot, as well as the two overlapping density plots.
In R this could be done by creating a scatter plot with two marginal density plots as shown in here.
What is the equivalent in sns? If this is not possible in sns, is there another python package that can be used for this?
jointplot is a simple wrapper around sns.JointGrid. If you create a JointGrid object and add plots to it manually, you will have much more control over the individual plots.
In this case, your desired jointplot is simply a scatterplot combined with a kdeplot, and what you want to do is pass hue='smoker' (for example) to scatterplot.
The kdeplot is more complex; seaborn doesn't really support one KDE for each class, AFAIK, so I was forced to plot them individually (you could use a for loop with more classes).
Accordingly, you can do this:
import seaborn as sns
tips = sns.load_dataset('tips')
grid = sns.JointGrid(x='total_bill', y='tip', data=tips)
g = grid.plot_joint(sns.scatterplot, hue='smoker', data=tips)
sns.kdeplot(tips.loc[tips['smoker']=='Yes', 'total_bill'], ax=g.ax_marg_x, legend=False)
sns.kdeplot(tips.loc[tips['smoker']=='No', 'total_bill'], ax=g.ax_marg_x, legend=False)
sns.kdeplot(tips.loc[tips['smoker']=='Yes', 'tip'], ax=g.ax_marg_y, vertical=True, legend=False)
sns.kdeplot(tips.loc[tips['smoker']=='No', 'tip'], ax=g.ax_marg_y, vertical=True, legend=False)
I'm making some EDA using pandas and seaborn, this is the code I have to plot the histograms of a group of features:
skewed_data = pd.DataFrame.skew(data)
skewed_features =skewed_data.index
fig, axs = plt.subplots(ncols=len(skewed_features))
plt.ticklabel_format(style='sci', axis='both', scilimits=(0,0))
for i,skewed_feature in enumerate(skewed_features):
g = sns.distplot(data[column])
sns.distplot(data[skewed_feature], ax=axs[i])
This is the result I'm getting:
Is not readable, how can I avoid that issue?
I know you are concerning about the layout of the figures. However, you need to first decide how to represent your data. Here are two choices for your case
(1) Multiple lines in one figure and
(2) Multiple subplots 2x2, each subplot draws one line.
I am not quite familiar with searborn, but the plotting of searborn is based on matplotlib. I could give you some basic ideas.
To archive (1), you can first declare the figure and ax, then add all line to this ax. Example codes:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
# YOUR LOOP, use the ax parameter
for i in range(3)
sns.distplot(data[i], ax=ax)
To archive (2), same as above, but with different number subplots, and put your line in the different subplot.
# Four subplots, 2x2
fig, axarr = plt.subplots(2,2)
# YOUR LOOP, use different cell
You may check matplotlib subplots demo. To do a good visualization is a very tough work. There are so many documents to read. Check the gallery of matplotlib or seaborn is a good and quick way to understand how some kinds of visualization are implemented.
Thanks.