This question already has answers here:
Seaborn displot facetgrid do not share y axis
(1 answer)
Prevent Sharing of Y Axes in a relplot
(1 answer)
Closed 6 months ago.
I have some data that has widely different scales. I want to create a displot showing all the features on graphic. I though facet_kws={'sharex': False} was the relevant parameter, but it doesn't appear to be working, what I am doing wrong?
import numpy as np
import pandas as pd
import seaborn as sns
import random
# Sample Dataframe
df = pd.DataFrame(np.random.randint(0,200,size=(200, 3)), columns=list('ABC'))
D= np.random.randint(0,10,size=(200, 1))
df['D']= D
# reshape dataframe
df2 = df.stack().reset_index(level=1).reset_index(drop=True).\
rename(columns={'level_1': 'Name', 0: 'Value'})
# plot
g = sns.displot(data=df2,
x='Value', col='Name',
col_wrap=3, kde=True,
facet_kws={'sharex': False})
The author of Seaborn(mwaskom) already answered at the comment on the question, but I'll answer in more detail.
Use the common_bins option as well, like the following. The documentation on that option is on the histplot() section.
g = sns.displot(data=df2,
x='Value', col='Name',
col_wrap=3, kde=True,
common_bins=False,
facet_kws={'sharex': False, 'sharey': False})
Also I suggest to correct your example code like the following, because it raises a ValueError on duplicate labels, for other readers.
df2 = df.stack().reset_index(level=1).reset_index(drop=True).\
rename(columns={'level_1': 'Name', 0: 'Value'})
Related
This question already has answers here:
Passing datetime-like object to seaborn.lmplot
(2 answers)
format x-axis (dates) in sns.lmplot()
(1 answer)
How to plot int to datetime on x axis using seaborn?
(1 answer)
Closed 10 months ago.
I would really really appreciate it if you guys can point me to where to look. I have been trying to do it for 3 days and still can't find the right one. I need to draw the chart which looks as the first picture's chart and I need to display the dates on the X axis as it gets displayed on the second chart. I am complete beginner with seaborn, python and everything. I used lineplot first, which only met one criteria, display the dates on X-axis. But, the lines are actually sharp like in the second picture rather than smooth like in the first picture. Then, I kept digging and found implot. With that, I could get the design of the chart I wanted (Smoothed chart). But, the problem is when I tried to display the dates on the X-axis, it didn't work. I got an error could not convert string to float: '2022-07-27T13:31:00Z'.
Here is the code for implot, got the wanted plot design but date can't be displayed on X-axis
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
T = np.array([ "2022-07-27T13:31:00Z",
"2022-08-28T13:31:00Z",
"2022-09-29T13:31:00Z",
"2022-10-30T13:31:00Z",])
power = np.array([10,25,60,42])
df = pd.DataFrame(data = {'T': T, 'power': power})
sns.lmplot(x='T', y='power', data=df, ci=None, order=4, truncate=False)
If I use the number instead of date, the output is this. Exactly as I need
Here is the code with which all the data gets displayed correctly. But, the plot design is not smoothed.
import seaborn as sns
import numpy as np
import scipy
import matplotlib.pyplot as plt
import pandas as pd
from pandas.core.apply import frame_apply
years = ["2022-03-22T13:30:00Z",
"2022-03-23T13:31:00Z",
"2022-04-24T19:27:00Z",
"2022-05-25T13:31:00Z",
"2022-06-26T13:31:00Z",
"2022-07-27T13:31:00Z",
"2022-08-28T13:31:00Z",
"2022-09-29T13:31:00Z",
"2022-10-30T13:31:00Z",
]
feature_1 =[0,
6,
1,
5,
9,
15,
21,
4,
1,
]
data_preproc = pd.DataFrame({
'Period': years,
# 'Feature 1': feature_1,
# 'Feature 2': feature_2,
# 'Feature 3': feature_3,
# 'Feature 4': feature_4,
"Feature 1" :feature_1
})
data_preproc['Period'] = pd.to_datetime(data_preproc['Period'],
format="%Y-%m-%d",errors='coerce')
data_preproc['Period'] = data_preproc['Period'].dt.strftime('%b')
# aiAlertPlot =sns.lineplot(x='Period', y='value', hue='variable',ci=None,
# data=pd.melt(data_preproc, ['Period']))
sns.lineplot(x="Period",y="Feature 1",data=data_preproc)
# plt.xticks(np.linspace(start=0, stop=21, num=52))
plt.xticks(rotation=90)
plt.legend(title="features")
plt.ylabel("Alerts")
plt.legend(loc='upper right')
plt.show()
The output is this. Correct data, wrong chart design.
lmplot is a model based method, which requires numeric x. If you think the date values are evenly spaced, you can just create another variable range which is numeric and calculate lmplot on that variable and then change the xticks labels.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
T = np.array([ "2022-07-27T13:31:00Z",
"2022-08-28T13:31:00Z",
"2022-09-29T13:31:00Z",
"2022-10-30T13:31:00Z",])
power = np.array([10,25,60,42])
df = pd.DataFrame(data = {'T': T, 'power': power})
df['range'] = np.arange(df.shape[0])
sns.lmplot(x='range', y='power', data=df, ci=None, order=4, truncate=False)
plt.xticks(df['range'], df['T'], rotation = 45);
This question already has answers here:
Line plot with data points in pandas
(2 answers)
Closed 1 year ago.
Hi I am trying to get a line plot for a dataframe:
i = [0.01,0.02,0.03,....,0.98,0.99,1.00]
values= [76,98,22,.....,32,98,100]
but there is index from 0,1,...99 as well and when I plot the index line also gets plotted. How do I ignore the plotting of index? I used the following code:
plt.plot(df,color= 'blue', label= 'values')
plt.title('values for corresponding i')
plt.legend(loc= 'upper right')
plt.xlabel("i")
plt.ylabel("values")
plt.show()
You could use plot.line directly on pandas dataframe, it's a wrapper around matplotlib and it makes stuff easier.
Example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Generate random DataFrame
i = np.arange(0, 1, 0.01)
values = np.random.randint(1, 100, 100)
df = pd.DataFrame({"i": i, "values": values})
# Plot
df.plot.line(x="i", y="values", color="blue", label="values")
plt.title("values for corresponding i")
plt.legend(loc="upper right")
plt.xlabel("i")
plt.ylabel("values")
Result:
This question already has answers here:
How to change the color of a single bar if condition is True
(2 answers)
Closed 2 years ago.
I have the following dataframe producing the following plot:
# Import pandas library
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# initialize data
data = [['tom', 10,1,'a'], ['matt', 15,5,'a'], ['Nick', 14,1,'a']]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Attempts','Score','Category'])
print(df.head(3))
Name Attempts Score Category
0 tom 10 1 a
1 matt 15 5 a
2 Nick 14 1 a
# Initialize the matplotlib figure
sns.set()
sns.set_context("paper")
sns.axes_style({'axes.spines.left': True})
f, ax = plt.subplots(nrows=3,figsize=(8.27,11.7))
# Plot
sns.set_color_codes("muted")
sns.barplot(x="Attempts", y='Name', data=df,
label="Total", color="b", ax=ax[0])
sns.scatterplot(x='Score',y='Name',data=df,zorder=10,color='k',edgecolor='k',ax=ax[0],legend=False)
ax[0].set_title("title")
plt.show()
I want to highlight just the bar Nick in a different color (eg red). Is there an easy way to do this?
In the barplot method, you can use the palette instead of the parameter color and do a loop to check which value you want to change.
sns.barplot(x="Attempts", y='Name', data=df,
label="Total", palette=["b" if x!='Nick' else 'r' for x in df.Name], ax=ax[0])
and you get
This question already has answers here:
Adding labels in x y scatter plot with seaborn
(6 answers)
Closed 4 years ago.
I have a Seaborn scatterplot using data from a dataframe. I would like to add data labels to the plot, using other values in the df associated with that observation (row). Please see below - is there a way to add at least one of the column values (A or B) to the plot? Even better, is there a way to add two labels (in this case, both the values in column A and B?)
I have tried to use a for loop using functions like the below per my searches, but have not had success with this scatterplot.
Thank you for your help.
df_so = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
scatter_so=sns.lmplot(x='C', y='D', data=df_so,
fit_reg=False,y_jitter=0, scatter_kws={'alpha':0.2})
fig, ax = plt.subplots() #stuff like this does not work
Use:
df_so = pd.DataFrame(np.random.randint(0,100,size=(20, 4)), columns=list('ABCD'))
scatter_so=sns.lmplot(x='C', y='D', data=df_so,
fit_reg=False,y_jitter=0, scatter_kws={'alpha':0.2})
def label_point(x, y, val, ax):
a = pd.concat({'x': x, 'y': y, 'val': val}, axis=1)
for i, point in a.iterrows():
ax.text(point['x']+.02, point['y'], str(point['val']))
label_point(df_so['C'], df_so['D'], '('+df_so['A'].astype(str)+', '+df_so['B'].astype(str)+')', plt.gca())
Output:
This question already has answers here:
How to scale Seaborn's y-axis with a bar plot
(4 answers)
Closed 5 years ago.
My question is fairly simple: I would like to visualize multiple histograms using the Seaborn module, however, as a number of bins contain very few counts, I would like to visualize the vertical axis using a logarithmic scale.
My code so far:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(100,2), columns=['A','B'])
df = pd.melt(df, var_name='Category')
g = sns.FacetGrid(df, col='Category', sharex=True, sharey=False, aspect=1.5)
g = g.map(plt.hist, "value", color="r")
, which gives me the following image:
How do I change the vertical axis to a logarithmic scale (in the most 'pythonic'/'seabornic' way)? I've looked around on various answers, but wasn't satisfied with the answers I found so far.
Update:
Adding the following code, following the answer here, makes my bars vanish:
g.fig.get_axes()[0].set_yscale('log')
Update II:
The following code fixed my problem:
df = pd.DataFrame(np.random.rand(100,2), columns=['A','B'])
df = pd.melt(df, var_name='Category')
g = sns.FacetGrid(df, col='Category', sharex=True, sharey=False, aspect=1.5)
g = g.map(plt.hist, "value", color="r", log=True)
I just added the last couple of lines:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(100,2), columns=['A','B'])
df = pd.melt(df, var_name='Category')
g = sns.FacetGrid(df, col='Category', col_wrap=2, sharex=True, sharey=False, aspect=1.5)
g = g.map(plt.hist, "value", color="r")
g.axes[0].set_yscale('log')
g.axes[1].set_yscale('log')