This question already has answers here:
How to plot in multiple subplots
(12 answers)
Closed 6 months ago.
I am trying to create subplots inside for loop for various columns of the dataset. I am using the California housing dataset from sklearn. So, there are 4 columns and I want to display three figures for each column in a subplot. I have provided the code which I have tried. Can somebody help me with this issue? Can we make it dynamic so that if I need to add more figure then we can add easily with title?
from sklearn.datasets import fetch_california_housing
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
california_housing = fetch_california_housing(as_frame=True)
# california_housing.frame.head()
features_of_interest = ["AveRooms", "AveBedrms", "AveOccup", "Population"]
california_housing.frame[features_of_interest]
fig, axes = plt.subplots(4, 3)
for cols in features_of_interest:
# scatterplot
sns.scatterplot(x=california_housing.frame[cols], y=california_housing.target)
# histogram
sns.histplot(x=california_housing.frame[cols], y=california_housing.target)
#qqplot
sm.qqplot(california_housing.frame[cols], line='45')
plt.show()
There are some problems with your code:
you need to import statsmodels.api as sm
you need to use the ax parameter from scatterplot, histplot, and qqplot to indicate where the plot will be present
the way that you load the data isnot allowing matplotlib and seaborn to use the data. I made some changes on this part.
you do not need to show on each iteration just at the end.
from sklearn.datasets import fetch_california_housing
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import statsmodels.api as sm
california_housing = fetch_california_housing(as_frame=True).frame
features_of_interest = ["AveRooms", "AveBedrms", "AveOccup", "Population"]
fig, axes = plt.subplots(len(features_of_interest), 3)
for i, cols in enumerate(features_of_interest):
# scatterplot
sns.scatterplot(x=california_housing[cols], y=california_housing['MedHouseVal'], ax=axes[i,0])
# histogram
sns.histplot(x=california_housing[cols], y=california_housing['MedHouseVal'], ax=axes[i,1])
#qqplot
sm.qqplot(california_housing[cols], line='45', ax=axes[i,2])
plt.show()
PS.: I used len(features_of_interest) to auto-adapt our script considering the number of features.
I am trying to connect lines based on a specific relationship associated with the points. In this example the lines would connect the players by which court they played in. I can create the basic structure but haven't figured out a reasonably simple way to create this added feature.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df_dict={'court':[1,1,2,2,3,3,4,4],
'player':['Bob','Ian','Bob','Ian','Bob','Ian','Ian','Bob'],
'score':[6,8,12,15,8,16,11,13],
'win':['no','yes','no','yes','no','yes','no','yes']}
df=pd.DataFrame.from_dict(df_dict)
ax = sns.boxplot(x='score',y='player',data=df)
ax = sns.swarmplot(x='score',y='player',hue='win',data=df,s=10,palette=['red','green'])
plt.show()
This code generates the following plot minus the gray lines that I am after.
You can use lineplot here:
sns.lineplot(
data=df, x="score", y="player", units="court",
color=".7", estimator=None
)
The player name is converted to an integer as a flag, which is used as the value of the y-axis, and a loop process is applied to each position on the court to draw a line.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df_dict={'court':[1,1,2,2,3,3,4,4],
'player':['Bob','Ian','Bob','Ian','Bob','Ian','Ian','Bob'],
'score':[6,8,12,15,8,16,11,13],
'win':['no','yes','no','yes','no','yes','no','yes']}
df=pd.DataFrame.from_dict(df_dict)
ax = sns.boxplot(x='score',y='player',data=df)
ax = sns.swarmplot(x='score',y='player',hue='win',data=df,s=10,palette=['red','green'])
df['flg'] = df['player'].apply(lambda x: 0 if x == 'Bob' else 1)
for i in df.court.unique():
dfq = df.query('court == #i').reset_index()
ax.plot(dfq['score'], dfq['flg'], 'g-')
plt.show()
I have seen many questions on changing the tick frequency on SO, and that did help when I am building a line chart, but I have been struggling when its a bar chart. So below are my codes
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame(np.random.randint(1,10,(90,1)),columns=['Values'])
df.plot(kind='bar')
plt.show()
and thats the output I see. How do I change the tick frequency ?
(To be more clearer frequency of 5 on x axis!)
Using Pandas plot function you can do:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(1,10,(90,1)),columns=['Values'])
df.plot(kind='bar', xticks=np.arange(0,90,5))
Or better:
df.plot(kind='bar', xticks=list(df.index[0::5]))
I have lot of feature in data and i want to make box plot for each feature. So for that
import pandas as pd
import seaborn as sns
plt.figure(figsize=(25,20))
for data in train_df.columns:
plt.subplot(7,4,i+1)
plt.subplots_adjust(hspace = 0.5, wspace = 0.5)
ax =sns.boxplot(train_df[data])
I did this
and the output is
All the plot are on one image i want something like
( not with skew graphs but with box plot )
What changes i need to do ?
In your code, I cannot see where the i is coming from and also it's not clear how ax was assigned.
Maybe try something like this, first an example data frame:
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
train_df = pd.concat([pd.Series(np.random.normal(i,1,100)) for i in range(12)],axis=1)
Set up fig and a flattened ax for each subplot:
fig,ax = plt.subplots(4,3,figsize=(10,10))
ax = ax.flatten()
The most basic would be to call sns.boxplot assigning ax inside the function:
for i,data in enumerate(train_df.columns):
sns.boxplot(train_df[data],ax=ax[i])
I have a problem using Seaborn relplot when trying to make an animation. I have recreated the issue I have using one of the datasets that comes with Seaborn, below.
I suspect it is something to do with the plt.figure() not being the same as the sns.relplot. Any ideas on how to make this work would be greatly received. Thanks.
%matplotlib inline
import seaborn as sns
import pandas as pd
from matplotlib import pyplot as plt
from celluloid import Camera
from IPython.display import HTML
import ffmpeg
df = sns.load_dataset('car_crashes')
f = plt.figure(figsize=(3,3))
camera = Camera(f)
# This might seem a little bit unnecessary, but its emulating the way I am plotting my other data source:
for i in range(0, len(df), 10):
g = sns.relplot(x='total', y='abbrev', hue='abbrev', data=df.iloc[i:i+10 , [0,7]])
plt.axis('off')
plt.title(f'THIS IS THE TITLE OF {i}')
plt.gca().set_aspect('equal')
camera.snap()
animation = camera.animate()
HTML(animation.to_html5_video())