Related
I am trying to connect lines based on a specific relationship associated with the points. In this example the lines would connect the players by which court they played in. I can create the basic structure but haven't figured out a reasonably simple way to create this added feature.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df_dict={'court':[1,1,2,2,3,3,4,4],
'player':['Bob','Ian','Bob','Ian','Bob','Ian','Ian','Bob'],
'score':[6,8,12,15,8,16,11,13],
'win':['no','yes','no','yes','no','yes','no','yes']}
df=pd.DataFrame.from_dict(df_dict)
ax = sns.boxplot(x='score',y='player',data=df)
ax = sns.swarmplot(x='score',y='player',hue='win',data=df,s=10,palette=['red','green'])
plt.show()
This code generates the following plot minus the gray lines that I am after.
You can use lineplot here:
sns.lineplot(
data=df, x="score", y="player", units="court",
color=".7", estimator=None
)
The player name is converted to an integer as a flag, which is used as the value of the y-axis, and a loop process is applied to each position on the court to draw a line.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df_dict={'court':[1,1,2,2,3,3,4,4],
'player':['Bob','Ian','Bob','Ian','Bob','Ian','Ian','Bob'],
'score':[6,8,12,15,8,16,11,13],
'win':['no','yes','no','yes','no','yes','no','yes']}
df=pd.DataFrame.from_dict(df_dict)
ax = sns.boxplot(x='score',y='player',data=df)
ax = sns.swarmplot(x='score',y='player',hue='win',data=df,s=10,palette=['red','green'])
df['flg'] = df['player'].apply(lambda x: 0 if x == 'Bob' else 1)
for i in df.court.unique():
dfq = df.query('court == #i').reset_index()
ax.plot(dfq['score'], dfq['flg'], 'g-')
plt.show()
I'm doing a jointplot with a basemap, the problem is that when I add the basemap the main plot doesn't have the same size of the marginal plots. I've tried with different parameters without luck. Does anyone have an idea?
import seaborn as sns
import matplotlib.pyplot as plt
import contextily as ctx
import pandas as pd
##exaplme of the data
coords={'longitud':[-62.2037376443, -62.1263309099, -62.1111660957, -62.2094232682, -62.2373117384, -62.4837603464,
-62.4030570833, -62.3975699059, -62.7017114116, -62.7830883096, -62.7786038141, -62.7683234105, -62.7490101452,
-62.7709656745, -63.1002199219, -63.1890252191, -63.1183018549, -63.069960016, -62.7957745659, -63.1715687622,
-63.2156105034, -63.0634381954, -63.2243260588, -63.1153871895, -63.1068292891, -63.103945266, -63.046202785,
-63.1002257551, -63.2076065143, -62.9766391316, -62.9639256604, -62.9911452446, -62.9819984159, -62.9693649898,
-63.066770885, -62.9867441519, -62.9566360192, -62.962616287, -62.835080907, -63.0704805194, -62.8796906301,
-63.0725050601, -63.2224345145, -63.1609069526, -63.0614466072, -62.8847887504, -63.1093652381, -62.822694115,
-63.211982035, -63.1689040153],
'latitud':[8.54644405234, 8.54344899107, 8.54223724187, 8.54290207992, 8.49122679072, 8.48386575122, 8.46450360179,
8.46404720757, 8.35310083084, 8.31701565261, 8.30258604829, 8.29974870902, 8.29281679496, 8.28939264064, 8.28785272804,
8.28221439317, 8.27978694565, 8.27864159366, 8.27634987807, 8.27619269053, 8.27236343925, 8.27258932351, 8.26833993531,
8.267530064, 8.26446669791, 8.26266392333, 8.2641092051, 8.26208837315, 8.26034269744, 8.26123972942, 8.25789799656,
8.25825378832, 8.25833002805, 8.25914612933, 8.2540499893, 8.25347956867, 8.2540932736, 8.25405171513, 8.2478564527,
8.24561857662, 8.2440865055, 8.24256528837, 8.24089278, 8.23877286416, 8.23782626443, 8.23865421655, 8.23733824299,
8.23477115627, 8.23552604027, 8.24327920905]}
df = pd.DataFrame(coords)
OSM_C = 'http://c.tile.openstreetmap.org/{z}/{x}/{y}.png'
joint_axes = sns.jointplot(
x='longitud', y='latitud', data=df, ec="r", s=5)
ctx.add_basemap(joint_axes.ax_joint,crs=4326,attribution=False,url=OSM_C)
adjust(hspace=0, wspace=0)
#plt.tight_layout()
plt.show()
Here is an approach that:
removes the axes sharing in the y-direction to be able to change the aspect to 'datalim'
sets the aspect to 'equal', 'datalim'
sets the y data limits of the marginal plot to be the same as the joint plot; this seems to need a redraw
The following code shows the idea (using imshow, as I don't have contextily installed):
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
coords = {'longitud' : [-62.2037376443, -62.1263309099, -62.1111660957, -62.2094232682, -62.2373117384, -62.4837603464, -62.4030570833, -62.3975699059, -62.7017114116, -62.7830883096, -62.7786038141, -62.7683234105, -62.7490101452, -62.7709656745, -63.1002199219, -63.1890252191, -63.1183018549, -63.069960016, -62.7957745659, -63.1715687622, -63.2156105034, -63.0634381954, -63.2243260588, -63.1153871895, -63.1068292891, -63.103945266, -63.046202785, -63.1002257551, -63.2076065143, -62.9766391316, -62.9639256604, -62.9911452446, -62.9819984159, -62.9693649898, -63.066770885, -62.9867441519, -62.9566360192, -62.962616287, -62.835080907, -63.0704805194, -62.8796906301, -63.0725050601, -63.2224345145, -63.1609069526, -63.0614466072, -62.8847887504, -63.1093652381, -62.822694115, -63.211982035, -63.1689040153],
'latitud' : [8.54644405234, 8.54344899107, 8.54223724187, 8.54290207992, 8.49122679072, 8.48386575122, 8.46450360179, 8.46404720757, 8.35310083084, 8.31701565261, 8.30258604829, 8.29974870902, 8.29281679496, 8.28939264064, 8.28785272804, 8.28221439317, 8.27978694565, 8.27864159366, 8.27634987807, 8.27619269053, 8.27236343925, 8.27258932351, 8.26833993531, 8.267530064, 8.26446669791, 8.26266392333, 8.2641092051, 8.26208837315, 8.26034269744, 8.26123972942, 8.25789799656, 8.25825378832, 8.25833002805, 8.25914612933, 8.2540499893, 8.25347956867, 8.2540932736, 8.25405171513, 8.2478564527, 8.24561857662, 8.2440865055, 8.24256528837, 8.24089278, 8.23877286416, 8.23782626443, 8.23865421655, 8.23733824299, 8.23477115627, 8.23552604027, 8.24327920905]}
df = pd.DataFrame(coords)
g = sns.jointplot(data=df, x='longitud', y='latitud')
ctx.add_basemap(g.ax_joint,crs=4326,attribution=False,url=OSM_C)
# g.ax_joint.imshow(np.random.rand(20, 10), cmap='spring', interpolation='bicubic',
# extent=[df['longitud'].min(), df['longitud'].max(), df['latitud'].min(), df['latitud'].max()])
for axes in g.ax_joint.get_shared_y_axes():
for ax in axes:
g.ax_joint.get_shared_y_axes().remove(ax)
g.ax_joint.set_aspect('equal', 'datalim')
g.fig.canvas.draw()
g.ax_marg_y.set_ylim(g.ax_joint.get_ylim())
plt.show()
You can still combine this approach with changing the figure's width or height, or adding more whitespace on top or below.
I have lot of feature in data and i want to make box plot for each feature. So for that
import pandas as pd
import seaborn as sns
plt.figure(figsize=(25,20))
for data in train_df.columns:
plt.subplot(7,4,i+1)
plt.subplots_adjust(hspace = 0.5, wspace = 0.5)
ax =sns.boxplot(train_df[data])
I did this
and the output is
All the plot are on one image i want something like
( not with skew graphs but with box plot )
What changes i need to do ?
In your code, I cannot see where the i is coming from and also it's not clear how ax was assigned.
Maybe try something like this, first an example data frame:
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
train_df = pd.concat([pd.Series(np.random.normal(i,1,100)) for i in range(12)],axis=1)
Set up fig and a flattened ax for each subplot:
fig,ax = plt.subplots(4,3,figsize=(10,10))
ax = ax.flatten()
The most basic would be to call sns.boxplot assigning ax inside the function:
for i,data in enumerate(train_df.columns):
sns.boxplot(train_df[data],ax=ax[i])
I am creating a JointGrid plot using seaborn.
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
mydataset=pd.DataFrame(data=np.random.rand(50,2),columns=['a','b'])
g = sns.JointGrid(x=mydataset['a'], y=mydataset['b'])
g=g.plot_marginals(sns.distplot,color='black',kde=True,hist=False,rug=True,bins=20,label='X')
g=g.plot_joint(plt.scatter,label='X')
legend_properties = {'weight':'bold','size':8}
legendMain=g.ax_joint.legend(prop=legend_properties,loc='upper right')
legendSide=g.ax_marg_x.legend(prop=legend_properties,loc='upper right')
I get this:
I would like to get rid of the legend within the vertical marginal plot (the one on the right side) but keep the one for the horizontal margin.
how to achieve that?
EDIT: The solution from #ImportanceOfBeingErnest works fine for one plot. However, if I repeat it in a for loops something unexpected happens.
I still get a legend in the upper plot and that is unexpected.
How to get rid of it?
The following code:
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
mydataset=pd.DataFrame(data=np.random.rand(50,2),columns=['a','b'])
g = sns.JointGrid(x=mydataset['a'], y=mydataset['b'])
LABEL_LIST=['x','Y','Z']
for n in range(0,3):
g=g.plot_marginals(sns.distplot,color='black',kde=True,hist=False,rug=True,bins=20,label=LABEL_LIST[n])
g=g.plot_joint(plt.scatter,label=LABEL_LIST[n])
legend_properties = {'weight':'bold','size':8}
legendMain=g.ax_joint.legend(prop=legend_properties,loc='upper right')
legendSide=g.ax_marg_y.legend(labels=[LABEL_LIST[n]],prop=legend_properties,loc='upper right')
gives:
which is almost perfect, byt I need to get rid of the last legend entry in the plo on the right.
You may decide not to give any label to the marginals, but instead add the label when creating the legend inside the top marginal axes.
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
mydataset=pd.DataFrame(data=np.random.rand(50,2),columns=['a','b'])
g = sns.JointGrid(x=mydataset['a'], y=mydataset['b'])
g=g.plot_marginals(sns.distplot,color='black',
kde=True,hist=False,rug=True,bins=20)
g=g.plot_joint(plt.scatter,label='X')
legend_properties = {'weight':'bold','size':8}
legendMain=g.ax_joint.legend(prop=legend_properties,loc='upper right')
legendSide=g.ax_marg_x.legend(labels=["x"],
prop=legend_properties,loc='upper right')
plt.show()
The solution is the same for a plot in a loop.
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
mydataset=pd.DataFrame(data=np.random.rand(50,2),columns=['a','b'])
g = sns.JointGrid(x=mydataset['a'], y=mydataset['b'])
LABEL_LIST=['x','Y','Z']
for n in range(0,3):
g=g.plot_marginals(sns.distplot,color='black',kde=True,hist=False,rug=True,bins=20)
g=g.plot_joint(plt.scatter,label=LABEL_LIST[n])
legend_properties = {'weight':'bold','size':8}
legendMain=g.ax_joint.legend(prop=legend_properties,loc='upper right')
legendSide=g.ax_marg_x.legend(labels=LABEL_LIST,prop=legend_properties,loc='upper right')
plt.show()
I tried to save multiple plot in a loop, but It draw them over each other. what should I do?
sample code:
import pandas as pd
import seaborn as sns
data=pd.DataFrame({'a':[1,2,3,4,5,6],'b':[0,1,1,0,1,1],'c':[0,0,0,1,1,1]})
for i in ['b','c']:
img=sns.boxplot(data.a, groupby=data[i])
fig = img.get_figure()
fig.savefig(i)
You need to clear the data from the previous figure which is rolling over in the loop. This should work, noting fig.clf() as the end of each loop:
import pandas as pd
import seaborn as sns
data=pd.DataFrame({'a':[1,2,3,4,5,6],'b':[0,1,1,0,1,1],'c':[0,0,0,1,1,1]})
for i in ['b','c']:
img=sns.boxplot(data.a, groupby=data[i])
fig = img.get_figure()
fig.savefig(i)
fig.clf()