Facet box plots by additional category in Python - python

I have created a box plot in Python with the Seaborn package (sns) as shown in the code below where the x-axis is age ranges and the y-axis is a dollar value number ('AveMonthSpend').
def plot_box(df, col, col_y ='AveMonthSpend'):
sns.set_style("whitegrid")
sns.boxplot(col, col_y, data=df)
plt.xlabel(col) # Set text for the x axis
plt.ylabel(col_y)# Set text for y axis
plt.show()
plot_box(df_3, 'age_range')
What I would like to do is facet this box plot by a third column - 'Gender' with the two values being ['M', 'F']. I have read this site and have tried many options, but none seem to work.
Click here for image of box plots
Following is one of the options I tried:
g = sns.FacetGrid(pd.melt(df, id_vars='Gender'), col='Gender')
g.map(sns.boxplot, 'age_range', 'AveMonthSpend')
For this I got the following error:
KeyError: "['age_range' 'AveMonthSpend'] not in index"
Any suggestions would be greatly appreciated. Thank you!

Related

Matplotlib Label issue

I have dataframe and would like to plot the histogram for 2 variables. Below syntax gives the required plot however labels do not appear. Any help would appreciate.
colors = ['tan', 'lime']
labels = ['Score 1','Score 2']
plt.hist([test.score_1,test.score_2],bins = 10,histtype='bar',color=colors, label=labels)
Here is the plot
enter image description here

How to plot the legend of a set of data with different color label in Mathplotlib

I have a 1:1 plot in which the dot colour are different based on the condition (A-F), which comes from the same data frame column.
df is a data frame with data for every 1 min. df60 is a data frame with data for every 1 hour.
plt.figure()
colors = {'A':'green', 'B':'aqua', 'C':'blue','D':'black','E':'yellow','F':'red'}
x = df['Method1'].loc['2020-01-01 00:00':'2020-01-15 23:59'].resample('h').mean()
y = df['Method2'].loc['2020-01-01 00:00':'2020-01-15 23:59'].resample('h').mean()
plt.scatter(x, y, c=df60['Method1'].loc['2020-01-01 00:00':'2020-01-15 23:59'].map(colors))
plt.show()
I have tried to plot the legend showing that which is A-F. However, since the data comes from the same column, it does not show what I am expecting. Are there any methods which help me to show the legend properly without breaking the column into several columns?
You can define the legend manually by, for instance:
handles=[Line2D([0],[0],label=k,marker="o",markerfacecolor=v,markeredgecolor=v,linestyle="None") for k,v in colors.items()]
plt.legend(handles=handles)
This should produce:
I hope this helps. Not really sure if there is a more elegant solution, though...

How to use column values for x axis labels in matplotlib

I have a basic DataFrame in pandas and using matplotlib to create a chart
I have followed advice found on SO and also on the docs for labelling the values on the x axis but they won't change from the indices.
I have this,
Presc_df_asc = Presc_df.sort_values('Total Items',ascending=True)
Presc_df_asc['Total Items'].plot.bar(x="Practice", ylim=[Presc_df_asc['Total Items'].min(), Presc_df_asc['Total Items'].max()])
plt.xlabel('Practice')
plt.ylabel('Total Items')
plt.title('practice total items')
plt.legend(('Items',),loc='upper center')
From what I have found plot.bar(x="Practice" should set the x-axis to show the values int he practice column under each bar.
But no matter what I try I get the x-axis labelled as indices with just the main label saying Practices.
In order for the plotting command to be able to access the "Practice" column, you need to apply the plot function to the entire dataframe (or a sub_dataframe that contains at least these two columns). The code below uses the corresponding labels below each bar. The rot=0 argument prevents the labels from being rotated by 90°.
Presc_df_asc.plot.bar(x="Practice", y ="Total Items",
ylim=[Presc_df_asc['Total Items'].min(),
Presc_df_asc['Total Items'].max()], rot=0)

python plot how to adjust a lengthy legend [duplicate]

I have a data file which consists of 131 columns and 4 rows. I am plotting it into python as follows
df = pd.read_csv('data.csv')
df.plot(figsize = (15,10))
Once it is plotted, all 131 legends are coming together like a huge tower over the line plots.
Please see the image here, which I have got :
Link to Image, I have clipped after v82 for better understanding
I have found some solutions on Stackoverflow (SO) to shift legend anywhere in the plot but I could not find any solution to break this legend tower into multiple small-small pieces and stack them one beside another.
Moreover, I want my plot something look like this
My desired plot :
Any help would be appreciable. Thank you.
You can specify the position of the legend in relative coordinates using loc and use ncol parameter to split the single legend column into multiple columns. To do so, you need an axis handle returned by the df.plot
df = pd.read_csv('data.csv')
ax = df.plot(figsize = (10,7))
ax.legend(loc=(1.01, 0.01), ncol=4)
plt.tight_layout()

python matplotlib: add labels and set their colours correctly for a stackplot

I am creating a stacked plot. I understand, from experimenting myself and from researching online, that adding labels to a stacked plot is messy, but I have managed to pull it off with the code below.
My question is: how do I retrieve the color cycle used to create the stacked plot, so that I can assign the right colours to the legend?
Right now field 1 is blueish, field 2 greenish, but both labels appear in the first colour. I can force specific colours to both the plot and the legends, but I quite like the default colour cycle and would like to keep using it.
df=pd.DataFrame(np.ones((10,2)),columns=['field1','field2'])
fig,ax=plt.subplots()
plt.suptitle('This is a plot of ABCDEF')
ax.stackplot(df.index,df.field1,df.field2]
patch1=matplotlib.patches.Patch(color='red',label= 'field 1')
patch2=matplotlib.patches.Patch(color='blue', label ='field 2')
plt.legend(handles=[patch1,patch2])
The closest to a solution I have found is: Get matplotlib color cycle state but, if I understand correctly, the order of the colours is not preserved. The problem is that
ax._get_lines.color_cycle
returns an iterator, not a list, so I can't easily do something like
colour of patch 1 = ax._get_lines.color_cycle[0]
Thanks!
You can get the colors from the polycollection object made by stackplot:
fields = ax.stackplot(df.index,df.field1,df.field2)
colors = [field.get_facecolor()[0] for field in fields]
patch1=mpl.patches.Patch(color=colors[0],label= 'field 1')
patch2=mpl.patches.Patch(color=colors[1], label ='field 2')

Categories

Resources