Overplot the mean line in Python - python

I'd like to get two lines (red and green) with the average of my data points in green and average of my data points in red. I'm using the following code, but it's not working. It's only showing the red and green data points, without the red average line
sns.set(rc={"figure.figsize": (16, 8)})
ax = events_all_metrics[["event_name","kambi_payback"]].plot(x="event_name", style='.',use_index=False, color ='green')
events_all_metrics[["event_name","pinny_payback"]].plot(x="event_name",style='.', color='red', ax=ax)
plt.tick_params(
axis='x', # changes apply to the x-axis
which='both', # both major and minor ticks are affected
bottom='off', # ticks along the bottom edge are off
top='off', # ticks along the top edge are off
labelbottom='off')
plt.legend(loc=4, prop={'size': 15})
pinny_mean = events_all_metrics["pinny_payback"].mean()
ax.plot(pinny_mean, label='Pinny Mean', linestyle='--', color='red')
plt.show()

This is not working because your pinny_mean is a single value in y. plot needs points in y and x. In this case I recommend you use plt.axhline instead of plot. It plots a line of constant y that covers the whole range in x. For your example:
plt.axhline(y=pinny_mean, label='Pinny Mean', linestyle='--', color='red')

Related

Bar plot with multi level X axis and Twin axis in matplotlib

I want to create a bar plot with two level of X axis and have a twin axis as well. The plot I managed to get is as below
I want to do two changes in the above plot:
Remove Y labels from the middle Y axis
Have bar plots side by side and not stacked on one above the other.
My code is as below:
stresses = df_rootg_multilevel.index.levels[0] #Get no. of categories of stress(3 in this case)
nplots = stresses.size #Get number of sub plots (3 in this case for 3 stress)
plots_width_ratios = [df_rootg_multilevel.xs(stress).index.size for stress in stresses] #Get relative size for each plot. 4 in yhis case as there are 4 strains in each stress
fig, axes = plt.subplots(nrows=1, ncols=nplots,sharey=True, figsize=(10, 4),
gridspec_kw = dict(width_ratios=plots_width_ratios, wspace=0))
alpha = 0.3 # used for grid lines, bottom spine and separation lines between zones
for stress, ax in zip(stresses, axes):
ax1=ax.twinx()
#df_rootg_multilevel.xs(stress).plot.bar(ax=ax, legend=None, zorder=2)
# Create bar chart with grid lines and no spines except bottom one
df_rootg_multilevel.xs(stress)['rl_mean'].plot.bar(ax=ax, legend=None, zorder=2,color='red')
df_rootg_multilevel.xs(stress)['rfw_mean'].plot.bar(ax=ax1, legend=None, zorder=2,color='green')
#df_rootg_multilevel.xs(stress)['rdw_mean'].plot.bar(ax=ax, legend=None, zorder=2,color='blue')
ax.grid(axis='y', zorder=1, color='black', alpha=alpha)
for spine in ['top', 'left', 'right']:
ax.spines[spine].set_visible(False)
ax.spines['bottom'].set_alpha(alpha)
# Set and place x labels for factory zones
ax.set_xlabel(stress)
ax.xaxis.set_label_coords(x=0.5, y=-0.2)
# Format major tick labels for factory names: note that because this figure is
# only about 10 inches wide, I choose to rewrite the long names on two lines.
ticklabels = [name.replace(' ', '\n') if len(name) > 10 else name
for name in df_rootg_multilevel.xs(stress).index]
ax.set_xticklabels(ticklabels, rotation=0, ha='center')
ax.tick_params(axis='both', length=0, pad=7)
# Set and format minor tick marks for separation lines between zones: note
# that except for the first subplot, only the right tick mark is drawn to avoid
# duplicate overlapping lines so that when an alpha different from 1 is chosen
# (like in this example) all the lines look the same
if ax.is_first_col():
ax.set_xticks([*ax.get_xlim()], minor=True)
else:
ax.set_xticks([ax.get_xlim()[1]], minor=True)
ax.tick_params(which='minor', length=55, width=0.8, color=[0, 0, 0, alpha])
#ax1.get_yaxis().set_visible(False)
# Add legend using the labels and handles from the last subplot
fig.legend(*ax.get_legend_handles_labels(), frameon=False, loc=(0.08, 0.77))
My dataframe looks like below:
How can i do the required modifications.

Giving distinctive color for each curve in my plot

I was wondering if I can give a distinctive color/marker for each curve in my plot, my plot has 20 different curves and it is hard to differentiate between the curves because some of them have the same colors. Is there a way to control the color for each curve? or give different marker for each curve then they become distinctive?
here is my plotting code:
plt.figure(figsize=(15,15))
for cluster_index in [0,1,2]:
plt.subplot(3,1,cluster_index + 1)
for index, row in data1.iterrows():
if row.iloc[-1] == cluster_index:
plt.plot(row.iloc[1:-1] ,marker='v', alpha=1)
plt.legend(loc="right")
plt.plot(kmeans.cluster_centers_[cluster_index], color='k' ,marker='o', alpha=1)
ax = plt.gca()
ax.tick_params(axis = 'x', which = 'major', labelsize = 8)
plt.xticks(rotation='vertical')
plt.ylabel('Consumption in Winter-2019')
plt.title(f'Cluster {cluster_index}', fontsize=20)
plt.tight_layout()
plt.show()
plt.close()
and here is a photo for the specific output that I wanna edit the color of the curves in it:
The output for the second cluster has 20 curves and I want to give 20 different marker or controlling the color somehow to make it more visually appear. Please any help

Plotting multiple horizontal lines for each distribution in strip plot subplots Matplotlib

I'm trying to plot the average calculated values as a line through the center of each plotted distribution for my data set.
My code looks like this:
for plot, var in zip(range(1, plot_num+1), var_list):
ax = fig.add_subplot(2, 2, plot)
# calculate averages
sns.stripplot(x=cluster_index_sample[cluster_type], y=cluster_index_sample[var],
jitter=jitter, linewidth=line_width, alpha=alpha, cmap=RS_colorwheel,
size=marker_size, ax=ax)
# create average lines
ax.axhline(y=cluster_index_sample['Average_'+(var)].iloc[0],
linewidth=3, xmin=0.2, xmax=0.5)
ax.set_ylabel(str(var), fontsize=y_lab)
ax.set_xlabel('')
ax.tick_params(axis='both', which='major', pad=10)
But when I plot this the horizontal lines only appear once per cluster_type (x-axis category).
How can I get it so that each set of numbered categorical values gets their own respective averages?
Since you did not provide a MCVE, I can't run your code. Nevertheless, you can try using a second for loop to iterate through all the variables for plotting the horizontal average line as follows. You will also have to modify the xmin and xmax for each line. I leave that up to you.
for plot, var in zip(range(1, plot_num+1), var_list):
ax = fig.add_subplot(2, 2, plot)
sns.stripplot(x=cluster_index_sample[cluster_type], y=cluster_index_sample[var],
jitter=jitter, linewidth=line_width, alpha=alpha, cmap=RS_colorwheel,
size=marker_size, ax=ax)
for v in var_list: # <--- Added here
ax.axhline(y=cluster_index_sample['Average_'+(v)].iloc[0],
linewidth=3, xmin=0.2, xmax=0.5) # <--- Added here
ax.set_ylabel(str(var), fontsize=y_lab)
ax.set_xlabel('')
ax.tick_params(axis='both', which='major', pad=10)

Setting the X Axes Limit in Matplotlib 1.4.3

I am trying to zoom in on a section of my plot. I used the following code to produce the high level plot below.
fig = poll_df.plot('Start Date', 'Difference',figsize=(12,4),marker='o',linestyle='-',color='purple')
# Now add the debate markers
plt.axvline(x=403+2, linewidth=4, color='grey')
plt.axvline(x=403+10, linewidth=4, color ='grey')
plt.axvline(x=403+21, linewidth=4, color='grey')
plt.show()
The vertical grey bars are in the right locations and I want to zoom in on the plot (basically to show the month of October). I modified the plot to add the xlim parameters as below.
fig = poll_df.plot('Start Date', 'Difference',figsize=(12,4), marker='o',linestyle='-',color='purple',xlim=(403,433))
# Now add the debate markers
plt.axvline(x=403+2, linewidth=4, color='grey')
plt.axvline(x=403+10, linewidth=4, color ='grey')
plt.axvline(x=403+21, linewidth=4, color='grey')
plt.show()
However, this gives me a totally different plot (see below). I have tried all sorts of variations and still can't seem to get it to work. It looks as if the vertical bars would be in the right places if the axis labels reflected the month of October.
Why did the plot not rescale the x labels?

Major and minor grid lines and ticks using matplotlib

I have two big intergers
min=round(raw_min,-5) # a negative number
max=round(raw_max,-5)
from which I get a range of interesting ticks:
xticks=np.arange(min,max,500000)
On the x-axis, I want to have minor ticks (including labels) for the xticks range. Furthermore, I want to have a major tick and grid line at the value 0. I tried to add:
minorLocator = FixedLocator(xticks)
majorLocator = FixedLocator([0])
ax.xaxis.set_major_locator(majorLocator)
ax.xaxis.set_major_formatter(FormatStrFormatter('%d'))
ax.xaxis.set_minor_locator(minorLocator)
plt.tick_params(which='both', width=1)
plt.tick_params(which='major', length=7, color='b')
plt.tick_params(which='minor', length=4, color='r')
ax.yaxis.grid(True)
ax.xaxis.grid(b=True,which='major', color='b', linestyle='-')
but it doesn't work...
No ticks for the minors and no grid line for the major.
Any ideas?
Seems like I was missing the following line:
plt.grid(b=True,which='both')

Categories

Resources