I'm using Matplotlib to create 2 side-by-side horizontal bar charts showing regression coefficient importance across several words. I'd like to label the y-axes with each word in the list.
Every other word is appended to the y-axis when I try this:
# plot word importance bar graphs
fig, axes = plt.subplots(1,2,figsize=(5,10))
plt.subplots_adjust(wspace = 1)
axes[0].set_title('Low revenue')
axes[0].invert_yaxis()
axes[0].barh(np.arange(len(lowrev_topten)), lowrev_topten['Coefficient'])
axes[0].set_yticklabels(list(lowrev_topten['Word']))
axes[0].set_xlabel('Coefficient')
axes[1].set_title('High revenue')
axes[1].invert_yaxis()
axes[1].barh(np.arange(len(highrev_topten)), highrev_topten['Coefficient'])
axes[1].set_yticklabels(list(highrev_topten['Word']))
axes[1].set_xlabel('Coefficient')
However, when I remind it that I'd like to have 10 ticks for 10 words (plt.yticks(np.arange(0,10))), it fixes the second subplot:
# plot word importance bar graphs
fig, axes = plt.subplots(1,2,figsize=(5,10))
plt.subplots_adjust(wspace = 1)
plt.yticks(np.arange(0,10))
axes[0].set_title('Low revenue')
axes[0].invert_yaxis()
axes[0].barh(np.arange(len(lowrev_topten)), lowrev_topten['Coefficient'])
axes[0].set_yticklabels(list(lowrev_topten['Word']))
axes[0].set_xlabel('Coefficient')
axes[1].set_title('High revenue')
axes[1].invert_yaxis()
axes[1].barh(np.arange(len(highrev_topten)), highrev_topten['Coefficient'])
axes[1].set_yticklabels(list(highrev_topten['Word']))
axes[1].set_xlabel('Coefficient')
How do I get both subplots to have the proper y-tick labels?
Seems like you just need to set_yticks for each subplot.
fig, axes = plt.subplots(1,2,figsize=(5,10))
...
axes[0].set_yticks(np.arange(0,10))
axes[1].set_yticks(np.arange(0,10))
Related
I have the following code, which almost does what I need it to do. I am graphing the importance of each feature for two different models on the same graph for comparison. I can't seem to get them to show side by side as two separate bars. I am fairly new to python and brand new to this forum. here is the code:
def plot_importances1(model1, feature_names1, label1, model2=None,feature_names2=None, label2=None):
if model2 is None:
importances1 = model1.feature_importances_
indices1 = np.argsort(importances1)
plt.figure(figsize=(8, 8)) # Set figure size
# plot the first list of feature importances as a horizontal bar chart
plt.barh(range(len(indices1)), importances1[indices1], color="violet", align="center", label=label1)
# set the y-axis tick labels to be the feature names
plt.yticks(range(len(indices1)), [feature_names1[i] for i in indices1])
else:
importances1 = model1.feature_importances_
indices1 = np.argsort(importances1)
importances2 = model2.feature_importances_
indices2 = np.argsort(importances2)
plt.figure(figsize=(8, 8)) # Set figure size
# plot the first list of feature importances as a horizontal bar chart
plt.barh(range(len(indices1)), importances1[indices1], color="violet", align="center", label=label1)
# plot the second list of feature importances as a horizontal bar chart
plt.barh(range(len(indices2)), importances2[indices2], color="orange", align="center", label=label2)
# set the y-axis tick labels to be the feature names
plt.yticks(range(len(indices1)), [feature_names1[i] for i in indices1])
# add a title and x- and y-axis labels
plt.title("Feature Importances")
plt.xlabel("Relative Importance")
plt.ylabel("Feature")
# add a legend to the plot
plt.legend()
# set the tick locations and labels for the first bar graph
plt.gca().tick_params(axis='x', which='both', length=0)
plt.gca().xaxis.set_ticks_position('top')
plt.gca().xaxis.set_label_position('top')
# set the tick locations and labels for the second bar graph
plt.twinx()
plt.gca().tick_params(axis='x', which='both', length=0)
plt.gca().xaxis.set_ticks_position('bottom')
plt.gca().xaxis.set_label_position('bottom')
plt.show()
Then I call the function:
plot_importances1(
dTree_treat_out,
list(X1_train),
"Outliers present",
dTree,
list(X_train),
"No outliers",
)
The two bars are both showing, but I can't get them to separate completely and I am getting this error:
Output for the code
I have ran several version of this, including one that does not return the matplotlib error. The problem with the other function definitions that I have is that the bars are stacked and I can't see both of them. If I knew how to make one less opaque? I am super stuck. I so not want them stacked, I need the first one to be its own graph with the second one NEXT to it, not overlaying or stacked on top, similar to the image I uploaded, but the bars need to be completely separated.
Any input to fix this issue will be greatly appreciated.
I'm attempting to plot a few subplots. The issue that I'm running into is in labeling the x-axis for each plot since they're all different.
The variables relHazardRate and relHazardFICO are dataframes of size 50 X 2
I attempting to plot the below I'm unable to show the x-axis tick marks (i.e. relHazardRate is a variable ranging from 3% to 6%, and relHazardFICO is a variable ranging from 300-850. Each figure in the subplot will have its own x-axis/ticker (there are 8 such plots) and I have provided my logic for 2 as shown below.
fig, ((ax1, ax2), (ax3, ax4), (ax5, ax6), (ax7, ax8)) = plt.subplots(4, 2,figsize=(12,8))
ax1.plot(relHazardRate['orig_coupon'],relHazardRate['Hazard Multiplier']);
ax1.title.set_text('Original Interest Rate');
ax1.set_xticks(range(len(relHazardRate['orig_coupon'])));
ax1.set_xticklabels(relHazardRate['orig_coupon'].to_list())
ax2.plot(relHazardFICO['orig_FICO'],relHazardFICO['Hazard Multiplier'], 'tab:orange');
ax2.title.set_text('Original FICO');
ax2.set_xticks(range(len(relHazardRate['orig_FICO'])));
ax2.set_xticklabels(relHazardRate['orig_FICO'].to_list())
ax.3 through ax.8 follow a similar decleration as the described above
for ax in fig.get_axes():
ax.label_outer()
The subplot that I get is as follows, I want to label each plot with its own x-axis, as shown this is not happening.
Remove the lines with label_outer.
From the docs:
label_outer()
Only show "outer" labels and tick labels.
x-labels are only kept for subplots on the last row; y-labels only for subplots on the first column
Clearly this is what is causing the behaviour you see in your plot
I have a data frame with a related salary to the major.
I am trying to create horizontal bar charts of the majors sorted by salary.
My code looks like this:
fig, ax = plt.subplots()
topTenMajor = df[['Major','Salary']].sort_values('Salary', ascending=False).set_index('Major')
topTenMajor.sort_values('Salary', ascending=True).plot.barh(figsize=(5,10))
ax.set_title('Majors by Salary')
ax.set_xlabel('Salary')
ax.set_ylabel('Majors')
However, my chart shows one emptly plots on top with title, x label and y label,
and then a horizontal barchart under the empty plots without title and labels.
Why is this happening?
Thanks for any help!
barh will plot in a new figure / axes by default.
Either you need to tell it to plot in the fig, ax you created before.
Or you can set title and labels in the active figure automatically created:
topTenMajor = df[['Major','Salary']].sort_values('Salary', ascending=False).set_index('Major')
topTenMajor.sort_values('Salary', ascending=True).plot.barh(figsize=(5,10))
plt.title('Majors by Salary')
plt.xlabel('Salary')
plt.ylabel('Majors')
I have a dataframe with ~120 features that I would like to examine by year. I am plotting each feature, x = year, y = feature value within a loop. Whilst these plot successfully, the charts are illegible as they are totally squashed.
I have tried using plt.tight_layout() and adjusting the figure size using plt.rcParams['figure.figsize'] but sadly to no avail
for i in range(len(roll_df.columns)):
plt.subplot(len(roll_df.columns), 1, i+1)
name = roll_df.columns[i]
plt.plot(roll_df[name])
plt.title(name, y=0)
plt.yticks([])
plt.xticks([])
plt.tight_layout()
plt.show()
The loop runs but all plots are so squashed on the y-axis as to become illegible:
Matplotlib will not automatically adjust the size of your figure. So if you add more subplots below each other, it will split the available space instead of extending the figure. That's why your y axes are so narrow.
You could try to define the figure size beforehand, or determine the figure size based on how many subplots you have:
n_plots = roll_df.shape[1]
fig, axes = plt.subplots(n_plots, 1, figsize=(8, 4 * n_plots), tight_layout=True)
# Then your usual part, but plot on the created axes
for i in range(n_plots):
name = roll_df.columns[i]
axes[i].plot(roll_df[name])
axes[i].title(name, y=0)
axes[i].yticks([])
axes[i].xticks([])
plt.show()
Looking to add in vertical space between plotted graphs to allow a X-Axis label to show:
Each graph needs to have space to show the day, currently the last 2 graphs are the only one's that show simply because the graphs are overlapping it.
Also curious if I could actually remove the notch labels for the X-Axis for the graphs above the one's marked Thursday/Friday, i.e. the bottom X-axis is the only one that shows. Same for the Y-Axis, but only the graphs on the left having the scale shown.
*Unfortunately I can't post an image to show this since I don't have enough rep.
Code snippet:
import mathlib.pyplot as pyplot
fig = pyplot.figure()
ax1 = fig.add_subplot(4,2,1)
ax1.set_yscale('log')
ax2 = fig.add_subplot(4,2,2, sharex=ax1, sharey=ax1)
ax3 = fig.add_subplot(4,2,3, sharex=ax2, sharey=ax2)
ax4 = fig.add_subplot(4,2,4, sharex=ax3, sharey=ax3)
ax5 = fig.add_subplot(4,2,5, sharex=ax4, sharey=ax4)
ax6 = fig.add_subplot(4,2,6, sharex=ax5, sharey=ax5)
ax7 = fig.add_subplot(4,2,7, sharex=ax6, sharey=ax6)
ax1.plot(no_dict["Saturday"],'k.-',label='Saturday')
ax1.set_xlabel('Saturday')
ax1.axis([0,24,0,10000])
pyplot.suptitle('Title')
pyplot.xlabel('Hour in 24 Hour Format')
ax2.plot(no_dict["Sunday"],'b.-',label='Sunday')
ax2.set_xlabel('Sunday')
...
Use subplots_adjust. In your case this looks good:
fig.subplots_adjust(hspace=.5)
to remove the tick labels do this:
ax1.set_xticklabels([])
Similar for the yticklabels. However, you cannot share the x-axis with the plots that do have tick labels.
To change the spacing around a certain subplot, instead of all of them, you can adjust the position of the axes of that subplot using:
bbox=plt.gca().get_position()
offset=-.03
plt.gca().set_position([bbox.x0, bbox.y0 + offset, bbox.x1-bbox.x0, bbox.y1 - bbox.y0])
If offset < 0, the subplot is moved down. If offset > 0, the subplot is moved up.
Note that the subplot will disappear if offset is so big that the new position of the subplot overlaps with another subplot.