Matplotlib - overlaying line chart on bar chart and aligning yticks - python

I'm trying to plot a line chart over a bar chart, but both the ticks and the actual locations of the points aren't aligned. I would like them to be aligned. (Just a note I'm going to be plotting another set of data similarly (but reversed) on the other side, hence the subplots.)
Here's what I have so far
import matplotlib.pyplot as plt
import numpy as np
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, (ax1, ax2) = plt.subplots(nrows = 1, ncols = 2, sharey=True, figsize = (17,8))
ax1_2 = ax1.twinx()
# y_pos
y_pos = np.arange(len(group))
# plot men
ax1.barh(y_pos, amount1, align = 'center')
ax1_2.plot(amount2, group, color = 'black', marker = 'o')
# ticks
ax1.set_yticks(y_pos)
ax1.set_yticklabels(group)
ax1.invert_xaxis()
ax1.yaxis.tick_right()
# padding
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.05, hspace=None)
plt.show()
plt.close()
I've tried setting the ticks, but the bar graph and line graph seem to have very different notions of that. I've also tried graphing both on ax1, but then the line graph goes way beyond the bar graph and they don't line up at all. I've also tried ax1_2.set_yticks(ax1.get_yticks()) but this has a similar problem.
Any help would be appreciated!

You can plot both in ax1, and remove the y_pos, because at the end both of them share the group variable as y coordinate.
Then, you can add a height to the barh plot.
Here, it is the code:
import matplotlib.pyplot as plt
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, sharey=True, figsize=(17, 8))
# plot men
a = ax1.barh(group, amount1, height=4)
ax1.plot(amount2, group, color='black', marker='o')
# ticks
ax1.set_yticks(group)
ax1.set_yticklabels(group)
ax1.invert_xaxis()
ax1.yaxis.tick_right()
# padding
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.05,
hspace=None)
plt.show()
plt.close()
And an image with the result:

The main problem is that the ylims of both axes aren't aligned. The y of the barh plot goes like 0, 1, 2, 3 till 11. The y of the line plot goes from 0 to 55 in steps of 5. To align them, you could just do ax1_2.set_ylim([y * 5 for y in ax1.get_ylim()]).
An alternative would be to also use ypos for the line graph. Then the limits simply could be copied: ax1_2.set_ylim(ax1.get_ylim()).
Here is the sample code, with the second graph left out for simplicity:
import matplotlib.pyplot as plt
import numpy as np
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, ax1 = plt.subplots()
ax1_2 = ax1.twinx()
# y_pos
y_pos = np.arange(len(group))
# plot men
ax1.barh(y_pos, amount1, align='center', color='turquoise')
ax1_2.plot(amount2, group, color='crimson', marker='o')
ax1_2.set_ylim([y * 5 for y in ax1.get_ylim()])
# ticks
ax1.set_yticks(y_pos)
ax1.set_yticklabels(group)
ax1.yaxis.tick_right()
ax1.invert_xaxis()
plt.show()
The plot now has the 0, 10, 20 ticks darker as they come from ax1_2. Just call ax1_2.set_yticks([]) to remove those.
PS: Still another way, is to forget about ypos and only use group also for the y-axis of ax1. Then the height of the bars needs to be adapted, e.g. to 4.5 as it is now measured in the units of group.
In code:
ax1.barh(group, amount1, align='center', color='turquoise', height=4.5)
ax1_2.plot(amount2, group, color='crimson', marker='o')
ax1_2.set_ylim(ax1.get_ylim()) # no need for labels as the ticks have the same value
ax1.set_yticks(group)
ax1_2.set_yticks([])
ax1.yaxis.tick_right()
ax1.invert_xaxis()

Related

Python: Image Background on a plot

I have this plot in which I can adapt the curve as I want. My problem is I need to draw on an image. I donĀ“t know how to put both together.
1
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
#theta = np.arange(0, 2*np.pi, 0.1)
#r = 1.5
#xs = r*np.cos(theta)
#ys = r*np.sin(theta)
xs = (921, 951, 993, 1035, 1065, 1045, 993, 945)
ys = (1181, 1230, 1243, 1230, 1181, 1130, 1130, 1130)
poly = Polygon(list(zip(xs, ys)), animated=True)
fig, ax = plt.subplots()
ax.add_patch(poly)
p = PolygonInteractor(ax, poly, visible=False)
ax.set_title('Click and drag a point to move it')
ax.set_xlim((800, 1300))
ax.set_ylim((1000, 1300))
plt.show()
Try call ax.imshow before draw the polygon? Like this:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon
from scipy import misc
xs = (21, 51, 93, 135, 100, 90, 21, 10)
ys = (111, 130, 143, 230, 501, 530, 530, 513)
poly = Polygon(list(zip(xs, ys)), color='r')
fig, ax = plt.subplots()
ax.imshow(misc.face(), origin='lower')
ax.add_patch(poly)
# ax.set_xlim([0,2000])
# ax.set_ylim([0,2000])
fig.show()
BTW, your xlim and ylim is also not proper. Your image is in the range of y=0~700, but your polygon is y=1000~1300. You at least need to ax.set_ylim([0,1400]) for your image and polygon shown together.

3D data contour ploting using a kde

I have two Arrays of positional Data (X,Y) and a corresponding 1D Array of Integers (Z) that weighs the positional Data. So my Data set looks like that:
X = [ 507, 1100, 1105, 1080, 378, 398, 373]
Y = [1047, 838, 821, 838, 644, 644, 659]
Z = [ 300, 55, 15, 15, 55, 15, 15]
I want to use that Data to create a KDE thats equivalent to a KDE that gets only X and Y as input but gets the X and Y values Z times. To apply that KDE to a np.mgrid to create a contourplot.
I already got it working by just iterating over the arrays in a FOR Loop and adding Z times X and Y, but that looks to me like a rather inelegant Solution and I hope you can help me to find a better way of doing this.
You could use the weights= parameter of scipy.stats.gaussian_kde:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
import numpy as np
from scipy import stats
X = [ 507, 1100, 1105, 1080, 378, 398, 373]
Y = [1047, 838, 821, 838, 644, 644, 659]
Z = [ 300, 55, 15, 15, 55, 15, 15]
kernel = stats.gaussian_kde(np.array([X, Y]), weights=Z)
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
xs, ys = np.mgrid[0:1500:30j, 0:1500:30j]
zs = kernel(np.array([xs.ravel(), ys.ravel()])).reshape(xs.shape)
ax.plot_surface(xs, ys, zs, cmap="hot_r", lw=0.5, rstride=1, cstride=1, ec='k')
plt.show()

Why I can't smooth this curve by B-spline in python?

I check several different method, but why my curve can't be smoothed as what the others did? Here is my code and image.
from scipy.interpolate import splrep, splev
import matplotlib.pyplot as plt
list_x = [296, 297, 425, 460, 510, 532, 597, 601, 602, 611]
list_y = [2, 12, 67, 15, 21, 2037, 1995, 9, 39, 3]
bspl = splrep(list_x,list_y)
bspl_y = splev(list_x,bspl)
plt.figure()
plt.plot(list_x, bspl_y)
plt.xticks(fontsize = 10)
plt.yticks(fontsize = 10)
plt.show()
You don't see the interpolation, because you give matplotlib the same 10 data points for the interpolated curve that you use for your original data presentation. We have to create a higher resolution curve:
from scipy.interpolate import splrep, splev
import matplotlib.pyplot as plt
import numpy as np
list_x = [296, 297, 425, 460, 510, 521, 597, 601, 602, 611]
list_y = [2, 12, 67, 15, 21, 2037, 1995, 9, 39, 3]
bspl = splrep(list_x,list_y, s=0)
#values for the x axis
x_smooth = np.linspace(min(list_x), max(list_x), 1000)
#get y values from interpolated curve
bspl_y = splev(x_smooth, bspl)
plt.figure()
#original data points
plt.plot(list_x, list_y, 'rx-')
#and interpolated curve
plt.plot(x_smooth, bspl_y, 'b')
plt.xticks(fontsize = 10)
plt.yticks(fontsize = 10)
plt.show()
And this is the output we get:

Mean line on top of bar plot with pandas and matplotlib

I'm trying to plot a Pandas DataFrame, and add a line to show the mean and median. As you can see below, I'm adding a red line for the mean, but it doesn't show.
If I try to draw a green line at 5, it shows at x=190. So apparently the x values are treated as 0, 1, 2, ... rather than 160, 165, 170, ...
How can I draw lines so that their x values match those of the x axis?
From Jupyter:
Full code:
%matplotlib inline
from pandas import Series
import matplotlib.pyplot as plt
heights = Series(
[165, 170, 195, 190, 170,
170, 185, 160, 170, 165,
185, 195, 185, 195, 200,
195, 185, 180, 185, 195],
name='Heights'
)
freq = heights.value_counts().sort_index()
freq_frame = freq.to_frame()
mean = heights.mean()
median = heights.median()
freq_frame.plot.bar(legend=False)
plt.xlabel('Height (cm)')
plt.ylabel('Count')
plt.axvline(mean, color='r', linestyle='--')
plt.axvline(5, color='g', linestyle='--')
plt.show()
Use plt.bar(freq_frame.index,freq_frame['Heights']) to plot your bar plot. Then the bars will be at freq_frame.index positions. Pandas in-build bar function does not allow for specifying positions of the bars, as far as I can tell.
%matplotlib inline
from pandas import Series
import matplotlib.pyplot as plt
heights = Series(
[165, 170, 195, 190, 170,
170, 185, 160, 170, 165,
185, 195, 185, 195, 200,
195, 185, 180, 185, 195],
name='Heights'
)
freq = heights.value_counts().sort_index()
freq_frame = freq.to_frame()
mean = heights.mean()
median = heights.median()
plt.bar(freq_frame.index,freq_frame['Heights'],
width=3,align='center')
plt.xlabel('Height (cm)')
plt.ylabel('Count')
plt.axvline(mean, color='r', linestyle='--')
plt.axvline(median, color='g', linestyle='--')
plt.show()

Multiple Broken Axis On A Histogram in Matplotlib

So I've got some data which I wish to plot via a frequency density (unequal class width) histogram, and via some searching online, I've created this to allow me to do this.
import numpy as np
import matplotlib.pyplot as plt
plt.xkcd()
freqs = np.array([3221, 1890, 866, 529, 434, 494, 382, 92, 32, 7, 7])
bins = np.array([0, 5, 10, 15, 20, 30, 50, 100, 200, 500, 1000, 1500])
widths = bins[1:] - bins[:-1]
heights = freqs.astype(np.float)/widths
plt.xlabel('Cost in Pounds')
plt.ylabel('Frequency Density')
plt.fill_between(bins.repeat(2)[1:-1], heights.repeat(2), facecolor='steelblue')
plt.show()
As you may see however, this data stretches into the thousands on the x axis and on the y axis (density) goes from tiny data (<1) to vast data (>100). To solve this I will need to break both axis. The closest to help I've found so far is this, which I've found hard to use. Would you be able to help?
Thanks, Aj.
You could just use a bar plot. Setting the xtick labels to represent the bin values.
With logarithmic y scale
import numpy as np
import matplotlib.pyplot as plt
plt.xkcd()
fig, ax = plt.subplots()
freqs = np.array([3221, 1890, 866, 529, 434, 494, 382, 92, 32, 7, 7])
freqs = np.log10(freqs)
bins = np.array([0, 5, 10, 15, 20, 30, 50, 100, 200, 500, 1000, 1500])
width = 0.35
ind = np.arange(len(freqs))
rects1 = ax.bar(ind, freqs, width)
plt.xlabel('Cost in Pounds')
plt.ylabel('Frequency Density')
tick_labels = [ '{0} - {1}'.format(*bin) for bin in zip(bins[:-1], bins[1:])]
ax.set_xticks(ind+width)
ax.set_xticklabels(tick_labels)
fig.autofmt_xdate()
plt.show()

Categories

Resources