Multiple Broken Axis On A Histogram in Matplotlib - python

So I've got some data which I wish to plot via a frequency density (unequal class width) histogram, and via some searching online, I've created this to allow me to do this.
import numpy as np
import matplotlib.pyplot as plt
plt.xkcd()
freqs = np.array([3221, 1890, 866, 529, 434, 494, 382, 92, 32, 7, 7])
bins = np.array([0, 5, 10, 15, 20, 30, 50, 100, 200, 500, 1000, 1500])
widths = bins[1:] - bins[:-1]
heights = freqs.astype(np.float)/widths
plt.xlabel('Cost in Pounds')
plt.ylabel('Frequency Density')
plt.fill_between(bins.repeat(2)[1:-1], heights.repeat(2), facecolor='steelblue')
plt.show()
As you may see however, this data stretches into the thousands on the x axis and on the y axis (density) goes from tiny data (<1) to vast data (>100). To solve this I will need to break both axis. The closest to help I've found so far is this, which I've found hard to use. Would you be able to help?
Thanks, Aj.

You could just use a bar plot. Setting the xtick labels to represent the bin values.
With logarithmic y scale
import numpy as np
import matplotlib.pyplot as plt
plt.xkcd()
fig, ax = plt.subplots()
freqs = np.array([3221, 1890, 866, 529, 434, 494, 382, 92, 32, 7, 7])
freqs = np.log10(freqs)
bins = np.array([0, 5, 10, 15, 20, 30, 50, 100, 200, 500, 1000, 1500])
width = 0.35
ind = np.arange(len(freqs))
rects1 = ax.bar(ind, freqs, width)
plt.xlabel('Cost in Pounds')
plt.ylabel('Frequency Density')
tick_labels = [ '{0} - {1}'.format(*bin) for bin in zip(bins[:-1], bins[1:])]
ax.set_xticks(ind+width)
ax.set_xticklabels(tick_labels)
fig.autofmt_xdate()
plt.show()

Related

Connecting a non-linear axis in matplotlib with spatial coordinates

I am hoping to graph data that looks something like:
import matplotlib.pyplot as plt
x = [0, 350, 40, 55, 60]
y = [0, 20, 40, 10, 20]
plt.scatter(x,y);
Gives something like this:
However I would like to change this so the axes run from 180 to 360 and then from 0 to 180 all in the same figure. Essentially I want connect 360 to 0 in the center of the figure.
There might be something creative you can do with matplotlib.units, but I often find that interface to be quite clunky.
I'm not 100% certain the result you want, but from your description it sounds like you want a plot in cartesian coordinates with an xaxis that goes from 180 → 360 → 180. Unfortunately this is not directly doable with a single Axes in matplotlib (without playing around with the units above).
Thankfully, you can stitch together 2 plots to get the desired end result that you want:
import matplotlib.pyplot as plt
x = [0, 350, 40, 55, 60]
y = [0, 20, 40, 10, 20]
fig, (ax1, ax2) = plt.subplots(1, 2, sharey=True, grid
spec_kw={"wspace": 0})
ax1.scatter(x, y, clip_on=False)
ax2.scatter(x, y, clip_on=False)
ax1.set_xlim(180, 360)
ax1.set_xticks([180, 240, 300, 360])
ax1.spines["right"].set_visible(False)
ax2.set_xlim(0, 180)
ax2.set_xticks([60, 120, 180])
ax2.yaxis.set_visible(False)
ax2.spines["left"].set_visible(False)
plt.show()
The trick for the above is that I actually plotted all of the data twice (.scatter(...)), laid those plots out next to eachother ({'wspace': 0}) and then limited their data view (.set_xlim) to make it appear as a seamless plot that goes from 180 → 360 → 180.
You may also be asking for a plot not in cartesian coordinates, but in polar coordinates. In that case you can use the following code:
import matplotlib.pyplot as plt
from numpy import deg2rad
x = [0, 350, 40, 55, 60]
y = [0, 20, 40, 10, 20]
fig, ax = plt.subplots(subplot_kw={"projection": "pola
r"})
ax.scatter(deg2rad(x), y)
ax.set_yticks([0, 20, 40, 60])
plt.show()
Most people would plot that as -180 to 180?
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(2, 1)
x = np.arange(0, 360, 10)
y = x * 1
y[x>180] = y[x>180] - 360
ax[0].scatter(x, np.abs(y), c=x)
ax[1].scatter(y, np.abs(y), c=x)
plt.show()

Matplotlib - overlaying line chart on bar chart and aligning yticks

I'm trying to plot a line chart over a bar chart, but both the ticks and the actual locations of the points aren't aligned. I would like them to be aligned. (Just a note I'm going to be plotting another set of data similarly (but reversed) on the other side, hence the subplots.)
Here's what I have so far
import matplotlib.pyplot as plt
import numpy as np
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, (ax1, ax2) = plt.subplots(nrows = 1, ncols = 2, sharey=True, figsize = (17,8))
ax1_2 = ax1.twinx()
# y_pos
y_pos = np.arange(len(group))
# plot men
ax1.barh(y_pos, amount1, align = 'center')
ax1_2.plot(amount2, group, color = 'black', marker = 'o')
# ticks
ax1.set_yticks(y_pos)
ax1.set_yticklabels(group)
ax1.invert_xaxis()
ax1.yaxis.tick_right()
# padding
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.05, hspace=None)
plt.show()
plt.close()
I've tried setting the ticks, but the bar graph and line graph seem to have very different notions of that. I've also tried graphing both on ax1, but then the line graph goes way beyond the bar graph and they don't line up at all. I've also tried ax1_2.set_yticks(ax1.get_yticks()) but this has a similar problem.
Any help would be appreciated!
You can plot both in ax1, and remove the y_pos, because at the end both of them share the group variable as y coordinate.
Then, you can add a height to the barh plot.
Here, it is the code:
import matplotlib.pyplot as plt
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, sharey=True, figsize=(17, 8))
# plot men
a = ax1.barh(group, amount1, height=4)
ax1.plot(amount2, group, color='black', marker='o')
# ticks
ax1.set_yticks(group)
ax1.set_yticklabels(group)
ax1.invert_xaxis()
ax1.yaxis.tick_right()
# padding
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.05,
hspace=None)
plt.show()
plt.close()
And an image with the result:
The main problem is that the ylims of both axes aren't aligned. The y of the barh plot goes like 0, 1, 2, 3 till 11. The y of the line plot goes from 0 to 55 in steps of 5. To align them, you could just do ax1_2.set_ylim([y * 5 for y in ax1.get_ylim()]).
An alternative would be to also use ypos for the line graph. Then the limits simply could be copied: ax1_2.set_ylim(ax1.get_ylim()).
Here is the sample code, with the second graph left out for simplicity:
import matplotlib.pyplot as plt
import numpy as np
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, ax1 = plt.subplots()
ax1_2 = ax1.twinx()
# y_pos
y_pos = np.arange(len(group))
# plot men
ax1.barh(y_pos, amount1, align='center', color='turquoise')
ax1_2.plot(amount2, group, color='crimson', marker='o')
ax1_2.set_ylim([y * 5 for y in ax1.get_ylim()])
# ticks
ax1.set_yticks(y_pos)
ax1.set_yticklabels(group)
ax1.yaxis.tick_right()
ax1.invert_xaxis()
plt.show()
The plot now has the 0, 10, 20 ticks darker as they come from ax1_2. Just call ax1_2.set_yticks([]) to remove those.
PS: Still another way, is to forget about ypos and only use group also for the y-axis of ax1. Then the height of the bars needs to be adapted, e.g. to 4.5 as it is now measured in the units of group.
In code:
ax1.barh(group, amount1, align='center', color='turquoise', height=4.5)
ax1_2.plot(amount2, group, color='crimson', marker='o')
ax1_2.set_ylim(ax1.get_ylim()) # no need for labels as the ticks have the same value
ax1.set_yticks(group)
ax1_2.set_yticks([])
ax1.yaxis.tick_right()
ax1.invert_xaxis()

Is it possible to change the frequency of ticks on a pyplot INDEPENDENT of length of data set and zoom?

When I plot data using matplotlib I always have 5-9 ticks on my x-axis independent of the range I plot, and if I zoom on the x-axis the tick spacing decreases, so I still see 5-9 ticks.
however, I would like 20-30 ticks on my x-axis!
I can achieve this with the following:
from matplotlib import pyplot as plt
import numpy as np
x = [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]
y = [1, 4, 3, 2, 7, 6, 9, 8, 10, 5]
number_of_ticks_on_x_axis = 20
plt.plot(x, y)
plt.xticks(np.arange(min(x), max(x)+1, (max(x) - min(x))/number_of_ticks_on_x_axis))
plt.show()
If I now zoom on the x-axis, no new ticks appear between the existing ones. I would like to still have ~20 ticks however much I zoom.
Assuming that you want to fix the no. of ticks on the X axis
...
from matplotlib.ticker import MaxNLocator
...
fig, ax = plt.subplots()
ax.xaxis.set_major_locator(MaxNLocator(15, min_n_ticks=15))
...
Please look at the docs for MaxNLocator
Example
In [36]: import numpy as np
...: import matplotlib.pyplot as plt
In [37]: from matplotlib.ticker import MaxNLocator
In [38]: fig, ax = plt.subplots(figsize=(10,4))
In [39]: ax.grid()
In [40]: ax.xaxis.set_major_locator(MaxNLocator(min_n_ticks=15))
In [41]: x = np.linspace(0, 1, 51)
In [42]: y = x*(1-x)
In [43]: plt.plot(x, y)
Out[43]: [<matplotlib.lines.Line2D at 0x7f9eab409e10>]
gives
and when I zoom into the maximum of the curve I get
You can link a callback function to an event in the canvas. In you case you can trigger a function that updates the axis when a redraw occurs.
from matplotlib import pyplot as plt
import numpy as np
x = [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]
y = [1, 4, 3, 2, 7, 6, 9, 8, 10, 5]
n = 20
plt.plot(x, y)
plt.xticks(np.arange(min(x), max(x)+1, (max(x) - min(x))/n), rotation=90)
def on_zoom(event):
ax = plt.gca()
fig = plt.gcf()
x_min, x_max = ax.get_xlim()
ax.set_xticks(np.linspace(x_min, x_max, n))
# had to add flush_events to get the ticks to redraw on the last update.
fig.canvas.flush_events()
fig = plt.gcf()
fig.canvas.mpl_disconnect(cid)
cid = fig.canvas.mpl_connect('draw_event', on_zoom)

Why I can't smooth this curve by B-spline in python?

I check several different method, but why my curve can't be smoothed as what the others did? Here is my code and image.
from scipy.interpolate import splrep, splev
import matplotlib.pyplot as plt
list_x = [296, 297, 425, 460, 510, 532, 597, 601, 602, 611]
list_y = [2, 12, 67, 15, 21, 2037, 1995, 9, 39, 3]
bspl = splrep(list_x,list_y)
bspl_y = splev(list_x,bspl)
plt.figure()
plt.plot(list_x, bspl_y)
plt.xticks(fontsize = 10)
plt.yticks(fontsize = 10)
plt.show()
You don't see the interpolation, because you give matplotlib the same 10 data points for the interpolated curve that you use for your original data presentation. We have to create a higher resolution curve:
from scipy.interpolate import splrep, splev
import matplotlib.pyplot as plt
import numpy as np
list_x = [296, 297, 425, 460, 510, 521, 597, 601, 602, 611]
list_y = [2, 12, 67, 15, 21, 2037, 1995, 9, 39, 3]
bspl = splrep(list_x,list_y, s=0)
#values for the x axis
x_smooth = np.linspace(min(list_x), max(list_x), 1000)
#get y values from interpolated curve
bspl_y = splev(x_smooth, bspl)
plt.figure()
#original data points
plt.plot(list_x, list_y, 'rx-')
#and interpolated curve
plt.plot(x_smooth, bspl_y, 'b')
plt.xticks(fontsize = 10)
plt.yticks(fontsize = 10)
plt.show()
And this is the output we get:

Add a string to an x-axis of integers

I'm plotting a graph on a x axis (solution concentration) against efficiency (y). I have this set up to display for x between 0 to 100, but I want to add another datapoint as a control, without any solution at all. I'm having issues as this doesn't really fit anywhere on the concentration axis, but Id like to add it either before 0 or after 100, potentially with a break in the axis to separate them. So my x-axis would look like ['control', 0, 20, 40, 60, 80, 100]
MWE:
x_array = ['control', 0, 20, 40, 50, 100]
y_array = [1, 2, 3, 4, 5, 6]
plt.plot(x_array, y_array)
Trying this, I get an error of:
ValueError: could not convert string to float: 'control'
Any ideas how i could make something like this work? Ive looked at xticks but that would plot the x axis as strings, therefore losing the continuity of the axis, which would mess up the plot as the datapoints are not spaced equidistant.
You can add a single point to your graph as a separate call to plot, then adjust the x-axis labels.
import matplotlib.pyplot as plt
x_array = [0, 20, 40, 50, 100]
y_array = [2, 3, 4, 5, 6]
x_con = -20
y_con = 1
x_ticks = [-20, 0, 20, 40, 60, 80, 100]
x_labels = ['control', 0, 20, 40, 60, 80, 100]
fig, ax = plt.subplots(1,1)
ax.plot(x_array, y_array)
ax.plot(x_con, y_con, 'ro') # add a single red dot
# set tick positions, adjust label text
ax.xaxis.set_ticks(x_ticks)
ax.xaxis.set_ticklabels(x_labels)
ax.set_xlim(x_con-10, max(x_array)+3)
ax.set_ylim(0,7)
plt.show()

Categories

Resources