I'm trying to plot a Pandas DataFrame, and add a line to show the mean and median. As you can see below, I'm adding a red line for the mean, but it doesn't show.
If I try to draw a green line at 5, it shows at x=190. So apparently the x values are treated as 0, 1, 2, ... rather than 160, 165, 170, ...
How can I draw lines so that their x values match those of the x axis?
From Jupyter:
Full code:
%matplotlib inline
from pandas import Series
import matplotlib.pyplot as plt
heights = Series(
[165, 170, 195, 190, 170,
170, 185, 160, 170, 165,
185, 195, 185, 195, 200,
195, 185, 180, 185, 195],
name='Heights'
)
freq = heights.value_counts().sort_index()
freq_frame = freq.to_frame()
mean = heights.mean()
median = heights.median()
freq_frame.plot.bar(legend=False)
plt.xlabel('Height (cm)')
plt.ylabel('Count')
plt.axvline(mean, color='r', linestyle='--')
plt.axvline(5, color='g', linestyle='--')
plt.show()
Use plt.bar(freq_frame.index,freq_frame['Heights']) to plot your bar plot. Then the bars will be at freq_frame.index positions. Pandas in-build bar function does not allow for specifying positions of the bars, as far as I can tell.
%matplotlib inline
from pandas import Series
import matplotlib.pyplot as plt
heights = Series(
[165, 170, 195, 190, 170,
170, 185, 160, 170, 165,
185, 195, 185, 195, 200,
195, 185, 180, 185, 195],
name='Heights'
)
freq = heights.value_counts().sort_index()
freq_frame = freq.to_frame()
mean = heights.mean()
median = heights.median()
plt.bar(freq_frame.index,freq_frame['Heights'],
width=3,align='center')
plt.xlabel('Height (cm)')
plt.ylabel('Count')
plt.axvline(mean, color='r', linestyle='--')
plt.axvline(median, color='g', linestyle='--')
plt.show()
Related
What's causing these errors?
-----------Code-----------
from matplotlib import pyplot as plt
plt.style.use('bmh')
ages_x = [15, 19, 24, 29, 29, 34, 39, 44, 49, 54, 59, 64]
py_dev_y = [60, 560, 2110, 2760, 1930, 1190, 620, 340, 200, 120, 100]
plt.plot(ages_x, py_dev_y, '#5a7d9a', marker='o', linewidth=3, label='Python')
plt.title("Fuck it, here it is")
plt.legend(['All Devs', 'Python'])
plt.tight_layout()
plt.ylabel('Amount of devs who took the survey')
plt.xlabel('Ages')
plt.show()
----------Error------------
Traceback (most recent call last):
File "C:/Users/usr/PycharmProjects/pythonProject5/main.py", line 10, in <module>
plt.plot(ages_x, py_dev_y, '#5a7d9a', marker='o', linewidth=3, label='Python')
File "C:\Users\usr\PycharmProjects\pythonProject5\venv\lib\site-packages\matplotlib\pyplot.py", line 2840, in plot
return gca().plot(
File "C:\Users\usr\PycharmProjects\pythonProject5\venv\lib\site-packages\matplotlib\axes\_axes.py", line 1743, in plot
lines = [*self._get_lines(*args, data=data, **kwargs)]
File "C:\Users\usr\PycharmProjects\pythonProject5\venv\lib\site-packages\matplotlib\axes\_base.py", line 273, in __call__
yield from self._plot_args(this, kwargs)
File "C:\Users\usr\PycharmProjects\pythonProject5\venv\lib\site-packages\matplotlib\axes\_base.py", line 399, in _plot_args
raise ValueError(f"x and y must have same first dimension, but "
ValueError: x and y must have same first dimension, but have shapes (12,) and (1,)
-----------What I want it to do---------
I'm trying to get it to display a graph that shows ages and the amount of people that took the survey in a line graph
You have a different number of ages_x (12 values in the list) vs. py_dev_y (11 values in the list). I just added an extra element in py_dev_y (90 as the last list value) and plot works.
from matplotlib import pyplot as plt
plt.style.use('bmh')
ages_x = [15, 19, 24, 29, 29, 34, 39, 44, 49, 54, 59, 64]
py_dev_y = [60, 560, 2110, 2760, 1930, 1190, 620, 340, 200, 120, 100, 90]
plt.plot(ages_x, py_dev_y, '#5a7d9a', marker='o', linewidth=3, label='Python')
plt.title("Fuck it, here it is")
plt.legend(['All Devs', 'Python'])
plt.tight_layout()
plt.ylabel('Amount of devs who took the survey')
plt.xlabel('Ages')
plt.show()
Resulting plot:
I'm trying to plot a line chart over a bar chart, but both the ticks and the actual locations of the points aren't aligned. I would like them to be aligned. (Just a note I'm going to be plotting another set of data similarly (but reversed) on the other side, hence the subplots.)
Here's what I have so far
import matplotlib.pyplot as plt
import numpy as np
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, (ax1, ax2) = plt.subplots(nrows = 1, ncols = 2, sharey=True, figsize = (17,8))
ax1_2 = ax1.twinx()
# y_pos
y_pos = np.arange(len(group))
# plot men
ax1.barh(y_pos, amount1, align = 'center')
ax1_2.plot(amount2, group, color = 'black', marker = 'o')
# ticks
ax1.set_yticks(y_pos)
ax1.set_yticklabels(group)
ax1.invert_xaxis()
ax1.yaxis.tick_right()
# padding
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.05, hspace=None)
plt.show()
plt.close()
I've tried setting the ticks, but the bar graph and line graph seem to have very different notions of that. I've also tried graphing both on ax1, but then the line graph goes way beyond the bar graph and they don't line up at all. I've also tried ax1_2.set_yticks(ax1.get_yticks()) but this has a similar problem.
Any help would be appreciated!
You can plot both in ax1, and remove the y_pos, because at the end both of them share the group variable as y coordinate.
Then, you can add a height to the barh plot.
Here, it is the code:
import matplotlib.pyplot as plt
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, sharey=True, figsize=(17, 8))
# plot men
a = ax1.barh(group, amount1, height=4)
ax1.plot(amount2, group, color='black', marker='o')
# ticks
ax1.set_yticks(group)
ax1.set_yticklabels(group)
ax1.invert_xaxis()
ax1.yaxis.tick_right()
# padding
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.05,
hspace=None)
plt.show()
plt.close()
And an image with the result:
The main problem is that the ylims of both axes aren't aligned. The y of the barh plot goes like 0, 1, 2, 3 till 11. The y of the line plot goes from 0 to 55 in steps of 5. To align them, you could just do ax1_2.set_ylim([y * 5 for y in ax1.get_ylim()]).
An alternative would be to also use ypos for the line graph. Then the limits simply could be copied: ax1_2.set_ylim(ax1.get_ylim()).
Here is the sample code, with the second graph left out for simplicity:
import matplotlib.pyplot as plt
import numpy as np
group = [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55]
amount1 = [967, 975, 1149, 1022, 852, 975, 1025, 1134, 994, 1057, 647, 1058]
amount2 = [286, 364, 111, 372, 333, 456, 258, 152, 400, 181, 221, 441]
f, ax1 = plt.subplots()
ax1_2 = ax1.twinx()
# y_pos
y_pos = np.arange(len(group))
# plot men
ax1.barh(y_pos, amount1, align='center', color='turquoise')
ax1_2.plot(amount2, group, color='crimson', marker='o')
ax1_2.set_ylim([y * 5 for y in ax1.get_ylim()])
# ticks
ax1.set_yticks(y_pos)
ax1.set_yticklabels(group)
ax1.yaxis.tick_right()
ax1.invert_xaxis()
plt.show()
The plot now has the 0, 10, 20 ticks darker as they come from ax1_2. Just call ax1_2.set_yticks([]) to remove those.
PS: Still another way, is to forget about ypos and only use group also for the y-axis of ax1. Then the height of the bars needs to be adapted, e.g. to 4.5 as it is now measured in the units of group.
In code:
ax1.barh(group, amount1, align='center', color='turquoise', height=4.5)
ax1_2.plot(amount2, group, color='crimson', marker='o')
ax1_2.set_ylim(ax1.get_ylim()) # no need for labels as the ticks have the same value
ax1.set_yticks(group)
ax1_2.set_yticks([])
ax1.yaxis.tick_right()
ax1.invert_xaxis()
I'm to Python and learning it by doing. I want to make two plots with matplotlib in Python. The second plot keeps the limits of first one. Wonder how I can change the limits of each next plot from previous. Any help, please. What is the recommended method?
X1 = [80, 100, 120, 140, 160, 180, 200, 220, 240, 260]
Y1 = [70, 65, 90, 95, 110, 115, 120, 140, 155, 150]
from matplotlib import pyplot as plt
plt.plot(
X1
, Y1
, color = "green"
, marker = "o"
, linestyle = "solid"
)
plt.show()
X2 = [80, 100, 120, 140, 160, 180, 200]
Y2 = [70, 65, 90, 95, 110, 115, 120]
plt.plot(
X2
, Y2
, color = "green"
, marker = "o"
, linestyle = "solid"
)
plt.show()
There are two ways:
The quick and easy way; set the x and y limits in each plot to what you want.
plt.xlim(60,200)
plt.ylim(60,200)
(for example). Just paste those two lines just before both plt.show() and they'll be the same.
The harder, but better way and this is using subplots.
# create a figure object
fig = plt.figure()
# create two axes within the figure and arrange them on the grid 1x2
ax1 = fig.add_Subplot(121)
# ax2 is the second set of axes so it is 1x2, 2nd plot (hence 122)
# they won't have the same limits this way because they are set up as separate objects, whereas in your example they are the same object that is being re-purposed each time!
ax2 = fig.add_Subplot(122)
ax1.plot(X1,Y1)
ax2.plot(X2,Y2)
Here is one way for you using subplot where plt.subplot(1, 2, 1) means a figure with 1 row (first value) and 2 columns (second value) and 1st subfigure (third value in the bracket, meaning left column in this case). plt.subplot(1, 2, 2) means subplot in the 2nd column (right column in this case).
This way, each figure will adjust the x- and y-limits according to the data. There are another ways to do the same thing. Here is a SO link for you.
from matplotlib import pyplot as plt
fig = plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
X1 = [80, 100, 120, 140, 160, 180, 200, 220, 240, 260]
Y1 = [70, 65, 90, 95, 110, 115, 120, 140, 155, 150]
plt.plot(X1, Y1, color = "green", marker = "o", linestyle = "solid")
# plt.plot(X1, Y1, '-go') Another alternative to plot in the same style
plt.subplot(1, 2, 2)
X2 = [80, 100, 120, 140, 160, 180, 200]
Y2 = [70, 65, 90, 95, 110, 115, 120]
plt.plot(X2, Y2, color = "green", marker = "o", linestyle = "solid")
# plt.plot(X2, Y2, '-go') Another alternative to plot in the same style
Output
I check several different method, but why my curve can't be smoothed as what the others did? Here is my code and image.
from scipy.interpolate import splrep, splev
import matplotlib.pyplot as plt
list_x = [296, 297, 425, 460, 510, 532, 597, 601, 602, 611]
list_y = [2, 12, 67, 15, 21, 2037, 1995, 9, 39, 3]
bspl = splrep(list_x,list_y)
bspl_y = splev(list_x,bspl)
plt.figure()
plt.plot(list_x, bspl_y)
plt.xticks(fontsize = 10)
plt.yticks(fontsize = 10)
plt.show()
You don't see the interpolation, because you give matplotlib the same 10 data points for the interpolated curve that you use for your original data presentation. We have to create a higher resolution curve:
from scipy.interpolate import splrep, splev
import matplotlib.pyplot as plt
import numpy as np
list_x = [296, 297, 425, 460, 510, 521, 597, 601, 602, 611]
list_y = [2, 12, 67, 15, 21, 2037, 1995, 9, 39, 3]
bspl = splrep(list_x,list_y, s=0)
#values for the x axis
x_smooth = np.linspace(min(list_x), max(list_x), 1000)
#get y values from interpolated curve
bspl_y = splev(x_smooth, bspl)
plt.figure()
#original data points
plt.plot(list_x, list_y, 'rx-')
#and interpolated curve
plt.plot(x_smooth, bspl_y, 'b')
plt.xticks(fontsize = 10)
plt.yticks(fontsize = 10)
plt.show()
And this is the output we get:
I'm having a really strange issue with matplotlib. Plotting some points looks like this:
When I switch to a log scale on the y-axis, some of the points are not connected:
Is this a bug? Am I missing something? Code is below. Comment out the log scale line to see the first graph.
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111)
x = [1.0, 2.0, 3.01, 4.01, 5.01, 6.01, 7.04, 8.04, 9.04, 10.05,
11.05, 12.09, 13.17, 14.18, 15.73, 16.74, 17.74, 18.9, 19.91,
20.94, 22.05, 23.15, 24.33, 25.48, 26.51, 27.58, 28.86, 29.93,
30.93, 32.23, 33.25, 34.26, 35.27, 36.29, 37.33, 38.35, 39.36,
40.37, 41.37]
y = [552427, 464338, 446687, 201960, 227238, 265140, 148903, 134851,
172234, 120263, 115385, 100671, 164542, 171176, 28, 356, 0, 0,
195, 313, 9, 0, 132, 0, 249, 242, 81, 217, 159, 140, 203, 215,
171, 141, 154, 114, 99, 97, 97]
ax1.plot(x, y, c='b', marker='o')
ax1.set_yscale('log')
plt.ylim((-50000, 600000))
plt.show()
log(0) is undefined. I'm guessing matplotlib just ignores the NaNs which crop up here.
You can try to use ax1.set_yscale('symlog')