Python Matplotlib fill from fill_between() method shifted - python

I am trying to fill the area between the curve and x=0 with Matplotlib fill_between() method. For some reason, the shading appears to be shifted to the left, as can be seen from the picture. I should mention that this is one of 3 subplots I plot next to each other horizontally.
This is the code I use
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
from matplotlib import pyplot as plt
fig, axs = plt.subplots(1, 3)
fig.tight_layout()
ticks_x1 = [-400,1200,4]
major_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],ticks_x1[2]+1)
minor_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],5*ticks_x1[2]+1)
axs[1].plot(V, -z, 'tab:cyan', linewidth=2, solid_capstyle='round')
axs[1].set_title('Querkraft')
axs[1].set(xlabel='V [kN]')
axs[1].set_xticks(major_ticks_x1)
axs[1].set_xticks(minor_ticks_x1,minor=True);
axs[1].fill_between(
x= V,
y1= -z,
color= "cyan",
alpha= 0.2)
How could I fix that? Thank you.

I believe it's because the start of your function is not zero (I think that would be V[0] in your case).
Here's how I can reproduce your problem and (sort of) fix it (using a simple function):
import matplotlib as mpl
import numpy as np
from matplotlib import pyplot as plt
mpl.rcParams['figure.dpi'] = 300
fig, axs = plt.subplots(1, 3)
fig.tight_layout()
ticks_x1 = [-400,1200,4]
major_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],ticks_x1[2]+1)
minor_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],5*ticks_x1[2]+1)
# V[0] will not be zero
z = np.linspace(-3,3, 1000)
V= np.sin(z)
axs[0].plot(V, -z, 'tab:cyan', linewidth=2, solid_capstyle='round')
axs[0].set_title('Querkraft')
axs[0].set(xlabel='V [kN]')
axs[0].set_xticks(major_ticks_x1)
axs[0].set_xticks(minor_ticks_x1,minor=True);
axs[0].grid()
axs[0].fill_between(
x= V,
y1= -z,
color= "cyan",
alpha= 0.2)
print(V[0])
Okay - that creates the graph with the "offset" fill_between (see below)
Now construct V so that V[0] will be zero
z = np.linspace(-np.pi,np.pi, 1000)
V= np.sin(z)
axs[1].plot(V, -z, 'tab:cyan', linewidth=2, solid_capstyle='round')
axs[0].set_title('Querkraft'.format(V.min()))
axs[1].set(xlabel='V [kN]')
axs[1].set_xticks(major_ticks_x1)
axs[1].set_xticks(minor_ticks_x1,minor=True);
axs[1].grid()
axs[1].fill_between(
x= V,
y1= -z,
color= "cyan",
alpha= 0.2)
This is what those two graphs look like:
I believe the reason for this is that fill_between is intended for horizontal fill_between, where you're filling between two y values - often the x axis is implicitly the origin of the fill_between.
When you work horizontally, the x in your curve is more obviously defining where the fill between should start and stop. When you use fill_between vertically like this, it is not quite as obvious.
Here's a more typical (horizontal) use of fill_between, that makes it more clear that the beginning and end of the curve are defining where to start and stop the fill:
Now of course, I only 'sort of' fixed your problem, because I changed the range of my function so that V[0] was zero - you might be constrained from doing that. But you can at least see why the fill_between was acting as it was.

Ok, I think I have found a solution. Matplotlib has a method fill_betweenx().
Now the code looks like this
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
from matplotlib import pyplot as plt
fig, axs = plt.subplots(1, 3)
fig.tight_layout()
ticks_x1 = [-400,1200,4]
major_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],ticks_x1[2]+1)
minor_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],5*ticks_x1[2]+1)
axs[1].plot(V, -z, 'tab:cyan', linewidth=2, solid_capstyle='round')
axs[1].set_title('Querkraft')
axs[1].set(xlabel='V [kN]')
axs[1].set_xticks(major_ticks_x1)
axs[1].set_xticks(minor_ticks_x1,minor=True);
axs[1].fill_betweenx(
y= -z,
x1= V,
color= "cyan",
alpha= 0.2)
and produces the desired result. I guess it did not work with the firs method because the graph is vertical instead of horizontal.

Related

Overlapping y axis lable in matplotlib

I have these code here to create an xgboost feature importance plot with more than 40 variables :
plot_importance(xgb_model)
plt.show()
However, I got a plot with overlapping y-axis labels and it was hard to read. The figsize=() argument did not seem to work.
Is there a way to make this plot readable?
Definitely go with figsize. You can see that because if you interactively change the window size you observe that the ticks labels d on't overlap anymore.
You can also change the font properties, see https://stackoverflow.com/a/11386056/13636407.
import numpy as np
import matplotlib.pyplot as plt
def plot_sin(figsize):
x = np.linspace(0, 4 * np.pi)
y = np.sin(x)
fig, ax = plt.subplots(figsize=figsize)
ax.plot(x, y)
ax.set_yticks(np.arange(-1.15, 1.15, 0.05))
ax.set_title(f"{figsize = }")
plot_sin(figsize=(12, 4))
plot_sin(figsize=(12, 10))
plt.show()

Matplotlib: How can I show only exponents in the y tick labels of a semi-log plot with secondary_yaxis()?

I've been working on matplotlib's secondary-yaxis and I can't figure out how I should set "functions" parameter in order to get the result that I want.
I want to make a semi-log plot and set set the labels of y-ticks in the 2 following formats:
ordinary format such as "10^1, 10^2, 10^3, ..., 10^(exponent), ..."
the exponents only: "1, 2, 3, ..."
And I want to put them in the former style in the y-axis of left side, and the latter right side.
What I want to do can be done by using twinx() like this:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(1, 3, 41)
y = 10**x
fig, ax1 = plt.subplots()
ax1.set_yscale('log')
ax1.plot(x, y)
ax2 = ax1.twinx()
ymin, ymax = ax1.get_ylim()
ax2.set_ylim(np.log10(ymin), np.log10(ymax))
plt.show()
You would see that i=(1, 2, 3) in the right label is located at the same height as 10^i in the left label.
However, I want to know how to do the same thing by secondary_yaxis. I've tried this but it didn't work.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(1, 3, 41)
y = 10**x
fig, ax = plt.subplots()
ax.set_yscale('log')
ax.plot(x, y)
def forward(x):
return np.log10(x)
def backward(x):
return 10**x
secax = ax.secondary_yaxis('right', functions=(forward, backward))
plt.show()
It resulted in this:
You can see right-side tick labels are broken. I suspect that my way of setting the parameter "functions" of secondary_yaxis() might be invalid. I would appreciate it if you tell me how to do it.
I get the broken figure on matplotlib 3.1.0. and updating it to 3.3.0. has solved the problem. The same code as the second code block of the question generates this.
enter image description here

Python quiver plot without head

I'd like to make a quiver plot without the heads of the arrows. I also want to have borders so that the arrows could stand out of the background color plot. Here is the main part of my code trying to produce such a plot:
plt.quiver(phia[sl1,sl2], R0a[sl1,sl2],u,v, color='white', headlength=0, headwidth = 1, pivot = 'middle', scale = scale, width=width, linewidth = 0.5)
The plot is in polar axis if this matters. This works for most of the lines except for those that are very short. Some artificial tails from the border are produced after the lines in those cases. One of the plots I generated that suffers the most from this is the following:
Any solutions to this problem or suggestions to bypass it will be greatly appreciated! Thanks!
Specifying the headaxislength parameter for the arrows to be zero does the trick:
import numpy as np
import matplotlib.pyplot as plt
theta = np.linspace(0, 2*np.pi, 16)
r = np.linspace(0, 1, 6)
x = np.cos(theta)[:,np.newaxis]*r
y = np.sin(theta)[:,np.newaxis]*r
quiveropts = dict(color='white', headlength=0, pivot='middle', scale=3,
linewidth=.5, units='xy', width=.05, headwidth=1) # common options
f, (ax1, ax2) = plt.subplots(1,2, sharex=True, sharey=True)
ax1.quiver(x,y, -y, x, headaxislength=4.5, **quiveropts) # the default
ax2.quiver(x,y, -y, x, headaxislength=0, **quiveropts)
The code above results in the following quiverplots, without arrowheads.

How to draw axis in the middle of the figure?

I want to draw a figure in matplotib where the axis are displayed within the plot itself not on the side
I have tried the following code from here:
import math
import numpy as np
import matplotlib.pyplot as plt
def sigmoid(x):
a = []
for item in x:
a.append(1/(1+math.exp(-item)))
return a
x = np.arange(-10., 10., 0.2)
sig = sigmoid(x)
plt.plot(x,sig)
plt.show()
The above code displays the figure like this:
What I would like to draw is something as follows (image from Wikipedia)
This question describes a similar problem, but it draws a reference line in the middle but no axis.
One way to do it is using spines:
import math
import numpy as np
import matplotlib.pyplot as plt
def sigmoid(x):
a = []
for item in x:
a.append(1/(1+math.exp(-item)))
return a
x = np.arange(-10., 10., 0.2)
sig = sigmoid(x)
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
# Move left y-axis and bottom x-axis to centre, passing through (0,0)
ax.spines['left'].set_position('center')
ax.spines['bottom'].set_position('center')
# Eliminate upper and right axes
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
# Show ticks in the left and lower axes only
ax.xaxis.set_ticks_position('bottom')
ax.yaxis.set_ticks_position('left')
plt.plot(x,sig)
plt.show()
shows:
Basically, I want to comment on the accepted answer (but my rep doesn't allow that).
The use of
ax.spines['bottom'].set_position('center')
draws the x-axes such that it intersect the y-axes in its center. In case of asymmetric ylim this means that x-axis passes NOT through y=0. Jblasco's answer has this drawback, the intersect is at y=0.5 (the center between ymin=0.0 and ymax=1.0)
However, the reference plot of the original question has axes that intersect each other at 0.0 (which is somehow conventional or at least common).
To achieve this behaviour,
ax.spines['bottom'].set_position('zero')
has to be used.
See the following example, where 'zero' makes the axes intersect at 0.0 despite asymmetrically ranges in both x and y.
import numpy as np
import matplotlib.pyplot as plt
#data generation
x = np.arange(-10,20,0.2)
y = 1.0/(1.0+np.exp(-x)) # nunpy does the calculation elementwise for you
fig, [ax0, ax1] = plt.subplots(ncols=2, figsize=(8,4))
# Eliminate upper and right axes
ax0.spines['top'].set_visible(False)
ax0.spines['right'].set_visible(False)
# Show ticks on the left and lower axes only
ax0.xaxis.set_tick_params(bottom='on', top='off')
ax0.yaxis.set_tick_params(left='on', right='off')
# Move remaining spines to the center
ax0.set_title('center')
ax0.spines['bottom'].set_position('center') # spine for xaxis
# - will pass through the center of the y-values (which is 0)
ax0.spines['left'].set_position('center') # spine for yaxis
# - will pass through the center of the x-values (which is 5)
ax0.plot(x,y)
# Eliminate upper and right axes
ax1.spines['top'].set_visible(False)
ax1.spines['right'].set_visible(False)
# Show ticks on the left and lower axes only (and let them protrude in both directions)
ax1.xaxis.set_tick_params(bottom='on', top='off', direction='inout')
ax1.yaxis.set_tick_params(left='on', right='off', direction='inout')
# Make spines pass through zero of the other axis
ax1.set_title('zero')
ax1.spines['bottom'].set_position('zero')
ax1.spines['left'].set_position('zero')
ax1.set_ylim(-0.4,1.0)
# No ticklabels at zero
ax1.set_xticks([-10,-5,5,10,15,20])
ax1.set_yticks([-0.4,-0.2,0.2,0.4,0.6,0.8,1.0])
ax1.plot(x,y)
plt.show()
Final remark: If ax.spines['bottom'].set_position('zero') is used but zerois not within the plotted y-range, then the axes is shown at the boundary of the plot closer to zero.
The title of this question is how to draw the spine in the middle and the accepted answer does exactly that but what you guys draw is the sigmoid function and that one passes through y=0.5. So I think what you want is the spine centered according to your data. Matplotlib offers the spine position data for that (see documentation)
import numpy as np
import matplotlib.pyplot as plt
def sigmoid(x):
return 1 / (1 + np.exp(-x))
sigmoid = np.vectorize(sigmoid) #vectorize function
values=np.linspace(-10, 10) #generate values between -10 and 10
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
#spine placement data centered
ax.spines['left'].set_position(('data', 0.0))
ax.spines['bottom'].set_position(('data', 0.0))
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
plt.plot(values, sigmoid(values))
plt.show()
Looks like this (Github):
You can simply add:
plt.axhline()
plt.axvline()
It's not fixed to the center, but it does the job very easily.
Working example:
import matplotlib.pyplot as plt
import numpy as np
def f(x):
return np.sin(x) / (x/100)
delte = 100
Xs = np.arange(-delte, +delte +1, step=0.01)
Ys = np.array([f(x) for x in Xs])
plt.axhline(color='black', lw=0.5)
plt.axvline(color='black', lw=0.5)
plt.plot(Xs, Ys)
plt.show()
If you use matplotlib >= 3.4.2, you can use Pandas syntax and do it in only one line:
plt.gca().spines[:].set_position('center')
You might find it cleaner to do it in 3 lines:
ax = plt.gca()
ax.spines[['top', 'right']].set_visible(False)
ax.spines[['left', 'bottom']].set_position('center')
See documentation here.
Check your matplotlib version with pip freeze and update it with pip install -U matplotlib.
According to latest MPL Documentation:
ax = plt.axes()
ax.spines.left.set_position('zero')
ax.spines.bottom.set_position('zero')

Plot two histograms on single chart with matplotlib

I created a histogram plot using data from a file and no problem. Now I wanted to superpose data from another file in the same histogram, so I do something like this
n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)
but the problem is that for each interval, only the bar with the highest value appears, and the other is hidden. I wonder how could I plot both histograms at the same time with different colors.
Here you have a working example:
import random
import numpy
from matplotlib import pyplot
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]
bins = numpy.linspace(-10, 10, 100)
pyplot.hist(x, bins, alpha=0.5, label='x')
pyplot.hist(y, bins, alpha=0.5, label='y')
pyplot.legend(loc='upper right')
pyplot.show()
The accepted answers gives the code for a histogram with overlapping bars, but in case you want each bar to be side-by-side (as I did), try the variation below:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-deep')
x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
bins = np.linspace(-10, 10, 30)
plt.hist([x, y], bins, label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()
Reference: http://matplotlib.org/examples/statistics/histogram_demo_multihist.html
EDIT [2018/03/16]: Updated to allow plotting of arrays of different sizes, as suggested by #stochastic_zeitgeist
In the case you have different sample sizes, it may be difficult to compare the distributions with a single y-axis. For example:
import numpy as np
import matplotlib.pyplot as plt
#makes the data
y1 = np.random.normal(-2, 2, 1000)
y2 = np.random.normal(2, 2, 5000)
colors = ['b','g']
#plots the histogram
fig, ax1 = plt.subplots()
ax1.hist([y1,y2],color=colors)
ax1.set_xlim(-10,10)
ax1.set_ylabel("Count")
plt.tight_layout()
plt.show()
In this case, you can plot your two data sets on different axes. To do so, you can get your histogram data using matplotlib, clear the axis, and then re-plot it on two separate axes (shifting the bin edges so that they don't overlap):
#sets up the axis and gets histogram data
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.hist([y1, y2], color=colors)
n, bins, patches = ax1.hist([y1,y2])
ax1.cla() #clear the axis
#plots the histogram data
width = (bins[1] - bins[0]) * 0.4
bins_shifted = bins + width
ax1.bar(bins[:-1], n[0], width, align='edge', color=colors[0])
ax2.bar(bins_shifted[:-1], n[1], width, align='edge', color=colors[1])
#finishes the plot
ax1.set_ylabel("Count", color=colors[0])
ax2.set_ylabel("Count", color=colors[1])
ax1.tick_params('y', colors=colors[0])
ax2.tick_params('y', colors=colors[1])
plt.tight_layout()
plt.show()
As a completion to Gustavo Bezerra's answer:
If you want each histogram to be normalized (normed for mpl<=2.1 and density for mpl>=3.1) you cannot just use normed/density=True, you need to set the weights for each value instead:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
x_w = np.empty(x.shape)
x_w.fill(1/x.shape[0])
y_w = np.empty(y.shape)
y_w.fill(1/y.shape[0])
bins = np.linspace(-10, 10, 30)
plt.hist([x, y], bins, weights=[x_w, y_w], label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()
As a comparison, the exact same x and y vectors with default weights and density=True:
You should use bins from the values returned by hist:
import numpy as np
import matplotlib.pyplot as plt
foo = np.random.normal(loc=1, size=100) # a normal distribution
bar = np.random.normal(loc=-1, size=10000) # a normal distribution
_, bins, _ = plt.hist(foo, bins=50, range=[-6, 6], normed=True)
_ = plt.hist(bar, bins=bins, alpha=0.5, normed=True)
Here is a simple method to plot two histograms, with their bars side-by-side, on the same plot when the data has different sizes:
def plotHistogram(p, o):
"""
p and o are iterables with the values you want to
plot the histogram of
"""
plt.hist([p, o], color=['g','r'], alpha=0.8, bins=50)
plt.show()
Plotting two overlapping histograms (or more) can lead to a rather cluttered plot. I find that using step histograms (aka hollow histograms) improves the readability quite a bit. The only downside is that in matplotlib the default legend for a step histogram is not properly formatted, so it can be edited like in the following example:
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
from matplotlib.lines import Line2D
rng = np.random.default_rng(seed=123)
# Create two normally distributed random variables of different sizes
# and with different shapes
data1 = rng.normal(loc=30, scale=10, size=500)
data2 = rng.normal(loc=50, scale=10, size=1000)
# Create figure with 'step' type of histogram to improve plot readability
fig, ax = plt.subplots(figsize=(9,5))
ax.hist([data1, data2], bins=15, histtype='step', linewidth=2,
alpha=0.7, label=['data1','data2'])
# Edit legend to get lines as legend keys instead of the default polygons
# and sort the legend entries in alphanumeric order
handles, labels = ax.get_legend_handles_labels()
leg_entries = {}
for h, label in zip(handles, labels):
leg_entries[label] = Line2D([0], [0], color=h.get_facecolor()[:-1],
alpha=h.get_alpha(), lw=h.get_linewidth())
labels_sorted, lines = zip(*sorted(leg_entries.items()))
ax.legend(lines, labels_sorted, frameon=False)
# Remove spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# Add annotations
plt.ylabel('Frequency', labelpad=15)
plt.title('Matplotlib step histogram', fontsize=14, pad=20)
plt.show()
As you can see, the result looks quite clean. This is especially useful when overlapping even more than two histograms. Depending on how the variables are distributed, this can work for up to around 5 overlapping distributions. More than that would require the use of another type of plot, such as one of those presented here.
It sounds like you might want just a bar graph:
http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html
http://matplotlib.sourceforge.net/examples/pylab_examples/barchart_demo.html
Alternatively, you can use subplots.
There is one caveat when you want to plot the histogram from a 2-d numpy array. You need to swap the 2 axes.
import numpy as np
import matplotlib.pyplot as plt
data = np.random.normal(size=(2, 300))
# swapped_data.shape == (300, 2)
swapped_data = np.swapaxes(x, axis1=0, axis2=1)
plt.hist(swapped_data, bins=30, label=['x', 'y'])
plt.legend()
plt.show()
Also an option which is quite similar to joaquin answer:
import random
from matplotlib import pyplot
#random data
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]
#plot both histograms(range from -10 to 10), bins set to 100
pyplot.hist([x,y], bins= 100, range=[-10,10], alpha=0.5, label=['x', 'y'])
#plot legend
pyplot.legend(loc='upper right')
#show it
pyplot.show()
Gives the following output:
Just in case you have pandas (import pandas as pd) or are ok with using it:
test = pd.DataFrame([[random.gauss(3,1) for _ in range(400)],
[random.gauss(4,2) for _ in range(400)]])
plt.hist(test.values.T)
plt.show()
This question has been answered before, but wanted to add another quick/easy workaround that might help other visitors to this question.
import seasborn as sns
sns.kdeplot(mydata1)
sns.kdeplot(mydata2)
Some helpful examples are here for kde vs histogram comparison.
Inspired by Solomon's answer, but to stick with the question, which is related to histogram, a clean solution is:
sns.distplot(bar)
sns.distplot(foo)
plt.show()
Make sure to plot the taller one first, otherwise you would need to set plt.ylim(0,0.45) so that the taller histogram is not chopped off.

Categories

Resources