Related
I am trying to fill the area between the curve and x=0 with Matplotlib fill_between() method. For some reason, the shading appears to be shifted to the left, as can be seen from the picture. I should mention that this is one of 3 subplots I plot next to each other horizontally.
This is the code I use
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
from matplotlib import pyplot as plt
fig, axs = plt.subplots(1, 3)
fig.tight_layout()
ticks_x1 = [-400,1200,4]
major_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],ticks_x1[2]+1)
minor_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],5*ticks_x1[2]+1)
axs[1].plot(V, -z, 'tab:cyan', linewidth=2, solid_capstyle='round')
axs[1].set_title('Querkraft')
axs[1].set(xlabel='V [kN]')
axs[1].set_xticks(major_ticks_x1)
axs[1].set_xticks(minor_ticks_x1,minor=True);
axs[1].fill_between(
x= V,
y1= -z,
color= "cyan",
alpha= 0.2)
How could I fix that? Thank you.
I believe it's because the start of your function is not zero (I think that would be V[0] in your case).
Here's how I can reproduce your problem and (sort of) fix it (using a simple function):
import matplotlib as mpl
import numpy as np
from matplotlib import pyplot as plt
mpl.rcParams['figure.dpi'] = 300
fig, axs = plt.subplots(1, 3)
fig.tight_layout()
ticks_x1 = [-400,1200,4]
major_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],ticks_x1[2]+1)
minor_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],5*ticks_x1[2]+1)
# V[0] will not be zero
z = np.linspace(-3,3, 1000)
V= np.sin(z)
axs[0].plot(V, -z, 'tab:cyan', linewidth=2, solid_capstyle='round')
axs[0].set_title('Querkraft')
axs[0].set(xlabel='V [kN]')
axs[0].set_xticks(major_ticks_x1)
axs[0].set_xticks(minor_ticks_x1,minor=True);
axs[0].grid()
axs[0].fill_between(
x= V,
y1= -z,
color= "cyan",
alpha= 0.2)
print(V[0])
Okay - that creates the graph with the "offset" fill_between (see below)
Now construct V so that V[0] will be zero
z = np.linspace(-np.pi,np.pi, 1000)
V= np.sin(z)
axs[1].plot(V, -z, 'tab:cyan', linewidth=2, solid_capstyle='round')
axs[0].set_title('Querkraft'.format(V.min()))
axs[1].set(xlabel='V [kN]')
axs[1].set_xticks(major_ticks_x1)
axs[1].set_xticks(minor_ticks_x1,minor=True);
axs[1].grid()
axs[1].fill_between(
x= V,
y1= -z,
color= "cyan",
alpha= 0.2)
This is what those two graphs look like:
I believe the reason for this is that fill_between is intended for horizontal fill_between, where you're filling between two y values - often the x axis is implicitly the origin of the fill_between.
When you work horizontally, the x in your curve is more obviously defining where the fill between should start and stop. When you use fill_between vertically like this, it is not quite as obvious.
Here's a more typical (horizontal) use of fill_between, that makes it more clear that the beginning and end of the curve are defining where to start and stop the fill:
Now of course, I only 'sort of' fixed your problem, because I changed the range of my function so that V[0] was zero - you might be constrained from doing that. But you can at least see why the fill_between was acting as it was.
Ok, I think I have found a solution. Matplotlib has a method fill_betweenx().
Now the code looks like this
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
from matplotlib import pyplot as plt
fig, axs = plt.subplots(1, 3)
fig.tight_layout()
ticks_x1 = [-400,1200,4]
major_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],ticks_x1[2]+1)
minor_ticks_x1 = np.linspace(ticks_x1[0],ticks_x1[1],5*ticks_x1[2]+1)
axs[1].plot(V, -z, 'tab:cyan', linewidth=2, solid_capstyle='round')
axs[1].set_title('Querkraft')
axs[1].set(xlabel='V [kN]')
axs[1].set_xticks(major_ticks_x1)
axs[1].set_xticks(minor_ticks_x1,minor=True);
axs[1].fill_betweenx(
y= -z,
x1= V,
color= "cyan",
alpha= 0.2)
and produces the desired result. I guess it did not work with the firs method because the graph is vertical instead of horizontal.
I would like a representation consisting of a scatter plot and 2 histograms on the right and below the scatter plot
create. I have the following requirements:
1.) In the scatter plot, the apect ratio is equal so that the circle does not look like an ellipse.
2.) In the graphic, the subplots should be exactly as wide or high as the axes of the scatter plot.
This also works to a limited extent. However, I can't make the lower histogram as wide as the x axis of the scatter plot. How do I do that?
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import random
#create some demo data
x = [random.uniform(-2.0, 2.0) for i in range(100)]
y = [random.uniform(-2.0, 2.0) for i in range(100)]
#create figure
fig = plt.figure()
gs = gridspec.GridSpec(2, 2, width_ratios = [3, 1], height_ratios = [3, 1])
ax = plt.subplot(gs[0])
# Axis labels
plt.xlabel('pos error X [mm]')
plt.ylabel('pos error Y [mm]')
ax.grid(True)
ax.axhline(color="#000000")
ax.axvline(color="#000000")
ax.set_aspect('equal')
radius = 1.0
xc = radius*np.cos(np.linspace(0,np.pi*2))
yc = radius*np.sin(np.linspace(0,np.pi*2))
plt.plot(xc, yc, "k")
ax.scatter(x,y)
hist_x = plt.subplot(gs[1],sharey=ax)
hist_y = plt.subplot(gs[2],sharex=ax)
plt.tight_layout() #needed. without no xlabel visible
plt.show()
what i want is:
Many thanks for your help!
The easiest (but not necessarily most elegant) solution is to manually position the lower histogram after applying the tight layout:
ax_pos = ax.get_position()
hist_y_pos = hist_y.get_position()
hist_y.set_position((ax_pos.x0, hist_y_pos.y0, ax_pos.width, hist_y_pos.height))
This output was produced by matplotlib version 3.4.3. For your example output, you're obviously using a different version, as I get a much wider lower histogram than you.
(I retained the histogram names as in your example although I guess the lower one should be hist_x instead of hist_y).
For some reason when I use a zorder with my scatter plot the edges of the points overlap the axis. I tried some of the solutions from [here] (matplotlib axis tick labels covered by scatterplot (using spines)) but they didn't work for me. Is there a way from preventing this from happening?
I understand I could also add an ax.axvline() at my boundaries but that would be an annoying workaround for lots of plots.
xval = np.array([0,0,0,3,3,3,0,2,3,0])
yval = np.array([0,2,3,5,1,0,1,0,4,5])
zval = yval**2-4
fig = plt.figure(figsize=(6,6))
ax = plt.subplot(111)
ax.scatter(xval,yval,cmap=plt.cm.rainbow,c=zval,s=550,zorder=20)
ax.set_ylim(0,5)
ax.set_xlim(0,3)
#These don't work
ax.tick_params(labelcolor='k', zorder=100)
ax.tick_params(direction='out', length=4, color='k', zorder=100)
#This will work but I don't want to have to do this for the plot edges every time
ax.axvline(0,c='k',zorder=100)
plt.show()
For me the solution you linked to works; that is, setting the z-order of the scatter plot to a negative number. E.g.
xval = np.array([0,0,0,3,3,3,0,2,3,0])
yval = np.array([0,2,3,5,1,0,1,0,4,5])
zval = yval**2-4
fig = plt.figure(figsize=(6,6))
ax = plt.subplot(111)
ax.scatter(xval,yval,cmap=plt.cm.rainbow,c=zval,s=550,zorder=-1)
ax.set_ylim(0,5)
ax.set_xlim(0,3)
plt.show()
]1
You can fix the overlap using the following code with a large number for the zorder. This will work on both the x- and y-axis.
for k,spine in ax.spines.items():
spine.set_zorder(1000)
This works for me
import numpy as np
import matplotlib.pyplot as plt
xval = np.array([0,0,0,3,3,3,0,2,3,0])
yval = np.array([0,2,3,5,1,0,1,0,4,5])
zval = yval**2-4
fig = plt.figure(figsize=(6,6))
ax = plt.subplot(111)
ax.scatter(xval,yval,cmap=plt.cm.rainbow,c=zval,s=550,zorder=20)
ax.set_ylim(-1,6)
ax.set_xlim(-1,4)
#These don't work
ax.tick_params(labelcolor='k', zorder=100)
ax.tick_params(direction='out', length=4, color='k', zorder=100)
#This will work but I don't want to have to do this for the plot edges every time
ax.axvline(0,c='k',zorder=100)
plt.show()
Your circle sizes are big enough that they go beyond the axis scope. So we simply change the ylim and xlim
Changed
ax.set_ylim(0,5)
ax.set_xlim(0,3)
to
ax.set_ylim(-1,6)
ax.set_xlim(-1,4)
Also, zorder doesn't play a role in pushing the points to edges.
With matplotlib, I want to plot two graphs with the same x-axis scale, but I want to show different sized sections. How can I accomplish that?
So far I can plot differently sized subplots with GridSpec or same sized ones who share the x-axis. When I try both at once, the smaller subplot has the same axis but smaller scaled, while I want the same scaling and a different axis, so sharing the axis might be a wrong idea.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
x=np.linspace(0,10,100)
y=np.sin(x)
x2=np.linspace(0,5,60)
y2=np.cos(x2)
fig=plt.figure()
gs=GridSpec(2,3)
ax1 = fig.add_subplot(gs[0, :])
ax1.plot(x,y)
ax2 = fig.add_subplot(gs[1,:-1])
#using sharex=ax1 here decreases the scaling of ax2 too much
ax2.plot(x2,y2)
plt.show()
I want the x.axes to have the same scaling, i.e. the same x values are always exactly on top of each other, this should give you an idea. The smaller plot's frame could be expanded or fit the plot, that doesn't matter. As it is now, the scales don't match.
Thanks in advance.
This is still a bit rough. I'm sure there's a slightly more elegant way to do this, but you can create a custom transformation (see Transformations Tutorial) between the Axes coordinates of ax2 and the data coordinates of ax1. In other word, your calculating what is the data-value (according to ax1) at the position corresponding to the left and right edges of ax2, and then adjust the xlim of ax2 accordingly.
Here is a demonstration showing that it works even if the second subplot is not aligned in any particular way with the first.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
x=np.linspace(0,25,100)
y=np.sin(x)
x2=np.linspace(10,30,60)
y2=np.cos(x2)
fig=plt.figure()
gs=GridSpec(2,6)
ax1 = fig.add_subplot(gs[0, :])
ax1.plot(x,y)
ax2 = fig.add_subplot(gs[1,3:-1])
ax2.plot(x2,y2)
# here is where the magic happens
trans = ax2.transAxes + ax1.transData.inverted()
((xmin,_),(xmax,_)) = trans.transform([[0,1],[1,1]])
ax2.set_xlim(xmin,xmax)
# for demonstration, show that the vertical lines end up aligned
for ax in [ax1,ax2]:
for pos in [15,20]:
ax.axvline(pos)
plt.show()
EDIT: One possible refinement would be to do the transform in the xlim_changed event callback. That way, the axes stay in sync even when zooming/panning in the first axes.
There is also a slight issue with tight_layout() as you noted, but that is easily fixed by calling the callback function directly.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
def on_xlim_changed(event):
# here is where the magic happens
trans = ax2.transAxes + ax1.transData.inverted()
((xmin, _), (xmax, _)) = trans.transform([[0, 1], [1, 1]])
ax2.set_xlim(xmin, xmax)
x = np.linspace(0, 25, 100)
y = np.sin(x)
x2 = np.linspace(10, 30, 60)
y2 = np.cos(x2)
fig = plt.figure()
gs = GridSpec(2, 6)
ax1 = fig.add_subplot(gs[0, :])
ax1.plot(x, y)
ax2 = fig.add_subplot(gs[1, 3:-1])
ax2.plot(x2, y2)
# for demonstration, show that the vertical lines end up aligned
for ax in [ax1, ax2]:
for pos in [15, 20]:
ax.axvline(pos)
# tight_layout() messes up the axes xlim
# but can be fixed by calling on_xlim_changed()
fig.tight_layout()
on_xlim_changed(None)
ax1.callbacks.connect('xlim_changed', on_xlim_changed)
plt.show()
I suggest setting limits of the second axis based on the limits of ax1.
Try this!
ax2 = fig.add_subplot(gs[1,:-1])
ax2.plot(x2,y2)
lb, ub = ax1.get_xlim()
# Default margin is 0.05, which would be used for auto-scaling, hence reduce that here
# Set lower bound and upper bound based on the grid size, which you choose for second plot
ax2.set_xlim(lb, ub *(2/3) -0.5)
plt.show()
I created a histogram plot using data from a file and no problem. Now I wanted to superpose data from another file in the same histogram, so I do something like this
n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)
but the problem is that for each interval, only the bar with the highest value appears, and the other is hidden. I wonder how could I plot both histograms at the same time with different colors.
Here you have a working example:
import random
import numpy
from matplotlib import pyplot
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]
bins = numpy.linspace(-10, 10, 100)
pyplot.hist(x, bins, alpha=0.5, label='x')
pyplot.hist(y, bins, alpha=0.5, label='y')
pyplot.legend(loc='upper right')
pyplot.show()
The accepted answers gives the code for a histogram with overlapping bars, but in case you want each bar to be side-by-side (as I did), try the variation below:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-deep')
x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
bins = np.linspace(-10, 10, 30)
plt.hist([x, y], bins, label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()
Reference: http://matplotlib.org/examples/statistics/histogram_demo_multihist.html
EDIT [2018/03/16]: Updated to allow plotting of arrays of different sizes, as suggested by #stochastic_zeitgeist
In the case you have different sample sizes, it may be difficult to compare the distributions with a single y-axis. For example:
import numpy as np
import matplotlib.pyplot as plt
#makes the data
y1 = np.random.normal(-2, 2, 1000)
y2 = np.random.normal(2, 2, 5000)
colors = ['b','g']
#plots the histogram
fig, ax1 = plt.subplots()
ax1.hist([y1,y2],color=colors)
ax1.set_xlim(-10,10)
ax1.set_ylabel("Count")
plt.tight_layout()
plt.show()
In this case, you can plot your two data sets on different axes. To do so, you can get your histogram data using matplotlib, clear the axis, and then re-plot it on two separate axes (shifting the bin edges so that they don't overlap):
#sets up the axis and gets histogram data
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.hist([y1, y2], color=colors)
n, bins, patches = ax1.hist([y1,y2])
ax1.cla() #clear the axis
#plots the histogram data
width = (bins[1] - bins[0]) * 0.4
bins_shifted = bins + width
ax1.bar(bins[:-1], n[0], width, align='edge', color=colors[0])
ax2.bar(bins_shifted[:-1], n[1], width, align='edge', color=colors[1])
#finishes the plot
ax1.set_ylabel("Count", color=colors[0])
ax2.set_ylabel("Count", color=colors[1])
ax1.tick_params('y', colors=colors[0])
ax2.tick_params('y', colors=colors[1])
plt.tight_layout()
plt.show()
As a completion to Gustavo Bezerra's answer:
If you want each histogram to be normalized (normed for mpl<=2.1 and density for mpl>=3.1) you cannot just use normed/density=True, you need to set the weights for each value instead:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
x_w = np.empty(x.shape)
x_w.fill(1/x.shape[0])
y_w = np.empty(y.shape)
y_w.fill(1/y.shape[0])
bins = np.linspace(-10, 10, 30)
plt.hist([x, y], bins, weights=[x_w, y_w], label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()
As a comparison, the exact same x and y vectors with default weights and density=True:
You should use bins from the values returned by hist:
import numpy as np
import matplotlib.pyplot as plt
foo = np.random.normal(loc=1, size=100) # a normal distribution
bar = np.random.normal(loc=-1, size=10000) # a normal distribution
_, bins, _ = plt.hist(foo, bins=50, range=[-6, 6], normed=True)
_ = plt.hist(bar, bins=bins, alpha=0.5, normed=True)
Here is a simple method to plot two histograms, with their bars side-by-side, on the same plot when the data has different sizes:
def plotHistogram(p, o):
"""
p and o are iterables with the values you want to
plot the histogram of
"""
plt.hist([p, o], color=['g','r'], alpha=0.8, bins=50)
plt.show()
Plotting two overlapping histograms (or more) can lead to a rather cluttered plot. I find that using step histograms (aka hollow histograms) improves the readability quite a bit. The only downside is that in matplotlib the default legend for a step histogram is not properly formatted, so it can be edited like in the following example:
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
from matplotlib.lines import Line2D
rng = np.random.default_rng(seed=123)
# Create two normally distributed random variables of different sizes
# and with different shapes
data1 = rng.normal(loc=30, scale=10, size=500)
data2 = rng.normal(loc=50, scale=10, size=1000)
# Create figure with 'step' type of histogram to improve plot readability
fig, ax = plt.subplots(figsize=(9,5))
ax.hist([data1, data2], bins=15, histtype='step', linewidth=2,
alpha=0.7, label=['data1','data2'])
# Edit legend to get lines as legend keys instead of the default polygons
# and sort the legend entries in alphanumeric order
handles, labels = ax.get_legend_handles_labels()
leg_entries = {}
for h, label in zip(handles, labels):
leg_entries[label] = Line2D([0], [0], color=h.get_facecolor()[:-1],
alpha=h.get_alpha(), lw=h.get_linewidth())
labels_sorted, lines = zip(*sorted(leg_entries.items()))
ax.legend(lines, labels_sorted, frameon=False)
# Remove spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# Add annotations
plt.ylabel('Frequency', labelpad=15)
plt.title('Matplotlib step histogram', fontsize=14, pad=20)
plt.show()
As you can see, the result looks quite clean. This is especially useful when overlapping even more than two histograms. Depending on how the variables are distributed, this can work for up to around 5 overlapping distributions. More than that would require the use of another type of plot, such as one of those presented here.
It sounds like you might want just a bar graph:
http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html
http://matplotlib.sourceforge.net/examples/pylab_examples/barchart_demo.html
Alternatively, you can use subplots.
There is one caveat when you want to plot the histogram from a 2-d numpy array. You need to swap the 2 axes.
import numpy as np
import matplotlib.pyplot as plt
data = np.random.normal(size=(2, 300))
# swapped_data.shape == (300, 2)
swapped_data = np.swapaxes(x, axis1=0, axis2=1)
plt.hist(swapped_data, bins=30, label=['x', 'y'])
plt.legend()
plt.show()
Also an option which is quite similar to joaquin answer:
import random
from matplotlib import pyplot
#random data
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]
#plot both histograms(range from -10 to 10), bins set to 100
pyplot.hist([x,y], bins= 100, range=[-10,10], alpha=0.5, label=['x', 'y'])
#plot legend
pyplot.legend(loc='upper right')
#show it
pyplot.show()
Gives the following output:
Just in case you have pandas (import pandas as pd) or are ok with using it:
test = pd.DataFrame([[random.gauss(3,1) for _ in range(400)],
[random.gauss(4,2) for _ in range(400)]])
plt.hist(test.values.T)
plt.show()
This question has been answered before, but wanted to add another quick/easy workaround that might help other visitors to this question.
import seasborn as sns
sns.kdeplot(mydata1)
sns.kdeplot(mydata2)
Some helpful examples are here for kde vs histogram comparison.
Inspired by Solomon's answer, but to stick with the question, which is related to histogram, a clean solution is:
sns.distplot(bar)
sns.distplot(foo)
plt.show()
Make sure to plot the taller one first, otherwise you would need to set plt.ylim(0,0.45) so that the taller histogram is not chopped off.