Related
I've created a plot of normal distribution like this:
fig, ax = plt.subplots()
ax.set_title('Плотнось распределения вероятности')
ax.set_xlabel('x')
ax.set_ylabel('f(x)')
x = np.linspace(148, 200, 100) # X от 148 до 200
y = (1 / (5 * math.sqrt(2*math.pi))) * np.exp((-(x-178)**2) / (2*5**2))
ax.plot(x, y)
plt.show()
But I also need to add vertical lines inside the graph area, color inner segments and add marks like in picture on axis = 0.
How can I do it in python using matplotlib?
I've tried to use plt.axvline, but the vertical lines go outside of my main plot:
plt.axvline(x = 178, color = 'g', label = 'axvline - full height')
plt.axvline(x = 178+5, color = 'b', label = 'axvline - full height')
plt.axvline(x = 178-5, color = 'b', label = 'axvline - full height')
plt.axvline(x = 178+5*2, color = 'r', label = 'axvline - full height')
plt.axvline(x = 178-5*2, color = 'r', label = 'axvline - full height')
The line version can be implemented using vlines, but note that your reference figure can be better reproduced using fill_between.
Line version
Instead of axvline, use vlines which supports ymin and ymax bounds.
Change your y into a lambda f(x, mu, sd) and use that to define the ymax bounds:
# define y as a lambda f(x, mu, sd)
f = lambda x, mu, sd: (1 / (sd * (2*np.pi)**0.5)) * np.exp((-(x-mu)**2) / (2*sd**2))
fig, ax = plt.subplots(figsize=(8, 3))
x = np.linspace(148, 200, 200)
mu = 178
sd = 5
ax.plot(x, f(x, mu, sd))
# define 68/95/99 locations and colors
xs = mu + sd*np.arange(-3, 4)
colors = [*'yrbgbry']
# draw lines at 68/95/99 points from 0 to the curve
ax.vlines(xs, ymin=0, ymax=[f(x, mu, sd) for x in xs], color=colors)
# relabel x ticks
plt.xticks(xs, [f'${n}\sigma$' if n else '0' for n in range(-3, 4)])
Shaded version
Use fill_between to better recreate the sample figure. Define the shaded bounds using the where parameter:
fig, ax = plt.subplots(figsize=(8, 3))
x = np.linspace(148, 200, 200)
mu = 178
sd = 5
y = (1 / (sd * (2*np.pi)**0.5)) * np.exp((-(x-mu)**2) / (2*sd**2))
ax.plot(x, y)
# use `where` condition to shade bounded regions
bounds = mu + sd*np.array([-np.inf] + list(range(-3, 4)) + [np.inf])
alphas = [0.1, 0.2, 0.5, 0.8, 0.8, 0.5, 0.2, 0.1]
for left, right, alpha in zip(bounds, bounds[1:], alphas):
ax.fill_between(x, y, where=(x >= left) & (x < right), color='b', alpha=alpha)
# relabel x ticks
plt.xticks(bounds[1:-1], [f'${n}\sigma$' if n else '0' for n in range(-3, 4)])
To label the region percentages, add text objects at the midpoints of the bounded regions:
midpoints = mu + sd*np.arange(-3.5, 4)
percents = [0.1, 2.1, 13.6, 34.1, 34.1, 13.6, 2.1, 0.1]
colors = [*'kkwwwwkk']
for m, p, c in zip(
midpoints, # midpoints of bounded regions
percents, # percents captured by bounded regions
colors, # colors of text labels
):
ax.text(m, 0.01, f'{p}%', color=c, ha='center', va='bottom')
I created a simple heatmap on Matplotlib on a already existing image, now i'm trying to show the values on the cells, but the problem is that the values won't go inside the heatmap, but all around the image, here is a screenshot.
I think this happens because i'm generating the heatmap on top of an image, but i don't know how to fix that. Here is my code:
fig,ax = plt.subplots(1)
ax.imshow(im)
a = [[0.0233188 0.0232844 0.0233099 0.0242786 ]
[0.0233158 0.023217 0.02370096 0.02434176]
[0.02328474 0.02319508 0.02433976 0.02290478]
[0.02320107 0.02345002 0.02484117 0.02355316]
[0.02317872 0.02374418 0.02374605 0.02157998]]
ax1 = fig.add_subplot(111)
bounds1 = sorted([0.023, np.amin(a), np.amax(a)])
norm1 = matplotlib.colors.TwoSlopeNorm(vcenter=bounds1[1], vmin=bounds1[0], vmax=bounds1[2])
Map = ax1.imshow(a, interpolation='none', norm=norm1, extent=[0, 1.15, 0, 0.85])
x1 = [1, 2, 3, 4]
y1 = [1, 2, 3, 4, 5]
for i in range(len(y1)):
for j in range(len(x1)):
text = ax1.text(j, i, a[i, j],
ha="center", va="center", color="r")
extent=[x0, x1, y0, y1] changes the x and y coordinates of the image. When there are N cells between x0 and x1, the cell centers can be found by splitting the distance into 2N+1 parts and taking the 1st, 3rd, 5th, ... position of that list.
Note that as imshow(a, ...) didn't use origin='lower', the values are reversed. So, for the y-positions need to be traversed in reverse order.
from matplotlib import pyplot as plt
import matplotlib
import numpy as np
fig, ax = plt.subplots()
ax.axis('off')
a = np.array([[0.0233188, 0.0232844, 0.0233099, 0.0242786],
[0.0233158, 0.023217, 0.02370096, 0.02434176],
[0.02328474, 0.02319508, 0.02433976, 0.02290478],
[0.02320107, 0.02345002, 0.02484117, 0.02355316],
[0.02317872, 0.02374418, 0.02374605, 0.02157998]])
ax1 = fig.add_subplot(111)
bounds1 = sorted([0.023, np.amin(a), np.amax(a)])
norm1 = matplotlib.colors.TwoSlopeNorm(vcenter=bounds1[1], vmin=bounds1[0], vmax=bounds1[2])
x0, x1, y0, y1 = 0, 1.15, 0, 0.85
Map = ax1.imshow(a, interpolation='none', norm=norm1, extent=[x0, x1, y0, y1])
for i, yi in enumerate(np.linspace(y0, y1, 2 * a.shape[0] + 1)[-2::-2]):
for j, xj in enumerate(np.linspace(x0, x1, 2 * a.shape[1] + 1)[1::2]):
text = ax1.text(xj, yi, f'{a[i, j]:.6f}',
ha="center", va="center", color='darkred' if a[i, j] > bounds1[1] else 'white', fontsize=10)
plt.show()
I have N samples, each of which has n values in it. For each sample, I plot the histogram of n values. In total, I would have N histogram. Now I would like to have a plot that shows the mean of these histograms and have the 5-95% quantile region shaded. An example of such a plot would look like (please don't mind the dashed line and black line, just the shaded area.)
So far, I have plotted all the N histograms on top of each other.
I realized that the mean of these histograms would be the histogram of all [N X n] values together. A sample code for this would look like
import numpy as np
import matplotlib.pyplot as plt
samples = []
N = 20
n = 250
for i in range(N):
samples.append(np.random.normal(loc=np.random.rand(1,)[0]/5-0.1, scale=1., size=n))
values_all = None
for i in range(len(samples)):
values = samples[i]
print(values)
weights = np.ones_like(values) / float(len(values))
plt.hist(values, range=[-4, 4], density=False, histtype='step', color='red', bins=15, weights=weights)
if values_all is None:
values_all = values
else:
values_all = np.concatenate(([values_all, values]), axis=0)
weights = np.ones_like(values_all) / float(len(values_all))
plt.hist(values_all, range=[-4, 4], density=False, histtype='step', color='black', bins=15, weights=weights)
plt.show()
Any suggestions on how to find and plot the 5-95% quantiles would be appreciated.
You should get the probabilities for each bin and find the quantiles for them. Here is a sample code
import numpy as np
import matplotlib.pyplot as plt
# generate data
samples = []
N = 20
n = 250
for i in range(N):
samples.append(np.random.normal(loc=np.random.rand(1,)[0]/5-0.1, scale=1., size=n))
prob_all = None
for i in range(len(samples)):
values = samples[i]
weights = np.ones_like(values) / float(len(values))
n, bins, patches = plt.hist(values, range=[-4, 4], density=False, histtype='step', color='red', bins=15, weights=weights, alpha = 0.5)
# concatanate bin probabilities
if prob_all is None:
prob_all = n.reshape(-1,1)
else:
prob_all = np.concatenate(([prob_all, n.reshape(-1,1)]), axis=1)
plt.close() # don't plot previous histograms
# find quantiles for each bin
quant = np.quantile(prob_all, [0.05, 0.5, 0.95], axis=1)
# plot histogram from bins and probabilities
def plt_hist(bins, quant, clr, alph, lw):
for j in range(len(n)):
plt.plot([bins[j], bins[j + 1]], [quant[j], quant[j]], color=clr, linewidth=lw, alpha = alph)
plt.plot([bins[0], bins[0]], [0., quant[0]], color=clr, linewidth=lw, alpha = alph)
plt.plot([bins[len(n)], bins[len(n)]], [quant[len(n) - 1], 0.], color=clr, linewidth=lw, alpha = alph)
for j in range(len(n) - 1):
plt.plot([bins[j + 1], bins[j + 1]], [quant[j], quant[j + 1]], color=clr, linewidth=lw, alpha = alph)
fig, ax = plt.subplots()
ax.set_ylim([0., 0.3])
ax.set_xlim([-4.5, 4.5])
# plot 50% quantile (mean)
plt_hist(bins, quant[1], clr='blue', alph=1., lw=1.)
# shade between quantiles
for i in range(len(n)):
x = np.arange(bins[i], bins[i+1], 0.0001)
y1 = quant[0,i]
y2 = quant[2,i]
ax.fill_between(x, y1, y2, facecolor='red', alpha=0.4)
# boarder for shadings
plt_hist(bins, quant[0], clr='black', alph=.2, lw=1.)
plt_hist(bins, quant[2], clr='black', alph=.2, lw=1.)
plt.show()
I have to create a plot that has axes suppressed and tangents drawn at regular intervals as shown in figure presented below.
Using R-programming, I know how to suppress tick marks and create the plot.
But I don't know how to suppress the whole axes.
Here, I need to omit the whole a-axis as well other axes such as top and right axes.
My initial try is this:
tau <- seq(-5,5,0.01)
a <- 0.4 # a is a constant parameter
sigma <- a*tau # tau is a variable, sigma = a*tau
x <- 1/a*cosh(sigma)
y <- 1/a*sinh(sigma)
# plot
plot(x,y,type="l",xaxt="n",yaxt="n")
abline(h=0,lty=1)
The plot also requires dots and tangents at points where a*tau = -1,-0.5, 0, 0.5 and 1.
The links I followed are following:
Drawing a Tangent to the Plot and Finding the X-Intercept using R
Lines between certain points in a plot, based on the data? (with R)
The required plot looks like below:
Any suggestion both in python or R are truly appreciated!!
Using Python,
import numpy as np
import matplotlib.pyplot as plt
tau = np.arange(-5, 5, 0.01)
a = 0.4
sigma = a*tau
x = 1/a*np.cosh(sigma)
y = 1/a*np.sinh(sigma)
fig, ax = plt.subplots()
ax.plot(x, y, c='black')
# approximate the curve by a cubic
dxds = np.poly1d(np.polyfit(sigma, x, 3)).deriv()
dyds = np.poly1d(np.polyfit(sigma, y, 3)).deriv()
xs, ys, dxs, dys = [], [], [], []
for s in np.linspace(-1, 1, 5):
# evaluate the derivative at s
dx = np.polyval(dxds, s)
dy = np.polyval(dyds, s)
# record the x, y location and dx, dy tangent vector associated with s
xi = 1/a*np.cosh(s)
yi = 1/a*np.sinh(s)
xs.append(xi)
ys.append(yi)
dxs.append(dx)
dys.append(dy)
if s == 0:
ax.text(xi-0.75, yi+1.5, '$u$', transform=ax.transData)
ax.annotate('$a^{-1}$',
xy=(xi, yi), xycoords='data',
xytext=(25, -5), textcoords='offset points',
verticalalignment='top', horizontalalignment='left',
arrowprops=dict(arrowstyle='-', shrinkB=7))
ax.quiver(xs, ys, dxs, dys, scale=1.8, color='black', scale_units='xy', angles='xy',
width=0.01)
ax.plot(xs, ys, 'ko')
# http://stackoverflow.com/a/13430772/190597 (lucasg)
ax.set_xlim(-0.1, x.max())
left, right = ax.get_xlim()
low, high = ax.get_ylim()
ax.arrow(0, 0, right, 0, length_includes_head=True, head_width=0.15 )
ax.arrow(0, low, 0, high-low, length_includes_head=True, head_width=0.15 )
ax.text(0.03, 1, '$t$', transform=ax.transAxes)
ax.text(1, 0.47, '$x$', transform=ax.transAxes)
plt.axis('off')
ax.set_aspect('equal')
plt.show()
yields
Are there any colormaps or is there a simple way to transform a matplotlib colormap to provide a much bigger color range near 0.5 and a smaller one at the extremes? I am creating a bunch of subplots, one of which has color values of about 10 times the others, so it’s values dominate and the rest of the plots all look the same. For a simple example say we have:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(1,10,10)
y = np.linspace(1,10,10)
t1 = np.random.normal(2,0.3,10)
t2 = np.random.normal(9,0.01,10)
t2_max = max(t2)
plt.figure(figsize=(22.0, 15.50))
p = plt.subplot(1,2,1)
colors = plt.cm.Accent(t1/t2_max)
p.scatter(x, y, edgecolors=colors, s=15, linewidths=4)
p = plt.subplot(1,2,2)
colors = plt.cm.Accent(t2/t2_max)
p.scatter(x, y, edgecolors=colors, s=15, linewidths=4)
plt.subplots_adjust(left=0.2)
cbar_ax = plt.axes([0.10, 0.15, 0.05, 0.7])
sm = plt.cm.ScalarMappable(cmap=plt.cm.Accent, norm=plt.Normalize(vmin=0, vmax=t2_max))
sm._A = []
cbar = plt.colorbar(sm,cax=cbar_ax)
plt.show()
There is much more variation in t1 than in t2, however the variation can not be seen because of the high values of t2. What I want is a map the will provide a larger color gradient around the mean of t1 without transforming the data itself. I have found one solution here http://protracted-matter.blogspot.co.nz/2012/08/nonlinear-colormap-in-matplotlib.html but cant get it to work for my scatter plots.
EDIT:
From answer below the class can be modified to take negative numbers, and fixed boundaries.
import numpy as np
import matplotlib.pyplot as plt
x = y = np.linspace(1, 10, 10)
t1mean, t2mean = -6, 9
sigma1, sigma2 = .3, .01
t1 = np.random.normal(t1mean, sigma1, 10)
t2 = np.random.normal(t2mean, sigma2, 10)
class nlcmap(object):
def __init__(self, cmap, levels):
self.cmap = cmap
self.N = cmap.N
self.monochrome = self.cmap.monochrome
self.levels = np.asarray(levels, dtype='float64')
self._x = self.levels
self.levmax = self.levels.max()
self.levmin = self.levels.min()
self.transformed_levels = np.linspace(self.levmin, self.levmax,
len(self.levels))
def __call__(self, xi, alpha=1.0, **kw):
yi = np.interp(xi, self._x, self.transformed_levels)
return self.cmap(yi / (self.levmax-self.levmin)+0.5, alpha)
tmax = 10
tmin = -10
#the choice of the levels depends on the data:
levels = np.concatenate((
[tmin, tmax],
np.linspace(t1mean - 2 * sigma1, t1mean + 2 * sigma1, 5),
np.linspace(t2mean - 2 * sigma2, t2mean + 2 * sigma2, 5),
))
levels = levels[levels <= tmax]
levels.sort()
print levels
cmap_nonlin = nlcmap(plt.cm.jet, levels)
fig, (ax1, ax2) = plt.subplots(1, 2)
ax1.scatter(x, y, edgecolors=cmap_nonlin(t1), s=15, linewidths=4)
ax2.scatter(x, y, edgecolors=cmap_nonlin(t2), s=15, linewidths=4)
fig.subplots_adjust(left=.25)
cbar_ax = fig.add_axes([0.10, 0.15, 0.05, 0.7])
#for the colorbar we map the original colormap, not the nonlinear one:
sm = plt.cm.ScalarMappable(cmap=plt.cm.jet,
norm=plt.Normalize(vmin=tmin, vmax=tmax))
sm._A = []
cbar = fig.colorbar(sm, cax=cbar_ax)
#here we are relabel the linear colorbar ticks to match the nonlinear ticks
cbar.set_ticks(cmap_nonlin.transformed_levels)
cbar.set_ticklabels(["%.2f" % lev for lev in levels])
plt.show()
Your link provides quite a good solution for the colormap. I edited a bit, but it contained al the necessary. You need to pick some sensible levels for your nonlinear colormap. I used two ranges centered around the mean values, between +- 4 the standard deviation of your sample. by changing that to another number you obtain a different local gradient in the color around the two mean values.
For the colorbar, you
either leave the colors nonlinearly spaced with linearly spaced labels
you have linearly spaced colors with nonlinearly spaced labels.
The second allows greater resolution when looking at the data, looks nicer and is implemented below:
import numpy as np
import matplotlib.pyplot as plt
x = y = np.linspace(1, 10, 10)
t1mean, t2mean = 2, 9
sigma1, sigma2 = .3, .01
t1 = np.random.normal(t1mean, sigma1, 10)
t2 = np.random.normal(t2mean, sigma2, 10)
class nlcmap(object):
def __init__(self, cmap, levels):
self.cmap = cmap
self.N = cmap.N
self.monochrome = self.cmap.monochrome
self.levels = np.asarray(levels, dtype='float64')
self._x = self.levels
self.levmax = self.levels.max()
self.transformed_levels = np.linspace(0.0, self.levmax,
len(self.levels))
def __call__(self, xi, alpha=1.0, **kw):
yi = np.interp(xi, self._x, self.transformed_levels)
return self.cmap(yi / self.levmax, alpha)
tmax = max(t1.max(), t2.max())
#the choice of the levels depends on the data:
levels = np.concatenate((
[0, tmax],
np.linspace(t1mean - 4 * sigma1, t1mean + 4 * sigma1, 5),
np.linspace(t2mean - 4 * sigma2, t2mean + 4 * sigma2, 5),
))
levels = levels[levels <= tmax]
levels.sort()
cmap_nonlin = nlcmap(plt.cm.jet, levels)
fig, (ax1, ax2) = plt.subplots(1, 2)
ax1.scatter(x, y, edgecolors=cmap_nonlin(t1), s=15, linewidths=4)
ax2.scatter(x, y, edgecolors=cmap_nonlin(t2), s=15, linewidths=4)
fig.subplots_adjust(left=.25)
cbar_ax = fig.add_axes([0.10, 0.15, 0.05, 0.7])
#for the colorbar we map the original colormap, not the nonlinear one:
sm = plt.cm.ScalarMappable(cmap=plt.cm.jet,
norm=plt.Normalize(vmin=0, vmax=tmax))
sm._A = []
cbar = fig.colorbar(sm, cax=cbar_ax)
#here we are relabel the linear colorbar ticks to match the nonlinear ticks
cbar.set_ticks(cmap_nonlin.transformed_levels)
cbar.set_ticklabels(["%.2f" % lev for lev in levels])
plt.show()
In the result, notice that the ticks of the colorbar are NOT equispaced:
You could use LinearSegmentedColormap:
With this, you need to set up a color lookup table within a dictionary e.g. 'cdict' below.
cdict = {'red': [(0.0, 0.0, 0.0),
(0.15, 0.01, 0.01),
(0.35, 1.0, 1.0),
(1.0, 1.0, 1.0)],
'green': [(0.0, 0.0, 0.0),
(1.0, 0.0, 1.0)],
'blue': [(0.0, 0.0, 1.0),
(0.9, 0.01, 0.01),
(1.0, 0.0, 1.0)]}
This shows the transistions between values. I have set red to vary a lot around the values of t1/t2_max (0.15 to 0.35) and blue to vary a lot around the values of t2/t2_max (0.9 to 1.0). Green does nothing. I'd recommend reading the docs to see how this works. (Note this could be automated to automatically vary around your values). I then tweaked your code to show the graph:
import matplotlib.colors as col
my_cmap = col.LinearSegmentedColormap('my_colormap', cdict)
plt.figure(figsize=(22.0, 15.50))
p = plt.subplot(1,2,1)
colors = my_cmap(t1/t2_max)
p.scatter(x, y, edgecolors=colors, s=15, linewidths=4)
p = plt.subplot(1,2,2)
colors = my_cmap(t2/t2_max)
p.scatter(x, y, edgecolors=colors, s=15, linewidths=4)
plt.subplots_adjust(left=0.2)
cbar_ax = plt.axes([0.10, 0.15, 0.05, 0.7])
sm = plt.cm.ScalarMappable(cmap=my_cmap, norm=plt.Normalize(vmin=0, vmax=t2_max))
sm._A = []
cbar = plt.colorbar(sm,cax=cbar_ax)
plt.show()