I currently work with an instrument that provides data in Wavenumber, but most of my community works in wavelength. Because of this I would like to create plots that display Wavenumber in cm^-1 along the bottom x-axis and wavelength in µm along the top. However the spacing doesn't quite match up between the two units of measurement to display a single spectrum. How do I create a different spacing for wavelength?
Here is an example of how a portion of one spectrum looks when plotted as a function of wavenumber against when it's plotted as a function of wavelength. Below is the code I'm currently implementing.
wn = wn_tot[425:3175] #range of 250 to 3000 cm-1
wl = 10000/wn #wavelength in microns
fig = plt.figure(1)
ax1 = plt.subplot(1,1,1)
ax2 = ax1.twiny()
ax1.plot(wn, spc[45], 'c', label='Wavenumber')
ax2.plot(wl, spc[45], 'm', label='Wavelength')
ax1.set_xlabel('Wavenumber (cm$^{-1}$)')
ax2.set_xlabel('Wavelength ($\mu$m)')
ax1.set_ylabel('Relative Intensity')
ax2.invert_xaxis()
fig.legend(loc=2, bbox_to_anchor=(0,1), bbox_transform=ax1.transAxes)
As said in the comment on the OP, both scales cannot be simultaneously linear, since one cannot be obtained from the other via a linear transformation. You must hence accept that one (or both) have ticks at non-regular intervals.
The correct way to do it
Apply a transformation to the scale, which causes matplotlib to have a non-homogeneous scale.
The doc for Axes.set_yscale leads to that example which demonstrate the syntax ax1.set_xscale('function', functions=(forward, inverse)). Here in that case, the transformation functions are simply
def forward(wn):
# cm^{-1} to μm
return 1.0e4 / wn
def reverse(lam):
# μm to cm^{-1}
return 1.0e4 / lam
However, my matplotlib is stuck on version 2.2.2 which does not have that feature, so I cannot give a working example.
The hacky way that works with older versions
Give tick positions and labels by hand, performing the calculations yourself.
# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
def lambda_to_wave(lam):
# μm to cm^{-1}
return 1.0e4 / lam
x_wave = np.linspace(2000.0, 3000.0)
y_arb = np.linspace(0.0, 1.0e6)
ticks_wavelength_values = np.linspace(3.5, 5.5, num=5)
ticks_labels = [str(lam) for lam in ticks_wavelength_values]
ticks_wavenumber_positions = lambda_to_wave(ticks_wavelength_values)
print ticks_wavelength_values
print ticks_wavenumber_positions
fig = plt.figure(1)
ax1 = plt.subplot(1,1,1) # wavenumber
ax2 = ax1.twiny() # wavelength
ax2.get_shared_x_axes().join(ax1, ax2) # https://stackoverflow.com/questions/42973223/how-share-x-axis-of-two-subplots-after-they-are-created
ax1.plot(x_wave, y_arb, 'c', label='Data')
ax1.set_xlabel('Wavenumber (cm$^{-1}$)')
ax1.set_ylabel('Relative Intensity')
ax2.set_xticks(ticks_wavenumber_positions)
ax2.set_xticklabels(ticks_labels)
ax2.set_xlabel('Wavelength ($\mu$m)')
ax1.set_xlim(left=1800.0, right=3000.0)
fig.legend(loc=2, bbox_to_anchor=(0,1), bbox_transform=ax1.transAxes)
plt.show()
You can do without the second call to plot if you prefer: https://matplotlib.org/gallery/subplots_axes_and_figures/secondary_axis.html#sphx-glr-gallery-subplots-axes-and-figures-secondary-axis-py
wn = wn_tot[425:3175] #range of 250 to 3000 cm-1
fig = plt.figure(1)
ax1 = plt.subplot(1,1,1)
ax1.plot(wn, spc[45], 'c', label='Wavenumber')
def forward(x):
return 10000 / x
def inverse(x):
return 10000 / x
secax = ax.secondary_xaxis('top', functions=(forward, inverse))
ax1.set_xlabel('Wavenumber (cm$^{-1}$)')
secax.set_xlabel('Wavelength ($\mu$m)')
ax1.set_ylabel('Relative Intensity')
Related
I am trying to label the intersection of two lines in a plot I have made. The code/MWE is:
import matplotlib.pyplot as plt
import numpy as np
#ignore my gross code, first time ever using Python :-)
#parameters
d = 0.02
s = 0.50 #absurd, but dynamics robust to 1>s>0
A = 0.90
u = 0.90
#variables
kt = np.arange(0, 50, 1)
invest = (1 - np.exp(-d*kt))*kt
output = A*u*kt
saving = s*output
#plot
plt.plot(kt, invest, 'r', label='Investment')
plt.plot(kt, output, 'b', label='Output')
plt.plot(kt, saving, label='Saving')
plt.xlabel('$K_t$')
plt.ylabel('$Y_t$, $S_t$, $I_t$')
plt.legend(loc="upper left")
#Steady State; changes with parameters
Kbar = np.log(1-s*A*u)/-d
x, y = [Kbar, Kbar], [0, s*A*u*Kbar]
plt.plot(x, y, 'k--')
#custom axes (no top and right)
ax = plt.gca()
right_side = ax.spines["right"]
right_side.set_visible(False)
top_side = ax.spines["top"]
top_side.set_visible(False)
#ax.grid(True) #uncomment for gridlines
plt.xlim(xmin=0) #no margins; preference
plt.ylim(ymin=0)
plt.show()
which creates:
I am trying to create a little label at the bottom of the dotted black line that says "$K^*$". I want it to coincide with Kbar so that, like the black line, it moves along with the parameters. Any tips or suggestions here?
I don't quite understand what you mean by "under the black dotted line", but you can already use the coordinate data of the dotted line to annotate it. I put it above the intersection point, but if you want to put it near the x-axis, you can set y=0.
plt.text(max(x), max(y)+1.5, '$K^*$', transform=ax.transData)
baseTicks=list(plt.xticks()[0]) #for better control, replace with a range or arange
ax.set_xticks(baseTicks+[np.log(1-A*u*s)/(-d)])
ax.set_xticklabels(baseTicks+['$K^*$'])
I am learning to make color bars, and thus learning to make good use of plt.Normalize , I succeeded to make it work with scipy.stats.norm, but when tryin to use plt.norm, I found out that I have to do two things to make it work well :
defining vmin and vmax to -1.96 and 1.96 respectively,I guess that it's because they are the z value for 95% confidence interval, but I still don't precisely know why they have we have to set vmin and vmax to those values
dividing the standard deviation by sqrt( number of elements )
I don't understand why are those two points important for using the Norm. Any help is welcome ! thank you in advance
# Use the following data for this assignment:
%matplotlib notebook
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as st
df = pd.DataFrame([np.random.normal(33500,150000,3650),
np.random.normal(41000,90000,3650),
np.random.normal(41000,120000,3650),
np.random.normal(48000,55000,3650)],
index=[1992,1993,1994,1995])
new_df = pd.DataFrame()
new_df['mean'] = df.mean(axis =1)
new_df['std'] = df.std(axis =1)
new_df['se'] = df.sem(axis= 1)
new_df['C_low'] = new_df['mean'] - 1.96 * new_df['se']
new_df['C_high'] = new_df['mean'] + 1.96 * new_df['se']
from scipy.stats import norm
import numpy as np
# First, Define a figure
fig = plt.figure()
# next define its the axis and create a plot
ax = fig.add_subplot(1,1,1)
# change the ticks
xticks = np.array(new_df.index,dtype= 'str')
# remove the top and right borders
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# draw the bars in the axis
bars = ax.bar(xticks,new_df['mean'].values,
yerr = (1.96*new_df['se'],1.96*new_df['se']),
capsize= 10)
# define labels
plt.xlabel('YEARS',size = 14)
plt.ylabel('FREQUENCY',size = 14)
# Define color map
cmap = plt.cm.get_cmap('coolwarm')
# define scalar mappable
sm = plt.cm.ScalarMappable(cmap = cmap)
# draw the color bar
cbar = plt.colorbar(cmap = cmap, mappable =sm)
# define norm (will be used later to turn y to a value from 0 to 1 )
# define the events
class Cursor(object):
def __init__(self,ax):
self.ax = ax
self.lx = ax.axhline(color = 'c')
self.txt = ax.text(1,50000,'')
def mouse_movemnt(self,event):
#behaviour outside of the plot
if not event.inaxes:
return
#behavior inside the plot
y = event.ydata
self.lx.set_ydata(y)
for idx,bar in zip(new_df.index, bars):
norm = plt.Normalize(vmin =-1.96,vmax = 1.96)
mean = new_df.loc[idx,'mean']
err = new_df.loc[idx, 'se']
std = new_df.loc[idx,'std']/ np.sqrt(df.shape[1]) # not sure why we re dividing by np.sqrt(df.shape[1])
self.txt.set_text(f'Y = {round(y,2)} \n')
color_prob = norm( (mean - y)/std)
#color_prob = norm.cdf(y,loc = mean, scale = err) # you can also use this
bar.set_color( cmap(color_prob))
# connect the events to the plot
cursor = Cursor(ax)
plt.connect('motion_notify_event', cursor.mouse_movemnt)
None
After few hours of thinking, an explanation barged into my head and I was able to answer all of my inquiries,
first before answering the first point, I will answer the second one, the standard deviation was divided by the sqrt(nbr of element) because the resulting value is the standard error.
I will now move on to answering the first part:
(I can't embed images for now and I can't use latex either so I have to put links of the image instead). But here is the conclusion in advance, for all values within that confidence interval, the function (y-mean)/se will spit out a value within the range [−1.96,1.96]
answer of first part
Please, if I left something out or you have a better answer, share it with me.
I'm implementing a Naive Bayes classifier.
I have the following figure showing me my classification boundaries:
I want to make the axes equally scaled for the figure, because I think it would help me better understand what is going on. However, I haven't found any way to do this. The plot is generated by a function not written by me:
%matplotlib inline
plotBoundary(BayesClassifier(), dataset='iris',split=0.7)
# ## Plotting the decision boundary
#
# This is some code that you can use for plotting the decision boundary
# boundary in the last part of the lab.
def plotBoundary(classifier, dataset='iris', split=0.7):
X,y,pcadim = fetchDataset(dataset)
xTr,yTr,xTe,yTe,trIdx,teIdx = trteSplitEven(X,y,split,1)
classes = np.unique(y)
pca = decomposition.PCA(n_components=2)
pca.fit(xTr)
xTr = pca.transform(xTr)
xTe = pca.transform(xTe)
pX = np.vstack((xTr, xTe))
py = np.hstack((yTr, yTe))
# Train
trained_classifier = classifier.trainClassifier(xTr, yTr)
xRange = np.arange(np.min(pX[:,0]),np.max(pX[:,0]),np.abs(np.max(pX[:,0])-np.min(pX[:,0]))/100.0)
yRange = np.arange(np.min(pX[:,1]),np.max(pX[:,1]),np.abs(np.max(pX[:,1])-np.min(pX[:,1]))/100.0)
grid = np.zeros((yRange.size, xRange.size))
for (xi, xx) in enumerate(xRange):
for (yi, yy) in enumerate(yRange):
# Predict
grid[yi,xi] = trained_classifier.classify(np.array([[xx, yy]]))
ys = [i+xx+(i*xx)**2 for i in range(len(classes))]
colormap = cm.rainbow(np.linspace(0, 1, len(ys)))
fig = plt.figure()
# plt.hold(True)
conv = ColorConverter()
for (color, c) in zip(colormap, classes):
try:
CS = plt.contour(xRange,yRange,(grid==c).astype(float),15,linewidths=0.25,colors=conv.to_rgba_array(color))
except ValueError:
pass
trClIdx = np.where(y[trIdx] == c)[0]
teClIdx = np.where(y[teIdx] == c)[0]
plt.scatter(xTr[trClIdx,0],xTr[trClIdx,1],marker='o',c=color,s=40,alpha=0.5, label="Class "+str(c)+" Train")
plt.scatter(xTe[teClIdx,0],xTe[teClIdx,1],marker='*',c=color,s=50,alpha=0.8, label="Class "+str(c)+" Test")
plt.legend(bbox_to_anchor=(1., 1), loc=2, borderaxespad=0.)
fig.subplots_adjust(right=0.7)
plt.axis("equal") # <------- TRIED TO INJECT axis("equal") here
plt.show()
I've tried injecting plt.axis("equal") into this function (1 line from the bottom of the code) but it doesn't make my axes equal. How can I achieve this?
EDIT: I also tried injecting plt.gca().set_aspect('equal', adjustable='box'). It didn't change anything.
the equal keyword scales x and y to be on the same scale. However if you meant that you want square axis you can try plt.axis('box')
You can set the limits manually:
xmin, xmax = plt.xlim()
ymin, ymax = plt.ylim()
fmin = min(xmin, ymin)
fmax = max(xmax, ymax)
plt.xlim(fmin, fmax)
plt.ylim(fmin, fmax)
Then make sure you have a 1:1 aspect ratio
Plotting Differences between bar and hist
Given some data in a pandas.Series , rv, there is a difference between
Calling hist directly on the data to plot
Calculating the histogram results (with numpy.histogram) then plotting with bar
Example Data Generation
%matplotlib inline
import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12.0, 8.0)
matplotlib.style.use('ggplot')
# Setup size and distribution
size = 50000
distribution = stats.norm()
# Create random data
rv = pd.Series(distribution.rvs(size=size))
# Get sane start and end points of distribution
start = distribution.ppf(0.01)
end = distribution.ppf(0.99)
# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = distribution.pdf(x)
pdf = pd.Series(y, x)
# Get histogram of random data
y, x = np.histogram(rv, bins=50, normed=True)
# Correct bin edge placement
x = [(a+x[i+1])/2.0 for i,a in enumerate(x[0:-1])]
hist = pd.Series(y, x)
hist() Plotting
ax = pdf.plot(lw=2, label='PDF', legend=True)
rv.plot(kind='hist', bins=50, normed=True, alpha=0.5, label='Random Samples', legend=True, ax=ax)
bar() Plotting
ax = pdf.plot(lw=2, label='PDF', legend=True)
hist.plot(kind='bar', alpha=0.5, label='Random Samples', legend=True, ax=ax)
How can the bar plot be made to look like the hist plot?
The use case for this is needing to save only the histogrammed data to use and plot later (it is typically smaller in size than the original data).
Bar plotting differences
Obtaining a bar plot that looks like the hist plot requires some manipulating of default behavior for bar.
Force bar to use actual x data for plotting range by passing both x (hist.index) and y (hist.values). The default bar behavior is to plot the y data against an arbitrary range and put the x data as the label.
Set the width parameter to be related to actual step size of x data (The default is 0.8)
Set the align parameter to 'center'.
Manually set the axis legend.
These changes need to be made via matplotlib's bar() called on the axis (ax) instead of pandas's bar() called on the data (hist).
Example Plotting
%matplotlib inline
import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12.0, 8.0)
matplotlib.style.use('ggplot')
# Setup size and distribution
size = 50000
distribution = stats.norm()
# Create random data
rv = pd.Series(distribution.rvs(size=size))
# Get sane start and end points of distribution
start = distribution.ppf(0.01)
end = distribution.ppf(0.99)
# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = distribution.pdf(x)
pdf = pd.Series(y, x)
# Get histogram of random data
y, x = np.histogram(rv, bins=50, normed=True)
# Correct bin edge placement
x = [(a+x[i+1])/2.0 for i,a in enumerate(x[0:-1])]
hist = pd.Series(y, x)
# Plot previously histogrammed data
ax = pdf.plot(lw=2, label='PDF', legend=True)
w = abs(hist.index[1]) - abs(hist.index[0])
ax.bar(hist.index, hist.values, width=w, alpha=0.5, align='center')
ax.legend(['PDF', 'Random Samples'])
Another, simpler solution is to create fake samples that reproduce the same histogram and then simply use hist().
I.e., after retrieving bins and counts from stored data, do
fake = np.array([])
for i in range(len(counts)):
a, b = bins[i], bins[i+1]
sample = a + (b-a)*np.random.rand(counts[i])
fake = np.append(fake, sample)
plt.hist(fake, bins=bins)
Is it possible to clip an image generated by imshow() to the area under a line/multiple lines? I think Clip an image using several patches in matplotlib may have the solution, but I'm not sure how to apply it here.
I just want the coloring (from imshow()) under the lines in this plot:
Here is my plotting code:
from __future__ import division
from matplotlib.pyplot import *
from numpy import *
# wavelength array
lambd = logspace(-3.8, -7.2, 1000)
# temperatures
T_earth = 300
T_sun = 6000
# planck's law constants
h = 6.626069e-34
c = 2.997925e8
k = 1.380648e-23
# compute power using planck's law
power_earth = 2*h*c**2/lambd**5 * 1/(exp(h*c/(lambd*k*T_earth)) - 1)
power_sun = 2*h*c**2/lambd**5 * 1/(exp(h*c/(lambd*k*T_sun)) - 1)
# set up color array based on "spectrum" colormap
colors = zeros((1000,1000))
colors[:,:1000-764] = 0.03
for x,i in enumerate(range(701,765)):
colors[:,1000-i] = 1-x/(765-701)
colors[:,1000-701:] = 0.98
figure(1,(4,3),dpi=100)
# plot normalized planck's law graphs
semilogx(lambd, power_earth/max(power_earth), 'b-', lw=4, zorder=5); hold(True)
semilogx(lambd, power_sun/max(power_sun), 'r-', lw=4, zorder=5); hold(True)
# remove ticks (for now)
yticks([]); xticks([])
# set axis to contain lines nicely
axis([min(lambd), max(lambd), 0, 1.1])
# plot colors, shift extent to match graph
imshow(colors, cmap="spectral", extent=[min(lambd), max(lambd), 0, 1.1])
# reverse x-axis (longer wavelengths to the left)
ax = gca(); ax.set_xlim(ax.get_xlim()[::-1])
tight_layout()
show()
What you can do in this case is using the area under the curve as a Patch to apply set_clip_path. All you have to do is call fill_between and extract the corresponding path, like this:
semilogx(lambd, power_earth/max(power_earth), 'b-', lw=4, zorder=5)
# Area under the curve
fillb_earth = fill_between(lambd, power_earth/max(power_earth), color='none', lw=0)
# Get the path
path_earth, = fillb_earth.get_paths()
# Create a Patch
mask_earth = PathPatch(path_earth, fc='none')
# Add it to the current axes
gca().add_patch(mask_earth)
# Add the image
im_earth = imshow(colors, cmap="spectral", extent=[min(lambd), max(lambd), 0, 1.1])
# Clip the image with the Patch
im_earth.set_clip_path(mask_earth)
And then repeat the same lines for the Sun. Here is the result.