The following code is creating an artefact when shifting images by Fourier phase shift:
The code of the phase shift itself is:
def phase_shift(fimage, dx, dy):
# Shift the phase of the fourier transform of an image
dims = fimage.shape
x, y = np.meshgrid(np.arange(-dims[1] / 2, dims[1] / 2), np.arange(-dims[0] / 2, dims[0] / 2))
kx = -1j * 2 * np.pi * x / dims[1]
ky = -1j * 2 * np.pi * y / dims[0]
shifted_fimage = fimage * np.exp(-(kx * dx + ky * dy))
return shifted_fimage
Usage to actually shift the image and get the shifted image:
def translate_by_phase_shift(image, dx, dy):
# Get the fourier transform
fimage = np.fft.fftshift(np.fft.fftn(image))
# Phase shift
shifted_fimage = phase_shift(fimage, dx, dy)
# Inverse transform -> translated image
shifted_image = np.real(np.fft.ifftn(np.fft.ifftshift(shifted_fimage)))
return shifted_image
The artifact is shown in the images below (image has even dimensions). Top row is context (entire image), bottom is the close-up in the red rectangle. Left: reference image. Middle: shifted with the above code and subject to artifact. Right: what it looks when using cv2.warpAffine() using the same shifts.
What am I doing wrong in the code above that creates this artifact?
[UPDATE] One of the comment suggested to use scipy.ndimage.fourier.fourier_shift(). So I did just that:
fourier_shifted_image = fourier_shift(np.fft.fftn(image), shift)
shifted_image = np.fft.ifftn(fourier_shifted_image)
and plotted the real part (shifted_image.real)
In fact, it also produces the exact same artifact (see image below, right-hand side), which I guess rule out a mistake in my custom code phase_shift() above?
[UPDATE] Now that we ruled out my phase_shift() function, here's a reproducible code, provided that you download the image array from here:
https://www.dropbox.com/s/dmbv56xfqkv8qqz/image.npy?dl=0
import os
import numpy as np
import matplotlib
matplotlib.use('TKAgg')
import matplotlib.pyplot as plt
from scipy.ndimage.fourier import fourier_shift
# Load the image (update path according to your case)
image = np.load(os.path.expanduser('~/DDS/46P_Wirtanen/image.npy'))
# Shift vector
shift = np.array([-3.75, -7.5 ])
# Phase-shift
fourier_shifted_image = fourier_shift(np.fft.fftn(image), shift)
shifted_image = np.fft.ifftn(fourier_shifted_image)
interp_method = 'hanning'
zoomfov = [1525, 1750, 1010, 1225]
vmin = np.percentile(image, 0.1)
vmax = np.percentile(image, 99.8)
fig, ax = plt.subplots(1,2, figsize=(14, 6), sharex=True,sharey=True)
ax[0].imshow(image, origin='lower', cmap='gray', vmin=vmin, vmax=vmax, interpolation=interp_method)
ax[0].set_title('Original image')
ax[1].imshow(shifted_image.real, origin='lower', cmap='gray', vmin=vmin, vmax=vmax, interpolation=interp_method)
ax[1].set_title('with scipy.ndimage.fourier.fourier_shift()')
plt.axis(zoomfov)
plt.tight_layout()
plt.show()
And the output looks like this:
[UPDATE]
Following the reply from Cris, I played with other interpolation methods from opencv with a logarithmic scaling of the intensity, I arrive to similar conclusions: the artifact is indeed also present with the Lanczos flag in cv2.warpAffine() - although very faint - and the cubic one clearly works better for this case of undersampled objects (here, stars):
The code to get to this:
# Compare interpolation methods
import cv2
# Fourier phase shift.
fourier_shifted = fourier_shift(np.fft.fftn(image), shift)
fourier_shifted_image = np.fft.ifftn(fourier_shifted).real
# Use opencv
Mtrans = np.float32([[1,0,shift[1]],[0,1, shift[0]]])
shifted_image_cubic = cv2.warpAffine(image, Mtrans, image.shape[::-1], flags=cv2.INTER_CUBIC)
shifted_image_lanczos = cv2.warpAffine(image, Mtrans, image.shape[::-1], flags=cv2.INTER_LANCZOS4)
zoomfov = [1525, 1750, 1010, 1225]
pmin = 2
pmax = 99.999
fig, ax = plt.subplots(1,3, figsize=(19, 7), sharex=True,sharey=True)
ax[0].imshow(fourier_shifted_image, origin='lower', cmap='gray',
vmin=np.percentile(fourier_shifted_image, pmin), vmax=np.percentile(fourier_shifted_image, pmax),
interpolation=interp_method, norm=LogNorm())
add_rectangle(zoomfov, ax[0])
ax[0].set_title('shifted with Fourier phase shift')
ax[1].imshow(shifted_image_cubic, origin='lower', cmap='gray',
vmin=np.percentile(shifted_image_cubic, pmin), vmax=np.percentile(shifted_image_cubic, pmax),
interpolation=interp_method, norm=LogNorm())
add_rectangle(zoomfov, ax[1])
ax[1].set_title('with cv2.warpAffine(...,flags=cv2.INTER_CUBIC)')
ax[2].imshow(shifted_image_lanczos, origin='lower', cmap='gray',
vmin=np.percentile(shifted_image_lanczos, pmin), vmax=np.percentile(shifted_image_lanczos, pmax),
interpolation=interp_method, norm=LogNorm())
#ax[2].imshow(shifted_image.real, origin='lower', cmap='gray', vmin=np.percentile(Llights_prep[frame], pmin), vmax=np.percentile(Llights_prep[frame], pmax), interpolation=interp_method)
add_rectangle(zoomfov, ax[2])
ax[2].set_title('with cv2.warpAffine(...,flags=cv2.INTER_LANCZOS4) ')
plt.axis(zoomfov)
plt.tight_layout()
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.1, hspace=None)
plt.show()
And to reply to Cris' questions, indeed undersampled stars is of course inescapable with our modest amateur imaging systems (a poor 130 mm diameter), and I naively applied the same algorithm than what I use for professional, bigger instruments where this problem did not show.
The issue here is related to the way that the image is displayed, and to undersampling of the image. The code is correct, but inappropriate for the image.
1. Undersampling
The image has some very sharp transitions. Some stars show only in one single pixel. This is the hallmark of undersampling. In a properly sampled image, a single point of light (no matter how small) appears as an Airy disk (in the case of an ideal lens) in the image, and should occupy several pixels to prevent aliasing.
I'm assuming that the imaging cannot be changed, and is optimized for the application.
However, it is important to note how the image is sampled to be able to chose appropriate image processing tools.
In this case, the undersampled transitions mean that Fourier-based interpolation is not ideal.
2. Fourier-based interpolation
When shifting or scaling the image through the Fourier domain, a sinc interpolator is used. This is the ideal interpolator, and corresponds to a rectangular window in the Fourier domain. The sinc interpolator extends infinitely (or at least to the edges of the image), and decays with 1/x, which is quite slow. It is therefore not ideal in the case of undersampled images.
Because the undersampled image has sharp transitions, the sinc interpolator causes ringing (as do many other interpolators). And because of the slow decay of the sinc function, this ringing carries very far.
For example, the artificial sharp transition in this figure (blue), when interpolated through the Fourier domain (red), shows strong ringing that carries very far. This figure contrasts that with other interpolators that carry the ringing to different distances.
3. Image display
The image is displayed in the question by stretching the contrast very strongly. This is meant to allow observation of dim stars, but also strongly enhances the ringing caused the sharp transitions at those stars. In the plot above, imagine stretching and clipping the y-axis so you only see the region y=[0,0.01]. The ringing will look like a black-and-white pattern.
4. Alternative interpolators
The plot above shows the effect of different interpolators on a sharp transition. When applied to shift the image in the question, this is the result:
For the three methods on the bottom row, the ringing is not observable because it happens in a region that is fully saturated in the image display. Using a different range of grey-values in the display might show some ringing here too.
All these interpolators are designed to approximate the ideal sinc interpolator, but with a shorter spatial footprint so that they are cheaper to compute. Therefore, they all show some ringing at undersampled transitions.
The only interpolators that do not cause ringing at sharp edges are linear interpolation and nearest neighbor interpolation. Whether these are suitable for your application depends on the application, I cannot say.
This is the code I used to make the graph above:
a = double((0:99)<50);
b = resample(a,20,0,'ft');
c = resample(a,20,0,'3-cubic');
d = resample(a,20,0,'lanczos8');
a = resample(a,20,0,'nn');
plot(a)
hold on
plot(b)
plot(c)
plot(d)
legend({'input','sinc','cubic','Lanczos (8)'})
set(gca,'xlim',[600,1400],'ylim',[-0.2,1.2])
set(gca,'fontsize',16)
set(gca,'linewidth',1)
set(get(gca,'children'),'linewidth',2)
set(gca,'Position',[0.07,0.11,0.9,0.815])
The function resample is in DIPimage, you could use imresize instead, except for the 'ft' method, which simply pads the frequency domain with zeros, leading to a sinc interpolation.
Take a look at ndimage.fourier_shift, as far as I know that does not create any artefacts.
Related
I have a simulated signal which is displayed as an histogram. I want to emulate the real measured signal using a convolution with a Gaussian with a specific width, since in the real experiment a detector has a certain uncertainty in the measured channels.
I have tried to do a convolution using np.convolve as well as scipy.signal.convolve but can't seem to get the filtering correctly. Not only the expected shape is off, which would be a slightly smeared version of the histogram and the x-axis e.g. energy scale is off aswell.
I tried defining my Gaussian with a width of 20 keV as:
gauss = np.random.normal(0, 20000, len(coincidence['esum']))
hist_gauss = plt.hist(gauss, bins=100)[0]
where len(coincidence['esum']) is the length of my coincidencedataframe column.This column I bin using:
counts = plt.hist(coincidence['esum'], bins=100)[0]
Besides this approach to generate a suitable Gaussian I tried scipy.signal.gaussian(50, 30000) which unfortunately generates a parabolic looking curve and does not exhibit the characteristic tails.
I tried doing the convolution using both coincidence['esum'] and counts with the both Gaussian approaches. Note that when doing a simple convolution with the standard example according to Finding the convolution of two histograms it works without problems.
Would anyone know how to do such a convolution in python? I exported the column of coincidende['esum'] that I use for my histogram to a pastebin, in case anyone is interested and wants to recreate it with the specific data https://pastebin.com/WFiSBFa6
As you may be aware, doing the convolution of the two histograms with the same bin size will give the histogram of the result of adding each element of one of the samples with each elements of the other of the samples.
I cannot see exactly what you are doing. One important thing that you seem to not be doing is to make sure that the bins of the histograms have the same width, and you have to take care of the position of the edges of the second bin.
In code we have
def hist_of_addition(A, B, bins=10, plot=False):
A_heights, A_edges = np.histogram(A, bins=bins)
# make sure the histogram is equally spaced
assert(np.allclose(np.diff(A_edges), A_edges[1] - A_edges[0]))
# make sure to use the same interval
step = A_edges[1] - A_edges[0]
# specify parameters to make sure the histogram of B will
# have the same bin size as the histogram of A
nBbin = int(np.ceil((np.max(B) - np.min(B))/step))
left = np.min(B)
B_heights, B_edges = np.histogram(B, range=(left, left + step * nBbin), bins=nBbin)
# check that the bins for the second histogram matches the first
assert(np.allclose(np.diff(B_edges), step))
C_heights = np.convolve(A_heights, B_heights)
C_edges = B_edges[0] + A_edges[0] + np.arange(0, len(C_heights) + 1) * step
if plot:
plt.figure(figsize=(12, 4))
plt.subplot(131)
plt.bar(A_edges[:-1], A_heights, step)
plt.title('A')
plt.subplot(132)
plt.bar(B_edges[:-1], B_heights, step)
plt.title('B')
plt.subplot(133)
plt.bar(C_edges[:-1], C_heights, step)
plt.title('A+B')
return C_edges, C_heights
Then
A = -np.cos(np.random.rand(10**6))
B = np.random.normal(1.5, 0.025, 10**5)
hist_of_addition(A, B, bins=100, plot=True);
Gives
I am trying to come up with a generalised way in Python to identify pitch rotations occurring during a set of planned spacecraft manoeuvres. You could think of it as a particular case of a shift detection problem.
Let's consider the solar_elevation_angle variable in my set of measurements, identifying the elevation angle of the sun measured from the spacecraft's instrument. For those who might want to play with the data, I saved the solar_elevation_angle.txt file here.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import gridspec
from scipy.signal import argrelmax
from scipy.ndimage.filters import gaussian_filter1d
solar_elevation_angle = np.loadtxt("solar_elevation_angle.txt", dtype=np.float32)
fig, ax = plt.subplots()
ax.set_title('Solar elevation angle')
ax.set_xlabel('Scanline')
ax.set_ylabel('Solar elevation angle [deg]')
ax.plot(solar_elevation_angle)
plt.show()
The scanline is my time dimension. The four points where the slope changes identify the spacecraft pitch rotations.
As you can see, the solar elevation angle evolution outside the spacecraft manoeuvres regions is pretty much linear as a function of time, and that should always be the case for this particular spacecraft (except for major failures).
Note that during each spacecraft manoeuvre, the slope change is obviously continuous, although discretised in my set of angle values. That means: for each manoeuvre, it does not really make sense to try to locate a single scanline where a manoeuvre has taken place. My goal is rather to identify, for each manoeuvre, a "representative" scanline in the range of scanlines defining the interval of time where the manoeuvre occurred (e.g. middle value, or left boundary).
Once I get a set of "representative" scanline indexes where all manoeuvres have taken place, I could then use those indexes for rough estimations of manoeuvres durations, or to automatically place labels on the plot.
My solution so far has been to:
Compute the 2nd derivative of the solar elevation angle using
np.gradient.
Compute absolute value and clipping of resulting
curve. The clipping is necessary because of what I assume to be
discretisation noise in the linear segments, which would then severely affect the identification of the "real" local maxima in point 4.
Apply smoothing to the resulting curve, to get rid of multiple peaks. I'm using scipy's 1d gaussian filter with a trial-and-error sigma value for that.
Identify local maxima.
Here's my code:
fig = plt.figure(figsize=(8,12))
gs = gridspec.GridSpec(5, 1)
ax0 = plt.subplot(gs[0])
ax0.set_title('Solar elevation angle')
ax0.plot(solar_elevation_angle)
solar_elevation_angle_1stdev = np.gradient(solar_elevation_angle)
ax1 = plt.subplot(gs[1])
ax1.set_title('1st derivative')
ax1.plot(solar_elevation_angle_1stdev)
solar_elevation_angle_2nddev = np.gradient(solar_elevation_angle_1stdev)
ax2 = plt.subplot(gs[2])
ax2.set_title('2nd derivative')
ax2.plot(solar_elevation_angle_2nddev)
solar_elevation_angle_2nddev_clipped = np.clip(np.abs(np.gradient(solar_elevation_angle_2nddev)), 0.0001, 2)
ax3 = plt.subplot(gs[3])
ax3.set_title('absolute value + clipping')
ax3.plot(solar_elevation_angle_2nddev_clipped)
smoothed_signal = gaussian_filter1d(solar_elevation_angle_2nddev_clipped, 20)
ax4 = plt.subplot(gs[4])
ax4.set_title('Smoothing applied')
ax4.plot(smoothed_signal)
plt.tight_layout()
plt.show()
I can then easily identify the local maxima by using scipy's argrelmax function:
max_idx = argrelmax(smoothed_signal)[0]
print(max_idx)
# [ 689 1019 2356 2685]
Which correctly identifies the scanline indexes I was looking for:
fig, ax = plt.subplots()
ax.set_title('Solar elevation angle')
ax.set_xlabel('Scanline')
ax.set_ylabel('Solar elevation angle [deg]')
ax.plot(solar_elevation_angle)
ax.scatter(max_idx, solar_elevation_angle[max_idx], marker='x', color='red')
plt.show()
My question is: Is there a better way to approach this problem?
I find that having to manually specify the clipping threshold values to get rid of the noise and the sigma in the gaussian filter weakens this approach considerably, preventing it to be applied to other similar cases.
First improvement would be to use a Savitzky-Golay filter to find the derivative in a less noisy way. For example, it can fit a parabola (in the sense of least squares) to each data slice of certain size, and then take the second derivative of that parabola. The result is much nicer than just taking 2nd order difference with gradient. Here it is with window size 101:
savgol_filter(solar_elevation_angle, window_length=window, polyorder=2, deriv=2)
Second, instead of looking for points of maximum with argrelmax it is better to look for places where the second derivative is large; for example, at least half its maximal size. This will of course return many indexes, but we can then look at the gaps between those indexes to identify where each peak begins and ends. The midpoint of the peak is then easily found.
Here is the complete code. The only parameter is window size, which is set to 101. The approach is robust; the size 21 or 201 gives essentially the same outcome (it must be odd).
from scipy.signal import savgol_filter
window = 101
der2 = savgol_filter(solar_elevation_angle, window_length=window, polyorder=2, deriv=2)
max_der2 = np.max(np.abs(der2))
large = np.where(np.abs(der2) > max_der2/2)[0]
gaps = np.diff(large) > window
begins = np.insert(large[1:][gaps], 0, large[0])
ends = np.append(large[:-1][gaps], large[-1])
changes = ((begins+ends)/2).astype(np.int)
plt.plot(solar_elevation_angle)
plt.plot(changes, solar_elevation_angle[changes], 'ro')
plt.show()
The fuss with insert and append is because the first index with large derivative should qualify as "peak begins" and the last such index should qualify as "peak ends", even though they don't have a suitable gap next to them (the gap is infinite).
Piecewise linear fit
This is an alternative (not necessarily better) approach, which does not use derivatives: fit a smoothing spline of degree 1 (i.e., a piecewise linear curve), and notice where its knots are.
First, normalize the data (which I call y instead of solar_elevation_angle) to have standard deviation 1.
y /= np.std(y)
The first step is to build a piecewise linear curve that deviates from the data by at most the given threshold, arbitrarily set to 0.1 (no units here because y was normalized). This is done by calling UnivariateSpline repeatedly, starting with a large smoothing parameter and gradually reducing it until the curve fits. (Unfortunately, one can't simply pass in the desired uniform error bound).
from scipy.interpolate import UnivariateSpline
threshold = 0.1
m = y.size
x = np.arange(m)
s = m
max_error = 1
while max_error > threshold:
spl = UnivariateSpline(x, y, k=1, s=s)
interp_y = spl(x)
max_error = np.max(np.abs(interp_y - y))
s /= 2
knots = spl.get_knots()
values = spl(knots)
So far we found the knots, and noted the values of the spline at those knots. But not all of these knots are really important. To test the importance of each knot, I remove it and interpolate without it. If the new interpolant is substantially different from the old (doubling the error), the knot is considered important and is added to the list of found slope changes.
ts = knots.size
idx = np.arange(ts)
changes = []
for j in range(1, ts-1):
spl = UnivariateSpline(knots[idx != j], values[idx != j], k=1, s=0)
if np.max(np.abs(spl(x) - interp_y)) > 2*threshold:
changes.append(knots[j])
plt.plot(y)
plt.plot(changes, y[np.array(changes, dtype=int)], 'ro')
plt.show()
Ideally, one would fit piecewise linear functions to given data, increasing the number of knots until adding one more does not bring "substantial" improvement. The above is a crude approximation of that with SciPy tools, but far from best possible. I don't know of any off-the-shelf piecewise linear model selection tool in Python.
I have the following code to generate a streamplot based on an interp1d-Interpolation of discrete data:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as colors
from scipy.interpolate import interp1d
# CSV Import
a1array=pd.read_csv('a1.csv', sep=',',header=None).values
rv=a1array[:,0]
a1v=a1array[:,1]
da1vM=a1array[:,2]
a1 = interp1d(rv, a1v)
da1M = interp1d(rv, da1vM)
# Bx and By vector components
def bx(x ,y):
rad = np.sqrt(x**2+y**2)
if rad == 0:
return 0
else:
return x*y/rad**4*(-2*a1(rad)+rad*da1M(rad))/2.87445E-19*1E-12
def by(x ,y):
rad = np.sqrt(x**2+y**2)
if rad == 0:
return 4.02995937E-04/2.87445E-19*1E-12
else:
return -1/rad**4*(2*a1(rad)*y**2+rad*da1M(rad)*x**2)/2.87445E-19*1E-12
Bx = np.vectorize(bx, otypes=[np.float])
By = np.vectorize(by, otypes=[np.float])
# Grid
num_steps = 11
Y, X = np.mgrid[-25:25:(num_steps * 1j), 0:25:(num_steps * 1j)]
Vx = Bx(X, Y)
Vy = By(X, Y)
speed = np.sqrt(Bx(X, Y)**2+By(X, Y)**2)
lw = 2*speed / speed.max()+.5
# Star Radius
circle3 = plt.Circle((0, 0), 16.3473140, color='black', fill=False)
# Plot
fig0, ax0 = plt.subplots(num=None, figsize=(11,9), dpi=80, facecolor='w', edgecolor='k')
strm = ax0.streamplot(X, Y, Vx, Vy, color=speed, linewidth=lw,density=[1,2], cmap=plt.cm.jet)
ax0.streamplot(-X, Y, -Vx, Vy, color=speed, linewidth=lw,density=[1,2], cmap=plt.cm.jet)
ax0.add_artist(circle3)
cbar=fig0.colorbar(strm.lines,fraction=0.046, pad=0.04)
cbar.set_label('B[GT]', rotation=270, labelpad=8)
cbar.set_clim(0,1500)
cbar.draw_all()
ax0.set_ylim([-25,25])
ax0.set_xlim([-25,25])
ax0.set_xlabel('x [km]')
ax0.set_ylabel('z [km]')
ax0.set_aspect(1)
plt.title('polyEos(0.05,2), M/R=0.2, B_r(0,0)=1402GT', y=1.01)
plt.savefig('MR02Br1402.pdf',bbox_inches=0)
plt.show(fig0)
I uploaded the csv-file here if you want to try some stuff https://www.dropbox.com/s/4t7jixpglt0mkl5/a1.csv?dl=0.
Which generates the following plot:
I am actually pretty happy with the result except for one small detail, which I can not figure out: If one looks closely the linewidth and the color change in rather big steps, which is especially visible at the center:
Is there some way/option with which I can decrease the size of this steps to especially make the colormap smother?
I had another look at this and it wasnt as painful as I thought it might be.
Add:
subdiv = 15
points = np.arange(len(t[0]))
interp_points = np.linspace(0, len(t[0]), subdiv * len(t[0]))
tgx = np.interp(interp_points, points, tgx)
tgy = np.interp(interp_points, points, tgy)
tx = np.interp(interp_points, points, tx)
ty = np.interp(interp_points, points, ty)
after ty is initialised in the trajectories loop (line 164 in my version). Just substitute whatever number of subdivisions you want for subdiv = 15. All the segments in the streamplot will be subdivided into as many equally sized segments as you choose. The colors and linewidths for each will still be properly obtained from interpolating the data.
Its not as neat as changing the integration step but it does plot exactly the same trajectories.
If you don't mind changing the streamplot code (matplotlib/streamplot.py), you could simply decrease the size of the integration steps. Inside _integrate_rk12() the maximum step size is defined as:
maxds = min(1. / dmap.mask.nx, 1. / dmap.mask.ny, 0.1)
If you decrease that, lets say:
maxds = 0.1 * min(1. / dmap.mask.nx, 1. / dmap.mask.ny, 0.1)
I get this result (left = new, right = original):
Of course, this makes the code about 10x slower, and I haven't thoroughly tested it, but it seems to work (as a quick hack) for this example.
About the density (mentioned in the comments): I personally don't see the problem of that. It's not like we are trying to visualize the actual path line of (e.g.) a particle; the density is already some arbitrary (controllable) choice, and yes it is influenced by choices in the integration, but I don't thing that it changes the (not quite sure how to call this) required visualization we're after.
The results (density) do seem to converge a bit for decreasing step sizes, this shows the results for decreasing the integration step with a factor {1,5,10,20}:
You could increase the density parameter to get more smooth color transitions,
but then use the start_points parameter to reduce your overall clutter.
The start_points parameter allows you to explicity choose the location and
number of trajectories to draw. It overrides the default, which is to plot
as many as possible to fill up the entire plot.
But first you need one little fix to your existing code:
According to the streamplot documentation, the X and Y args should be 1d arrays, not 2d arrays as produced by mgrid.
It looks like passing in 2d arrays is supported, but it is undocumented
and it is currently not compatible with the start_points parameter.
Here is how I revised your X, Y, Vx, Vy and speed:
# Grid
num_steps = 11
Y = np.linspace(-25, 25, num_steps)
X = np.linspace(0, 25, num_steps)
Ygrid, Xgrid = np.mgrid[-25:25:(num_steps * 1j), 0:25:(num_steps * 1j)]
Vx = Bx(Xgrid, Ygrid)
Vy = By(Xgrid, Ygrid)
speed = np.hypot(Vx, Vy)
lw = 3*speed / speed.max()+.5
Now you can explicitly set your start_points parameter. The start points are actually
"seed" points. Any given stream trajectory will grow in both directions
from the seed point. So if you put a seed point right in the center of
the example plot, it will grow both up and down to produce a vertical
stream line.
Besides controlling the number of trajectories, using the
start_points parameter also controls the order they are
drawn. This is important when considering how trajectories terminate.
They will either hit the border of the plot, or they will terminate if
they hit a cell of the plot that already has a trajectory. That means
your first seeds will tend to grow longer and your later seeds will tend
to get limited by previous ones. Some of the later seeds may not grow
at all. The default seeding strategy is to plant a seed at every cell,
which is pretty obnoxious if you have a high density. It also orders
them by planting seeds first along the plot borders and spiraling inward.
This may not be ideal for your particular case. I found a very simple
strategy for your example was to just plant a few seeds between those
two points of zero velocity, y=0 and x from -10 to 10. Those trajectories
grow to their fullest and fill in most of the plot without clutter.
Here is how I create the seed points and set the density:
num_streams = 8
stptsy = np.zeros((num_streams,), np.float)
stptsx_left = np.linspace(0, -10.0, num_streams)
stptsx_right = np.linspace(0, 10.0, num_streams)
stpts_left = np.column_stack((stptsx_left, stptsy))
stpts_right = np.column_stack((stptsx_right, stptsy))
density = (3,6)
And here is how I modify the calls to streamplot:
strm = ax0.streamplot(X, Y, Vx, Vy, color=speed, linewidth=lw, density=density,
cmap=plt.cm.jet, start_points=stpts_right)
ax0.streamplot(-X, Y, -Vx, Vy, color=speed, linewidth=lw,density=density,
cmap=plt.cm.jet, start_points=stpts_left)
The result basically looks like the original, but with smoother color transitions and only 15 stream lines. (sorry no reputation to inline the image)
I think your best bet is to use a colormap other than jet. Perhaps cmap=plt.cmap.plasma.
Wierd looking graphs obscure understanding of the data.
For data which is ordered in some way, like by the speed vector magnitude in this case, uniform sequential colormaps will always look smoother. The brightness of sequential maps varies monotonically over the color range, removing large percieved color changes over small ranges of data. The uniform maps vary linearly over their whole range which makes the main features in the data much more visually apparent.
(source: matplotlib.org)
The jet colormap spans a very wide variety of brightnesses over its range with in inflexion in the middle. This is responsible for the particularly egregious red to blue transition around the center region of your graph.
(source: matplotlib.org)
The matplotlib user guide on choosing a color map has a few recomendations for about selecting an appropriate map for a given data set.
I dont think there is much else you can do to improve this by just changing parameters in your plot.
The streamplot divides the graph into cells with 30*density[x,y] in each direction, at most one streamline goes through each cell. The only setting which directly increases the number of segments is the density of the grid matplotlib uses. Increasing the Y density will decrease the segment length so that the middle region may transition more smoothly. The cost of this is an inevitable cluttering of the graph in regions where the streamlines are horizontal.
You could also try to normalise the speeds differently so the the change is artifically lowered in near the center. At the end of the day though it seems like it defeats the point of the graph. The graph should provide a useful view of the data for a human to understand. Using a colormap with strange inflexions or warping the data so that it looks nicer removes some understanding which could otherwise be obtained from looking at the graph.
A more detailed discussion about the issues with colormaps like jet can be found on this blog.
I need to create a spline or polyline representation of a vascular tree model (see below).
The model is in a STL format, thus I have the x-y-z coordinates of all vertices. The lines should run through the center of the vessel mesh thus I thought that the best approach would be a spline regression through the vertex cloud. In addition it would be great if I can have the radius of the vessel at given points, e.g. the coordinates of the polyline.
I looked through this forum and the VTK website (assuming they have a straightforward implementation for this sort of thing) but so far I haven't found something I can use. Does anyone know of a Python module or VTK class (which I would call from Python) that can do this? The python modules I found on this are all for 2D data.
Thanks a lot!
EDIT:
I came across this library called VMTK that deals almost exclusively with vessel segmentation and has functionality for what they call 'centerline calculation'. However, they usually require the vessels to be 'cut' at their ends and 'source points' to be defined. In the case of my model, however, one can see that the end points are 'capped' which makes matters more complicated. If I find a solution I'll post here
I don't know any software or python classes exactly on your problem.
Maybe python interpolate.splev will help you with a single vessel.
You may try the following code as an example:
from scipy import interpolate
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
# 3D example
total_rad = 10
z_factor = 3
noise = 0.1
num_true_pts = 200
s_true = np.linspace(0, total_rad, num_true_pts)
x_true = np.cos(s_true)
y_true = np.sin(s_true)
z_true = s_true/z_factor
num_sample_pts = 100
s_sample = np.linspace(0, total_rad, num_sample_pts)
x_sample = np.cos(s_sample) + noise * np.random.randn(num_sample_pts)
y_sample = np.sin(s_sample) + noise * np.random.randn(num_sample_pts)
z_sample = s_sample/z_factor + noise * np.random.randn(num_sample_pts)
tck, u = interpolate.splprep([x_sample,y_sample,z_sample], s=2)
x_knots, y_knots, z_knots = interpolate.splev(tck[0], tck)
u_fine = np.linspace(0,1,num_true_pts)
x_fine, y_fine, z_fine = interpolate.splev(u_fine, tck)
fig2 = plt.figure(2)
ax3d = fig2.add_subplot(111, projection='3d')
# blue line shows true helix
ax3d.plot(x_true, y_true, z_true, 'b')
# red stars show distorted sample around a blue line
ax3d.plot(x_sample, y_sample, z_sample, 'r*')
# green line and dots show fitted curve
ax3d.plot(x_knots, y_knots, z_knots, 'go')
ax3d.plot(x_fine, y_fine, z_fine, 'g')
plt.show()
This code uses noisy centerline path of a single vessel and fit it with a smooth curve (see the result below):
interolation result
Usually, two user seeds are used to mark centerline ends, in the case of centerline representation as in VMTK.
The other way to get centerlines automatically is to voxelize your stl mesh, costruct a voxel skeleton, and separate skeletal segment to represent each vessel. Then you can interpolate each centerline to get the smooth curves. Unprocessed skeletal segments usualy have zigzags.
I'm trying to get python to return, as close as possible, the center of the most obvious clustering in an image like the one below:
In my previous question I asked how to get the global maximum and the local maximums of a 2d array, and the answers given worked perfectly. The issue is that the center estimation I can get by averaging the global maximum obtained with different bin sizes is always slightly off than the one I would set by eye, because I'm only accounting for the biggest bin instead of a group of biggest bins (like one does by eye).
I tried adapting the answer to this question to my problem, but it turns out my image is too noisy for that algorithm to work. Here's my code implementing that answer:
import numpy as np
from scipy.ndimage.filters import maximum_filter
from scipy.ndimage.morphology import generate_binary_structure, binary_erosion
import matplotlib.pyplot as pp
from os import getcwd
from os.path import join, realpath, dirname
# Save path to dir where this code exists.
mypath = realpath(join(getcwd(), dirname(__file__)))
myfile = 'data_file.dat'
x, y = np.loadtxt(join(mypath,myfile), usecols=(1, 2), unpack=True)
xmin, xmax = min(x), max(x)
ymin, ymax = min(y), max(y)
rang = [[xmin, xmax], [ymin, ymax]]
paws = []
for d_b in range(25, 110, 25):
# Number of bins in x,y given the bin width 'd_b'
binsxy = [int((xmax - xmin) / d_b), int((ymax - ymin) / d_b)]
H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)
paws.append(H)
def detect_peaks(image):
"""
Takes an image and detect the peaks usingthe local maximum filter.
Returns a boolean mask of the peaks (i.e. 1 when
the pixel's value is the neighborhood maximum, 0 otherwise)
"""
# define an 8-connected neighborhood
neighborhood = generate_binary_structure(2,2)
#apply the local maximum filter; all pixel of maximal value
#in their neighborhood are set to 1
local_max = maximum_filter(image, footprint=neighborhood)==image
#local_max is a mask that contains the peaks we are
#looking for, but also the background.
#In order to isolate the peaks we must remove the background from the mask.
#we create the mask of the background
background = (image==0)
#a little technicality: we must erode the background in order to
#successfully subtract it form local_max, otherwise a line will
#appear along the background border (artifact of the local maximum filter)
eroded_background = binary_erosion(background, structure=neighborhood, border_value=1)
#we obtain the final mask, containing only peaks,
#by removing the background from the local_max mask
detected_peaks = local_max - eroded_background
return detected_peaks
#applying the detection and plotting results
for i, paw in enumerate(paws):
detected_peaks = detect_peaks(paw)
pp.subplot(4,2,(2*i+1))
pp.imshow(paw)
pp.subplot(4,2,(2*i+2) )
pp.imshow(detected_peaks)
pp.show()
and here's the result of that (varying the bin size):
Clearly my background is too noisy for that algorithm to work, so the question is: how can I make that algorithm less sensitive? If an alternative solution exists then please let me know.
EDIT
Following Bi Rico advise I attempted smoothing my 2d array before passing it on to the local maximum finder, like so:
H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)
H1 = gaussian_filter(H, 2, mode='nearest')
paws.append(H1)
These were the results with a sigma of 2, 4 and 8:
EDIT 2
A mode ='constant' seems to work much better than nearest. It converges to the right center with a sigma=2 for the largest bin size:
So, how do I get the coordinates of the maximum that shows in the last image?
Answering the last part of your question, always you have points in an image, you can find their coordinates by searching, in some order, the local maximums of the image. In case your data is not a point source, you can apply a mask to each peak in order to avoid the peak neighborhood from being a maximum while performing a future search. I propose the following code:
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import copy
def get_std(image):
return np.std(image)
def get_max(image,sigma,alpha=20,size=10):
i_out = []
j_out = []
image_temp = copy.deepcopy(image)
while True:
k = np.argmax(image_temp)
j,i = np.unravel_index(k, image_temp.shape)
if(image_temp[j,i] >= alpha*sigma):
i_out.append(i)
j_out.append(j)
x = np.arange(i-size, i+size)
y = np.arange(j-size, j+size)
xv,yv = np.meshgrid(x,y)
image_temp[yv.clip(0,image_temp.shape[0]-1),
xv.clip(0,image_temp.shape[1]-1) ] = 0
print xv
else:
break
return i_out,j_out
#reading the image
image = mpimg.imread('ggd4.jpg')
#computing the standard deviation of the image
sigma = get_std(image)
#getting the peaks
i,j = get_max(image[:,:,0],sigma, alpha=10, size=10)
#let's see the results
plt.imshow(image, origin='lower')
plt.plot(i,j,'ro', markersize=10, alpha=0.5)
plt.show()
The image ggd4 for the test can be downloaded from:
http://www.ipac.caltech.edu/2mass/gallery/spr99/ggd4.jpg
The first part is to get some information about the noise in the image. I did it by computing the standard deviation of the full image (actually is better to select an small rectangle without signal). This is telling us how much noise is present in the image.
The idea to get the peaks is to ask for successive maximums, which are above of certain threshold (let's say, 3, 4, 5, 10, or 20 times the noise). This is what the function get_max is actually doing. It performs the search of maximums until one of them is below the threshold imposed by the noise. In order to avoid finding the same maximum many times it is necessary to remove the peaks from the image. In the general way, the shape of the mask to do so depends strongly on the problem that one want to solve. for the case of stars, it should be good to remove the star by using a Gaussian function, or something similar. I have chosen for simplicity a square function, and the size of the function (in pixels) is the variable "size".
I think that from this example, anybody can improve the code by adding more general things.
EDIT:
The original image looks like:
While the image after identifying the luminous points looks like this:
Too much of a n00b on Stack Overflow to comment on Alejandro's answer elsewhere here. I would refine his code a bit to use a preallocated numpy array for output:
def get_max(image,sigma,alpha=3,size=10):
from copy import deepcopy
import numpy as np
# preallocate a lot of peak storage
k_arr = np.zeros((10000,2))
image_temp = deepcopy(image)
peak_ct=0
while True:
k = np.argmax(image_temp)
j,i = np.unravel_index(k, image_temp.shape)
if(image_temp[j,i] >= alpha*sigma):
k_arr[peak_ct]=[j,i]
# this is the part that masks already-found peaks.
x = np.arange(i-size, i+size)
y = np.arange(j-size, j+size)
xv,yv = np.meshgrid(x,y)
# the clip here handles edge cases where the peak is near the
# image edge
image_temp[yv.clip(0,image_temp.shape[0]-1),
xv.clip(0,image_temp.shape[1]-1) ] = 0
peak_ct+=1
else:
break
# trim the output for only what we've actually found
return k_arr[:peak_ct]
In profiling this and Alejandro's code using his example image, this code about 33% faster (0.03 sec for Alejandro's code, 0.02 sec for mine.) I expect on images with larger numbers of peaks, it would be even faster - appending the output to a list will get slower and slower for more peaks.
I think the first step needed here is to express the values in H in terms of the standard deviation of the field:
import numpy as np
H = H / np.std(H)
Now you can put a threshold on the values of this H. If the noise is assumed to be Gaussian, picking a threshold of 3 you can be quite sure (99.7%) that this pixel can be associated with a real peak and not noise. See here.
Now the further selection can start. It is not exactly clear to me what exactly you want to find. Do you want the exact location of peak values? Or do you want one location for a cluster of peaks which is in the middle of this cluster?
Anyway, starting from this point with all pixel values expressed in standard deviations of the field, you should be able to get what you want. If you want to find clusters you could perform a nearest neighbour search on the >3-sigma gridpoints and put a threshold on the distance. I.e. only connect them when they are close enough to each other. If several gridpoints are connected you can define this as a group/cluster and calculate some (sigma-weighted?) center of the cluster.
Hope my first contribution on Stackoverflow is useful for you!
The way I would do it:
1) normalize H between 0 and 1.
2) pick a threshold value, as tcaswell suggests. It could be between .9 and .99 for example
3) use masked arrays to keep only the x,y coordinates with H above threshold:
import numpy.ma as ma
x_masked=ma.masked_array(x, mask= H < thresold)
y_masked=ma.masked_array(y, mask= H < thresold)
4) now you can weight-average on the masked coordinates, with weight something like (H-threshold)^2, or any other power greater or equal to one, depending on your taste/tests.
Comment:
1) This is not robust with respect to the type of peaks you have, since you may have to adapt the thresold. This is the minor problem;
2) This DOES NOT work with two peaks as it is, and will give wrong results if the 2nd peak is above threshold.
Nonetheless, it will always give you an answer without crashing (with pros and cons of the thing..)
I'm adding this answer because it's the solution I ended up using. It's a combination of Bi Rico's comment here (May 30 at 18:54) and the answer given in this question: Find peak of 2d histogram.
As it turns out using the peak detection algorithm from this question Peak detection in a 2D array only complicates matters. After applying the Gaussian filter to the image all that needs to be done is to ask for the maximum bin (as Bi Rico pointed out) and then obtain the maximum in coordinates.
So instead of using the detect-peaks function as I did above, I simply add the following code after the Gaussian 2D histogram is obtained:
# Get 2D histogram.
H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)
# Get Gaussian filtered 2D histogram.
H1 = gaussian_filter(H, 2, mode='nearest')
# Get center of maximum in bin coordinates.
x_cent_bin, y_cent_bin = np.unravel_index(H1.argmax(), H1.shape)
# Get center in x,y coordinates.
x_cent_coor , y_cent_coord = np.average(xedges[x_cent_bin:x_cent_bin + 2]), np.average(yedges[y_cent_g:y_cent_g + 2])