I computed derivatives using different methods such as :
convolution with an array [[-1, 1]].
Using the fourier theorem by computing DFT of the image and the array mentioned above, multiplying them and performing IDFT.
Directly through the derivative formula (Computing Fourier, multiplying by index and a constant and computing the inverse).
All methods seem to work almost identically, but have slight differences.
An explanation why they end up with slightly different results would be appreciated.
After computing those I started playing with the result to learn about it, and I found out something that confused me:
The main thing that baffles me is that when I try computing the median of this derivative, its ALWAYS 0.0.
Why is that?
I added the code I used to compute this (the first method at least) because maybe I'm doing something wrong.
from scipy.signal import convolve2d
im = sl.read_image(r'C:\Users\ahhal\Desktop\Essentials\Uni\year3\SemesterA\ImageProcessing\Exercises\Ex2\external\monkey.jpg', 1)
b = [[-1, 1]]
print(np.median(convolve2d(im, b)))
output: 0.0
The read_image function is my own and this is the implementation:
from imageio import imread
from skimage.color import rgb2gray
import numpy as np
def read_image(filename, representation):
"""
Receives an image file and converts it into one of two given representations.
:param filename: The file name of an image on disk (could be grayscale or RGB).
:param representation: representation code, either 1 or 2 defining wether the output
should be a grayscale image (1) or an RGB image (2). If the input image is grayscale,
we won't call it with representation = 2.
:return: An image, represented by a matrix of type (np.float64) with intensities
normalized to the range [0,1].
"""
assert representation in [1, 2]
# reads the image
im = imread(filename)
if representation == 1: # If the user specified they need grayscale image,
if len(im.shape) == 3: # AND the image is not grayscale yet
im = rgb2gray(im) # convert to grayscale (**Assuming its RGB and not a different format**)
im_float = im.astype(np.float64) # Convert the image type to one we can work with.
if im_float.max() > 1: # If image values are out of bound, normalize them.
im_float = im_float / 255
return im_float
Edit 2:
I tried it on several different images, and got 0.0 at all of them.
The image I'm using in the example is:
I computed derivatives using different methods such as :
convolution with an array [[-1, 1]].
Using the fourier theorem by computing DFT of the image and the array mentioned above, multiplying them and performing IDFT.
Directly through the derivative formula (Computing Fourier, multiplying by index and a constant and computing the inverse).
These derivative methods are all approximate and make different assumptions:
Convolution by [[-1, 1]] computes differences between adjacent elements,
derivative ~= data[n+1] ā data[n]
You can interpret this like interpolating the data with a line segment, then taking the derivative of that interpolant:
I(x) = data[n] + (data[n+1] ā data[n]) * (x ā n)
So the approximation assumes the underlying function is locally linear. You can analyze the error by Taylor expansion to find that the error comes from the ignored higher-order terms. In other words, the approximation is accurate provided the function doesn't have strong nonlinear terms. This is a simple case of finite differences.
This is the same as 1, except with different boundary handling to handle convolution of samples near the edges of the image. By default, scipy.signal.convolve2d does zero padding (though you can use the boundary option to choose some other methods). However when computing the convolution through the DFT, then implicitly the boundary handling is periodic, wrapping around at the image edges. So the results of 1 and 2 differ for a margin of pixels near the edge because of the different boundary handling.
Computing the derivative through multiplying iĻ under the DFT representation can be interpreted like evaluating the derivative of the sinc interpolation the data. Sinc interpolation assumes the data is band limited. The error comes from spectra beyond the Nyquist frequency. Particularly, if there is a hard jump discontinuity from an object boundary, then the image is not bandlimited and the DFT-based derivative will have substantial error in the vicinity of the jump, appearing as ringing artifacts.
The main thing that baffles me is that when I try computing the median of this derivative, its ALWAYS 0.0.
I don't know why this happened here, but it shouldn't always be the case. For instance if each image row is the unit ramp data[n] = n, then the convolution by [[-1, 1]] is equal to 1 everywhere, except depending on boundary handling possibly not at the edges, so the median is 1.
Pascal already gave a wonderful explanation of the differences between the various approximations to the derivative. So I'll focus here on the "why always 0.0?" question.
The median of the derivative is 0.0 only by approximation. When I compute it, based on the finite difference approximation (method #1), I get -5.15e-5 as the median. Close to zero, but not exactly zero.
The derivative is 0 in uniform (flat) regions of the image such as the out-of-focus background. Other features in the image tend to have both a positive and a negative edge, making the histogram of the derivative image very symmetric:
This symmetry causes the median (as well as the mean) to be close to zero for such an image. However, this is not always the case. For example, if the image is brighter on the left edge than the right edge (or the other way around), then there must be a net gradient across the image, causing the mean or median to be different from zero.
Related
I am trying to get the phase distribution of a 2D aperture using FFT.
The input is a circle, where everything inside the circle has value 1, outside it has value 0.
In order to make a good transform, I use an input array that is 200x as large as the radius of the circle, and make a 5000x5000 grid out of it. This ensures that the circle is actually circular and there is enough room around in order that no Nyquist things happen.
I need to 2D Fourier transform the aperture and then calculate the phase of the Fourier transform at every point.
The function I use for creating the input (aperture):
creating the input aperture
Next do the numpy fft2 2D fourier transform:
Fourier transforming aperture
And the result of this is a 2D complex array (as expected!), BUT with the imaginary parts so much much much smaller than the real parts (17 orders of magnitude difference imaginary parts ~10E-17).
This is not expected and most probably wrong. What went wrong?
The FFT phase result of a perfectly symmetric input is zero, e.g. a strictly real result, thus atan2(Im,Re) == 0 , (imaginary components all zero, except for rounding noise).
(even symmetry with respect to (0,0) circularly, or to (n/2,n/2))
The phase will become non-zero (thus a non-zero imaginary component in the FFT result) when the input is moved off center or otherwise made non-symmetric.
I'm having a little trouble with doing a Fourier Deconvolution using numpy. I'm currently attempting this with a test case of 3 Gaussians so I know exactly what to expect at each end.
What I'm trying to recover is the input signal given the exact form of the filter and the output.
Here, I have used a naive constraint to suppress the high frequency ends setting it to zero (because the signals are all Gaussians in fourier space as well). I expected to recover my original input with a tiny bit of ringing due to this constraint.
#Dummy Case for Gaussian convolve with Gaussian
N = 128
x = np.arange(-5, 5, 10./(2 * N))
epsilon = 1e-18
def gaus(x,sigma):
return 1./np.sqrt(2*np.pi)/sigma * np.exp(-(x * x)/(2 * sigma**2))
y_g = gaus(x,0.3) #output gaussian blurred signal
y_b = gaus(x,0.1) #gaussian blur filter
y_i = gaus(x,np.sqrt(0.3**2 - 0.1**2)) #og gaussian input
f_yg = np.fft.fft(y_g) #fft the blur
f_yb = np.fft.fft(y_b) #fft the filter
f_yi = np.fft.fft(y_i)
r_f = (np.fft.fftshift(f_yg)+epsilon)/(np.fft.fftshift(f_yb)+epsilon) #deconvolve by division in fourier space
r_f[np.abs(x)>0.5] = 0 #naive constraint to remove the artifacts by knowing final form is gaussian
r_f = np.fft.ifftshift(r_f)
r_if = np.fft.ifft(r_f)
y_gf = np.fft.ifft(f_yg)
y_bf = np.fft.ifft(f_yb)
y_if = np.fft.ifft(f_yi)
plt.plot(x,y_if, label='fft true input')
plt.plot(x,r_if, label='fft recv. input')
plt.legend(framealpha=0.)
plt.show()
Here the orange is the recovered input signal using the deconvolution of the output and the blur.
There are a few questions I have with this:
There is clearly a scaling issue. The only area where I can think that this may come in is when I applied the naive constraint. Should I renormalize in this step, knowing that 1/sqrt(N)*integral over my fourier space is equal to 1?
It looks like the position of the recovered Gaussian is messed up with half of the curve at either sides of the plot. Is this due to the division in Fourier space? How do I recover the original position (or have I done this completely wrong to begin with)
I've attached the script used to generate the two curves, the original input and recovered input in physical space.
Cheers,
Keven
EDIT: I should add I have no problem restoring the image using scipy.deconvolve + some small edits. This must mean my method here is somehow wrong?
1 ) As you correctly understood, the requirement for a scaling is related to the Discrete Fourier Transform. The best way to get it is to compute the deconvolution of two uniform unit signals. Their DFT is n 0 0 0 ...., where n is the number of points of the DFT. Hence the ratio r_f is 1 0 0 0 0 and its backward fft computed by np.fft.ifft() is 1/n 1/n 1/n ...
The correct signal resulting from the deconvolution should have been 1/T 1/T 1/T ..., where T=10. is the length of the frame.
As a consequence, the correct scaling to perform the deconvolution is n/T= len(r_f)/10.
r_if=r_if*len(r_if)/10.
2) The deconvoluted signal is translated by half a period. This is due to the fact that the gaussian kernel is centered on the middle of the frame. Simply shift the kernel by half a period and the problem is solved. The function np.fft.fftshift() can be applied to this end:
f_yb = np.fft.fft(np.fft.fftshift(y_b)) #fft the filter
EDIT: To investigate the reason for the translation, let's focus on the case of the deconvolution kernel being a very narrow gaussian distribution, nearly corresponding to Dirac distribution. Your input signal is a gaussian curve, centered at zero, the frame being sampled between -5 and 5. Similarly, the deconvolution kernel is a Dirac centered at zero. As a consequence, the deconvoluted signal must be identical to the input signal: a gaussian curve centered at zero. Nevertheless, the DFT as implemented in FFTW and consequently np.fft.fft() is computed as that of a frame starting at 0 and ending at 10, sampled at points 10j/n where j is in [0..n-1], the frequencies in the Fourier space being k/10 where k in [0..n/2,-n/2+1..-1]. As a consequence, this DFT sees your signal as a gaussian centered at 5 and the deconvolution kernel as a Dirac centered at 5. The convolution of a function f(t) with a Dirac delta(t-t0) centered at t0 is simply the translated function f(t-t0). Hence, the result of the deconvolution as computed by np.fft.fft() is the input signal translated by half a period. Since the input signal is centered at 0 in the [-5,5] frame, the output signal computed by np.fft.fft() is centered at -5 (or equivalently 5 due to periodicity). Shifting the kernel resolves the mismatch between us thinking of the frame as [-5 5] and np.fft.ifft() handling it as if it were [0 10].
Filters are often designed to reduce the effect of high-frequency noises. Deconvoluting therfore induce a potential magnification of high frequency noise. Screening the frequencies as you did is a potential solution. Notice that it is exactly equivalent to convoluting the signal with a particular filter!
In the range of tomographic reconstruction, the filtered backprojection algorithm requires applying a ramp filter, which dramatically inflate the high frequency noise. Here is proposed a Wiener filter: this kind of filter can be designed to minimize the mean square error on the deconvoluted signal, given the SNR of the convoluted signal. It nevertheless require some assumption regarding the power spectral densities of the signal and noise.
I want to combine phase spectrum of one image and magnitude spectrum of different image into one image.
I have got phase spectrum and magnitude spectrum of image A and image B.
Here is the code.
f = np.fft.fft2(grayA)
fshift1 = np.fft.fftshift(f)
phase_spectrumA = np.angle(fshift1)
magnitude_spectrumB = 20*np.log(np.abs(fshift1))
f2 = np.fft.fft2(grayB)
fshift2 = np.fft.fftshift(f2)
phase_spectrumB = np.angle(fshift2)
magnitude_spectrumB = 20*np.log(np.abs(fshift2))
I trying to figure out , but still i do not know how to do that.
Below is my test code.
imgCombined = abs(f) * math.exp(1j*np.angle(f2))
I wish i can come out just like that
Here are the few things that you would need to fix for your code to work as intended:
The math.exp function supports scalar exponentiation. For an element-wise matrix exponentiation you should use numpy.exp instead.
Similary, the * operator would attempt to perform matrix multiplication. In your case you want to instead perform element-wise multiplication which can be done with np.multiply
With these fixes you should get the frequency-domain combined matrix as follows:
combined = np.multiply(np.abs(f), np.exp(1j*np.angle(f2)))
To obtain the corresponding spatial-domain image, you would then need compute the inverse transform (and take the real part since there my be residual small imaginary parts due to numerical errors) with:
imgCombined = np.real(np.fft.ifft2(combined))
Finally the result can be shown with:
import matplotlib.pyplot as plt
plt.imshow(imgCombined, cmap='gray')
Note that imgCombined may contain values outside the [0,1] range. You would then need to decide how you want to rescale the values to fit the expected [0,1] range.
The default scaling (resulting in the image shown above) is to linearly scale the values such that the minimum value is set to 0, and the maximum value is set to 0.
Another way could be to limit the values to that range (i.e. forcing all negative values to 0 and all values greater than 1 to 1).
Finally another approach, which seems to provide a result closer to the screenshot provided, would be to take the absolute value with imgCombined = np.abs(imgCombined)
I am trying to implement an algorithm in python, but I am not sure when I should use fftshift(fft(fftshift(x))) and when only fft(x) (from numpy). Is there a rule of thumb based on the shape of input data?
I am using fftshift instead of ifftshift due to the even number of values in the vector x.
It really just depends on what you want. The DFT (and hence the FFT) is periodic in the frequency domain with period equal to 2pi.
The fft() function will return the approximation of the DFT with omega (radians/s) from 0 to pi (i.e. 0 to fs, where fs is the sampling frequency). All fftshift() does is swap the output vector of the fft() right down the middle. So the output of fftshift(fft()) is now from -pi/2 to pi/2.
Usually, people like to plot a good approximation of the DTFT (or maybe even the CTFT) using the FFT, so they zero-pad the input with a huge amount of zeros (the function fft() does this on it's own) and then they use the fftshift() function to plot between -pi and pi.
In other words, use fftshift(fft()) for plotting, and fft() for the math!
fft(fftshift(x)) rotates the input vector so the the phase of the complex FFT result is relative to the center of the original data window. If the input waveform is not exactly integer periodic in the FFT width, phase relative to the center of the original window of data may make more sense than the phase relative to some averaging between the discontinuous beginning and end. fft(fftshift(x)) also has the property that the imaginary component of a result will always be positive for a positive zero crossing at the center of the window of any antisymmetric waveform component.
fftshift(fft(y)) rotates the FFT results so that the DC bin is in the center of the result, halfway between -Fs/2 and Fs/2, which is a common spectrum display format.
As part of a digital image processing class, we have been assigned the Inverse Filter for image restoration. I'm using numpy. The variable names below try to follow the names in Digital Image Processing Gonzalez+Woods, 3e.
A zoom of the original image.
.
Gaussian kernel "zz.tif" same size as original image.
Zoom of the gaussian smoothed image with no noise added
f = imtools.load_image( sys.argv[1], mode="L", dtype="float" )
zz = imtools.load_image( "zz.tif", mode="L", dtype="float" )
F = np.fft.fft2( f )
F2 = np.fft.fftshift( F )
# normalize to [0,1]
H = zz/255.
# calculate the damaged image
G = H * F2
# Inverse Filter
F_hat = G / H
# cheat? replace division by zero (NaN) with zeroes
a = np.nan_to_num(F_hat)
f_hat = np.fft.ifft2( np.fft.ifftshift(a) )
imtools.save_image( np.abs(f_hat), "out.tif" )
imtools is just my wrapper using PIL+numpy to load/store images. (Can post that src, too.)
Zoom of the inverse filtered image.
Am I calculating the Inverse Filter correctly? Am I using numpy correctly?
Is the ringing in the final image expected or am I doing something wrong?
Generally, yes you seem to be doing things correctly, as far as I know.
The ringing is due to an overly "sharp" high pass filter, but that's what the method you're using does.
However, you might consider using numpy.fft.rfft2 ("real fft") and numpy.fft.irfft2 instead of numpy.fft.fft2 and numpy.fft.ifft2 because you're dealing purely with real values. It should be slightly faster.
I don't know much about Python but the 'ringing' is normal for the inverse filter. The Gibbs phenomenon lies at the basis of the ringing. Since the input is not entirely smooth but has some discontinuities, an infinite number of Fourier components is in principle needed to represent it completely. A finite number of components is sufficient here since the display resolution is finite, the image is pixelated. However, some information is lost in the recorded image because of the multiplication by zeros in H, by consequence the restored image approximates the input image with components covering a finite bandwidth, lower than that of the display, revealing the Gibbs oscillations.
To mitigate this use proper regularization as with a 2D Wiener filter: F_hat=G * H.conjugate()/(abs(H)2+NSR2) where NSR is an estimate of the noise to signal ratio, e.g. linearly increasing from 0 to 10 at the highest spatial frequency. This will account for the finite signal to noise ratio and when the NSR estimate is close enough you should see little 'ringing' after restoration.