I am trying to deblur an image in Python but have run into some problems. Here is what I've tried, but keep in mind that I am not an expert on this topic. According to my understanding, if you know the point spread function, you should be able to deblur the image quite simply by performing a deconvolution. However, this doesn't seem to work and I don't know if I'm doing something stupid or if I just don't understand things correctly. In Mark Newman's Computational Physics book (using Python), he touches on this subject in problem 7.9. In this problem he supplies an image that he deliberately blurred using a Gaussian point spread function (psf), and the objective of the problem is to deblur the image using a Gaussian. This is accomplished by dividing the 2D FFT of the blurred image by the 2D FFT of the psf and then taking the inverse transform. This works reasonably well.
To extend this problem, I wanted to deblur a real image taken with a camera that was deliberately out of focus. So I set up a camera and took two sets of pictures. The first set of pictures were in focus. The first was of a very small LED light in a completely darkened room and the second was of a piece of paper with text on it (using the flash). Then, without changing any of the distances or anything, I changed the focus setting on the camera so that the text was very out of focus. I then took a picture of the text using the flash and took a second picture of the LED (without the flash). Here are the blurred images.
Now, according to my understanding, the image of the blurred point light source should be the point spread function, and as such I should be able to use it to deblur my image. The problem is that when I do so I get an image that just looks like noise. After doing a little research, it seems as though noise can be a big problem when using deconvolution techniques. However, given that I have measured what I believe to be the exact point spread function, I am surprised that noise would be an issue here.
One thing I did try was to replace small values (less than epsilon) in the psf transform with either 1 or with epsilon, and I tried this with a huge range of values for epsilon. This yielded an image that was not just noise, but is also not a deblurred version of the image; it looks like a weird, blurry version of the original (non-blurred) image. Here is an image from my program (you can ignore the value of sigma, which was not used in this program).
I believe I am dealing with a noise issue, but I don't know why and I don't know what to do about it. Any advice would be much appreciated (keeping in mind that I am no expert in this area).
Note that I have deliberately not posted the code because I think that is somewhat irrelevant at this point. But I would be happy to do so if anyone thinks that would be useful. I don't think it's a programming issue because I used the same technique and it works fine when I have the known point spread function (such as when I divide the FFT of the original in-focus image by the FFT of the out-of-focus image and then inverse transform). I just don't understand why I can't seem to use my experimentally measured point spread function.
The problem you have sought to solve is, unfortunately, more difficult than you might expect. Let me explain it in four parts. The first section assumes that you are comfortable with the Fourier transform.
Why you cannot solve this problem with a simple deconvolution.
An outline to how image deblurring can be performed.
Deconvolution by FFT and why it is a bad idea
An alternative method to perform deconvolution
But first, some notation:
I use I to represent an image and K to represent a convolution kernel. I * K is the convolution of the image I with the kernel K. F(I) is the (n-dimensional) Fourier transform of the image I and F(K) is the Fourier transform of the convolution kernel K (this is also called the point spread function, or PSF). Similarly, Fi is the inverse Fourier transform.
Why you cannot solve this problem with a simple deconvolution:
You are correct when you say that we can recover a blurred image Ib = I * K by dividing the Fourier transform of Ib by the Fourier transform of K. However, lens blur is not a convolution blurring operation. It is a modified convolution blurring operation where the blurring kernel K is dependent on the distance to the object you have photographed. Thus, the kernel changes from pixel to pixel.
You might think that this is not an issue with your image, as you have measured the correct kernel at the position of the image. However, this might not be the case, as the part of the image that is far away can influence the part of the image that is close. One way to fix this problem is to crop the image so that it is only the paper that is visible.
Why deconvolution by FFT is a bad idea:
The Convolution Theorem states that I * K = Fi(F(I)F(K)). This theorem leads to the reasonable assumption that if we have an image, Ib = I * K that is blurred by a convolution kernel K, then we can recover the deblurred image by computing I = (F(Ib)/F(K)).
Before we look at why this is a bad idea, I want to get some intuition for what the Convolution Theorem means. When we convolve an image with a kernel, then that is the same as taking the frequency components of the image and multiplying it elementwise with the frequency components of the kernel.
Now, let me explain why it is difficult to deconvolve an image with the FFT. Blurring, by default, removes high-frequency information. Thus, the high frequencies of K must go towards zero. The reason for this is that the high-frequency information of I is lost when it is blurred -- thus, the high-frequency components of Ib must go towards zero. For that to happen, the high-frequency components of K must also go towards zero.
As a result of the high-frequency components of K being almost zero, we see that the high-frequency components of Ib is amplified significantly (as we almost divide by zero) when we deconvolve with the FFT. This is not a problem in the noise-free case.
In the noisy case, however, this is a problem. The reason for this is that noise is, by definition, high-frequency information. So when we try to deconvolve Ib, the noise is amplified to an almost infinite extent. This is the reason that deconvolution by the FFT is a bad idea.
Furthermore, you need to consider how the FFT based convolution algorithm deals with boundary conditions. Normally, when we convolve images, the resolution decreases somewhat. This is unwanted behaviour, so we introduce boundary conditions that specify the pixel values of pixels outside the image. Example of such boundary conditions are
Pixels outside the image has the same value as the closest pixel inside the image
Pixels outside the image has a constant value (e.g. 0)
The image is part of a periodic signal, thus the row of pixel above the topmost row is equal to the bottom row of pixels.
The final boundary condition often makes sense for 1D signals. For images, however, it makes little sense. Unfortunately, the convolution theorem specifies that periodic boundary conditions are used.
In addition to this, it seems that the FFT based inversion method is significantly more sensitive to erroneous kernels than iterative methods (e.g. Gradient descent and FISTA).
An alternative method to perform deconvolution
It might seem like all hope is lost now, as all images are noisy, and deconvolving will increase the noise. However, this is not the case, as we have iterative methods to perform deconvolution. Let me start by showing you the simplest iterative method.
Let || I ||² be the squared sum of all of I's pixels. Solving the equation
Ib = I * K
with respect to I is then equivalent to solving the following optimisation problem:
min L(I) = min ||I * K - Ib||²
with respect to I. This can be done using gradient descent, as the gradient of L is given by
DL = Q * (I * K - Ib)
where Q is the kernel you get by transposing K (this is also called the matched filter in the signal processing litterature).
Thus, you can get the following iterative algorithm that will deblur an image.
from scipy.ndimage import convolve
blurred_image = # Load image
kernel = # Load kernel/psf
learning_rate = # You need to find this yourself, do a logarithmic line search. Small rate will always converge, but slowly. Start with 0.4 and divide by 2 every time it fails.
maxit = 100
def loss(image):
return 0.5 * np.sum((convolve(image, kernel) - blurred_image)**2)
def gradient(image):
return convolve(convolve(image, kernel) - blurred_image, kernel.T)
deblurred = blurred_image.copy()
for _ in range(maxit):
deblurred -= learning_rate*gradient(image)
The above method is perhaps the simplest of the iterative deconvolution algorithms. The way these are used in practice are through so-called regularised deconvolution algorithms. These algorithms work by firstly specifying a function that measures the amount of noise in an image, e.g. TV(I) (the total variation of I). Then the optimisation procedure is performed on L(I) + wTV(I). If you are interested in such algorithms, I recommend reading the FISTA paper by Amir Beck and Marc Teboulle. The paper is quite maths heavy, but you don't need to understand most of it -- only how to implement the TV deblurring algorithm.
In addition to using a regulariser, we use accelerated methods to minimise the loss L(I). One such example is Nesterov accelerated gradient descent. See Adaptive Restart for Accelerated Gradient Schemes by Brendan O'Donoghue, Emmanuel Candes for information about such methods.
An outline to how image deblurring can be performed.
Crop your image so that everything has same distance from the camera
Find the convolution kernel the same way you did now (Test your deconvolution algorithm on synthetically blurred images first)
Implement an iterative method to compute deconvolutoin
Deconvolve the image.
Related
I want to transform and align a detected face (320x240 Size) from a CelebA image (1024x1024 Size) using OpenCV's cv2.warpAffine function but the quality of the transformed image is significantly lower than when I try to align it by hand in Photoshop: (Left Image Is Transformed By Photoshop & Right Image Is Transformed in OpenCV)
I have used all of the interpolation techniques of OpenCV but none of them came close in quality to Photoshop.
The code I'm using is:
warped = cv2.warpAffine(image, TRANSFORM_MATRIX, (240, 320), flags=cv2.INTER_AREA)
What could be wrong that made the transformed image have such low quality?
Here's a Link to the original 1024x1024 image if needed.
Problem and general solution
You are down-sampling a signal.
The approach is always the same:
lowpass to remove high frequency components
resample/decimate
What not to do
If you don't do the lowpass, you'll get aliasing. You noticed that. Aliasing means the sampling step can completely miss some high frequency component (edge/corner/point/...), giving those strange artefacts. A properly resampled image would not completely lose such high frequency features.
If you do the lowpass after resampling, it won't fix the issue, only hide it. The damage has already been done.
You can convince yourself of both these aspects if you downsample some regular grid of strongly contrasting lines. Try alternating single-pixel lines of black and white for most effect.
Implementations
Libraries such as PIL do the lowpass implicitly before resampling.
OpenCV does not (kinda, in general). Not even with Lanczos interpolation (in OpenCV) will you be able to skip the lowpassing, because OpenCV's Lanczos has a fixed coefficient.
OpenCV has INTER_AREA, which is a linear interpolation, but it additionally sums over all pixels that are in the area between the corner samples (instead of just sampling those four corners). This can spare you the extra lowpass step.
here's the result of cv.resize(im, (240, 240), interpolation=cv.INTER_AREA):
Here's the result of cv.warpAffine(im, M[:2], (240, 240), interpolation=cv.INTER_AREA) with M = np.eye(3) * 0.25 (equivalent scaling):
It appears that warpAffine can't do INTER_AREA. That sucks for you :/
If you need to downsample with OpenCV, and it's a power of two, you can use pyrDown. That does the lowpass and decimation... for a factor of two. Repeated application gives you higher powers.
If you need arbitrary downsampling and you don't like INTER_AREA for some reason, you'd have to apply a GaussianBlur to the input. Sigma needs to be (inversely) proportional to the scale factor. There is some relation between the gaussian filter's sigma and the resulting cutoff frequency. You'll want to investigate that some more, if you don't want to pick a value arbitrarily. Check out the kernel for pyrDown, and what gaussian sigma it matches best. That's probably a good value for a scale factor of 0.5, and other factors should be (inversely) proportional.
For simple downscaling, one gaussian blur would be fine. For affine warps and higher transformations, you'd need to apply lowpassing that respects the different scale for every single pixel that is looked up, because their "support" in the source image isn't square any longer, maybe not even rectangular, but an arbitrary quad!
What am I not saying?
This goes for down-sampling. If you up-sample, do not lowpass.
I am trying to approximate different shapes of a weld bead geometry cross section in additive manufacturing with a graph or ideally (but not necessarily) a function. The regions are the outer shape as well as the individual layers. (see following images)
Therefore, I applied some pre-processing methods to extract the relevant pixels which represent the geometry of a weld bead which are shown as white pixels. (see third image)
I derived this image with canny edge detection and multiple morphological operations such as closing erosion and dilation prior to that and of course converting it into grey-scale.
The "noisy" areas are the transition areas between individual layers of metal and only show up in this way, so in general there is not a "better" or "sharper" transition in thus less "noise". Pictures 3 and 4 are an example of some of the image pre-processing methods I used.
My main approach to treat the inner geometry so far was to split up the image in several sub-images and perform least squares regression on each individual one by interpreting the white pixels as data points. Afterwards I've stitched all those little approximation functions back together to form the image of the original size. I've tried it with different sizes of those sub-images. (see pictures 5 and 6)
However, this approach produces jumps between the functions as well as functions next to each other where the pixels or data points in my case should only be approximated with one function (see attached image). My next approach would be to use multivariate adaptive regression on the sub-images.
Thus, I'm asking if anybody knows a better solution for my problem, maybe even for an approximation on global scale without splitting the image into the sub-images. The approximation does not need to be a polynomial function, piece wise linear but connected functions are totally sufficient. I would be thankful if anybody knows a method that is at least capable of achieving what I want to do. Whether a pure non-linear regression method. Unfortunately I don't have many images (only 64), hence I don't think I can use an ANN. (please correct me if I'm wrong)
If you need to take a look at my code, just let me know. Thank you! :)
The best I could obtain is with bilateral filtering for denoising, then adaptive binarization.
And on a reduced image:
I'm using BrainWeb a simulated dataset for normal brain MR images. I want to validate MyDenoise function which calls denoise_nl_means of skimage.restoration package. To do so, I downloaded two sets of images from BrainWeb, a original image with 0% noise and 0% Intensity non-uniformity, and a noisy image with the same options but 9% noise and 40% Intensity non-uniformity. And, I calculate Signal To Noise ratio (SNR) based on a deprecated version of scipy.stats as follows:
def signaltonoise(a, axis=0, ddof=0):
a = np.asanyarray(a)
m = a.mean(axis)
sd = a.std(axis=axis, ddof=ddof)
return np.where(sd == 0, 0, m/sd)
I assume, after denoising, we should have a higher SNR which is always true. However, when comparing to the original image, we have more SNR in the noisy image. I guess it's because the total mean of the image has increased more significantly than the standard deviation. So, it seems SNR cannot be a good measurement to validate whether my denoised image is closer to the original images or not since noisy images have already a higher SNR than the original images. I want to know if there are better measurements for validating denoising functions in images.
Here is my result:
Original image SNR: 1.23
Noisy image SNR: 1.41
Denoised image SNR: 1.44
Thank you.
This is not how you calculate SNR.
The core concept is that, for any one given image, you don’t know what is noise and what is signal. If we did, denoising wouldn’t be a problem. Therefore, it is impossible to measure the noise level from one image (it is possible to estimate it, but we cannot compute it).
The solution is to use that noise-free image. This is the ground truth, the objective of the denoise operation. We can thus estimate the noise by comparing any one image to this ground truth, the difference is the noise:
noise = image - ground_truth
You can now compute the mean square error (MSE):
mse = np.mean(noise**2)
Or the signal to noise ratio:
snr = np.mean(ground_truth) / np.mean(noise)
(Note that this is one of many possible different definitions of signal to noise ratio, often we use power of the signals rather that just their means, and often it is measured in dB.)
In general, MSE is a really good way to talk about the error in denoising. You’ll see most scientific papers in the field additionally using peak signal to noise ratio (PSNR) instead, which is just a scaling and logarithmic mapping of the MSE. Therefore it is pointless to use both.
You can also look at the mean absolute error (MAE), which is more sensitive to individual pixels with a large error.
Given two images of the same scene with potential alignment, focus, lighting differences and noise, I am looking for an operation that I can run on these images that produces another image of the difference between them that minimizes these differences or is more sensitive to the structural differences between them than the global differences. My initial thought was a comparison between the corresponding neighborhoods around a pixel in image A and the same pixel in image B might work.
Is this function already implemented in OpenCV or some other Python library (scipy, numpy etc)?
My musings:
A simple frame difference would tell me where absolute differences occur but is very brittle to alignment, lighting differences. Maybe there is a way to find the standard deviation over a pixel's neighborhood. Numpy's std only works by axis...
This seems like I want the correlation between two signals but I don't know how to extend this to a non-repeating 2D world. scipy.signal.correlate2d seems like it may work if there was an efficient way to just pass corresponding neighborhoods to it. However, I don't have a good feel for what is going on under the hood.
A convolution of one image where the kernel comes from corresponding locations in the other images would give a comparison that would handle noise and focus issues well but I don't know how to use a dynamic kernel for a convolution.
If I had a library of basically identical images (not an original assumption but doable) to compare one image to, I could use a mean difference or mixture of gaussians. But I don't think this would help with alignment or lighting. I could align the image first and then do the comparison.
Per the comment below, I looked up the skimage SSIM (Structural Similarity Index) method that is used to measure image degradation due to things like lossy compression and decompression. It actually expects two copies of the same image - one a truth source, one in a potentially degraded state due to lossy compression and decompression. This method is soft on global bias (lighting), which is good, but sensitive to noise (by design) and especially sensitive to misalignment.
The comment led me to MSE which acts globally, but if iterated over an image by neighborhood, it gives a good result - insensitive to bias, noise but not structural differences. However, it is fairly sensitive to alignment differences and very slow in python...
# Mean Squared Error
def mse(imageA, imageB):
return np.mean((imageA - imageB)**2)
from scipy import misc
import numpy as np
face = misc.face()
nface = np.array(face)
nface[295:305,395:415] = face[195:205,495:515] #discontinuous region
nface = cv2.blur(nface, (3,3)) # focus effects
nface = nface + (np.random.randn(*face.shape) * 10 - 5) # noise
nface = (nface * .9 + 20).astype(int) #lighting
n = m = 3
output = np.zeros(face.shape[:2])
for i in range(face.shape[0]):
for j in range(face.shape[1]):
if i > n and i < face.shape[0]-n and j > m and j < face.shape[1]-m:
output[i,j] = mse(nface[i-n:i+n, j-m:j+m], face[i-n:i+n, j-m:j+m])
Is this a common image processing technique? Is there an name for this or an optimized implementation in openCV or Numpy?
Here is my question:
My optical system is made of a camera plus a circular plexiglass "lens" that changes its curvature depending on pressure (radial bending).
This curvature induces a deformation of the image captured by the camera.
To correct this deformation, images need to be calibrated.
Calibration can be made with a grid (chessboard, dots, lines), pressure range has to be discretized with a certain step.
For each pressure step, an image of the grid has to be taken.
Then each image has to be compared to the reference one (P=0), and a transformation matrix has to be computed and stored.
Finally, each image taken during the experiment for a specific pressure has to be corrected by the transformation matrix.
The deformation is non-linear (not only a combination of rotations and translations), but most likely Barrel distortion. (again not induced by the camera)
Which looks like that:
http://en.wikipedia.org/wiki/Distortion_%28optics%29#mediaviewer/File:Barrel_distortion.svg
I found a plugin in ImageJ called BunwarpJ, http://biocomp.cnb.csic.es/~iarganda/bUnwarpJ/
and I basically want to know if there is an equivalent way to produce the same result in Opencv.
(CalibrateCamera won't do the trick)
OpenCv has an undistort function that can take a current image, a matrix of camera coefficients, distorsion coeffs. and produces a new image corrected for sent camera coeffs. and a new set of camera coeffs. (if you need to do other transformations on the new image).
I have not used it before, so I can't say what exactly are camera or distorsion coefficients are but as manual describes:
The function transforms an image to compensate radial and tangential
lens distortion. The function is simply a combination of
initUndistortRectifyMap() (with unity R ) and remap() (with bilinear
interpolation).
So checking those two funcs. out are a good way to find out.
I believe you misunderstood the manual perhaps because you seem to think that CalibrateCamera does this for you. Instead CalibrateCamera actually returns the camera and distorsion coeffs. which you need to undistort your image.
Each lens has its own constant coeffs. which in your case means that you'll have to calibrateCamera for a range of pressures (I assume you control that experimentally?) and then call different undistort func. with different parameters which you'll get out of your experiments.
A matrix can only capture a linear transformation (or possibly a linear transformation in homogeneous space), not a general distortion.
In my experience any attempt to use a single global transformation formula wouldn't be very accurate (it's not trivial to get just 99.9% accuracy). Even just correcting camera lens distortion this way is difficult if you want high accuracy.
In the past I got good enough results using a sparse global RBF interpolation, but later I moved to an interpolating 2d spline approach; if you can choose your calibration points to be on a regular grid this is the solution I would suggest.
In the end the mapping could be a 2-valued 3d interpolating spline on a regular grid (XY for the image, Z for the pressure; values UV are the pixel coordinates).
Straightening the image once pressure is known is just texture mapping.