Pytorch: Normalize Image data set

Pytorch: Normalize Image data set - python

I want to normalize custom dataset of images. For that i need to compute mean and standard deviation by iterating over the dataset. How can I normalize my entire dataset before creating the data set?

Well, let's take this image as an example:
The first thing you need to do is decide which library you want to use: Pillow or OpenCV. In this example I'll use Pillow:
from PIL import Image
import numpy as np
img = Image.open("test.jpg")
pix = np.asarray(img.convert("RGB")) # Open the image as RGB
Rchan = pix[:,:,0] # Red color channel
Gchan = pix[:,:,1] # Green color channel
Bchan = pix[:,:,2] # Blue color channel
Rchan_mean = Rchan.mean()
Gchan_mean = Gchan.mean()
Bchan_mean = Bchan.mean()
Rchan_var = Rchan.var()
Gchan_var = Gchan.var()
Bchan_var = Bchan.var()
And the results are:
Red Channel Mean: 134.80585625
Red Channel Variance: 3211.35843945
Green Channel Mean: 81.0884125
Green Channel Variance: 1672.63200823
Blue Channel Mean: 68.1831375
Blue Channel Variance: 1166.20433566
Hope it helps for your needs.

What normalization tries to do is mantain the overall information on your dataset, even when there exists differences in the values, in the case of images it tries to set apart some issues like brightness and contrast that in certain case does not contribute to the general information that the image has. There are several ways to do this, each one with pros and cons, depending on the image set you have and the processing effort you want to do on them, just to name a few:
Linear Histogram stetching: where you do a linear map on the current
range of values in your image and stetch it to match the 0 and 255
values in RGB
Nonlinear Histogram stetching: Where you use a
nonlinear function to map the input pixels to a new image. Commonly
used functions are logarithms and exponentials. My favorite function
is the cumulative probability function of the original histogram, it
works pretty well.
Adaptive Histogram equalization: Where you do a linear
histogram stretching in certain places of your image to avoid doing
an identity mapping where you have the max range of values in your original
image.

Related

matplotlib imshow gray colormap additional preprocess

I have bunch of images, randomly I figured out that best preprocessing for my images is using matplotlib imshow with cmap=gray. This is my RGB image (I can't publish the original images, this is a sample that I created to make my point. So the original images are not noiseless and perfect like this):
When I use plt.imshow(img, cmap='gray') the image will be:
I wanted to implement this process in Opencv. I tried to use OpenCV colormaps but there wasn't any gray one there. I used these solutions but the result is like the first image not the second one. (result here)
So I was wondering besides changing colormaps, what preprocessing does matplotlib apply on images when we call imshow?
P.S: You might suggest binarization, I've tested both techniques but on my data binarization will ruin some of the samples which this method (matplotlib) won't.

cv::normalize with NORM_MINMAX should help you. it can map intensity values so the darkest becomes black and the lightest becomes white, regardless of what the absolute values were.
this section of OpenCV docs contains example code. it's a permalink.
or so that minIdst(I)=alpha, maxIdst(I)=beta when normType=NORM_MINMAX (for dense arrays only)
that means, for NORM_MINMAX, alpha=0, beta=255. these two params have different meanings for different normTypes. for NORM_MINMAX it seems that the code automatically swaps them so the lower value of either is used as the lower bound etc.
further, the range for uint8 type data is 0 .. 255. giving 1 only makes sense for float data.
example:
import numpy as np
import cv2 as cv
im = cv.imread("m78xj.jpg")
normalized = cv.normalize(im, dst=None, alpha=0, beta=255, norm_type=cv.NORM_MINMAX)
cv.imshow("normalized", normalized)
cv.waitKey(-1)
cv.destroyAllWindows()
apply a median blur to remove noisy pixels (which go beyond the average gray of the text):
blurred = cv.medianBlur(im, ksize=5)
# ...normalize...
or do the scaling manually. apply the median blur, find the maximum value in that version, then apply it to the original image.
output = im.astype(np.uint16) * 255 / blurred.max()
output = np.clip(output, 0, 255).astype(np.uint8)
# ...

How to downscale an image without losing discrete values?

I have an image of a city with discrete colors (Green=meadow, black=buildings, white/yellow=roads). Using Pillow, I import the picture in my (Python) program and convert it to a Numpy array with discrete values for the colors (i.e. green pixels become 1's, black pixels become 2's, etc).
I want to downscale the resolution of the image (for computational purposes) while retaining as much information as possible. However, using Pillow's resize() method, colors deviate from these discrete values. How can I downscale this image while (most importantly) retaining the discrete colors and (also important) with losing as little information as possible?
Here an example of the image: https://i.imgur.com/6Tef55H.png
EDIT: per request, some code:
from PIL import Image
import Numpy as np
picture = Image.open(some_image.png)
width, height = picture.size
pic_array = np.zeros(width,height)
# Turn the image into discrete values
for i in range(0,width):
for j in range(0,height):
red, green, blue = picture.getpixel((i,j))
if red == a and green == b and blue == c:
#An example of how discrete colors are converted to values
pic_array[i][j] = 1
Scaling can be done in two ways:
1) Scaling the original image using Pillow's resize library or
2) rescaling the final array using something like:
scaled_array = pic_array[0:width:5, 0:height,5]
Option 1 is "well" in terms of retaining information but loses discrete values, while option 2 does it the other way around.

I was interested in this question and wrote some code to try out some ideas - specifically the "mode" filter suggested by #jasonharper in the comments. So, I programmed it up.
First of all the input image is not 4 nicely defined classes, but actually has 6,504 different colours, so I made a palette of 4 colours using ImageMagick like this:
magick xc:black xc:white xc:yellow xc:green +append palette.png
Here it is enlarged - in reality is 4x1 pixels:
Then I mapped the colours in the image to the palette of 4 discrete colours:
magick map.png +dither -remap palette.png start.png
Then I tried this code to calculate the median and the mode of each 3x3 window:
#!/usr/bin/env python3
from PIL import Image
import numpy as np
from scipy import stats
from skimage.util import view_as_blocks
# Open image and make into Numpy array
im = Image.open('start.png')
na = np.array(im)
# Make a view as 3x3 blocks - crop anything not a multiple of 3
block_shape=(3,3)
view = view_as_blocks(na[:747,:], block_shape)
flatView = view.reshape(view.shape[0], view.shape[1], -1) # now (249,303,9)
# Get median of each 3x3 block
resMedian = np.median(flatView, axis=2).astype(np.uint8)
Image.fromarray(resMedian*60).save('resMedian.png') # arbitrary scaling by 60 for contrast
# Get mode of each 3x3 block
resMode = stats.mode(flatView, axis=2)[0].reshape((249,303)).astype(np.uint8)
Image.fromarray(resMode*60).save('resMode.png') # arbitrary scaling by 60 for contrast
Here is the result of the median filter:
And here is the result of the "mode" filter which is indeed better IMHO:
Here is animated comparison:
If anyone wants to take the code and adapt it to try new ideas, please feel free!

How does cv2.merge((r,g,b)) works?

I am trying to do a linear filter on an image with RGB colors. I found a way to do that is by splitting the image to different color layers and then merge them.
i.e.:
cv2.split(img)
Sobel(b...)
Sobel(g...)
Sobel(r...)
cv2.merge((b,g,r))
I want to find out how cv2.merge((b,g,r)) works and how the final image will be constructed.

cv2.merge takes single channel images and combines them to make a multi-channel image. You've run the Sobel edge detection algorithm on each channel on its own. You are then combining the results together into a final output image. If you combine the results together, it may not make sense visually at first but what you would be displaying are the edge detection results of all three planes combined into a single image.
Ideally, hues of red will tell you the strength of the edge detection in the red channel, hues of green giving the strength of the detection for the green channel, and finally blue hues for the strength of detection in the blue.
Sometimes this is a good debugging tool so that you can semantically see all of the edge information for each channel in a single image. However, this will most likely be very hard to interpret for very highly complicated images with lots of texture and activity.
What is more usually done is to actually do an edge detection using a colour edge detection algorithm, or convert the image to grayscale and do the detection on that image instead.
As an example of the former, one can decompose the RGB image into HSV and use the colour information in this space to do a better edge detection. See this answer by Micka: OpenCV Edge/Border detection based on color.

This is my understanding. In OpenCV the function split() will take in the paced image input (being a multi-channel array) and split it into several separate single-channel arrays.
Within an image, each pixel has a spot sequentially within an array with each pixel having its own array to denote (r,g and b) hence the term multi channel. This set up allows any type of image such as bgr, rgb, or hsv to be split using the same function.
As Example (pretend these are separate examples so no variables are being overwritten)
b,g,r = cv2.split(bgrImage)
r,g,b = cv2.split(rgbImage)
h,s,v = cv2.split(hsvImage)
Take b,g,r arrayts for example. Each is a single channel array contains a portion of the split rgb image.
This means the image is being split out into three separate arrays:
rgbImage[0] = [234,28,19]
r[0] = 234
g[0] = 28
b[0] = 19
rgbImage[41] = [119,240,45]
r[41] = 119
g[14] = 240
b[14] = 45
Merge does the reverse by taking several single channel arrays and merging them together:
newRGBImage = cv2.merge((r,g,b))
the order in which the separated channels are passed through become important with this function.
Pseudo code:
cv2.merge((r,g,b)) != cv2.merge((b,g,r))
As an aside: Cv2.split() is an expensive function and the use of numpy indexing is must more efficient.
For more information check out opencv python tutorials

Comparing and plotting regions of the same color over a dataset of a few hundred images

A chem student asked me for help with plotting image segmenetation:
A stationary camera takes a picture of the experimental setup every second over a period of a few minutes, so like 300 images yield.
The relevant parts in the setup are two adjacent layers of differently-colored foams observed from the side, a 2-color sandwich shrinking from both sides, basically, except one of the foams evaporates a bit faster.
I'd like to segment each of the images in the way that would let me plot both foam regions' "width" against time.
Here is a "diagram" :)
I want to go from here --> To here
Ideally, given a few hundred of such shots, in which only the widths change, I get an array of scalars back that I can plot. (Going to look like a harmonic series on either side of the x-axis)
I have a bit of python and matlab experience, but have never used OpenCV or Image Processing toolbox in matlab, or actually never dealt with any computer vision in general. Could you guys throw like a roadmap of what packages/functions to use or steps one should take and i'll take it from there?
I'm not sure how to address these things:
-selecting at which slice along the length of the slice the algorithm measures the width(i.e. if the foams are a bit uneven), although this can be ignored.
-which library to use to segment regions of the image based on their color, (some k-means shenanigans probably), and selectively store the spatial parameters of the resulting segments?
-how to iterate that above over a number of files.
Thank you kindly in advance!

Assume your Intensity will be different after converting into gray scale ( if not, just convert to other color space like HSV or LAB, then just use one of the components)
img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
First, Threshold your grayscaled input into a few bands
ret,thresh1 = cv.threshold(img,128,255,cv.THRESH_BINARY)
ret,thresh2 = cv.threshold(img,27,255,cv.THRESH_BINARY_INV)
ret,thresh3 = cv.threshold(img,77,255,cv.THRESH_TRUNC)
ret,thresh4 = cv.threshold(img,97,255,cv.THRESH_TOZERO)
ret,thresh5 = cv.threshold(img,227,255,cv.THRESH_TOZERO_INV)
The value should be tested out by your actual data. Here Im just give a example
Clean up the segmented image using median filter with a radius larger than 9. I do expect some noise. You can also use ROI here to help remove part of noise. But personally I`m lazy, I just wrote program to handle all cases and angle
threshholed_images_aftersmoothing = cv2.medianBlur(threshholed_images,9)
Each band will be corresponding to one color (layer). Now you should have N segmented image from one source. where N is the number of layers you wish to track
Second use opencv function bounding rect to find location and width/height of each Layer AKA each threshholed_images_aftersmoothing. Eg. boundingrect on each sub-segmented images.
C++: Rect boundingRect(InputArray points)
Python: cv2.boundingRect(points) → retval¶
Last, the rect have x,y, height and width property. You can use a simple sorting order to sort from top to bottom layer based on rect attribute x. Run though all vieo to obtain the x(layer id) , height vs time graph.
Rect API
Public Attributes
_Tp **height** // this is what you are looking for
_Tp width
_Tp **x** // this tells you the position of the band
_Tp y
By plot the corresponding heights (|AB| or |CD|) over time, you can obtain the graph you needed.
The more correct way is to use Kalman filter to track the position and height graph as I would expect some sort of bubble will occur and will interfere with the height of the layers.
To be honest, i didnt expect a chem student to be good at this. Haha good luck
Anything wrong you can find me here or Email me if i`m not watching stackoverflow

You can select a region of interest straight down the middle of the foams, a few pixels wide. If you stack these regions for each image it will show the shrink over time.
If for example you use 3 pixel width for the roi, the result of 300 images will be a 900 pixel wide image, where the left is the start of the experiment and the right is the end. The following image can help you understand:
Though I have not fully tested it, this code should work. Note that there must only be images in the folder you reference.
import cv2
import numpy as np
import os
# path to folder that holds the images
path = '.'
# dimensions of roi
x = 0
y = 0
w = 3
h = 100
# store references to all images
all_images = os.listdir(path)
# sort images
all_images.sort()
# create empty result array
result = np.empty([h,0,3],dtype=np.uint8)
for image in all_images:
# load image
img = cv2.imread(path+'/'+image)
# get the region of interest
roi = img[y:y+h,x:x+w]
# add the roi to previous results
result = np.hstack((result,roi))
# optinal: save result as image
# cv2.imwrite('result.png',result)
# display result - can also plot with matplotlib
cv2.imshow('Result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Update after question edit:
If the foams have different colors, your can use easily separate them by color by converting the image you hsv and using inrange (example). This creates a mask (=2D array with values from 0-255, one for each pixel) that you can use to calculate average height and extract the parameters and area of the image.
You can find a script to help you find the HSV colors for separation on this GitHub

Counting particles using image processing in python

Is there any good algorithm for detecting particles on a changing background intensity?
For example, if I have the following image:
Is there a way to count the small white particles, even with the clearly different background that appears towards the lower left?
To be a little more clear, I would like to label the image and count the particles with an algorithm that finds these particles to be significant:
I have tried many things with the PIL, cv , scipy , numpy , etc. modules.
I got some hints from this very similar SO question, and it appears at first glance that you could take a simple threshold like so:
im = mahotas.imread('particles.jpg')
T = mahotas.thresholding.otsu(im)
labeled, nr_objects = ndimage.label(im>T)
print nr_objects
pylab.imshow(labeled)
but because of the changing background you get this:
I have also tried other ideas, such as a technique I found for measuring paws, which I implemented in this way:
import numpy as np
import scipy
import pylab
import pymorph
import mahotas
from scipy import ndimage
import cv
def detect_peaks(image):
"""
Takes an image and detect the peaks usingthe local maximum filter.
Returns a boolean mask of the peaks (i.e. 1 when
the pixel's value is the neighborhood maximum, 0 otherwise)
"""
# define an 8-connected neighborhood
neighborhood = ndimage.morphology.generate_binary_structure(2,2)
#apply the local maximum filter; all pixel of maximal value
#in their neighborhood are set to 1
local_max = ndimage.filters.maximum_filter(image, footprint=neighborhood)==image
#local_max is a mask that contains the peaks we are
#looking for, but also the background.
#In order to isolate the peaks we must remove the background from the mask.
#we create the mask of the background
background = (image==0)
#a little technicality: we must erode the background in order to
#successfully subtract it form local_max, otherwise a line will
#appear along the background border (artifact of the local maximum filter)
eroded_background = ndimage.morphology.binary_erosion(background, structure=neighborhood, border_value=1)
#we obtain the final mask, containing only peaks,
#by removing the background from the local_max mask
detected_peaks = local_max - eroded_background
return detected_peaks
im = mahotas.imread('particles.jpg')
imf = ndimage.gaussian_filter(im, 3)
#rmax = pymorph.regmax(imf)
detected_peaks = detect_peaks(imf)
pylab.imshow(pymorph.overlay(im, detected_peaks))
pylab.show()
but this gives no luck either, showing this result:
Using the regional max function, I get images which almost appear to be giving correct particle identification, but there are either too many, or too few particles in the wrong spots depending on my gaussian filtering (images have gaussian filter of 2,3, & 4):
Also, it would need to work on images similar to this as well:
This is the same type of image above, just at a much higher density of particles.
EDIT: Solved solution: I was able to get a decent working solution to this problem using the following code:
import cv2
import pylab
from scipy import ndimage
im = cv2.imread('particles.jpg')
pylab.figure(0)
pylab.imshow(im)
gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5,5), 0)
maxValue = 255
adaptiveMethod = cv2.ADAPTIVE_THRESH_GAUSSIAN_C#cv2.ADAPTIVE_THRESH_MEAN_C #cv2.ADAPTIVE_THRESH_GAUSSIAN_C
thresholdType = cv2.THRESH_BINARY#cv2.THRESH_BINARY #cv2.THRESH_BINARY_INV
blockSize = 5 #odd number like 3,5,7,9,11
C = -3 # constant to be subtracted
im_thresholded = cv2.adaptiveThreshold(gray, maxValue, adaptiveMethod, thresholdType, blockSize, C)
labelarray, particle_count = ndimage.measurements.label(im_thresholded)
print particle_count
pylab.figure(1)
pylab.imshow(im_thresholded)
pylab.show()
This will show the images like this:
(which is the given image)
and
(which is the counted particles)
and calculate the particle count as 60.

I had solved the "variable brightness in background" by using a tuned difference threshold with a technique called Adaptive Contrast. It works by performing a linear combination (a difference, in the case) of a grayscale image with a blurred version of itself, then applying a threshold to it.
Convolve the image with a suitable statistical operator.
Subtract the original from the convolved image, correcting intensity scale/gamma if necessary.
Threshold the difference image with a constant.
(original paper)
I did this very successfully with scipy.ndimage, in the floating-point domain (way better results than integer image processing), like this:
original_grayscale = numpy.asarray(some_PIL_image.convert('L'), dtype=float)
blurred_grayscale = scipy.ndimage.filters.gaussian_filter(original_grayscale, blur_parameter)
difference_image = original_grayscale - (multiplier * blurred_grayscale);
image_to_be_labeled = ((difference_image > threshold) * 255).astype('uint8') # not sure if it is necessary
labelarray, particle_count = scipy.ndimage.measurements.label(image_to_be_labeled)
Hope this helps!!

I cannot really give a definite answer, but here are a few pointers:
The function mahotas.morph.regmax might be better than the maximum filter as it removes pseudo-maxima. Perhaps combine this with a global threshold, with a local threshold (such as the mean over a window) or both.
If you have several images and the same uneven background, then maybe you can compute an average background and normalize against that, or use empty images as your estimate of background. This would be the case if you have a microscope, and like every microscope I've seen, the illumination is uneven.
Something like:
average = average_of_many(images)
# smooth it
average = mahotas.gaussian_filter(average,24)
Now you preprocess your images, like:
preproc = image/average
or something like that.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.