I am working with a dataset that contains .mha files. I want to convert these files to either png/tiff for some work. I am using the Medpy library for converting.
image_data, image_header = load('image_path/c0001.mha')
from medpy.io import save
save(image_data, 'image_save_path/new_image.png', image_header)
I can actually convert the image into png/tiff format, but the converted image turns dark after the conversion. I am attaching the screenshot below. How can I convert the images successfully?
Your data is clearly limited to 12 bits (white is 2**12-1, i.e., 4095), while a PNG image in this context is 16 bits (white is 2**16-1, i.e., 65535). For this reason your PNG image is so dark that it appears almost black (but if you look closely it isn't).
The most precise transformation you can apply is the following:
import numpy as np
from medpy.io import load, save
def convert_to_uint16(data, source_max):
target_max = 65535 # 2 ** 16 - 1
# build a linear lookup table (LUT) indexed from 0 to source_max
source_range = np.arange(source_max + 1)
lut = np.round(source_range * target_max / source_max).astype(np.uint16)
# apply it
return lut[data]
image_data, image_header = load('c0001.mha')
new_image_data = convert_to_uint16(image_data, 4095) # 2 ** 12 - 1
save(new_image_data, 'new_image.png', image_header)
Output:
N.B.: new_image_data = image_data * 16 corresponds to replacing 65535 with 65520 (4095 * 16) in convert_to_uint16
You may apply "contrast stretching".
The dynamic range of image_data is about [0, 4095] - the minimum value is about 0, and the maximum value is about 4095 (2^12-1).
You are saving the image as 16 bits PNG.
When you display the PNG file, the viewer, assumes the maximum value is 2^16-1 (the dynamic range of 16 bits is [0, 65535]).
The viewer assumes 0 is black, 2^16-1 is white, and values in between scales linearly.
In your case the white pixels value is about 4095, so it translated to be a very dark gray in the [0, 65535] range.
The simplest solution is to multiply image_data by 16:
from medpy.io import load, save
image_data, image_header = load('image_path/c0001.mha')
save(image_data*16, 'image_save_path/new_image.png', image_header)
A more complicated solution is applying linear "contrast stretching".
We may transform the lower 1% of all pixel to 0, the upper 1% of the pixels to 2^16-1, and scale the pixels in between linearly.
import numpy as np
from medpy.io import load, save
image_data, image_header = load('image_path/c0001.mha')
tmp = image_data.copy()
tmp[tmp == 0] = np.median(tmp) # Ignore zero values by replacing them with median value (there are a lot of zeros in the margins).
tmp = tmp.astype(np.float32) # Convert to float32
# Get the value of lower and upper 1% of all pixels
lo_val, up_val = np.percentile(tmp, (1, 99)) # (for current sample: lo_val = 796, up_val = 3607)
# Linear stretching: Lower 1% goes to 0, upper 1% goes to 2^16-1, other values are scaled linearly
# Clipt to range [0, 2^16-1], round and convert to uint16
# https://stackoverflow.com/questions/49656244/fast-imadjust-in-opencv-and-python
img = np.round(((tmp - lo_val)*(65535/(up_val - lo_val))).clip(0, 65535)).astype(np.uint16) # (for current sample: subtract 796 and scale by 23.31)
img[image_data == 0] = 0 # Restore the original zeros.
save(img, 'image_save_path/new_image.png', image_header)
The above method enhance the contrast, but looses some of the original information.
In case you want higher contrast, you may use non-linear methods, improving the visibility, but loosing some "integrity".
Here is the "linear stretching" result (downscaled):
Related
After training a model (image classification) I would like to see how it performs differently when I evaluate a proper image and various noised versions of it.
The type of noise I'm thinking is a random change in pixels value, I tried with this approach:
# --Inside the generator function that I provide to model.predict_generator--
# dataset is a numpy array with denoised images path
dt = tf.data.Dataset.from_generator(lambda: image_generator(dataset), output_types=(tf.float32))
def image_generator_(image_paths):
for path in image_paths:
# im is keras.preprocessing image
img = im.load_img(path,
color_mode='rgb',
target_size=(224,224))
img_to_numpy = np.array(img)
for _ in range (0, 5):
tmp_numpy_image = img_to_numpy.copy()
for i in range(tmp_numpy_image.shape[0]):
for j in range(tmp_numpy_image.shape[1]):
# add noise
tmp_numpy_image.shape[i][j] = ...
yield tmp_numpy_image
This process works fine but it is very slow. I also use dataset.batch and dataset.prefetch on dt and I didn't found a combination for their values that reduces the algorithm time
Is there a smarter way to do it? I tried by yielding not noised images and to add the noise later inside dataset.map. The problem is that inside map I have to manipulate tensors and I didn't found a way to change each pixel value
SOLUTION
I used #Marat approach and it worked like a charm, the whole process went from 20-30 hours to minutes. My noise was a simple +-1 but I didn't want to go in overflow (255+1 = 0 in uint8) and therefore I only had to use numpy masks
...
tmp_numpy_image = img_to_numpy.copy()
noise = np.random.randint(-1, 1, img_to_numpy.shape)
# tmp_numpy_imag will become of type int32
tmp_numpy_image = tmp_numpy_image + noise
np.putmask(tmp_numpy_image, tmp_numpy_image < 0, 0)
np.putmask(tmp_numpy_image, tmp_numpy_image > 255, 255)
tmp_numpy_image = tmp_numpy_image.astype('uint8')
yield tmp_numpy_image
The biggest overhead here is pixel operations (double for loop). Vectorizing it should result in substantial speedup:
noise_magnitude = 10
...
img_max_value = img_to_numpy.max() * np.ones(img_to_numpy.shape)
for _ in range (0, 5):
# depending on range of values, you might want to adjust noise magnitude
noise = np.random.randint(0, noise_magnitude, img_to_numpy.shape)
# after adding noise, clip values exceeding max values
yield np.maximum(img_to_numpy + noise, img_max_value)
I want to create salt and pepper noise function.
The input is noise_density, i.e. the amount of pixels as noise in the output image and it should return value is the noisy image data source
def salt_pepper(noise_density):
noisesource = ColumnDataSource(data={'image': [noiseImage]})
return noisesource
This function returns an image that is [density]x[density] pixels, using numpy to generate a random array and using PIL to generate the image itself from the array.
def salt_pepper(density):
imarray = numpy.random.rand(density,density,3) * 255
return Image.fromarray(imarray.astype('uint8')).convert('L')
Now, for example, you could run
salt_pepper(500)
To generate an image file that is 500x500px.
Of course, make sure to
import numpy
from PIL import Image
I came up with a vectorized solution which I'm sure can be improved/simplified. Although the interface is not exactly as the requested one, the code is pretty straightforward (and fast 😬) and I'm sure it can be easily adapted.
import numpy as np
from PIL import Image
def salt_and_pepper(image, prob=0.05):
# If the specified `prob` is negative or zero, we don't need to do anything.
if prob <= 0:
return image
arr = np.asarray(image)
original_dtype = arr.dtype
# Derive the number of intensity levels from the array datatype.
intensity_levels = 2 ** (arr[0, 0].nbytes * 8)
min_intensity = 0
max_intensity = intensity_levels - 1
# Generate an array with the same shape as the image's:
# Each entry will have:
# 1 with probability: 1 - prob
# 0 or np.nan (50% each) with probability: prob
random_image_arr = np.random.choice(
[min_intensity, 1, np.nan], p=[prob / 2, 1 - prob, prob / 2], size=arr.shape
)
# This results in an image array with the following properties:
# - With probability 1 - prob: the pixel KEEPS ITS VALUE (it was multiplied by 1)
# - With probability prob/2: the pixel has value zero (it was multiplied by 0)
# - With probability prob/2: the pixel has value np.nan (it was multiplied by np.nan)
# We need to to `arr.astype(np.float)` to make sure np.nan is a valid value.
salt_and_peppered_arr = arr.astype(np.float) * random_image_arr
# Since we want SALT instead of NaN, we replace it.
# We cast the array back to its original dtype so we can pass it to PIL.
salt_and_peppered_arr = np.nan_to_num(
salt_and_peppered_arr, nan=max_intensity
).astype(original_dtype)
return Image.fromarray(salt_and_peppered_arr)
You can load a black and white version of Lena like so:
lena = Image.open("lena.ppm")
bwlena = Image.fromarray(np.asarray(lena).mean(axis=2).astype(np.uint8))
Finally, you can save a couple of examples:
salt_and_pepper(bwlena, prob=0.1).save("sp01lena.png", "PNG")
salt_and_pepper(bwlena, prob=0.3).save("sp03lena.png", "PNG")
Results:
https://i.ibb.co/J2y9HXS/sp01lena.png
https://i.ibb.co/VTm5Vy2/sp03lena.png
I am working with large NetCDF4 files (about 1 GB and up but less than my 8 GB memory for now). 99% of the time the data type will be a float32. I want to map these values to an array of RGB colors which I will then write to a binary file to be read by another application for viewing. Because I only need 1 byte for each R, G, and B, I want to have an array of np.uint8 to represent this. In the end the array will take up 25% less space than the floats. However, as the original data is big, I don't want to keep both the original data and the color data in memory at the same time. For now I provide a color for the low value and the color for the high value. The problem is that in my program for a short period time, the color data consists of floats instead of np.uint8, which leads to taking up 3 times as much memory as the original data. Is there a way to skip the float conversion or at least only have one float in memory so that I don't take up this much memory? I have provided relevant code below:
from netCDF4 import Dataset
import numpy as np
import dask.array as da
import gc
import time
import sys
# Read file path
file_path = sys.argv[1]
# Default colors is blue for low and red for high
lowColor = np.array([0, 0, 255], dtype=int)
highColor = np.array([255, 0, 0], dtype=int)
data = Dataset(file_path)
allVariables = data.variables
# Sometimes we have time_bnds, lat_bnds, etc.
# Keep anything that doesn't have 'bnds'
varNames = list(filter(lambda x: 'bnds' not in x, list(allVariables.keys())))
# Remove the dimensions
varNames = list(filter(lambda x: x not in data.dimensions, varNames))
var = varNames[0]
flattened = allVariables[var][:].flatten()
origShape = allVariables[var].shape
if isinstance(flattened, np.ma.core.MaskedArray):
flattened = flattened.filled(np.nan)
# Find the minimum value and the range of values.
# Using these two we can make a percentage of how
# far 'up' each value and simply convert colors
# based on that. Because there's a chance of the data
# having NaNs, I can't use ptp().
lowVal = np.nanmin(flattened)
ptp = np.nanmax(flattened) - lowVal
# Subtract the min from each value and divide by ptp
# and add a dimension for dot product later.
percents = ((flattened - lowVal) / ptp)[np.newaxis, :]
# Remove flattened from memory as it is not needed anymore
flattened = None
gc.collect()
# Calculate the color difference
diff = (highColor - lowColor)[np.newaxis, :].T
# Do the dot product to create a list of colors
# Transpose so each color is each row. Also
# add the low color
colors = lowColor + np.dot(diff, percents).T # All floats here
# Round each value and cast to uint8 and finally reshape to
# the original data
colors = np.round(colors).astype(np.uint8)
colors = colors.reshape(origShape + (3,))
colors.tofile('colors_' + allVariables[var].name + '.bin')
I already achieved the goal described in the title but I was wondering if there was a more efficient (or generally better) way to do it. First of all let me introduce the problem.
I have a set of images of different sizes but with a width/height ratio less than (or equal) 2 (could be anything but let's say 2 for now), I want to normalize each one, meaning I want all of them to have the same size. Specifically I am going to do so like this:
Extract the max height above all images
Zoom the image so that each image reaches the max height keeping its ratio
Add a padding to the right with just white pixels until the image has a width/height ratio of 2
Keep in mind the images are represented as numpy matrices of grey scale values [0,255].
This is how I'm doing it now in Python:
max_height = numpy.max([len(obs) for obs in data if len(obs[0])/len(obs) <= 2])
for obs in data:
if len(obs[0])/len(obs) <= 2:
new_img = ndimage.zoom(obs, round(max_height/len(obs), 2), order=3)
missing_cols = max_height * 2 - len(new_img[0])
norm_img = []
for row in new_img:
norm_img.append(np.pad(row, (0, missing_cols), mode='constant', constant_values=255))
norm_img = np.resize(norm_img, (max_height, max_height*2))
There's a note about this code:
I'm rounding the zoom ratio because it makes the final height equal to max_height, I'm sure this is not the best approach but it's working (any suggestion is appreciated here). What I'd like to do is to expand the image keeping the ratio until it reaches a height equal to max_height. This is the only solution I found so far and it worked right away, the interpolation works pretty good.
So my final questions are:
Is there a better approach to achieve what explained above (image normalization) ? Do you think I could have done this differently ? Is there a common good practice I'm not following ?
Thanks in advance for your time.
Instead of ndimage.zoom you could use
scipy.misc.imresize. This
function allows you to specify the target size as a tuple, instead of by zoom
factor. Thus you won't have to call np.resize later to get the size exactly as
desired.
Note that scipy.misc.imresize calls
PIL.Image.resize
under the hood, so PIL (or Pillow) is a dependency.
Instead of using np.pad in a for-loop, you could allocate space for the desired array, norm_arr, first:
norm_arr = np.full((max_height, max_width), fill_value=255)
and then copy the resized image, new_arr into norm_arr:
nh, nw = new_arr.shape
norm_arr[:nh, :nw] = new_arr
For example,
from __future__ import division
import numpy as np
from scipy import misc
data = [np.linspace(255, 0, i*10).reshape(i,10)
for i in range(5, 100, 11)]
max_height = np.max([len(obs) for obs in data if len(obs[0])/len(obs) <= 2])
max_width = 2*max_height
result = []
for obs in data:
norm_arr = obs
h, w = obs.shape
if float(w)/h <= 2:
scale_factor = max_height/float(h)
target_size = (max_height, int(round(w*scale_factor)))
new_arr = misc.imresize(obs, target_size, interp='bicubic')
norm_arr = np.full((max_height, max_width), fill_value=255)
# check the shapes
# print(obs.shape, new_arr.shape, norm_arr.shape)
nh, nw = new_arr.shape
norm_arr[:nh, :nw] = new_arr
result.append(norm_arr)
# visually check the result
# misc.toimage(norm_arr).show()
I'm using ImageHash library to generate the perceptual hash of an image. The library claims to be able to generate hashes of different size (64, 128, 256), but I can't figure how to get a 128 hash.
The hash size is determined by the image size when the library rescales it, for example:
def average_hash(image, hash_size=8):
image = image.convert("L").resize((hash_size, hash_size), Image.ANTIALIAS)
here the default value is 8 (8x8 image = 64 pixels -> grayscale -> 64 bits).
However, how is a 128 bits hash created?
Second thing, the default size of a pHash is 32, as explained here, but later will be calculated only the DCT of the top-left 8x8 section, so again 64 bit. The DCT is calculated through scipy.fftpack:
def phash(image, hash_size=32):
image = image.convert("L").resize((hash_size, hash_size), Image.ANTIALIAS)
pixels = numpy.array(image.getdata(), dtype=numpy.float).reshape((hash_size, hash_size))
dct = scipy.fftpack.dct(pixels)
dctlowfreq = dct[:8, 1:9]
avg = dctlowfreq.mean()
diff = dctlowfreq > avg
return ImageHash(diff)
How can the hash size be changed?
Whichever value is used, the calculation will always be based on the top left 8x8, so will always be 64!
Strange thing that happens is that if I start with an 8 size pHash (resizing the image from the beginning), I got a final hash of 56 bit (namely, the calculation of the hash of a 7x8 image: I don't understand why this happens in the DCT computation - but I really know a little about it.
It looks like that was a bug in the library that has since been fixed. The current implementation of phash looks like this:
def phash(image, hash_size=8, highfreq_factor=4):
"""
Perceptual Hash computation.
Implementation follows http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
#image must be a PIL instance.
"""
if hash_size < 2:
raise ValueError("Hash size must be greater than or equal to 2")
import scipy.fftpack
img_size = hash_size * highfreq_factor
image = image.convert("L").resize((img_size, img_size), Image.ANTIALIAS)
pixels = numpy.asarray(image)
dct = scipy.fftpack.dct(scipy.fftpack.dct(pixels, axis=0), axis=1)
dctlowfreq = dct[:hash_size, :hash_size]
med = numpy.median(dctlowfreq)
diff = dctlowfreq > med
return ImageHash(diff)
You'll notice this correctly uses the hash_size rather than hardcoded values.