I am working with a dataset that contains .mha files. I want to convert these files to either png/tiff for some work. I am using the Medpy library for converting.
image_data, image_header = load('image_path/c0001.mha')
from import save
save(image_data, 'image_save_path/new_image.png', image_header)
I can actually convert the image into png/tiff format, but the converted image turns dark after the conversion. I am attaching the screenshot below. How can I convert the images successfully?
Your data is clearly limited to 12 bits (white is 2**12-1, i.e., 4095), while a PNG image in this context is 16 bits (white is 2**16-1, i.e., 65535). For this reason your PNG image is so dark that it appears almost black (but if you look closely it isn't).
The most precise transformation you can apply is the following:
import numpy as np
from import load, save
def convert_to_uint16(data, source_max):
target_max = 65535 # 2 ** 16 - 1
# build a linear lookup table (LUT) indexed from 0 to source_max
source_range = np.arange(source_max + 1)
lut = np.round(source_range * target_max / source_max).astype(np.uint16)
# apply it
return lut[data]
image_data, image_header = load('c0001.mha')
new_image_data = convert_to_uint16(image_data, 4095) # 2 ** 12 - 1
save(new_image_data, 'new_image.png', image_header)
N.B.: new_image_data = image_data * 16 corresponds to replacing 65535 with 65520 (4095 * 16) in convert_to_uint16
You may apply "contrast stretching".
The dynamic range of image_data is about [0, 4095] - the minimum value is about 0, and the maximum value is about 4095 (2^12-1).
You are saving the image as 16 bits PNG.
When you display the PNG file, the viewer, assumes the maximum value is 2^16-1 (the dynamic range of 16 bits is [0, 65535]).
The viewer assumes 0 is black, 2^16-1 is white, and values in between scales linearly.
In your case the white pixels value is about 4095, so it translated to be a very dark gray in the [0, 65535] range.
The simplest solution is to multiply image_data by 16:
from import load, save
image_data, image_header = load('image_path/c0001.mha')
save(image_data*16, 'image_save_path/new_image.png', image_header)
A more complicated solution is applying linear "contrast stretching".
We may transform the lower 1% of all pixel to 0, the upper 1% of the pixels to 2^16-1, and scale the pixels in between linearly.
import numpy as np
from import load, save
image_data, image_header = load('image_path/c0001.mha')
tmp = image_data.copy()
tmp[tmp == 0] = np.median(tmp) # Ignore zero values by replacing them with median value (there are a lot of zeros in the margins).
tmp = tmp.astype(np.float32) # Convert to float32
# Get the value of lower and upper 1% of all pixels
lo_val, up_val = np.percentile(tmp, (1, 99)) # (for current sample: lo_val = 796, up_val = 3607)
# Linear stretching: Lower 1% goes to 0, upper 1% goes to 2^16-1, other values are scaled linearly
# Clipt to range [0, 2^16-1], round and convert to uint16
img = np.round(((tmp - lo_val)*(65535/(up_val - lo_val))).clip(0, 65535)).astype(np.uint16) # (for current sample: subtract 796 and scale by 23.31)
img[image_data == 0] = 0 # Restore the original zeros.
save(img, 'image_save_path/new_image.png', image_header)
The above method enhance the contrast, but looses some of the original information.
In case you want higher contrast, you may use non-linear methods, improving the visibility, but loosing some "integrity".
Here is the "linear stretching" result (downscaled):
Given a list of images, I'd like to create a new image where each pixel contains the values (R,G,B) that occurred most frequently in the input list at that location.
Input: A list L that has length >=2. Each image/object in the list is a float32 numpy array with dimensions (288, 512, 3) where 3 represents the R/G/B color channels.
Output: A numpy array with the same shape (288,512,3). If there is no pixel that occurred most frequently, any of the pixels for that location can be returned.
image = stats.mode(L)[0][0]
The problem with this approach is that it looks at each R/G/B value of a pixel individually. But I want a pixel to only be considered the same as another pixel if all the color channels match (i.e. R1=R2, G1=G2, B1=B2).
Try this:
def packRGB(RGB):
return np.left_shift(RGB, [0, 8, 16]).sum(-1)
def unpackRGB(i24):
B = np.right_shift(i24, 16)
G = np.right_shift(i24, 8) - np.left_shift(B, 8)
R = i24 - np.left_shift(G, 8) - np.left_shift(B, 16)
return np.stack([R, G, B]).T
def img_mode(imgs_list, average_singles = True):
imgs = np.array(imgs_list) #(10, 100, 100, 3)
imgs24 = packRGB(imgs) # (10, 100, 100)
mode, count = scipy.stats.mode(imgs24, axis = 0) # (1, 100,100)
mode, count = mode.squeeze(), count.squeeze() #(100, 100)
if average_singles:
out = np.empty(imgs.shape[1:])
out[count == 1] = np.rint(np.average(imgs[:, count == 1], axis = 0))
out[count > 1] = unpackRGB(mode[count > 1])
out = unpackRGB(mode)
return out
EDIT: fixed error and added option from your other question: Aany value in set if no mode, which should be faster due to no division or rounding. scipy.stats.mode returns lowest value, which in this case will be the pixel with the lowest blue value. You also might want to try median, as mode is going to be unstable to very small differences in the inputs (especially if there are only ten)
This will also be a lot slower than, for instance, Photoshop's statistics function (I assume you're trying to do something like this), as you'd want to parallel-ize the function as well to make it time efficient.
import vtk
import pickle
from numpy import *
data_matrix = Ilog
dataImporter = vtk.vtkImageImport()
# The preaviusly created array is converted to a string of chars and imported.
data_string = data_matrix.tostring()
dataImporter.CopyImportVoidPointer(data_string, len(data_string))
# The type of the newly imported data is set to unsigned char (uint8)
# must be told this is the case.
# The following two functions describe how the data is stored and the dimensions of the array it is stored in. For this
# simple case, all axes are of length 75 and begins with the first element. For other data, this is probably not the case.
# I have to admit however, that I honestly don't know the difference between SetDataExtent() and SetWholeExtent() although
# VTK complains if not both are used.
dataImporter.SetDataExtent(0, 727, 0, 727, 0, 24)
dataImporter.SetWholeExtent(0, 727, 0, 727, 0, 24)
# The following class is used to store transparency-values for later retrieval. In our case, we want the value 0 to be
# completely opaque whereas the three different cubes are given different transparency-values to show how it works.
alphaChannelFunc = vtk.vtkPiecewiseFunction()
alphaChannelFunc.AddPoint(0, 1)
alphaChannelFunc.AddPoint(0.5, 0.9)
alphaChannelFunc.AddPoint(0.7, 0.9)
alphaChannelFunc.AddPoint(1, 0)
colorFunc = vtk.vtkColorTransferFunction()
colorFunc.AddRGBPoint(0, 1,1,1)
colorFunc.AddRGBPoint(0.2, 0.9, 0.9, 0.9)
colorFunc.AddRGBPoint(0.7, 0.2, 0.2, 0.2)
colorFunc.AddRGBPoint(1, 0, 0, 0)
# The preavius two classes stored properties. Because we want to apply these properties to the volume we want to render,
# we have to store them in a class that stores volume properties.
volumeProperty = vtk.vtkVolumeProperty()
# This class describes how the volume is rendered (through ray tracing).
compositeFunction = vtk.vtkVolumeRayCastCompositeFunction()
# We can finally create our volume. We also have to specify the data for it, as well as how the data will be rendered.
volumeMapper = vtk.vtkVolumeRayCastMapper()
# The class vtkVolume is used to pair the previously declared volume as well as the properties to be used when rendering that volume.
volume = vtk.vtkVolume()
# With almost everything else ready, its time to initialize the renderer and window, as well as creating a method for exiting the application
renderer = vtk.vtkRenderer()
renderWin = vtk.vtkRenderWindow()
renderInteractor = vtk.vtkRenderWindowInteractor()
# We add the volume to the renderer ...
# ... set background color to white ...
# ... and set window size.
# A simple function to be called when the user decides to quit the application.
def exitCheck(obj, event):
if obj.GetEventPending() != 0:
# Tell the application to use the function as an exit check.
renderWin.AddObserver("AbortCheckEvent", exitCheck)
# Because nothing will be rendered without any input, we order the first render manually before control is handed over to the main-loop.
Ilog is logical matrix of size 728x728x25 whose cross-section looks like
In this image the red color signifies the value 1 and the blue color signifies the value 0.
but when the above code is compiled the output is always a box like
The matrix contains values just zeros and ones. Using that logic the value with zeros have given full transparency and the values with zero have full opacity.
I have a problem in which a have a bunch of images for which I have to generate histograms. But I have to generate an histogram for each pixel. I.e, for a collection of n images, I have to count the values that the pixel 0,0 assumed and generate an histogram, the same for 0,1, 0,2 and so on. I coded the following method to do this:
class ImageData:
def generate_pixel_histogram(self, images, bins):
Generate a histogram of the image for each pixel, counting
the values assumed for each pixel in a specified bins
max_value = 0.0
min_value = 0.0
for i in range(len(images)):
image = images[i]
max_entry = max(max(p[1:]) for p in
min_entry = min(min(p[1:]) for p in
if max_entry > max_value:
max_value = max_entry
if min_entry < min_value:
min_value = min_entry
interval_size = (math.fabs(min_value) + math.fabs(max_value))/bins
for x in range(self.width):
for y in range(self.height):
pixel_histogram = {}
for i in range(bins+1):
key = round(min_value+(i*interval_size), 2)
pixel_histogram[key] = 0.0
for i in range(len(images)):
image = images[i]
value = round(Utils.get_bin([x][y], interval_size), 2)
pixel_histogram[value] += 1.0/len(images)[x][y] = pixel_histogram
Where each position of a matrix store a dictionary representing an histogram. But, how I do this for each pixel, and this calculus take a considerable time, this seems to me to be a good problem to be parallelized. But I don't have experience with this and I don't know how to do this.
I tried what #Eelco Hoogendoorn told me and it works perfectly. But applying it to my code, where the data are a large number of images generated with this constructor (after the values are calculated and not just 0 anymore), I just got as h an array of zeros [0 0 0]. What I pass to the histogram method is an array of ImageData.
class ImageData(object):
def __init__(self, width=5, height=5, range_min=-1, range_max=1):
The ImageData constructor
self.width = width
self.height = height
#The values range each pixel can assume
self.range_min = range_min
self.range_max = range_max = np.arange(width*height).reshape(height, width)
#Another class, just the method here
def generate_pixel_histogram(realizations, bins):
Generate a histogram of the image for each pixel, counting
the values assumed for each pixel in a specified bins
data = np.array([ for image in realizations])
min_max_range = data.min(), data.max()+1
bin_boundaries = np.empty(bins+1)
# Function to wrap np.histogram, passing on only the first return value
def hist(pixel):
h, b = np.histogram(pixel, bins=bins, range=min_max_range)
bin_boundaries[:] = b
return h
# Apply this for each pixel
hist_data = np.apply_along_axis(hist, 0, data)
print hist_data
print bin_boundaries
Now I get:
hist_data = np.apply_along_axis(hist, 0, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/", line 104, in apply_along_axis
outshape[axis] = len(res)
TypeError: object of type 'NoneType' has no len()
Any help would be appreciated.
Thanks in advance.
As noted by john, the most obvious solution to this is to look for library functionality that will do this for you. It exists, and it will be orders of magnitude more efficient than what you are doing here.
Standard numpy has a histogram function that can be used for this purpose. If you have only few values per pixel, it will be relatively inefficient; and it creates a dense histogram vector rather than the sparse one you produce here. Still, chances are good the code below solves your problem efficiently.
import numpy as np
#some example data; 128 images of 4x4 pixels
voxeldata = np.random.randint(0,100, (128, 4,4))
#we need to apply the same binning range to each pixel to get sensibble output
globalminmax = voxeldata.min(), voxeldata.max()+1
#number of output bins
bins = 20
bin_boundaries = np.empty(bins+1)
#function to wrap np.histogram, passing on only the first return value
def hist(pixel):
h, b = np.histogram(pixel, bins=bins, range=globalminmax)
bin_boundaries[:] = b #simply overwrite; result should be identical each time
return h
#apply this for each pixel
histdata = np.apply_along_axis(hist, 0, voxeldata)
print bin_boundaries
print histdata[:,0,0] #print the histogram of an arbitrary pixel
But the more general message id like to convey, looking at your code sample and the type of problem you are working on: do yourself a favor, and learn numpy.
Parallelization certainly would not be my first port of call in optimizing this kind of thing. Your main problem is that you're doing lots of looping at the Python level. Python is inherently slow at this kind of thing.
One option would be to learn how to write Cython extensions and write the histogram bit in Cython. This might take you a while.
Actually, taking a histogram of pixel values is a very common task in computer vision and it has already been efficiently implemented in OpenCV (which has python wrappers). There are also several functions for taking histograms in the numpy python package (though they are slower than the OpenCV implementations).
I'm using ctypes to access the image acquisition API from National Instruments (NI-IMAQ). In it, there's a function called imgBayerColorDecode() which I'm using on a Bayer encoded image returned from the imgSnap() function. I would like to compare the decoded output (that is an RGB image) to some numpy ndarrays that I will create based on the raw data, which is what imgSnap returns.
However, there are 2 problems.
The first is simple: passing the imgbuffer returned by imgSnap into a numpy array. Now first of all there's a catch: if your machine is 64-bit and you have more than 3GB of RAM, you cannot create the array with numpy and pass it as a pointer to imgSnap. That's why you have to implement a workaround, which is described on NI's forums (NI ref - first 2 posts): disable an error message (line 125 in the code attached below: imaq.niimaquDisable32bitPhysMemLimitEnforcement) and ensure that it is the IMAQ library that creates the memory required for the image (imaq.imgCreateBuffer). After that, this recipe on SO should be able to convert the buffer into a numpy array again. But I'm unsure if I made the correct changes to the datatypes: the camera has 1020x1368 pixels, each pixel intensity is recorded with 10 bits of precision. It returns the image over a CameraLink and I'm assuming it does this with 2 bytes per pixel, for ease of data transportation. Does this mean I have to adapt the recipe given in the other SO question:
buffer = numpy.core.multiarray.int_asbuffer(ctypes.addressof(y.contents), 8*array_length)
a = numpy.frombuffer(buffer, float)
to this:
bufsize = 1020*1368*2
buffer = numpy.core.multiarray.int_asbuffer(ctypes.addressof(y.contents), bufsize)
a = numpy.frombuffer(buffer, numpy.int16)
The second problem is that imgBayerColorDecode() does not give me an output I'm expecting.
Below are 2 images, the first being the output of imgSnap, saved with imgSessionSaveBufferEx(). The second is the output of imgSnap after it has gone through the demosaicing of imgBayerColorDecode().
raw data:
bayer decoded:
As you can see, the bayer decoded image is still a grayscale and moreover it does not resemble the original image (small remark here, the images were scaled for upload with imagemagick). The original image was taken with a red color filter in front of some mask. From it (and 2 other color filters), I know that the Bayer color filter looks like this in the top left corner:
I believe I'm doing something wrong in passing the correct type of pointer to imgBayerDecode, my code is appended below.
#!/usr/bin/env python
from __future__ import division
import ctypes as C
import ctypes.util as Cutil
import time
# useful references:
# location of the niimaq.h: C:\Program Files (x86)\National Instruments\NI-IMAQ\Include
# location of the camera files: C:\Users\Public\Documents\National Instruments\NI-IMAQ\Data
# check it C:\Users\Public\Documents\National Instruments\NI-IMAQ\Examples\MSVC\Color\BayerDecode
class IMAQError(Exception):
"""A class for errors produced during the calling of National Intrument's IMAQ functions.
It will also produce the textual error message that corresponds to a specific code."""
def __init__(self, code):
self.code = code
text = C.c_char_p('')
imaq.imgShowError(code, text)
self.message = "{}: {}".format(self.code, text.value)
# Call the base class constructor with the parameters it needs
Exception.__init__(self, self.message)
def imaq_error_handler(code):
"""Print the textual error message that is associated with the error code."""
if code < 0:
raise IMAQError(code)
free_associated_resources = 1
imaq.imgClose(sid, free_associated_resources)
imaq.imgClose(iid, free_associated_resources)
return code
if __name__ == '__main__':
imaqlib_path = Cutil.find_library('imaq')
imaq = C.windll.LoadLibrary(imaqlib_path)
imaq_function_list = [ # this is not an exhaustive list, merely the ones used in this program
imaq.niimaquDisable32bitPhysMemLimitEnforcement, # because we're running on a 64-bit machine with over 3GB of RAM
imaq.imgBayerColorDecode ]
# for all imaq functions we're going to call, we should specify that if they
# produce an error (a number), we want to see the error message (textually)
for func in imaq_function_list:
func.restype = imaq_error_handler
INTERFACE_ID = C.c_uint32
SESSION_ID = C.c_uint32
BUFLIST_ID = C.c_uint32
sid = SESSION_ID(0)
bid = BUFLIST_ID(0)
array_16bit = 2**16 * C.c_uint32
redLUT, greenLUT, blueLUT = [ array_16bit() for _ in range(3) ]
red_gain, blue_gain, green_gain = [ C.c_double(val) for val in (1., 1., 1.) ]
# our camera has been given its proper name in Measurement & Automation Explorer (MAX)
lcp_cam = 'JAI CV-M7+CL'
imaq.imgInterfaceOpen(lcp_cam, C.byref(iid))
imaq.imgSessionOpen(iid, C.byref(sid));
# define some C preprocessor macros (these are all defined in the niimaq.h file)
_IMG_BASE = 0x3FF60000
IMG_BUFF_ADDRESS = _IMG_BASE + 0x007E # void *
IMG_BUFF_COMMAND = _IMG_BASE + 0x007F # uInt32
IMG_BUFF_SIZE = _IMG_BASE + 0x0082 #uInt32
IMG_CMD_STOP = 0x08 # single shot acquisition
IMG_ATTR_COLOR = _IMG_BASE + 0x0003 # true = supports color
IMG_ATTR_PIXDEPTH = _IMG_BASE + 0x0002 # pix depth in bits
IMG_ATTR_BITSPERPIXEL = _IMG_BASE + 0x0066 # aka the bit depth
width, height = C.c_uint32(), C.c_uint32()
has_color, pixdepth, bitsperpixel, bytes_per_pixel = [ C.c_uint8() for _ in range(4) ]
# poll the camera (or is it the camera file (icd)?) for these attributes and store them in the variables
for var, macro in [ (width, IMG_ATTR_ROI_WIDTH),
(bytes_per_pixel, IMG_ATTR_BYTESPERPIXEL),
(pixdepth, IMG_ATTR_PIXDEPTH),
(has_color, IMG_ATTR_COLOR),
(bitsperpixel, IMG_ATTR_BITSPERPIXEL) ]:
imaq.imgGetAttribute(sid, macro, C.byref(var))
print("Image ROI size: {} x {}".format(width.value, height.value))
print("Pixel depth: {}\nBits per pixel: {} -> {} bytes per pixel".format(
bufsize = width.value*height.value*bytes_per_pixel.value
# create the buffer (in a list)
imaq.imgCreateBufList(1, C.byref(bid)) # Creates a buffer list with one buffer
imgbuffer = C.POINTER(C.c_uint16)() # create a null pointer
RGBbuffer = C.POINTER(C.c_uint32)() # placeholder for the Bayer decoded imgbuffer (i.e. demosaiced imgbuffer)
imaq.imgCreateBuffer(sid, 0, bufsize, C.byref(imgbuffer)) # allocate memory (the buffer) on the host machine (param2==0)
imaq.imgCreateBuffer(sid, 0, width.value*height.value * 4, C.byref(RGBbuffer))
imaq.imgSetBufferElement(bid, 0, IMG_BUFF_ADDRESS, C.cast(imgbuffer, C.POINTER(C.c_uint32))) # my guess is that the cast to an uint32 is necessary to prevent 64-bit callable memory addresses
imaq.imgSetBufferElement(bid, 0, IMG_BUFF_SIZE, bufsize)
imaq.imgSetBufferElement(bid, 0, IMG_BUFF_COMMAND, IMG_CMD_STOP)
imaq.imgCalculateBayerColorLUT(red_gain, green_gain, blue_gain, redLUT, greenLUT, blueLUT, bitsperpixel)
imgbuffer_vpp = C.cast(C.byref(imgbuffer), C.POINTER(C.c_void_p))
imaq.imgSnap(sid, imgbuffer_vpp)
#imaq.imgSnap(sid, imgbuffer) # <- doesn't work (img produced is entirely black). The above 2 lines are required
imaq.imgSessionSaveBufferEx(sid, imgbuffer,"bayer_mosaic.png")
print('1 taken')
imaq.imgBayerColorDecode(RGBbuffer, imgbuffer, height, width, width, width, redLUT, greenLUT, blueLUT, IMG_BAYER_PATTERN_BGBG_GRGR, bitsperpixel, 0)
free_associated_resources = 1
imaq.imgClose(sid, free_associated_resources )
imaq.imgClose(iid, free_associated_resources )
print "Finished"
Follow-up: after a discussion with an NI representative, I am getting convinced that the second issue is due to imgBayerColorDecode being limited to 8bit input images prior to its 2012 release (we are working on 2010). However, I would like to confirm this: if I cast the 10-bit image to an 8-bit image, keeping only the most significant bytes, and passing this cast version to imgBayerColorDecode, I'm expecting to see an RGB image.
To do so, I am casting the imgbuffer to a numpy array and shifting the 10-bit data with 2 bits:
np_buffer = np.core.multiarray.int_asbuffer(
ctypes.addressof(imgbuffer.contents), bufsize)
flat_data = np.frombuffer(np_buffer, np.uint16)
# from 10 bit to 8 bit, keeping only the non-empty bytes
Z = (flat_data>>2).view(dtype='uint8')[::2]
Z2 = Z.copy() # just in case
Now I pass the ndarray Z2 to imgBayerColorDecode:
bitsperpixel = 8
imaq.imgBayerColorDecode(RGBbuffer, Z2.ctypes.data_as(
ctypes.POINTER(ctypes.c_uint8)), height, width,
width, width, redLUT, greenLUT, blueLUT,
Remark that the original code (shown way above) has been altered slightly, such that redLUt, greenLUT and blueLUT are now only 256 element arrays.
And finally I call imaq.imgSessionSaveBufferEx(sid,RGBbuffer, save_path). But it is still a grayscale and the img shape is not preserved, so I am still doing something terribly wrong. Any ideas?
After a bit of playing around, it turns out that the RGBbuffer mentioned must hold the correct data, but imgSessionSaveBufferEx is doing something odd at that point.
When I pass the data from RGBbuffer back to numpy, reshape this 1D-array into the dimension of the image and then split it into color channels by masking and using bitshift operations (e.g. red_channel = (np_RGB & 0XFF000000)>>16), I can then save it as a nice color image in png format with PIL or pypng.
I haven't found out why imgSessionSaveBufferEx behaves oddly though, but the solution above works (even though speed-wise it's really inefficient).