i am trying my hands on a segmentation task, the images are 3d volumes since i cannot process them at once because of gpu memory constraints, i am extracting patches of the image and performing operations on them.
for extracting the patches i am
def cutup(data, blck, strd):
sh = np.array(data.shape)
blck = np.asanyarray(blck)
strd = np.asanyarray(strd)
nbl = (sh - blck) // strd + 1
strides = np.r_[data.strides * strd, data.strides]
dims = np.r_[nbl, blck]
data6 = stride_tricks.as_strided(data, strides=strides, shape=dims)
return data6.reshape(-1, *blck)
def make_patches(image_folder, mask_folder):
'''
Given niigz image and mask files will create numpy files
'''
for image, mask in tqdm.tqdm(zip(os.listdir(image_folder), os.listdir(mask_folder))):
mask_ = mask
mask = mask.split('_')
image = mask[0]
image_name = mask[0]
mask_name = mask[0]
image, mask = read_image_and_seg(os.path.join(image_folder, image), os.path.join(mask_folder,mask_))
if image.shape[1] > 600:
image = image[:,:600,:]
desired_size_w = 896
desired_size_h = 600
desired_size_z = 600
delta_w = desired_size_w - image.shape[0]
delta_h = desired_size_h - image.shape[1]
delta_z = desired_size_z - image.shape[2]
padded_image =np.pad(image, ((0,delta_w), (0,delta_h), (0, delta_z)), 'constant')
padded_mask =np.pad(mask, ((0,delta_w), (0,delta_h), (0, delta_z)), 'constant')
y = cutup(padded_image, (128,128,128),(128,128,128))#Actually extract more patches by changing stride size
y_ = cutup(padded_mask, (128,128,128),(128,128,128))
print(image_name)
for index, (im , label) in enumerate(zip(y , y_)):
if len(np.unique(im)) ==1:
continue
else:
if not os.path.exists(os.path.join('../data/patches/images/',image_name.split('.')[0]+str(index))):
np.save(os.path.join('../data/patches/images/',image_name.split('.')[0]+str(index)), im)
np.save(os.path.join('../data/patches/masks/', image_name.split('.')[0]+str(index)), label)
now this will extract non - overlapping patches and give me the patches in numpy array, just as an aside i am converting the image to shape(padding with 0) 896,640,640 so i can extarct all patches
The problem is i dont know if the above code works!to test it wanted to extract the patches and then take those patches and reconstruct the image,now i am not exactly sure how to go about this,
for now this is what i have
def reconstruct_image(folder_path_of_npy_files):
slice_shape = len(os.listdir(folder_path_of_npy_files))
recon_image = np.array([])
for index, file in enumerate(os.listdir(folder_path_of_npy_files)):
read_image = np.load(os.path.join(folder_path_of_npy_files, file))
recon_image = np.append(recon_image, read_image)
return recon_image
but this does not work as it makes an array of (x, 128,128,128) and keeps filling up the 0th dimension.
So my question is , how do i reconstruct the image? or is there just a plain better way to extract and reconstruct patches.
Thanks in advance.
If things are reasonably simple (not sliding window) then you could possibly use skimage.util.shape.view_as_blocks. For example:
import numpy as np
import skimage
# Create example
data = np.random.random((200,200,200))
blocks = skimage.util.shape.view_as_blocks(data, (10, 10, 10))
# Do the processing on the blocks here.
processed_blocks = blocks
new_data = np.reshape(process_blocks, (200, 200, 200))
But, if you are having memory constraint issues this may not be the best way to go as you are going to be duplicating the original data several times (data, blocks, new_data) etc and you might have to look at doing it a little smarter than my example here.
If you are having memory issues, the other thing you can do, very carefully, is to change the underlying data type of your data. For example, when I was doing MRI data, most original data was integer-ish but Python would represent it as float64. If you can accept some rounding on the data then you could do something like:
import numpy as np
import skimage
# Create example
data = 200*np.random.random((200,200,200)).astype(np.float16) # 2 byte float
blocks = skimage.util.shape.view_as_blocks(data, (10, 10, 10))
# Do the processing on the blocks here.
new_data = np.reshape(blocks, (200, 200, 200))
This version uses:
In [2]: whos
Variable Type Data/Info
-------------------------------
blocks ndarray 20x20x20x10x10x10: 8000000 elems, type `float16`, 16000000 bytes (15.2587890625 Mb)
data ndarray 200x200x200: 8000000 elems, type `float16`, 16000000 bytes (15.2587890625 Mb)
new_data ndarray 200x200x200: 8000000 elems, type `float16`, 16000000 bytes (15.2587890625 Mb)
vs the first version:
In [2]: whos
Variable Type Data/Info
-------------------------------
blocks ndarray 20x20x20x10x10x10: 8000000 elems, type `float64`, 64000000 bytes (61.03515625 Mb)
data ndarray 200x200x200: 8000000 elems, type `float64`, 64000000 bytes (61.03515625 Mb)
new_data ndarray 200x200x200: 8000000 elems, type `float64`, 64000000 bytes (61.03515625 Mb)
So, doing the np.float16 saves you about a factor of 4 in RAM.
But, making this type of change puts assumptions on the data and algorithm (possible rounding issues etc).
Related
Is it possible to save numpy arrays on disk in boolean format where it takes only 1 bit per element? This answer suggests to use packbits and unpackbits, however from the documentation, it seems that this may not support memory mapping. Is there a way to store 1bit arays on disk with memmap support?
Reason for memmap requirement: I'm training my neural network on a database of full HD (1920x1080) images, but I crop out randomly a 256x256 patch for each iteration. Since reading the full image is time consuming, I use memmap to read the only the required patch. Now, I want to use a binary mask along with my images and hence this requirement.
numpy does not support 1 bit per element arrays, I doubt memmap has such a feature.
However, there is a simple workaround using packbits.
Since your case is not bitwise random access, you can read it as 1 byte per element array.
# A binary mask represented as an 1 byte per element array.
full_size_mask = np.random.randint(0, 2, size=[1920, 1080], dtype=np.uint8)
# Pack mask vertically.
packed_mask = np.packbits(full_size_mask, axis=0)
# Save as a memmap compatible file.
buffer = np.memmap("./temp.bin", mode='w+',
dtype=packed_mask.dtype, shape=packed_mask.shape)
buffer[:] = packed_mask
buffer.flush()
del buffer
# Open as a memmap file.
packed_mask = np.memmap("./temp.bin", mode='r',
dtype=packed_mask.dtype, shape=packed_mask.shape)
# Rect where you want to crop.
top = 555
left = 777
width = 256
height = 256
# Read the area containing the rect.
packed_top = top // 8
packed_bottom = (top + height) // 8 + 1
packed_patch = packed_mask[packed_top:packed_bottom, left:left + width]
# Unpack and crop the actual area.
patch_top = top - packed_top * 8
patch_mask = np.unpackbits(packed_patch, axis=0)[patch_top:patch_top + height]
# Check that the mask is cropped from the correct area.
print(np.all(patch_mask == full_size_mask[top:top + height, left:left + width]))
Note that this solution could (and likely will) read extra bits.
To be specific, 7 bits maximum at both ends.
In your case, it will be 7x2x256 bits, but this is only about 5% of the patch, so I believe it is negligible.
By the way, this is not an answer to your question, but when you are dealing with binary masks such as labels for image segmentation, compressing with zip may drastically reduce the file size.
It is possible that it could be reduced to less than 8 KB per image (not per patch).
You might want to consider this option as well.
I am working with a dataset that contains .mha files. I want to convert these files to either png/tiff for some work. I am using the Medpy library for converting.
image_data, image_header = load('image_path/c0001.mha')
from medpy.io import save
save(image_data, 'image_save_path/new_image.png', image_header)
I can actually convert the image into png/tiff format, but the converted image turns dark after the conversion. I am attaching the screenshot below. How can I convert the images successfully?
Your data is clearly limited to 12 bits (white is 2**12-1, i.e., 4095), while a PNG image in this context is 16 bits (white is 2**16-1, i.e., 65535). For this reason your PNG image is so dark that it appears almost black (but if you look closely it isn't).
The most precise transformation you can apply is the following:
import numpy as np
from medpy.io import load, save
def convert_to_uint16(data, source_max):
target_max = 65535 # 2 ** 16 - 1
# build a linear lookup table (LUT) indexed from 0 to source_max
source_range = np.arange(source_max + 1)
lut = np.round(source_range * target_max / source_max).astype(np.uint16)
# apply it
return lut[data]
image_data, image_header = load('c0001.mha')
new_image_data = convert_to_uint16(image_data, 4095) # 2 ** 12 - 1
save(new_image_data, 'new_image.png', image_header)
Output:
N.B.: new_image_data = image_data * 16 corresponds to replacing 65535 with 65520 (4095 * 16) in convert_to_uint16
You may apply "contrast stretching".
The dynamic range of image_data is about [0, 4095] - the minimum value is about 0, and the maximum value is about 4095 (2^12-1).
You are saving the image as 16 bits PNG.
When you display the PNG file, the viewer, assumes the maximum value is 2^16-1 (the dynamic range of 16 bits is [0, 65535]).
The viewer assumes 0 is black, 2^16-1 is white, and values in between scales linearly.
In your case the white pixels value is about 4095, so it translated to be a very dark gray in the [0, 65535] range.
The simplest solution is to multiply image_data by 16:
from medpy.io import load, save
image_data, image_header = load('image_path/c0001.mha')
save(image_data*16, 'image_save_path/new_image.png', image_header)
A more complicated solution is applying linear "contrast stretching".
We may transform the lower 1% of all pixel to 0, the upper 1% of the pixels to 2^16-1, and scale the pixels in between linearly.
import numpy as np
from medpy.io import load, save
image_data, image_header = load('image_path/c0001.mha')
tmp = image_data.copy()
tmp[tmp == 0] = np.median(tmp) # Ignore zero values by replacing them with median value (there are a lot of zeros in the margins).
tmp = tmp.astype(np.float32) # Convert to float32
# Get the value of lower and upper 1% of all pixels
lo_val, up_val = np.percentile(tmp, (1, 99)) # (for current sample: lo_val = 796, up_val = 3607)
# Linear stretching: Lower 1% goes to 0, upper 1% goes to 2^16-1, other values are scaled linearly
# Clipt to range [0, 2^16-1], round and convert to uint16
# https://stackoverflow.com/questions/49656244/fast-imadjust-in-opencv-and-python
img = np.round(((tmp - lo_val)*(65535/(up_val - lo_val))).clip(0, 65535)).astype(np.uint16) # (for current sample: subtract 796 and scale by 23.31)
img[image_data == 0] = 0 # Restore the original zeros.
save(img, 'image_save_path/new_image.png', image_header)
The above method enhance the contrast, but looses some of the original information.
In case you want higher contrast, you may use non-linear methods, improving the visibility, but loosing some "integrity".
Here is the "linear stretching" result (downscaled):
I have one big numpy array A of shape (2_000_000, 2000) of dtype float64, which takes 32 GB.
(or alternatively the same data split into 10 arrays of shape (200_000, 2000), it may be easier for serialization?).
How can we serialize it to disk such that we can have fast random read access to any part of the data?
More precisely I need to be able to read ten thousands of windows of shape (16, 2 000) from A at random starting indexes i:
L = []
for i in range(10_000):
i = random.randint(0, 2_000_000 - 16):
window = A[i:i+16, :] # window of A of shape (16, 2000) starting at a random index i
L.append(window)
WINS = np.concatenate(L) # shape (10_000, 16, 2000) of float64, ie: ~ 2.4 GB
Let's say I only have 8 GB of RAM available for this task; it's totally impossible to load the whole 32 GB of A in RAM.
How can we read such windows in a serialized-on-disk numpy array? (.h5 format or any other)
Note: The fact the reading is done at randomized starting indexes is important.
This example shows how you can use an HDF5 file for the process you describe.
First, create a HDF5 file with a dataset of shape(2_000_000, 2000) and dtype=float64 values. I used variables for the dimensions so you can tinker with it.
import numpy as np
import h5py
import random
h5_a0, h5_a1 = 2_000_000, 2_000
with h5py.File('SO_68206763.h5','w') as h5f:
dset = h5f.create_dataset('test',shape=(h5_a0, h5_a1))
incr = 1_000
a0 = h5_a0//incr
for i in range(incr):
arr = np.random.random(a0*h5_a1).reshape(a0,h5_a1)
dset[i*a0:i*a0+a0, :] = arr
print(dset[-1,0:10]) # quick dataset check of values in last row
Next, open the file in read mode, read 10_000 random array slices of shape (16,2_000) and append to the list L. At the end, convert the list to the array WINS. Note, by default the array will have 2 axes -- you need to use .reshape() if you want 3 axes per your comment (reshape also shown).
with h5py.File('SO_68206763.h5','r') as h5f:
dset = h5f['test']
L = []
ds0, ds1 = dset.shape[0], dset.shape[1]
for i in range(10_000):
ir = random.randint(0, ds0 - 16)
window = dset[ir:ir+16, :] # window from dset of shape (16, 2000) starting at a random index i
L.append(window)
WINS = np.concatenate(L) # shape (160_000, 2_000) of float64,
print(WINS.shape, WINS.dtype)
WINS = np.concatenate(L).reshape(10_0000,16,ds1) # reshaped to (10_000, 16, 2_000) of float64
print(WINS.shape, WINS.dtype)
The procedure above is not memory efficient. You wind up with 2 copies of the randomly sliced data: in both list L and array WINS. If memory is limited, this could be a problem. To avoid the intermediate copy, read the random slide of data directly to an array. Doing this simplifies the code, and reduces the memory footprint. That method is shown below (WINS2 is a 2 axis array, and WINS3 is a 3 axis array).
with h5py.File('SO_68206763.h5','r') as h5f:
dset = h5f['test']
ds0, ds1 = dset.shape[0], dset.shape[1]
WINS2 = np.empty((10_000*16,ds1))
WINS3 = np.empty((10_000,16,ds1))
for i in range(10_000):
ir = random.randint(0, ds0 - 16)
WINS2[i*16:(i+1)*16,:] = dset[ir:ir+16, :]
WINS3[i,:,:] = dset[ir:ir+16, :]
An alternative soluton to h5py datasets that I tried and that works is using memmap, as suggested in #RyanPepper's comment.
Write the data as binary
import numpy as np
with open('a.bin', 'wb') as A:
for f in range(1000):
x = np.random.randn(10*2000).astype('float32').reshape(10, 2000)
A.write(x.tobytes())
A.flush()
Open later as memmap
A = np.memmap('a.bin', dtype='float32', mode='r').reshape((-1, 2000))
print(A.shape) # (10000, 2000)
print(A[1234:1234+16, :]) # window
How can I speed up reading 12 bit little endian packed data in Python?
The following code is based on https://stackoverflow.com/a/37798391/11687201, works but it takes far too long.
import bitstring
import numpy as np
# byte_string read from file contains 12 bit little endian packed image data
# b'\xAB\xCD\xEF' -> pixel 1 = 0x0DAB, pixel 2 = Ox0EFC
# width, height equals image with height read
image = np.empty(width*height, np.uint16)
ic = 0
ii = np.empty(width*height, np.uint16)
for oo in range(0,len(byte_string)-2,3):
aa = bitstring.BitString(byte_string[oo:oo+3])
aa.byteswap()
ii[ic+1], ii[ic] = aa.unpack('uint:12,uint:12')
ic=ic+2
This should work a bit better:
for oo in range(0,len(byte_string)-2,3):
(word,) = struct.unpack('<L', byte_string[oo:oo+3] + b'\x00')
ii[ic+1], ii[ic] = (word >> 12) & 0xfff, word & 0xfff
ic += 2
It's very similar, but instead of using bitstring which is quite slow, it uses a single call to struct.unpack to extract 24 bits at a time (padding with zeroes so that it can be read as a long) and then does some bit masking to extract the two different 12-bit parts.
I found a solution, that executes much faster on my system than the solution mentioned above https://stackoverflow.com/a/65851364/11687201 which already was a great improvement (2 seconds instead of 2 minutes using the code in the question).
Loading one of my image files using the code below takes approximately 45 milliseconds, instead of approximately 2 seconds with the above mentioned solution.
import numpy as np
import math
image = np.frombuffer(byte_string, np.uint8)
num_bytes = math.ceil((width*height)*1.5)
num_3b = math.ceil(num_bytes / 3)
last = num_3b * 3
image = image[:last]
image = image.reshape(-1,3)
image = np.hstack( (image, np.zeros((image.shape[0],1), dtype=np.uint8)) )
image.dtype='<u4' # 'u' for unsigned int
image = np.hstack( (image, np.zeros((image.shape[0],1), dtype=np.uint8)) )
image[:,1] = (image[:,0] >> 12) & 0xfff
image[:,0] = image[:,0] & 0xfff
image = image.astype(np.uint16)
image = image.reshape(height, width)
I'm using ctypes to access the image acquisition API from National Instruments (NI-IMAQ). In it, there's a function called imgBayerColorDecode() which I'm using on a Bayer encoded image returned from the imgSnap() function. I would like to compare the decoded output (that is an RGB image) to some numpy ndarrays that I will create based on the raw data, which is what imgSnap returns.
However, there are 2 problems.
The first is simple: passing the imgbuffer returned by imgSnap into a numpy array. Now first of all there's a catch: if your machine is 64-bit and you have more than 3GB of RAM, you cannot create the array with numpy and pass it as a pointer to imgSnap. That's why you have to implement a workaround, which is described on NI's forums (NI ref - first 2 posts): disable an error message (line 125 in the code attached below: imaq.niimaquDisable32bitPhysMemLimitEnforcement) and ensure that it is the IMAQ library that creates the memory required for the image (imaq.imgCreateBuffer). After that, this recipe on SO should be able to convert the buffer into a numpy array again. But I'm unsure if I made the correct changes to the datatypes: the camera has 1020x1368 pixels, each pixel intensity is recorded with 10 bits of precision. It returns the image over a CameraLink and I'm assuming it does this with 2 bytes per pixel, for ease of data transportation. Does this mean I have to adapt the recipe given in the other SO question:
buffer = numpy.core.multiarray.int_asbuffer(ctypes.addressof(y.contents), 8*array_length)
a = numpy.frombuffer(buffer, float)
to this:
bufsize = 1020*1368*2
buffer = numpy.core.multiarray.int_asbuffer(ctypes.addressof(y.contents), bufsize)
a = numpy.frombuffer(buffer, numpy.int16)
The second problem is that imgBayerColorDecode() does not give me an output I'm expecting.
Below are 2 images, the first being the output of imgSnap, saved with imgSessionSaveBufferEx(). The second is the output of imgSnap after it has gone through the demosaicing of imgBayerColorDecode().
raw data: i42.tinypic.com/znpr38.jpg
bayer decoded: i39.tinypic.com/n12nmq.jpg
As you can see, the bayer decoded image is still a grayscale and moreover it does not resemble the original image (small remark here, the images were scaled for upload with imagemagick). The original image was taken with a red color filter in front of some mask. From it (and 2 other color filters), I know that the Bayer color filter looks like this in the top left corner:
BGBG
GRGR
I believe I'm doing something wrong in passing the correct type of pointer to imgBayerDecode, my code is appended below.
#!/usr/bin/env python
from __future__ import division
import ctypes as C
import ctypes.util as Cutil
import time
# useful references:
# location of the niimaq.h: C:\Program Files (x86)\National Instruments\NI-IMAQ\Include
# location of the camera files: C:\Users\Public\Documents\National Instruments\NI-IMAQ\Data
# check it C:\Users\Public\Documents\National Instruments\NI-IMAQ\Examples\MSVC\Color\BayerDecode
class IMAQError(Exception):
"""A class for errors produced during the calling of National Intrument's IMAQ functions.
It will also produce the textual error message that corresponds to a specific code."""
def __init__(self, code):
self.code = code
text = C.c_char_p('')
imaq.imgShowError(code, text)
self.message = "{}: {}".format(self.code, text.value)
# Call the base class constructor with the parameters it needs
Exception.__init__(self, self.message)
def imaq_error_handler(code):
"""Print the textual error message that is associated with the error code."""
if code < 0:
raise IMAQError(code)
free_associated_resources = 1
imaq.imgSessionStopAcquisition(sid)
imaq.imgClose(sid, free_associated_resources)
imaq.imgClose(iid, free_associated_resources)
else:
return code
if __name__ == '__main__':
imaqlib_path = Cutil.find_library('imaq')
imaq = C.windll.LoadLibrary(imaqlib_path)
imaq_function_list = [ # this is not an exhaustive list, merely the ones used in this program
imaq.imgGetAttribute,
imaq.imgInterfaceOpen,
imaq.imgSessionOpen,
imaq.niimaquDisable32bitPhysMemLimitEnforcement, # because we're running on a 64-bit machine with over 3GB of RAM
imaq.imgCreateBufList,
imaq.imgCreateBuffer,
imaq.imgSetBufferElement,
imaq.imgSnap,
imaq.imgSessionSaveBufferEx,
imaq.imgSessionStopAcquisition,
imaq.imgClose,
imaq.imgCalculateBayerColorLUT,
imaq.imgBayerColorDecode ]
# for all imaq functions we're going to call, we should specify that if they
# produce an error (a number), we want to see the error message (textually)
for func in imaq_function_list:
func.restype = imaq_error_handler
INTERFACE_ID = C.c_uint32
SESSION_ID = C.c_uint32
BUFLIST_ID = C.c_uint32
iid = INTERFACE_ID(0)
sid = SESSION_ID(0)
bid = BUFLIST_ID(0)
array_16bit = 2**16 * C.c_uint32
redLUT, greenLUT, blueLUT = [ array_16bit() for _ in range(3) ]
red_gain, blue_gain, green_gain = [ C.c_double(val) for val in (1., 1., 1.) ]
# OPEN A COMMUNICATION CHANNEL WITH THE CAMERA
# our camera has been given its proper name in Measurement & Automation Explorer (MAX)
lcp_cam = 'JAI CV-M7+CL'
imaq.imgInterfaceOpen(lcp_cam, C.byref(iid))
imaq.imgSessionOpen(iid, C.byref(sid));
# START C MACROS DEFINITIONS
# define some C preprocessor macros (these are all defined in the niimaq.h file)
_IMG_BASE = 0x3FF60000
IMG_BUFF_ADDRESS = _IMG_BASE + 0x007E # void *
IMG_BUFF_COMMAND = _IMG_BASE + 0x007F # uInt32
IMG_BUFF_SIZE = _IMG_BASE + 0x0082 #uInt32
IMG_CMD_STOP = 0x08 # single shot acquisition
IMG_ATTR_ROI_WIDTH = _IMG_BASE + 0x01A6
IMG_ATTR_ROI_HEIGHT = _IMG_BASE + 0x01A7
IMG_ATTR_BYTESPERPIXEL = _IMG_BASE + 0x0067
IMG_ATTR_COLOR = _IMG_BASE + 0x0003 # true = supports color
IMG_ATTR_PIXDEPTH = _IMG_BASE + 0x0002 # pix depth in bits
IMG_ATTR_BITSPERPIXEL = _IMG_BASE + 0x0066 # aka the bit depth
IMG_BAYER_PATTERN_GBGB_RGRG = 0
IMG_BAYER_PATTERN_GRGR_BGBG = 1
IMG_BAYER_PATTERN_BGBG_GRGR = 2
IMG_BAYER_PATTERN_RGRG_GBGB = 3
# END C MACROS DEFINITIONS
width, height = C.c_uint32(), C.c_uint32()
has_color, pixdepth, bitsperpixel, bytes_per_pixel = [ C.c_uint8() for _ in range(4) ]
# poll the camera (or is it the camera file (icd)?) for these attributes and store them in the variables
for var, macro in [ (width, IMG_ATTR_ROI_WIDTH),
(height, IMG_ATTR_ROI_HEIGHT),
(bytes_per_pixel, IMG_ATTR_BYTESPERPIXEL),
(pixdepth, IMG_ATTR_PIXDEPTH),
(has_color, IMG_ATTR_COLOR),
(bitsperpixel, IMG_ATTR_BITSPERPIXEL) ]:
imaq.imgGetAttribute(sid, macro, C.byref(var))
print("Image ROI size: {} x {}".format(width.value, height.value))
print("Pixel depth: {}\nBits per pixel: {} -> {} bytes per pixel".format(
pixdepth.value,
bitsperpixel.value,
bytes_per_pixel.value))
bufsize = width.value*height.value*bytes_per_pixel.value
imaq.niimaquDisable32bitPhysMemLimitEnforcement(sid)
# create the buffer (in a list)
imaq.imgCreateBufList(1, C.byref(bid)) # Creates a buffer list with one buffer
# CONFIGURE THE PROPERTIES OF THE BUFFER
imgbuffer = C.POINTER(C.c_uint16)() # create a null pointer
RGBbuffer = C.POINTER(C.c_uint32)() # placeholder for the Bayer decoded imgbuffer (i.e. demosaiced imgbuffer)
imaq.imgCreateBuffer(sid, 0, bufsize, C.byref(imgbuffer)) # allocate memory (the buffer) on the host machine (param2==0)
imaq.imgCreateBuffer(sid, 0, width.value*height.value * 4, C.byref(RGBbuffer))
imaq.imgSetBufferElement(bid, 0, IMG_BUFF_ADDRESS, C.cast(imgbuffer, C.POINTER(C.c_uint32))) # my guess is that the cast to an uint32 is necessary to prevent 64-bit callable memory addresses
imaq.imgSetBufferElement(bid, 0, IMG_BUFF_SIZE, bufsize)
imaq.imgSetBufferElement(bid, 0, IMG_BUFF_COMMAND, IMG_CMD_STOP)
# CALCULATE THE LOOKUP TABLES TO CONVERT THE BAYER ENCODED IMAGE TO RGB (=DEMOSAICING)
imaq.imgCalculateBayerColorLUT(red_gain, green_gain, blue_gain, redLUT, greenLUT, blueLUT, bitsperpixel)
# CAPTURE THE RAW DATA
imgbuffer_vpp = C.cast(C.byref(imgbuffer), C.POINTER(C.c_void_p))
imaq.imgSnap(sid, imgbuffer_vpp)
#imaq.imgSnap(sid, imgbuffer) # <- doesn't work (img produced is entirely black). The above 2 lines are required
imaq.imgSessionSaveBufferEx(sid, imgbuffer,"bayer_mosaic.png")
print('1 taken')
imaq.imgBayerColorDecode(RGBbuffer, imgbuffer, height, width, width, width, redLUT, greenLUT, blueLUT, IMG_BAYER_PATTERN_BGBG_GRGR, bitsperpixel, 0)
imaq.imgSessionSaveBufferEx(sid,RGBbuffer,"snapshot_decoded.png");
free_associated_resources = 1
imaq.imgSessionStopAcquisition(sid)
imaq.imgClose(sid, free_associated_resources )
imaq.imgClose(iid, free_associated_resources )
print "Finished"
Follow-up: after a discussion with an NI representative, I am getting convinced that the second issue is due to imgBayerColorDecode being limited to 8bit input images prior to its 2012 release (we are working on 2010). However, I would like to confirm this: if I cast the 10-bit image to an 8-bit image, keeping only the most significant bytes, and passing this cast version to imgBayerColorDecode, I'm expecting to see an RGB image.
To do so, I am casting the imgbuffer to a numpy array and shifting the 10-bit data with 2 bits:
np_buffer = np.core.multiarray.int_asbuffer(
ctypes.addressof(imgbuffer.contents), bufsize)
flat_data = np.frombuffer(np_buffer, np.uint16)
# from 10 bit to 8 bit, keeping only the non-empty bytes
Z = (flat_data>>2).view(dtype='uint8')[::2]
Z2 = Z.copy() # just in case
Now I pass the ndarray Z2 to imgBayerColorDecode:
bitsperpixel = 8
imaq.imgBayerColorDecode(RGBbuffer, Z2.ctypes.data_as(
ctypes.POINTER(ctypes.c_uint8)), height, width,
width, width, redLUT, greenLUT, blueLUT,
IMG_BAYER_PATTERN_BGBG_GRGR, bitsperpixel, 0)
Remark that the original code (shown way above) has been altered slightly, such that redLUt, greenLUT and blueLUT are now only 256 element arrays.
And finally I call imaq.imgSessionSaveBufferEx(sid,RGBbuffer, save_path). But it is still a grayscale and the img shape is not preserved, so I am still doing something terribly wrong. Any ideas?
After a bit of playing around, it turns out that the RGBbuffer mentioned must hold the correct data, but imgSessionSaveBufferEx is doing something odd at that point.
When I pass the data from RGBbuffer back to numpy, reshape this 1D-array into the dimension of the image and then split it into color channels by masking and using bitshift operations (e.g. red_channel = (np_RGB & 0XFF000000)>>16), I can then save it as a nice color image in png format with PIL or pypng.
I haven't found out why imgSessionSaveBufferEx behaves oddly though, but the solution above works (even though speed-wise it's really inefficient).