What would be the fastest/memory efficient way to get average over many frames of 16-bit TIFF image as numpy array?
What I came up so far is the code below. To my surprise, method2 was faster than method1.
But, for profiling never assume, test it! So, I want to test more.
Worth trying Wand? I did not include here because after imstalling ImageMagick-6.8.9-Q16 and MAGICK_HOME env var it still does not import... Any other library for multipage tiff in Python? GDAL maybe little too much for this.
(edit) I included libtiff. Still method2 fastest and quite memory efficient.
from time import time
#import cv2 ## no multi page tiff support
import numpy as np
from PIL import Image
#from scipy.misc import imread ## no multi page tiff support
import tifffile # http://www.lfd.uci.edu/~gohlke/code/tifffile.py.html
from libtiff import TIFF # https://code.google.com/p/pylibtiff/
fp = r"path/2/1000frames-timelapse-image.tif"
def method1(fp):
'''
using tifffile.py by Christoph (Version: 2014.02.05)
(http://www.lfd.uci.edu/~gohlke/code/tifffile.py.html)
'''
with tifffile.TIFFfile(fp) as imfile:
return imfile.asarray().mean(axis=0)
def method2(fp):
'primitive peak memory friendly way with tifffile.py'
with tifffile.TIFFfile(fp) as imfile:
nframe, h, w = imfile.series[0]['shape']
temp = np.zeros( (h,w), dtype=np.float64 )
for n in range(nframe):
curframe = imfile.asarray(n)
temp += curframe
return (temp / nframe)
def method3(fp):
' like method2 but using pillow 2.3.0 '
im = Image.open(fp)
w, h = im.size
temp = np.zeros( (h,w), dtype=np.float64 )
n = 0
while True:
curframe = np.array(im.getdata()).reshape(h,w)
temp += curframe
n += 1
try:
im.seek(n)
except:
break
return (temp / n)
def method4(fp):
'''
https://code.google.com/p/pylibtiff/
documentaion seems out dated.
'''
tif = TIFF.open(fp)
header = tif.info()
meta = dict() # extracting meta
for l in header.splitlines():
if l:
if l.find(':')>0:
parts = l.split(':')
key = parts[0]
value = ':'.join(parts[1:])
elif l.find('=')>0:
key, value =l.split('=')
meta[key] = value
nframes = int(meta['frames'])
h = int(meta['ImageLength'])
w = int(meta['ImageWidth'])
temp = np.zeros( (h,w), dtype=np.float64 )
for frame in tif.iter_images():
temp += frame
return (temp / nframes)
t0 = time()
avgimg1 = method1(fp)
print time() - t0
# 1.17-1.33 s
t0 = time()
avgimg2 = method2(fp)
print time() - t0
# 0.90-1.53 s usually faster than method1 by 20%
t0 = time()
avgimg3 = method3(fp)
print time() - t0
# 21 s
t0 = time()
avgimg4 = method4(fp)
print time() - t0
# 1.96 - 2.21 s # may not be accurate. I got warning for every frame with the tiff file I tested.
np.testing.assert_allclose(avgimg1, avgimg2)
np.testing.assert_allclose(avgimg1, avgimg3)
np.testing.assert_allclose(avgimg1, avgimg4)
Simple logic would make me bet my money on method 1 or 3, since method 2 and 4 have for-loops in them. For-loops Always make your code go slower if you have more input.
I would definitely go for method 1: neat, clear to read...
To be really sure, just test them I would say. If you don't feel like testing, I would go for method one.
Kind regards,
Related
I want to run through a large tif stack +1500 frames and extract the coordinates of the local maxima for each frame. The code below does the job, however extremely slow for large files. When running on smaller bits (e.g. 20 frames) each frame is done almost instantly - when running on the whole dataset, each frame takes seconds.
Any solutions to run a faster code? I figure it is due to the loading of the large tiff file - however it should only be necessary one time initially?
I have the following code:
from pims import ImageSequence
from skimage.feature import peak_local_max
def cmask(index,array):
radius = 3
a,b = index
nx,ny = array.shape
y,x = np.ogrid[-a:nx-a,-b:ny-b]
mask = x*x + y*y <= radius*radius
return(sum(array[mask])) # number of pixels
images = ImageSequence('tryhard_red_small.tif')
frame_list = []
x = []
y = []
int_liposome = []
BG_liposome = []
for i in range(len(images[0])):
tmp_frame = images[0][i]
xy = pd.DataFrame(peak_local_max(tmp_frame, min_distance=8,threshold_abs=3000))
x.extend(xy[0].tolist())
y.extend(xy[1].tolist())
for j in range(len(xy)):
index = x[j],y[j]
int_liposome.append(cmask(index,tmp_frame))
frame_list.extend([i]*len(xy))
print "Frame: ", i, "of ",len(images[0])
features = pd.DataFrame(
{'lip_int':int_liposome,
'y' : y,
'x' : x,
'frame' : frame_list})
Have you tried profiling the code, say with %prun or %lprun in ipython? That'll tell you exactly where your slowdowns are occurring.
I can't make my own version of this without the tif stack, but I suspect the problem is the fact that you're using lists to store everything. Every time you do an append or an extension, python is having to allocate more memory. You could try getting the total count of maxima first, then allocating your output arrays, then rerunning to fill the arrays. Something like below
# run through once to get the count of local maxima
npeaks = (len(peak_local_max(f, min_distance=8, threshold_abs=3000))
for f in images[0])
total_peaks = sum(npeaks)
# allocate storage arrays and rerun
x = np.zeros(total_peaks, np.float)
y = np.zeros_like(x)
int_liposome = np.zeros_like(x)
BG_liposome = np.zeros_like(x)
frame_list = np.zeros(total_peaks, np.int)
index_0 = 0
for frame_ind, tmp_frame in enumerate(images[0]):
peaks = pd.DataFrame(peak_local_max(tmp_frame, min_distance=8,threshold_abs=3000))
index_1 = index_0 + len(peaks)
# copy the data from the DataFrame's underlying numpy array
x[index_0:index_1] = peaks[0].values
y[index_0:index_1] = peaks[1].values
for i, peak in enumerate(peaks, index_0):
int_liposome[i] = cmask(peak, tmp_frame)
frame_list[index_0:index_1] = frame_ind
# update the starting index
index_0 = index_1
print "Frame: ", frame_ind, "of ",len(images[0])
Hello I am using Python to try to read the digit data provided by MNIST into a data structure I can use to train a neural network. I am testing to ensure the data was read properly by creating an image using PIL. The image that is being created is horribly wrong, and I am not sure if it is because I am using PIL incorrectly or my data structures and methods are not right.
The format of the two data files is described here:
http://yann.lecun.com/exdb/mnist/
Here are the applicable functions:
read_image_data reads the pixel data organizing it into a list of 2D array numpy arrays
def read_image_data():
fd = open("train-images.idx3-ubyte", "rb")
images_bin_string = fd.read()
num_images = struct.unpack(">i", images_bin_string[4:8])[0]
image_data_bank = []
uint32_num_bytes = 4
current_index = 8
num_rows = struct.unpack(">I", \
images_bin_string[current_index: current_index + uint32_num_bytes])[0]
num_cols = struct.unpack(">I", \
images_bin_string[current_index + uint32_num_bytes: \
current_index + uint32_num_bytes * 2])[0]
current_index += 8
i = 0
while i < num_images:
image_data = np.zeros([num_rows, num_cols])
for j in range(num_rows - 1):
for k in range(num_cols - 1):
image_data[j][k] = images_bin_string[current_index + j * k]
current_index += num_rows * num_cols
i += 1
image_data_bank.append(image_data)
return image_data_bank
read_label_data reads the corresponding labels into a list
def read_label_data():
fd = open("train-labels.idx1-ubyte", "rb")
images_bin_string = fd.read()
num_images = struct.unpack(">i", images_bin_string[4:8])[0]
image_data_bank = []
current_index = 8
i = 0
while i < num_images:
image_data_bank.append(images_bin_string[current_index])
current_index += 1
i += 1
return image_data_bank
collect_data zips the structures together
def collect_data():
print("Reading image data...")
image_data = read_image_data()
print("Reading label data...")
label_data = read_label_data()
print("Zipping data sets...")
all_data = np.array(list(zip(image_data, label_data)))
return all_data
lastly run_test uses PIL to print the pixels from the first 28x28 np structure created by read_image_data
def run_test(data):
example = data[0]
pixel_data = example[0]
number = example[1]
print(number)
im = Image.fromarray(pixel_data)
im.show()
When I run the script:
Collecting data... Reading image data... Reading label data... Zipping
data sets... 5
I must be messing something up with the PIL library, but I do not know what.
That is a really weird looking 5. I am guessing that I went wrong somewhere in my organization of the data. The directions did say "Pixels are organized row-wise.", but I think I covered that by having my outer loop as the row loop then the inner as the column loop
UPDATE
I reversed the order of the row and column index in the np.arrays in read_image_data and it is making no difference.
image_data[k][j] = images_bin_string[current_index + j * k]
UPDATE
Ran quick test with matplotlib
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
imgplot = plt.imshow(pixel_data)
plt.show()
Here is what I got from matplotlib
That means it is definitely a problem with my code and not the library. The question is if it is the way I am passing the pixels to the imaging libraries or how I structured the data. If anyone can find the mistake, I would greatly appreciate.
AS the title suggests I have an image, whose pixel coordinates I want to change using a mathematical function. So far, I have the following code which works but is very time consuming because of the nested loop. Do you have any suggestions to make it faster? To be quantitative, it takes about 2-2.5 minutes to complete the process on a 12MPixel image.
imgcor = np.zeros(img.shape, dtype=img.dtype)
for f in range(rowc):
for k in range(colc):
offX = k + (f*b*c*(math.sin(math.radians(a))))
offY = f + (f*b*d*(math.cos(math.radians(a))))
imgcor[f, k] = img[int(offY)%rowc, int(offX)%colc]
P.S. I am using opencv 2.4.13 and python 2.7
There may be a way to get numpy to do some vectorized work for you, but one easy speedup is to not re-calculate some of the values every time you loop (I'm assuming a,b,c, and d are not changing in the loop). I'm curious what the speedup would be, can you report back?
imgcor = np.zeros(img.shape, dtype=img.dtype)
offX_precalc = b*c*(math.sin(math.radians(a)))
offY_precalc = b*d*(math.cos(math.radians(a)))
for f in range(rowc):
for k in range(colc):
offX = k + (f*offX_precalc)
offY = f + (f*offY_precalc)
imgcor[f, k] = img[int(offY)%rowc, int(offX)%colc]
ok since the above was too slow, I added a bit of vectorization and I'm curious if it's faster:
imgcor = np.zeros(img.shape, dtype=img.dtype)
off_base = b*(math.sin(math.radians(a)))
offX_precalc = off_base*c
offY_precalc = off_base*d+1
for f in range(rowc):
offY = int(f*offY_precalc)%rowc
offXs = [int(k + (f*offX_precalc))%colc for k in range(colc)]
imgcor[f,:] = img[offY, offXs]
I open a TIFF LAB image and return a big numpy array (4928x3264x3 float64) using python with this function:
def readTIFFLAB(filename):
"""Read TIFF LAB and retur a float matrix
read 16 bit (2 byte) each time without any multiprocessing
about 260 sec"""
import numpy as np
....
....
# Data read
# Matrix creation
dim = (int(ImageLength), int(ImageWidth), int(SamplePerPixel))
Image = np.empty(dim, np.float64)
contatore = 0
for address in range(0, len(StripOffsets)):
offset = StripOffsets[address]
f.seek(offset)
for lung in range(0, (StripByteCounts[address]/SamplePerPixel/2)):
v = np.array(f.read(2))
v.dtype = np.uint16
v1 = np.array(f.read(2))
v1.dtype = np.int16
v2 = np.array(f.read(2))
v2.dtype = np.int16
v = np.array([v/65535.0*100])
v1 = np.array([v1/32768.0*128])
v2 = np.array([v2/32768.0*128])
v = np.append(v, [v1, v2])
riga = contatore // ImageWidth
colonna = contatore % ImageWidth
# print(contatore, riga, colonna)
Image[riga, colonna, :] = v
contatore += 1
return(Image)
but this routine need about 270 second to do all the work and return a numpy array.
I try to use multiprocessing but is not possible to share an array or to use queue to pass it and sharedmem is not usable in windows system (at home I use openSuse but at work I must use windows).
Someone could help me to reduce the elaboration time? I read about threadind, to write some part in C language but I don’t understand what the best (and easier) solution,...I’m a food technologist not a real programmer :-)
Thanks
Wow, your method is really slow indeed, try tifffile library, you can find it here. That library will open your file very fast, then you just need to make the proper conversion, here's the simple usage:
import numpy as np
import tifffile
from skimage import color
import time
import matplotlib.pyplot as plt
def convert_to_tifflab(image):
# divide the color channel
L = image[:, :, 0]
a = image[:, :, 1]
b = image[:, :, 2]
# correct interpretation of a/b channel
a.dtype = np.int16
b.dtype = np.int16
# scale the result
L = L / 65535.0 * 100
a = a / 32768.0 * 128
b = b / 32768.0 * 128
# join the result
lab = np.dstack([L, a, b])
# view the image
start = time.time()
rgb = color.lab2rgb(lab)
print "Lab2Rgb: {0}".format(time.time() - start)
return rgb
if __name__ == "__main__":
filename = '/home/cilladani1/FERRERO/Immagini Digi Eye/Test Lettura CIELAB/TestLetturaCIELAB (LAB).tif'
start = time.time()
I = tifffile.imread(filename)
end = time.time()
print "Image fetching: {0}".format(end - start)
rgb = convert_to_tifflab(I)
print "Image conversion: {0}".format(time.time() - end)
plt.imshow(rgb)
plt.show()
The benchmark gives this data:
Image fetching: 0.0929999351501
Lab2Rgb: 12.9520001411
Image conversion: 13.5920000076
As you can see the bottleneck in this case is lab2rgb, which converts from xyz to rgb space. I'd recommend you to report an issue to the author of tifffile requesting the feature to read your fileformat, I'm sure he'll be able to speed up directly the C code.
After doing what BPL suggest me I modify the result array as follow:
# divide the color channel
L = I[:, :, 0]
a = I[:, :, 1]
b = I[:, :, 2]
# correct interpretation of a/b channel
a.dtype = np.int16
b.dtype = np.int16
# scale the result
L = L / 65535.0 * 100
a = a / 32768.0 * 128
b = b / 32768.0 * 128
# join the result
lab = np.dstack([L, a, b])
# view the image
from skimage import color
rgb = color.lab2rgb(lab)
plt.imshow(rgb)
So now is easier to read TIFF LAB image.
Thank BPL
I often need to stack 2d numpy arrays (tiff images). For that, I first append them in a list and use np.dstack. This seems to be the fastest way to get 3D array stacking images. But, is there a faster/memory-efficient way?
from time import time
import numpy as np
# Create 100 images of the same dimention 256x512 (8-bit).
# In reality, each image comes from a different file
img = np.random.randint(0,255,(256, 512, 100))
t0 = time()
temp = []
for n in range(100):
temp.append(img[:,:,n])
stacked = np.dstack(temp)
#stacked = np.array(temp) # much slower 3.5 s for 100
print time()-t0 # 0.58 s for 100 frames
print stacked.shape
# dstack in each loop is slower
t0 = time()
temp = img[:,:,0]
for n in range(1, 100):
temp = np.dstack((temp, img[:,:,n]))
print time()-t0 # 3.13 s for 100 frames
print temp.shape
# counter-intuitive but preallocation is slightly slower
stacked = np.empty((256, 512, 100))
t0 = time()
for n in range(100):
stacked[:,:,n] = img[:,:,n]
print time()-t0 # 0.651 s for 100 frames
print stacked.shape
# (Edit) As in the accepted answer, re-arranging axis to mainly use
# the first axis to access data improved the speed significantly.
img = np.random.randint(0,255,(100, 256, 512))
stacked = np.empty((100, 256, 512))
t0 = time()
for n in range(100):
stacked[n,:,:] = img[n,:,:]
print time()-t0 # 0.08 s for 100 frames
print stacked.shape
After some joint effort with otterb, we concluded that preallocating of the array is the way to go. Apparently the performance killing bottleneck was the array layout with the image number (n) being the fastest changing index. If we make n the first index of the array (which will default to the "C" ordering: first index changest slowest, last index changes fastest) we get the best performance:
from time import time
import numpy as np
# Create 100 images of the same dimention 256x512 (8-bit).
# In reality, each image comes from a different file
img = np.random.randint(0,255,(100, 256, 512))
# counter-intuitive but preallocation is slightly slower
stacked = np.empty((100, 256, 512))
t0 = time()
for n in range(100):
stacked[n] = img[n]
print time()-t0
print stacked.shape