get a numpy array from a sequence of images in a folder - python

I have a folder, say video1 with bunch of images in order frame_00.png, frame_01.png, ...
What I want is a 4D numpy array in the format (number of frames, w, h, 3)
This is what I did, but I think it is quite slow, is there any faster or more effecient method to achieve the same thing?
folder = "video1/"
import os
images = sorted(os.listdir(folder)) #["frame_00", "frame_01", "frame_02", ...]
from PIL import Image
import numpy as np
video_array = []
for image in images:
im = Image.open(folder + image)
video_array.append(np.asarray(im)) #.transpose(1, 0, 2))
video_array = np.array(video_array)
print(video_array.shape)
#(75, 50, 100, 3)

There's an older SO thread that goes into a great deal of detail (perhaps even a bit too much) on this very topic. Rather than vote to close this question as a dup, I'm going to give a quick rundown of that thread's top bullet points:
The fastest commonly available image reading function is imread from the cv2 package.
Reading the images in and then adding them to a plain Python list (as you are already doing) is the fastest approach for reading in a large number of images.
However, given that you are eventually converting the list of images to an array of images, every possible method of building up an array of images is almost exactly as fast as any other
Although, interestingly enough, if you take the approach of assigning images directly to a preallocated array, it actually matters which indices (ie which dimension) you assign to in terms of getting optimal performance.
So basically, you're not going to be able to get much faster while working in pure, single-threaded Python. You might get a boost from switching to cv2.imread (in place of PIL.Image.open).

PNG is an extremely slow format, so if you can use almost anything else, you'll see a big speedup.
For example, here's an opencv version of your program that gets the filenames from command-line args:
#!/usr/bin/python3
import sys
import cv2
import numpy as np
video_array = []
for filename in sys.argv[1:]:
im = cv2.imread(filename)
video_array.append(np.asarray(im))
video_array = np.array(video_array)
print(video_array.shape)
I can run it like this:
$ mkdir sample
$ for i in {1..100}; do cp ~/pics/k2.png sample/$i.png; done
$ time ./readframes.py sample/*.png
(100, 2048, 1450, 3)
real 0m6.063s
user 0m5.758s
sys 0m0.839s
So 6s to read 100 PNG images. If I try with TIFF instead:
$ for i in {1..100}; do cp ~/pics/k2.tif sample/$i.tif; done
$ time ./readframes.py sample/*.tif
(100, 2048, 1450, 3)
real 0m1.532s
user 0m1.060s
sys 0m0.843s
1.5s, so four times faster.
You might get a small speedup with pyvips:
#!/usr/bin/python3
import sys
import pyvips
import numpy as np
# map vips formats to np dtypes
format_to_dtype = {
'uchar': np.uint8,
'char': np.int8,
'ushort': np.uint16,
'short': np.int16,
'uint': np.uint32,
'int': np.int32,
'float': np.float32,
'double': np.float64,
'complex': np.complex64,
'dpcomplex': np.complex128,
}
# vips image to numpy array
def vips2numpy(vi):
return np.ndarray(buffer=vi.write_to_memory(),
dtype=format_to_dtype[vi.format],
shape=[vi.height, vi.width, vi.bands])
video_array = []
for filename in sys.argv[1:]:
vi = pyvips.Image.new_from_file(filename, access='sequential')
video_array.append(vips2numpy(vi))
video_array = np.array(video_array)
print(video_array.shape)
I see:
$ time ./readframes.py sample/*.tif
(100, 2048, 1450, 3)
real 0m1.360s
user 0m1.629s
sys 0m2.153s
Another 10% or so.
Finally, as other posters have said, you could load frames in parallel. That wouldn't help TIFF much, but it would certainly boost PNG.

Related

High performance (python) library for reading tiff files?

I am using code to read a .tiff file in order to calculate a fractal dimension. My code looks like this:
import matplotlib.pyplot as plt
raster = plt.imread('xyz.tif')
for i in range(x1, x2):
for j in range(y1, y2):
pixel = raster[i][j]
This works, but I have to read a lot of pixels so I would like this to be fast, and ideally minimize electricity usage given current events. Is there a better library than matplotlib for this purpose? For example, could using a library specialized for matrix operations such as pandas help? Additionally, would another language such as C have better performance than python?
Edit: #cgohlke in the comments and others have found that cv2 is slower than tifffile for large and/or compressed images. It is best you test the different options on realistic data for your application.
I have found cv2 to be the fastest library for this. Using 5000 128x128 uint16 tif images gives the following result:
import time
import matplotlib.pyplot as plt
t0 = time.time()
for file in files:
raster = plt.imread(file)
print(f'{time.time()-t0:.2f} s')
1.52 s
import time
from PIL import Image
t0 = time.time()
for file in files:
im = np.array(Image.open(file))
print(f'{time.time()-t0:.2f} s')
1.42 s
import time
import tifffile
t0 = time.time()
for file in files:
im = tifffile.imread(file)
print(f'{time.time()-t0:.2f} s')
1.25 s
import time
import cv2
t0 = time.time()
for file in files:
im = cv2.imread(file, cv2.IMREAD_UNCHANGED)
print(f'{time.time()-t0:.2f} s')
0.20 s
cv2 is a computer vision library written in c++, which as the other commenter mentioned is much faster than pure python. Note the cv2.IMREAD_UNCHANGED flag, otherwise cv2 will convert monochrome images to 8-bit rgb.
I am not sure which library is the fastest but I have very good experience with Pillow:
from PIL import Image
raster = Image.open('xyz.tif')
then you could convert it to a numpy array:
import numpy
pixels = numpy.array(raster)
I would need to see the rest of the code to be able to recommend any other libraries. As for the language C++ or C would have better performance as they are low level languages. So depends on how complex your operations are and how much data you need to process, C++ scripts were shown to be 10-200x faster(increasing with the complexity of calculations). Hope this helps if you have any further questions just ask.

Images generated from the same array are different during image compression

In this code sample, the assertion in the function fails.
from pathlib import Path
import numpy as np
import PIL.Image
def make_images(tmp_path):
np.random.seed(0)
shape = (4, 6, 3)
rgb = np.random.randint(0, 256, shape, dtype=np.uint8)
test_image = PIL.Image.fromarray(rgb)
image_path = tmp_path / 'test_image.jpg'
test_image.save(image_path)
return image_path, rgb
def test_Image_load_rgb(tmp_path):
image_path, original_rgb = make_images(tmp_path)
rgb2 = np.array(PIL.Image.open(image_path))
assert np.array_equal(rgb2, original_rgb)
if __name__ == '__main__':
test_Image_load_rgb(tmp_path)
When I look at the two arrays, original_rgb and rgb2, they have different values, so of course it is failing, but I don't understand why their arrays have different values.
Opening them both as images using PIL.Image.fromarray(), visually they look similar but not the same, the brightness values are slightly altered, visually.
I don't understand why this is.
The two images are:
Note: This is fails the same way for both pytest and when run as a script.
It occurred to me to test this with BMP and PNG images, and this problem does not happen with them.
So it occurs to me that the JPG Compression process somehow alters the data slightly, since it is lossy compression.
But I was surprised, that it would have an effect in such a small and light image.
I am leaving this question in case someone else stumbles on to this.
Anyone offering a more detailed explanation would be great!
UPDATE: I noticed the colors in BMP/PNG are much different from the JPG. Any reason why?

how can I load a single tif image in parts into numpy array without loading the whole image into memory?

so There is a 4GB .TIF image that needs to be processed, as a memory constraint I can't load the whole image into numpy array so I need to load it lazily in parts from hard disk.
so basically I need and that needs to be done in python as the project requirement. I also tried looking for tifffile library in PyPi tifffile but I found nothing useful please help.
pyvips can do this. For example:
import sys
import numpy as np
import pyvips
image = pyvips.Image.new_from_file(sys.argv[1], access="sequential")
for y in range(0, image.height, 100):
area_height = min(image.height - y, 100)
area = image.crop(0, y, image.width, area_height)
array = np.ndarray(buffer=area.write_to_memory(),
dtype=np.uint8,
shape=[area.height, area.width, area.bands])
The access option to new_from_file turns on sequential mode: pyvips will only load pixels from the file on demand, with the restriction that you must read pixels out top to bottom.
The loop runs down the image in blocks of 100 scanlines. You can tune this, of course.
I can run it like this:
$ vipsheader eso1242a-pyr.tif
eso1242a-pyr.tif: 108199x81503 uchar, 3 bands, srgb, tiffload_stream
$ /usr/bin/time -f %M:%e ./sections.py ~/pics/eso1242a-pyr.tif
273388:479.50
So on this sad old laptop it took 8 minutes to scan a 108,000 x 82,000 pixel image and needed a peak of 270mb of memory.
What processing are you doing? You might be able to do the whole thing in pyvips. It's quite a bit quicker than numpy.
import pyvips
img = pyvips.Image.new_from_file("space.tif", access='sequential')
out = img.resize(0.01, kernel = "linear")
out.write_to_file("resied_image.jpg")
if you want to convert the file to other format have a smaller size this code will be enough and will help you do it without without any memory spike and in very less time...

Fast way to import and crop a jpeg in python lib

I have a python app that imports 200k+ images, crops them, and presents the cropped image to pyzbar to interpret a barcode. Cropping helps because there are multiple barcodes on the image and, presumably pyzbar is a little faster when given smaller images.
Currently I am using Pillow to import and crop the image.
On the average importing and cropping an image takes 262 msecs and pyzbar take 8 msecs.
A typical run is about 21 hours.
I wonder if a library other than Pillow might offer substantial improvements in loading/cropping. Ideally the library should be available for MacOS but I could also run the whole thing in a virtual Ubuntu machine.
I am working on a version that can run in parallel processes which will be a big improvement but if I could get 25% or more speed increase from a different library I would also add that.
As you didn't provide a sample image, I made a dummy file with dimensions 2544x4200 at 1.1MB in size and it is provided at the end of the answer. I made 1,000 copies of that image and processed all 1,000 images for each benchmark.
As you only gave your code in the comments area, I took it, formatted it and made the best I could of it. I also put it in a loop so it can process many files for just one invocation of the Python interpreter - this becomes important when you have 20,000 files.
That looks like this:
#!/usr/bin/env python3
import sys
from PIL import Image
# Process all input files so we only incur Python startup overhead once
for filename in sys.argv[1:]:
print(f'Processing: {filename}')
imgc = Image.open(filename).crop((0, 150, 270, 1050))
My suspicion is that I can make that faster using:
GNU Parallel, and/or
pyvips
Here is a pyvips version of your code:
#!/usr/bin/env python3
import sys
import pyvips
import numpy as np
# Process all input files so we only incur Python startup overhead once
for filename in sys.argv[1:]:
print(f'Processing: {filename}')
img = pyvips.Image.new_from_file(filename, access='sequential')
roi = img.crop(0, 150, 270, 900)
mem_img = roi.write_to_memory()
# Make a numpy array from that buffer object
nparr = np.ndarray(buffer=mem_img, dtype=np.uint8,
shape=[roi.height, roi.width, roi.bands])
Here are the results:
Sequential original code
./orig.py bc*jpg
224 seconds, i.e. 224 ms per image, same as you
Parallel original code
parallel ./orig.py ::: bc*jpg
55 seconds
Parallel original code but passing as many filenames as possible
parallel -X ./orig.py ::: bc*jpg
42 seconds
Sequential pyvips
./vipsversion bc*
30 seconds, i.e. 7x as fast as PIL which was 224 seconds
Parallel pyvips
parallel ./vipsversion ::: bc*
32 seconds
Parallel pyvips but passing as many filenames as possible
parallel -X ./vipsversion ::: bc*
5.2 seconds, i.e. this is the way to go :-)
Note that you can install GNU Parallel on macOS with homebrew:
brew install parallel
You might take a look on PyTurboJPEG which is a Python wrapper of libjpeg-turbo with insanely fast rescaling (1/2, 1/4, 1/8) while decoding large JPEG image, the returning numpy.ndarray is handy for image cropping. Moreover, JPEG image encoding speed is also remarkable.
from turbojpeg import TurboJPEG
# specifying library path explicitly
# jpeg = TurboJPEG(r'D:\turbojpeg.dll')
# jpeg = TurboJPEG('/usr/lib64/libturbojpeg.so')
# jpeg = TurboJPEG('/usr/local/lib/libturbojpeg.dylib')
# using default library installation
jpeg = TurboJPEG()
# direct rescaling 1/2 while decoding input.jpg to BGR array
in_file = open('input.jpg', 'rb')
bgr_array_half = jpeg.decode(in_file.read(), scaling_factor=(1, 2))
in_file.close()
# encoding BGR array to output.jpg with default settings.
out_file = open('output.jpg', 'wb')
out_file.write(jpeg.encode(bgr_array))
out_file.close()
libjpeg-turbo prebuilt binaries for macOS and Linux are also available here.

Save 1 bit deep binary image in Python

I have a binary image in Python and I want to save it in my pc.
I need it to be a 1 bit deep png image once stored in my computer.
How can I do that? I tried with both PIL and cv2 but I'm not able to save it with 1 bit depth.
I found myself in a situation where I needed to create a lot of binary images, and was frustrated with the available info online. Thanks to the answers and comments here and elsewhere on SO, I was able to find an acceptable solution. The comment from #Jimbo was the best so far. Here is some code to reproduce my exploration of some ways to save binary images in python:
Load libraries and data:
from skimage import data, io, util #'0.16.2'
import matplotlib.pyplot as plt #'3.0.3'
import PIL #'6.2.1'
import cv2 #'4.1.1'
check = util.img_as_bool(data.checkerboard())
The checkerboard image from skimage has dimensions of 200x200. Without compression, as a 1-bit image it should be represented by (200*200/8) 5000 bytes
To save with skimage, note that the package will complain if the data is not uint, hence the conversion. Saving the image takes an average of 2.8ms and has a 408 byte file size
io.imsave('bw_skimage.png',util.img_as_uint(check),plugin='pil',optimize=True,bits=1)
Using matplotlib, 4.2ms and 693 byte file size
plt.imsave('bw_mpl.png',check,cmap='gray')
Using PIL, 0.5ms and 164 byte file size
img = PIL.Image.fromarray(check)
img.save('bw_pil.png',bits=1,optimize=True)
Using cv2, also complains about a bool input. The following command takes 0.4ms and results in a 2566 byte file size, despite the png compression...
_ = cv2.imwrite('bw_cv2.png', check.astype(int), [cv2.IMWRITE_PNG_BILEVEL, 1])
PIL was clearly the best for speed and file size.
I certainly missed some optimizations, comments welcome!
Use:
cv2.imwrite(<image_name>, img, [cv2.IMWRITE_PNG_BILEVEL, 1])
(this will still use compression, so in practice it will most likely have less than 1 bit per pixel)
If you're not loading pngs or anything the format does behave pretty reasonably to just write it. Then your code doesn't need PIL or any of the headaches of various imports and imports on imports etc.
import struct
import zlib
from math import ceil
def write_png_1bit(buf, width, height, stride=None):
if stride is None:
stride = int(ceil(width / 8))
raw_data = b"".join(
b'\x00' + buf[span:span + stride] for span in range(0, (height - 1) * stride, stride))
def png_pack(png_tag, data):
chunk_head = png_tag + data
return struct.pack("!I", len(data)) + chunk_head + struct.pack("!I", 0xFFFFFFFF & zlib.crc32(chunk_head))
return b"".join([
b'\x89PNG\r\n\x1a\n',
png_pack(b'IHDR', struct.pack("!2I5B", width, height, 1, 0, 0, 0, 0)),
png_pack(b'IDAT', zlib.compress(raw_data, 9)),
png_pack(b'IEND', b'')])
Adapted from:
http://code.activestate.com/recipes/577443-write-a-png-image-in-native-python/ (MIT)
by reading the png spec:
https://www.w3.org/TR/PNG-Chunks.html
Keep in mind the 1 bit data from buf, should be written left to right like the png spec wants in normal non-interlace mode (which we declared). And the excess data pads the final bit if it exists, and stride is the amount of bytes needed to encode a scanline. Also, if you want those 1 bit to have palette colors you'll have to write a PLTE block and switch the type to 3 rather than 0. Etc.

Categories

Resources