Decoding a YUV image in Python OpenCV - python

I have a YUV420_SP_NV21 image represented as a byte array (no headers), taken from an Android preview frame, and I need to decode it into a RGB image.
I've done this in Android apps before, using Java and OpenCV4Android:
convert_mYuv = new Mat(height + height / 2, width, CvType.CV_8UC1);
convert_mYuv.put( 0, 0, data );
Imgproc.cvtColor( convert_mYuv, convert_mRgba, type, channels );
I've tried doing the same in Python:
nmp = np.array(byteArray, dtype=np.ubyte)
RGBMatrix = cv2.cvtColor(nmp, cv2.COLOR_YUV420P2RGB)
, but RGBMatrix remains None.
I'm aware of the possibility to do this myself, I'm familiar with the formula, but would be realy happy to do this the OpenCV way.
How can this be done?
I've also tried cv2.imdecode(), but it failed too, possibly because of me miss-using it.

Its been a while since I solved this issue, but I hadn't the time to update this question.
I ended up extracting the YUV420_SP_NV21 byte array to a full scaled YUV image (1280x720 in my case) using numpy, then converting it to RGB using OpenCV.
import cv2 as cv2 # OpenCV import
def YUVtoRGB(byteArray):
e = 1280*720
Y = byteArray[0:e]
Y = np.reshape(Y, (720,1280))
s = e
V = byteArray[s::2]
V = np.repeat(V, 2, 0)
V = np.reshape(V, (360,1280))
V = np.repeat(V, 2, 0)
U = byteArray[s+1::2]
U = np.repeat(U, 2, 0)
U = np.reshape(U, (360,1280))
U = np.repeat(U, 2, 0)
RGBMatrix = (np.dstack([Y,U,V])).astype(np.uint8)
RGBMatrix = cv2.cvtColor(RGBMatrix, cv2.COLOR_YUV2RGB, 3)
My raw data was YUV420_SP_NV21, which is decode like this: YY...YVUVUVU...VU,
but other YUV formats will require subtle changes to the code.

Related

Why do I get different results, when I export data to *.vtk-files with MATLAB or Python, respectively?

I got some trouble exporting vector-data to *.vtk-file format for later use in ParaView. However, since now I was using MATLAB and especially a script called vtkwrite.m which can be found here. This works fine so far, but I wanted to change to Python using tvtk because of license reasons.
I managed to export my vector-data with tvtk and Python to *.vtk-format, but compared to the MATLAB exported data, the files are quite different! First of all, the MATLAB version is almost twice as big as the Python version (67.2MB to 46.2MB). Also when I visualize streamlines in ParaView, both data look quite different. The MATLAB data is way more smoother than the Python version.
What is the reason for these differences?
Here some coding I used to export the data. Consider vx, vy, vz the 3D vectorial velocity components I want to process.
1) MATLAB:
[x,y,z]=ndgrid(1:size(vx,1),1:size(vx,2),1:size(vx,3));
vtkwrite('/pathToFile/filename.vtk','structured_grid',x,y,z,'vectors','velocity',vx,vy,vz);
2) Python
from tvtk.api import tvtk, write_data
dim=vx.shape
xx,yy,zz=np.mgrid[0:dim[0],0:dim[1],0:dim[2]]
pts = empty(dim + (3,), dtype=int)
pts[..., 0] = xx
pts[..., 1] = yy
pts[..., 2] = zz
vectors = empty(dim + (3,), dtype=float)
vectors[..., 0] = vx
vectors[..., 1] = vy
vectors[..., 2] = vz
pts = pts.transpose(2, 1, 0, 3).copy()
pts.shape = pts.size // 3, 3
vectors = vectors.transpose(2, 1, 0, 3).copy()
vectors.shape = vectors.size // 3, 3
sg = tvtk.StructuredGrid(dimensions=xx.shape, points=pts)
sg.point_data.vectors = vectors
sg.point_data.vectors.name = 'velocity'
write_data(sg, '/pathToFile/filename.vtk')
As you can see, the workflow in Python is way more difficult, so maybe I made a mistake here?!
If the Python code is working for you, I would wrap it in a function and simplyfy it as follows:
import numpy as np
def vtk_save(
filepath,
v_arr,
x_arr=None,
label=None,
ndim=3):
base_shape = v_arr.shape[:ndim]
if not isinstance(v_arr, np.ndarray):
v_arr = np.stack(v_arr[::-1], -1).reshape(-1, ndim)
if x_arr is None:
x_arr = np.stack(
np.mgrid[tuple(slice(0, dim) for dim in v_arr.shape[::-1])], -1) \
.reshape(-1, ndim)
elif not isinstance(x_arr, np.ndarray):
x_arr = np.stack(x_arr[::-1], -1).reshape(-1, ndim)
sg = tvtk.StructuredGrid(
dimensions=base_shape, points=x_arr)
sg.point_data.vectors = v_arr
sg.point_data.vectors.name = label
write_data(sg, filepath)
which could be used like:
vtk_save('/pathToFile/filename.vtk', [vx, vy, vz], label='velocity')
this code is modulo silly bugs that happen when writing on-the-fly untested code.

Find average colour of each section of an image

I am looking for the best way to achieve the following using Python:
Import an image.
Add a grid of n sections (4 shown in this example below).
For each section find the dominant colour.
Desired output
Output an array, list, dict or similar capturing these dominant colour values.
Maybe even a Matplotlib graph showing the colours (like pixel art).
What have I tried?
The image could be sliced using image slicer:
import image_slicer
image_slicer.slice('image_so_grid.png', 4)
I could then potentially use something like this to get the average colour but Im sure there are better ways to do this.
What are the best ways to do this with Python?
This works for 4 sections, but you'll need to figure out how to make it work for 'n' sections:
import cv2
img = cv2.imread('image.png')
def fourSectionAvgColor(image):
rows, cols, ch = image.shape
colsMid = int(cols/2)
rowsMid = int(rows/2)
numSections = 4
section0 = image[0:rowsMid, 0:colsMid]
section1 = image[0:rowsMid, colsMid:cols]
section2 = image[rowsMid: rows, 0:colsMid]
section3 = image[rowsMid:rows, colsMid:cols]
sectionsList = [section0, section1, section2, section3]
sectionAvgColorList = []
for i in sectionsList:
pixelSum = 0
yRows, xCols, chs = i.shape
pixelCount = yRows*xCols
totRed = 0
totBlue = 0
totGreen = 0
for x in range(xCols):
for y in range(yRows):
bgr = i[y,x]
b = bgr[0]
g = bgr[1]
r = bgr[2]
totBlue = totBlue+b
totGreen = totGreen+g
totRed = totRed+r
avgBlue = int(totBlue/pixelCount)
avgGreen = int(totGreen/pixelCount)
avgRed = int(totRed/pixelCount)
avgPixel = (avgBlue, avgGreen, avgRed)
sectionAvgColorList.append(avgPixel)
return sectionAvgColorList
print(fourSectionAvgColor(img))
cv2.waitKey(0)
cv2.destroyAllWindows()
You can use scikit-image's view_as_blocks together with numpy.mean. You specify the block size instead of the number of blocks:
import numpy as np
from skimage import data, util
import matplotlib.pyplot as plt
astro = data.astronaut()
blocks = util.view_as_blocks(astro, (8, 8, 3))
print(astro.shape)
print(blocks.shape)
mean_color = np.mean(blocks, axis=(2, 3, 4))
fig, ax = plt.subplots()
ax.imshow(mean_color.astype(np.uint8))
Output:
(512, 512, 3)
(64, 64, 1, 8, 8, 3)
Don't forget the cast to uint8 because matplotlib and scikit-image expect floating point images to be in [0, 1], not [0, 255]. See the scikit-image documentation on data types for more info.

Elaborating very large array in Python

I open a TIFF LAB image and return a big numpy array (4928x3264x3 float64) using python with this function:
def readTIFFLAB(filename):
"""Read TIFF LAB and retur a float matrix
read 16 bit (2 byte) each time without any multiprocessing
about 260 sec"""
import numpy as np
....
....
# Data read
# Matrix creation
dim = (int(ImageLength), int(ImageWidth), int(SamplePerPixel))
Image = np.empty(dim, np.float64)
contatore = 0
for address in range(0, len(StripOffsets)):
offset = StripOffsets[address]
f.seek(offset)
for lung in range(0, (StripByteCounts[address]/SamplePerPixel/2)):
v = np.array(f.read(2))
v.dtype = np.uint16
v1 = np.array(f.read(2))
v1.dtype = np.int16
v2 = np.array(f.read(2))
v2.dtype = np.int16
v = np.array([v/65535.0*100])
v1 = np.array([v1/32768.0*128])
v2 = np.array([v2/32768.0*128])
v = np.append(v, [v1, v2])
riga = contatore // ImageWidth
colonna = contatore % ImageWidth
# print(contatore, riga, colonna)
Image[riga, colonna, :] = v
contatore += 1
return(Image)
but this routine need about 270 second to do all the work and return a numpy array.
I try to use multiprocessing but is not possible to share an array or to use queue to pass it and sharedmem is not usable in windows system (at home I use openSuse but at work I must use windows).
Someone could help me to reduce the elaboration time? I read about threadind, to write some part in C language but I don’t understand what the best (and easier) solution,...I’m a food technologist not a real programmer :-)
Thanks
Wow, your method is really slow indeed, try tifffile library, you can find it here. That library will open your file very fast, then you just need to make the proper conversion, here's the simple usage:
import numpy as np
import tifffile
from skimage import color
import time
import matplotlib.pyplot as plt
def convert_to_tifflab(image):
# divide the color channel
L = image[:, :, 0]
a = image[:, :, 1]
b = image[:, :, 2]
# correct interpretation of a/b channel
a.dtype = np.int16
b.dtype = np.int16
# scale the result
L = L / 65535.0 * 100
a = a / 32768.0 * 128
b = b / 32768.0 * 128
# join the result
lab = np.dstack([L, a, b])
# view the image
start = time.time()
rgb = color.lab2rgb(lab)
print "Lab2Rgb: {0}".format(time.time() - start)
return rgb
if __name__ == "__main__":
filename = '/home/cilladani1/FERRERO/Immagini Digi Eye/Test Lettura CIELAB/TestLetturaCIELAB (LAB).tif'
start = time.time()
I = tifffile.imread(filename)
end = time.time()
print "Image fetching: {0}".format(end - start)
rgb = convert_to_tifflab(I)
print "Image conversion: {0}".format(time.time() - end)
plt.imshow(rgb)
plt.show()
The benchmark gives this data:
Image fetching: 0.0929999351501
Lab2Rgb: 12.9520001411
Image conversion: 13.5920000076
As you can see the bottleneck in this case is lab2rgb, which converts from xyz to rgb space. I'd recommend you to report an issue to the author of tifffile requesting the feature to read your fileformat, I'm sure he'll be able to speed up directly the C code.
After doing what BPL suggest me I modify the result array as follow:
# divide the color channel
L = I[:, :, 0]
a = I[:, :, 1]
b = I[:, :, 2]
# correct interpretation of a/b channel
a.dtype = np.int16
b.dtype = np.int16
# scale the result
L = L / 65535.0 * 100
a = a / 32768.0 * 128
b = b / 32768.0 * 128
# join the result
lab = np.dstack([L, a, b])
# view the image
from skimage import color
rgb = color.lab2rgb(lab)
plt.imshow(rgb)
So now is easier to read TIFF LAB image.
Thank BPL

Numpy Error - Only On Linux

The following bit of Python takes two images and performs an 'alpha composite' of them, or in other words, sticks one on top of the other, and returns a single image. The code isn't really something I quite grasp, as it came from another Stack Overflow answer.
import numpy as np
import Image
def alpha_composite(src, dst):
src = np.asarray(src)
dst = np.asarray(dst)
out = np.empty(src.shape, dtype = 'float')
alpha = np.index_exp[:, :, 3:]
rgb = np.index_exp[:, :, :3]
src_a = src[alpha]/255.0
dst_a = dst[alpha]/255.0
out[alpha] = src_a+dst_a*(1-src_a)
old_setting = np.seterr(invalid = 'ignore')
out[rgb] = (src[rgb]*src_a + dst[rgb]*dst_a*(1-src_a))/out[alpha]
np.seterr(**old_setting)
out[alpha] *= 255
np.clip(out,0,255)
# astype('uint8') maps np.nan (and np.inf) to 0
out = out.astype('uint8')
out = Image.fromarray(out, 'RGBA')
return out
It works great on Windows, but as soon as I move it over to Ubuntu Server, it gives me the following error:
File "ImageStitcher.py", line 21, in alpha_composite
src_a = src[alpha]/255.0
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index
I'm using the same version of PIL and the same version of numpy on both.
Any idea what might be going on here?

Using numpy and pil to convert 565(16bit-color) to 888(24bit-color)

I must preface this, with the fact that I have a working method using bitshift and putpixel, but it is incredibly slow, and I am looking to leverage numpy to speed up the process. I believe I am close, but not quite there. Having timed what I think should work, I'm seeing a 0.3 second improvement in time, hence my motivation.
The current working code:
buff # a binary set of data
im = Image.new("RGBA",(xdim,ydim))
for y in xrange(ydim):
for x in xrange(xdim):
px = buff[x*y]
# the 255 is for the alpha channel which I plan to use later
im.putpixel((x,y),(px&0xF800) >> 8, (px&0x07E0) >> 3, (px&0x001F) <<3, 255))
return im
The code I'm trying to get work looks like this:
im16 = numpy.fromstring(buff,dtype=numpy.uint16) #read data as shorts
im16 = numpy.array(im16,dtype=numpy.uint32) #now that it's in the correct order, convert to 32 bit so there is room to do shifting
r = numpy.right_shift(8, im16.copy() & 0xF800)
g = numpy.right_shift(3, im16.copy() & 0x07E0)
b = numpy.left_shift( 3, im16 & 0x001F)
pA = numpy.append(r,g)
pB = numpy.append(b,numpy.ones((xdim,ydim),dtype=numpy.uint32) * 0xFF) #this is a black alpha channel
img = numpy.left_shift(img,8) #gives me green channel
im24 = Image.fromstring("RGBA",(xdim,ydim),img)
return im24
so the final problem, is that the channels are not combining and I don't believe I should have to do that final bit shift (note that I get the red channel if I don't bit-shift by 8). Assistance on how to combine everything correctly would be much appreciated.
SOLUTION
import numpy as np
arr = np.fromstring(buff,dtype=np.uint16).astype(np.uint32)
arr = 0xFF000000 + ((arr & 0xF800) >> 8) + ((arr & 0x07E0) << 5) + ((arr & 0x001F) << 19)
return Image.frombuffer('RGBA', (xdim,ydim), arr, 'raw', 'RGBA', 0, 1)
the difference is that you need to pack it as MSB(ALPHA,B,G,R)LSB counter intuitive from putpixel, but it works, and works well
Warning: the following code hasn't been checked, but I think that this will do what you want (if I'm understanding everything correctly):
import numpy as np
arr = np.fromstring(buff,dtype=np.uint16).astype(np.uint32)
arr = ((arr & 0xF800) << 16) + ((arr & 0x07E0) << 13) + ((arr & 0x001F) << 11) + 0xFF
return Image.frombuffer('RGBA', (xdim,ydim), arr, 'raw', 'RGBA', 0, 1)
I'm combining all of the channels together into 32-bits on the line that does all of the bit shifting. The leftmost 8-bits are the red, the next 8 are the green, the next 8 blue, and the last 8 alpha. The shifting numbers may seem a little strange because I incorporated the shifts from the 16-bit format. Also, I'm using frombuffer because then we want to take advantage of the buffer being used by Numpy rather than converting to a string first.
It might help to look at this page. It's not super great in my opinion, but that's how things go with PIL in my experience. The documentation is really not very user-friendly, in fact I often find it confusing, but I'm not about to volunteer to rewrite it because I don't use PIL much.
If you want to do the scaling appropriately, here is a more PIL-ish to solve your problem.
FROM_5 = ((np.arange(32, dtype=numpy.uint16) * 255 + 15) // 31).astype(numpy.ubyte)
FROM_6 = ((np.arange(64, dtype=numpy.uint16) * 255 + 31) // 63).astype(numpy.ubyte)
data = numpy.fromstring(buff, dtype=numpy.uint16)
r = Image.frombuffer('L', shape, FROM_5[data >> 11], 'raw', 'L', 0, 1)
g = Image.frombuffer('L', shape, FROM_6[(data >> 5) & 0x3F], 'raw', 'L', 0, 1)
b = Image.frombuffer('L', shape, FROM_5[data & 0x1F], 'raw', 'L', 0, 1)
return Image.merge('RGB', (r, g, b))

Categories

Resources