Bellow is the following mask showing the detected object by using histogram back projection
The image has the type float32 which results from the algorithm's output. I want to detect contours using cv2.findContours function.
As you know this function accept a certain image type which is uint8, otherwise it raises ans error. Therefore, I converted the image type from float32 into uint8 using imageFloat.astype(np.uint8).
When displaying the new converted binary image (new uint8) it displays a black image which means that the detected object is no longer visible (Zero mask)
So my question is: anyone know why this happens and what i'm doing wrong?
Thanks in advance
Khaled
You are not scaling up the values of the image pixels before converting to int, this is the reason why you are facing error.
Do this:
imageFloat *= 255
imageFloat.astype(np.uint8)
Related
I'm currently trying to start with an original RGB image, convert it to LUV, perform some operations (namely, rotate the hues), then rotate it back to RGB for display purposes. However, I'm encountering a vexing issue where the RGB-to-LUV conversion (and vice versa) seems to be changing the image. Specifically, if I begin with an LUV image, convert it to RGB, and then change it back to LUV, without changing anything else, the original image is different. This has happened for both the Python (cv2) and Matlab (open source) implementations of the color conversion algorithms, as well as my own hand-coded ones based on. Here is an example:
luv1 = np.array([[[100,6.12,0]]]).astype('float32')
rgb1 = cv2.cvtColor(luv1,cv2.COLOR_Luv2RGB)
luv2 = cv2.cvtColor(rgb1,cv2.COLOR_RGB2Luv)
print(luv2)
[[[99.36293 1.3064307 -1.0494182]]]
As you can see, the LUV coordinates have changed from the input. Is this because certain LUV coordinates have no direct match in RGB space?
Yes, remove the astype('uint8') bit in your code, and the difference should disappear if the conversion is implemented correctly.
You can see the equations for the conversion in Wikipedia. There is nothing there that is irreversible, the conversions are perfect inverses of each other.
However, this conversion contains a 3rd power, which does stretch some values significantly. The rounding of the conversion to an integer can introduce a significant shift of color.
Also, the Luv domain is highly irregular and it might not be easy to verify that Luv values will lead to a valued RGB value. Your statement "I've verified that luv1 has entries that all fall in the allowable input ranges" makes me believe that you think the Luv domain is a box. It is not. The ranges for u and v change with L. One good exercise is to start with a sampling of the RGB cube, and map those to Luv, then plot those points to see the shape of the Luv domain. Wikipedia has an example of what this could look like for the sRGB gamut.
The OpenCV cvtColor function will clamp RGB values to the [0,1] range (if of type float32), leading to irreversible changes of color if the input is out of gamut.
Here is an example that shows that the conversion is reversible. I start with RGB values because these are easy to verify as valid:
import numpy as np
import cv2
rgb1 = np.array([[[1.0,1.0,1.0],[0.5,1.0,0.5],[0.0,0.5,0.5],[0.0,0.0,0.0]]], 'float32')
luv1 = cv2.cvtColor(rgb1, cv2.COLOR_RGB2Luv)
rgb2 = cv2.cvtColor(luv1, cv2.COLOR_Luv2RGB)
np.max(np.abs(rgb2-rgb1))
This returns 2.8897537e-06, which is numerical precision for 32-bit floats.
In OpenCV (Python), to convert RGB to YCbCr we use:
imgYCC = cv2.cvtColor(img, cv2.COLOR_BGR2YCR_CB)
What if i want to come back to RGB?
Check the docs for color conversions. You can see all of the available color conversion codes here: Conversion Color Codes.
For the colorspaces available, you can generally transform both ways---COLOR_BGR2YCrCb (i.e. BGR-to-YCrCb) and COLOR_YCrCb2BGR (i.e. YCrCb-to-BGR). Also, OpenCV uses BGR ordering, not RGB ordering. Regardless, to answer the specific question at hand, simply convert back using the opposite order of the colorspaces:
img_bgr = cv2.cvtColor(imgYCC, cv2.COLOR_YCrCb2BGR)
Note: cv2.COLOR_YCrCb2BGR is equivalent to cv2.COLOR_YCR_CB2BGR, I just find the first variant easier to read. Since these transformations (on uint8 images especially), there's some rounding going on so you won't necessarily get the exact same image going back and fourth. But you shouldn't be more than like 1 off at a few of the locations.
I'm an R user, new to both Python and TensorFlow, and have been struggling to get my retrained image classifier to actually make predictions when modifying label_image.py for use with Mobilenets. I've identified the problem and know I need to implement the last line from this tutorial, but I can't figure out how.
If you're going to be using the Mobilenet models in label_image or
your own programs, you'll need to feed in an image of the specified
size converted to a float range into the 'input' tensor. Typically
24-bit images are in the range [0,255], and you must convert them to
the [-1,1] float range expected by the model with the formula (image -
128.)/128..
In R I'm used to dealing with JPEGs as 3 dimensional arrays. If it were in that format I would know what to do, but the image type returned from tf.gfile.FastGFile("fileName.jpg", 'rb').read() is bytes. I don't really understand what this is. Directly applying the formula they give to the image object returns TypeError: unsupported operand type(s) for -: 'bytes' and 'float'. I assume that after I change the range I'll still need it to be in bytes format to feed it into the network, but I'm not 100% clear on that either. Any clarifications on what this object type is and how to work with it would be much appreciated.
tf.gfile.FastGfile..read() returns a binary string, to get image value you have to decode.
imagedata = tf.gfile.FastGFile("fileName.jpg", 'rb').read()
img_decoded = tf.image.decode_jpeg(imagedata, dct_method="INTEGER_ACCURATE")
image_standardized = tf.image.per_image_standardization(image)
image_standardized = tf.clip_by_value(image_standardized, -1.0, 1.0)
bytes is literally a buffer of raw bytes, which lie in the range [0,255]. You can get the int values out by iterating over it. Then you can normalize.
image = b'\x20\x30\x40'
normalized = [(x-128.)/128 for x in image]
print(normalized) # [-0.75, -0.625, -0.5]
I have a .png image that contains three grayscale values. It contains black (0), white (255) and gray (128) blobs. I want to resize this image to a smaller size while preserving only these three grayscale values.
Currently, I am using scipy.misc.imresize to do it but I noticed that when I reduce the size, the edges get blurred and now contains more than 3 grayscale values.
Does anyone know how to do this in python?
From the docs for imresize, note the interp keyword argument:
interp : str, optional
Interpolation to use for re-sizing
(‘nearest’, ‘lanczos’, ‘bilinear’, ‘bicubic’ or ‘cubic’).
The default is bilinear filtering; switch to nearest and it will instead use the exact color of the nearest existing pixel, which will preserve your precise grayscale values rather than trying to linearly interpolate between them.
I believe that PIL.Image.resize does exactly what you want. Take a look at the docs.
Basically what you need is:
from PIL import Image
im = Image.open('old.png')
# The Image.NEAREST is the default, I'm just being explicit
im = im.resize((im.size[0]/2, im.size[1]/2), Image.NEAREST)
im.save('new.png')
Actually you can pretty much do that with the scipy.misc.imresize
Take a look at its docs.
The interp parameter is what you need. If you set it to nearest the image colors won't be affected.
I am using OpenCV to read and display an image. I am trying to do a scalar multiplication but it is being displayed very differently for two similar approaches:
img = cv2.imread('C:/Python27/user_scripts/images/g1.jpg', -1)
cv2.imshow('img_scaled1', 0.5*img)
cv2.waitKey(0)
cv2.imshow('img_scaled2', img/2)
cv2.waitKey(0)
In the 1st case, hardly anything is displayed. 2nd case works fine.
It seems to me that imshow() does not support numpy array of floats.
I want to use the first method. Can somebody help?
There are lot of pitfall when using images. This one seems like a type issue.
imshowaccept uint8 arrays in range(0,256) (256 excluded), and float arrays in range(0.0,1.0). When doing a=a*.5, you have a float array out of range, so no warranty on the result.
A solution is to cast the array in the uint8 type by:
imshow((a*.5).astype(np.uint8))
or
imshow((a*.5).astype('uint8'))