I am reading an image from a camera that comes in cv2.COLOR_RGB2BGR format. Below is a temporary work around for what I am trying to achieve:
import cv2
from skimage import transform, io
...
_, img = cam.read()
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
cv2.imwrite("temp.png", img)
img = io.imread("temp.png", as_gray=True)
img = transform.resize(img, (320, 240), mode='symmetric', preserve_range=True)
I found one way to do this conversion from this post, however, it seems that the image data is not the same than if I read the same image from a path?
I've also found from this documentation that I can use img_as_float(cv2_img), but this conversion does not produce the same result as what is returned by io.imread("temp.png", as_gray=True)
What is the proper way to do this conversion efficiently? Should I first convert the image back to RGB then use img_as_float()?
I guess, the basic problem you encounter, are the different luma calculations used by OpenCV and scikit-image:
OpenCV uses:
Y = 0.299 * R + 0.587 * G + 0.114 * B
scikit-image uses:
Y = 0.2125 * R + 0.7154 * G + 0.0721 * B
Let's have some tests – using the following image for example:
import cv2
import numpy as np
from skimage import io
# Assuming we have some kind of "OpenCV image", i.e. BGR color ordering
cv2_bgr = cv2.imread('paddington.png')
# Convert to grayscale
cv2_gray = cv2.cvtColor(cv2_bgr, cv2.COLOR_BGR2GRAY)
# Save BGR image
cv2.imwrite('cv2_bgr.png', cv2_bgr)
# Save grayscale image
cv2.imwrite('cv2_gray.png', cv2_gray)
# Convert to grayscale with custom luma
cv2_custom_luma = np.uint8(0.2125 * cv2_bgr[..., 2] + 0.7154 * cv2_bgr[..., 1] + 0.0721 * cv2_bgr[..., 0])
# Load BGR saved image using scikit-image with as_gray; becomes np.float64
sc_bgr_w = io.imread('cv2_bgr.png', as_gray=True)
# Load grayscale saved image using scikit-image without as_gray; remains np.uint8
sc_gray_wo = io.imread('cv2_gray.png')
# Load grayscale saved image using scikit-image with as_gray; remains np.uint8
sc_gray_w = io.imread('cv2_gray.png', as_gray=True)
# OpenCV grayscale = scikit-image grayscale loaded image without as_gray? Yes.
print('Pixel mismatches:', cv2.countNonZero(cv2.absdiff(cv2_gray, sc_gray_wo)))
# Pixel mismatches: 0
# OpenCV grayscale = scikit-image grayscale loaded image with as_gray? Yes.
print('Pixel mismatches:', cv2.countNonZero(cv2.absdiff(cv2_gray, sc_gray_w)))
# Pixel mismatches: 0
# OpenCV grayscale = scikit-image BGR loaded (and scaled) image with as_gray? No.
print('Pixel mismatches:', cv2.countNonZero(cv2.absdiff(cv2_gray, np.uint8(sc_bgr_w * 255))))
# Pixel mismatches: 131244
# OpenCV grayscale with custom luma = scikit-image BGR loaded (and scaled) image with as_gray? Almost.
print('Pixel mismatches:', cv2.countNonZero(cv2.absdiff(cv2_custom_luma, np.uint8(sc_bgr_w * 255))))
# Pixel mismatches: 1
You see:
When opening the grayscale image, scikit-image simply uses the np.uint8 values, regardless of using as_gray=True or not.
When opening the color image with as_gray=True, scikit-image applies rgb2gray, scales all values to 0.0 ... 1.0, thus uses np.float64. Even scaling back to 0 ... 255 and np.uint8 yields a lot of pixel mismatches between this image and the OpenCV grayscale image – due to the different luma calculations.
When calculating the luma manually and accordingly to rgb2gray, the OpenCV grayscale image is almost identical. The one pixel mismatch might be due to floating point inaccuracies.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
NumPy: 1.20.1
OpenCV: 4.5.1
scikit-image: 0.18.1
----------------------------------------
Related
I have an transparent image with RGBA code (0, 0, 0, 0), on which I added some pictures and text. Now, I am trying to paste that on a GIF image, but it is completely ruining it.
Here is my transparent image:
Here is my GIF:
And, this is what I get:
This is my code:
from PIL import ImageSequence
im = Image.open('newav.gif')
frames = []
for frame in ImageSequence.Iterator(im):
frame = frame.copy()
frame.paste(card, (0, 0), card)
frames.append(frame)
frames[0].save('rank_card_gif.gif', save_all=True, append_images=frames[1:], loop=0)
Combining existent, animated GIFs with static PNGs having transparency doesn't work that easily – at least solely using Pillow. Your GIF can only store upto 256 different colors using some color palette, thus has mode P (or PA), when opened using Pillow. Now, your PNG probably has a lot of more colors. When pasting the PNG onto the GIF, the color palette of the GIF is used to convert some of the PNG's colors, which gives unexpected or unwanted results, cf. your output.
My idea would be, since you're already iterating each frame:
Convert the frame to RGB, to get the "explicit" colors from the palette.
Convert the frame to some NumPy array, and manually alpha blend the frame and the PNG using its alpha channel.
Convert the resulting frame back a Pillow Image object.
Thus, all frames are stored as RGB, all colors are the same for all frames. So, when now saving a new GIF, the new color palette is determined from this set of images.
Here's my code for the described procedure:
import cv2
from PIL import Image, ImageSequence
import numpy as np
# Read gif using Pillow
gif = Image.open('gif.gif')
# Read png using OpenCV
pngg = cv2.imread('png.png', cv2.IMREAD_UNCHANGED)
# Extract alpha channel, repeat for later alpha blending
alpha = np.repeat(pngg[..., 3, np.newaxis], 3, axis=2) / 255
frames = []
for frame in ImageSequence.Iterator(gif):
frame = frame.copy()
# Convert frame to RGB
frame = frame.convert('RGB')
# Convert frame to NumPy array; convert RGB to BGR for OpenCV
frame = cv2.cvtColor(np.asarray(frame), cv2.COLOR_RGB2BGR)
# Manual alpha blending
frame = np.uint8(pngg[..., :3] * alpha + frame * (1 - alpha))
# Convert BGR to RGB for Pillow; convert frame to Image object
frame = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
frames.append(frame)
frames[0].save('output.gif', append_images=frames[1:], save_all=True,
loop=0, duration=gif.info['duration'])
And, this is the result:
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
NumPy: 1.20.1
OpenCV: 4.5.1
Pillow: 8.1.0
----------------------------------------
I have two images: img1 and img2, and img2 is transparent except for one part of the image.
Using Pillow, how to crop the non-transparent part of img2 from img1? As a result, I would like to get img1 with transparent part, where img2 is non-transparent.
img1 and img2 are the same size.
You can convert your Pillow images to NumPy arrays and make use of vectorized operations to speed up your processing.
Having img1.png (fully opaque random pixels)
and img2.png (fully transparent background pixels, fully opaque red pixels)
one could use this approach to achieve the described behaviour:
import numpy as np
from PIL import Image
# Open images via Pillow
img1 = Image.open('img1.png')
img2 = Image.open('img2.png')
# Convert images to NumPy arrays
img1_np = np.array(img1)
img2_np = np.array(img2)
# Get (only full) opaque pixels in img2 as mask
mask = img2_np[:, :, 3] == 255
# Make pixels in img1 within mask transparent
img1_np[mask, 3] = 0
# Convert image back to Pillow
img1 = Image.fromarray(img1_np)
# Save image
img1.save('img1_mod.png')
The modified img1_mod.png would look like this (fully opaque random background pixels, transparent pixels where there's the red square in img2.png):
If you have "smooth" transparency, i.e. your alpha channel has values from the whole range of [0 ... 255], we could modify the code. Having such an img2_smooth.png
that'd be the modified code:
import numpy as np
from PIL import Image
# Open images via Pillow
img1 = Image.open('img1.png')
img2 = Image.open('img2_smooth.png')
# Convert images to NumPy arrays
img1_np = np.array(img1)
img2_np = np.array(img2)
# Get (also partially) opaque pixels in img2 as mask # <--
mask = img2_np[:, :, 3] > 0 # <--
# Make pixels in img1 within mask (partially) transparent # <--
img1_np[mask, 3] = 255 - img2_np[mask, 3] # <--
# Convert image back to Pillow
img1 = Image.fromarray(img1_np)
# Save image
img1.save('img1_smooth_mod.png')
And that'd be new output img1_smooth_mod.png:
Hope that helps!
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.8.1
NumPy: 1.18.1
Pillow: 7.0.0
----------------------------------------
Observe the following image:
Observe the following Python code:
import cv2
img = cv2.imread("rainbow.png", cv2.IMREAD_COLOR)
img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) # convert it to hsv
img = cv2.cvtColor(img, cv2.COLOR_HSV2BGR) # convert back to BGR
cv2.imwrite("out.png", img)
Here's the output image:
If you can't see it, there's a clear loss of visual fidelity in the image here. For comparison's sake, here's the original next to the output image zoomed in around the yellows:
What's going on here? Is there any way to prevent these blocky artifacts from appearing? I need to convert to the HSL color space to rotate the hue, but I can't do that if I'm going to get these kinds of artifacts.
As a note, the output image does not have the artifacts when I don't do the two conversions; the conversions themselves are indeed the cause.
Back at a computer now - try like this:
#!/usr/bin/env python3
import numpy as np
import cv2
img = cv2.imread("rainbow.png", cv2.IMREAD_COLOR)
img = img.astype(np.float32)/255 # go to 32-bit float on 0..1
img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) # convert it to hsv
img = cv2.cvtColor(img, cv2.COLOR_HSV2BGR) # convert back to BGR
cv2.imwrite("output.png", (img*255).astype(np.uint8))
I think the problem is that when you use unsigned 8-bit representation, the Hue gets "squished" from a range of 0..360 into a range of 0..180, in 2 degree increments in order to stay within 8-bit unsigned range of 0..255 causing steps between nearby values. A solution is to move to 32-bit floats and scale to the range 0..1.
I am testing a segmentation algorithm on several VHSR satellite images, which originally comes in 16bit format, but when I convert them to 8bit images, the produced images are showing striped appearance.
I've been trying different python libraries (skimage, cv2, scipy) getting similar results.
1) The original 16-bit image it is a 4 band image (NIR,B,G,R), so you need to choose the right bands to create a true color image, RGB image (4,3,2 bands). thanks in advance. It can be downloaded from this link:
16bit image
2) I use this code to convert each pixel value, from a 16-bit integer now fitting within 8-bit range:
from scipy.misc import bytescale
SS = io.imread('Imag16bit.tif')
SS = bytescale(SS)
SS = np.asarray(SS)
plt.imshow(SS)
This is my result of above code:
bytescale works for me. I think the asarray step messes up something.
import cv2
from skimage import io
from scipy.misc import bytescale
image = io.imread('SkySat_16bit.tif')
cv2.imshow('Original', image)
print(image.dtype)
image = bytescale(image)
print(image.dtype)
cv2.imshow('Converted', image)
cv2.waitKey(0)
I think this is a way to do it:
#!/usr/local/bin/python3
from PIL import Image
from tifffile import imsave, imread
# Load image
im = imread('SkySat_16bit.tif')
# Extract Red, Green and Blue bands into separate 8-bit arrays
R = (im[:,:,3]/256).astype(np.uint8)
G = (im[:,:,2]/256).astype(np.uint8)
B = (im[:,:,1]/256).astype(np.uint8)
# Combine bands into RGB array
RGB = np.dstack((R,G,B))
# Save to disk
Image.fromarray(RGB).save('result.png')
You may want to adjust the contrast a bit, and check I selected the correct bands.
I have a very simple program in python with OpenCV and GDAL. In this program i read GeoTiff image with the following line
image = cv2.imread(sys.argv[1], cv2.IMREAD_LOAD_GDAL | cv2.IMREAD_COLOR)
The problem is for a specific image imread return None. I am using images from: https://www.sensefly.com/drones/example-datasets.html
Image in Assessing crops with RGB imagery (eBee SQ) > Map (orthomosaic) works well. Its size is: 19428, 19784 with 4 bands.
Image in Urban mapping (eBee Plus/senseFly S.O.D.A.) > Map (orthomosaic) doesn't work. Its size is: 26747, 25388 and 4 bands.
Any help to figure out what is the problem?
Edit: I tried the solution suggested by #en_lorithai and it works, the problem is then I need to do some image processing with OpenCV and the image loaded by GDAL has several issues
GDAL load images as RGB instead of BGR (used by default in OpenCV)
The image shape expected by OpenCV is (width, height, channels) and GDAL return an image with (channels, width, height) shape
The image returned by GDAL is flipped in Y-axe and rotate clockwise by 90 degree.
The image loaded by OpenCV is (resized to 700x700):
The image loaded by GDAL (after change shape, of course) is (resized to 700x700)
Finally, If I try to convert this image from BGR to RGB with
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
I get (resized to 700x700)
I can convert from GDAL format to OpenCV format with the following code
image = ds.ReadAsArray() #Load image with GDAL
tmp = image.copy()
image[0] = tmp[2,:,:] # swap read channel and blue channel
image[2] = tmp[0,:,:]
image = np.swapaxes(image,2,0) # convert from (height, width, channels) to (channels, height, width)
image = cv2.flip(image,0) # flip in Y-axis
image = cv2.transpose(image) # Rotate by 90 degress (clockwise)
image = cv2.flip(image,1)
The problem is I think that this is a very slow process and I want to know if there is a automatic convert-process.
You can try and open the image in gdal instead
from osgeo import gdal
g_image = gdal.Open('161104_hq_transparent_mosaic_group1.tif')
a_image = g_image.ReadAsArray()
can't test as i don't have enough available memory to open that image.
Edit: equivalent operation on another image
from osgeo import gdal
import matplotlib.pyplot as plt
g_image = gdal.Open('Water-scenes-014.jpg') # 3 channel rgb image
a_image = g_image.ReadAsArray()
s_image = np.dstack((a_image[0],a_image[1],a_image[2]))
plt.imshow(s_image) # show image in matplotlib (no need for color swap)
s_image = cv2.cvtColor(s_image,cv2.COLOR_RGB2BGR) # colorswap for cv
cv2.imshow('name',s_image)
Another method of getting individual bands from gdal
g_image = gdal.Open('image_name.PNG')
band1 = g_image.GetRasterBand(1).ReadAsArray()
You can then do a numpy dstack of each of the bands.