I'm trying to merge two RGBA images (with a shape of (h,w,4)), taking into account their alpha channels.
Example :
What I've tried
I tried to do this using opencv for that, but I getting some strange pixels on the output image.
Images Used:
and
import cv2
import numpy as np
import matplotlib.pyplot as plt
image1 = cv2.imread("image1.png", cv2.IMREAD_UNCHANGED)
image2 = cv2.imread("image2.png", cv2.IMREAD_UNCHANGED)
mask1 = image1[:,:,3]
mask2 = image2[:,:,3]
mask2_inv = cv2.bitwise_not(mask2)
mask2_bgra = cv2.cvtColor(mask2, cv2.COLOR_GRAY2BGRA)
mask2_inv_bgra = cv2.cvtColor(mask2_inv, cv2.COLOR_GRAY2BGRA)
# output = image2*mask2_bgra + image1
output = cv2.bitwise_or(cv2.bitwise_and(image2, mask2_bgra), cv2.bitwise_and(image1, mask2_inv_bgra))
output[:,:,3] = cv2.bitwise_or(mask1, mask2)
plt.figure(figsize=(12,12))
plt.imshow(cv2.cvtColor(output, cv2.COLOR_BGRA2RGBA))
plt.axis('off')
Output :
So what I figured out is that I'm getting those weird pixels because I used cv2.bitwise_and function (Which btw works perfectly with binary alpha channels).
I tried using different approaches
Question
Is there an approach to do this (While keeping the output image as an 8bit image).
I was able to obtain the expected result in 2 stages.
# Read both images preserving the alpha channel
hh1 = cv2.imread(r'C:\Users\524316\Desktop\Stack\house.png', cv2.IMREAD_UNCHANGED)
hh2 = cv2.imread(r'C:\Users\524316\Desktop\Stack\memo.png', cv2.IMREAD_UNCHANGED)
# store the alpha channels only
m1 = hh1[:,:,3]
m2 = hh2[:,:,3]
# invert the alpha channel and obtain 3-channel mask of float data type
m1i = cv2.bitwise_not(m1)
alpha1i = cv2.cvtColor(m1i, cv2.COLOR_GRAY2BGRA)/255.0
m2i = cv2.bitwise_not(m2)
alpha2i = cv2.cvtColor(m2i, cv2.COLOR_GRAY2BGRA)/255.0
# Perform blending and limit pixel values to 0-255 (convert to 8-bit)
b1i = cv2.convertScaleAbs(hh2*(1-alpha2i) + hh1*alpha2i)
Note: In the b=above the we are using only the inverse alpha channel of the memo image
But I guess this is not the expected result. So moving on ....
# Finding common ground between both the inverted alpha channels
mul = cv2.multiply(alpha1i,alpha2i)
# converting to 8-bit
mulint = cv2.normalize(mul, dst=None, alpha=0, beta=255,norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)
# again create 3-channel mask of float data type
alpha = cv2.cvtColor(mulint[:,:,2], cv2.COLOR_GRAY2BGRA)/255.0
# perform blending using previous output and multiplied result
final = cv2.convertScaleAbs(b1i*(1-alpha) + mulint*alpha)
Sorry for the weird variable names. I would request you to analyze the result in each line. I hope this is the expected output.
You could use PIL library to achieve this
from PIL import Image
def merge_images(im1, im2):
bg = Image.open(im1).convert("RGBA")
fg = Image.open(im2).convert("RGBA")
x, y = ((bg.width - fg.width) // 2 , (bg.height - fg.height) // 2)
bg.paste(fg, (x, y), fg)
# convert to 8 bits (pallete mode)
return bg.convert("P")
we can test it using the provided images:
result_image = merge_images("image1.png", "image2.png")
result_image.save("image3.png")
Here's the result:
Related
I was trying to combine 3 gray scale images into a single overlapping image with three different colors for each.
For that, I added each into a 3 channel numpy array.
But when plotting with im.show I don't get a colourful image. Till adding 2nd channel it works, but when I add the third channel, it doesn't work. The final image has only red and blue colour.
It is supposed to be red, green and blue for corresponding to the overlapping images.
Why would it be?
image1 = Image.open("E:/imaging/04102022_Bronze/Copper_4_2/10.tif") #openingimage1
image1_norm =(np.array(image1)-np.array(image1).min() ) / (np.array(image1).max() -
np.array(image1).min()) #normalisingimage1
image2 = Image.open("E:/imaging/04102022_Bronze/Oxygen_1_2/10.tif")#openingimage2
image2_norm = (np.array(image2)-np.array(image2).min()) / (np.array(image2).max() -
np.array(image2).min())#normalisingimage2
image3 = Image.open("E:/imaging/04102022_Bronze/Oxygen_1_2/10.tif")#openingimage3
image3_norm = (np.array(image3)-np.array(image3).min()) / (np.array(image3).max() -
np.array(image3).min())#normalisingimage3
im=np.array(image2)
new_image = np.zeros(im.shape + (3,)) #creating an empty 3 channel numpy array .shape of this
array is (255, 1024, 3)
new_image[:,:,0] = image1_norm #adding the three images into three channels
new_image[:,:,1] = image2_norm
new_image[:,:,2] = image3_norm
new_image1=new_image*255.999
new_image2= new_image1.astype(np.uint8)
final_image=final_image=Image.fromarray(new_image2, mode='RGB')
A few possible issues...
When you open an image in PIL, if you want to be sure it is single-channel greyscale, and not accidentally 3-channel RGB, or a palette image, force it to greyscale:
im = Image.open('image.png').convert('L')
Try not to repeat complicated calculations or expressions several times - it just makes for a maintenance nightmare. Maybe use a function instead:
def normalize(im):
# Normalise image to range 0..1
min, max = im.min(), im.max()
return (im.astype(float)-min)/(max-min)
You can use Numpy's dstack() to merge channels - it means "depth"-stack, as opposed to np.vstack() which stacks images vertically above/below each other and np.hstack() which stacks images side-by-side horizontally. It is a lot simpler than creating an image of the right size and individually pushing each channel into it.
result = np.dstack((im1, im2, im3))
That would make the overall code more like this:
#!/usr/bin/env python3
from PIL import Image
import numpy as np
def normalize(im):
# Normalise image to range 0..1
min, max = im.min(), im.max()
return (im.astype(float)-min)/(max-min)
# Load images as single channel Numpy arrays
im1 = np.array(Image.open('ch1.png').convert('L'))
im2 = np.array(Image.open('ch2.png').convert('L'))
im3 = np.array(Image.open('ch3.png').convert('L'))
# Normalize and scale
n1 = normalize(im1) * 255.999
n2 = normalize(im2) * 255.999
n3 = normalize(im3) * 255.999
# Merge channels to RGB
result = np.dstack((n1,n2,n3))
result = Image.fromarray(result.astype(np.uint8))
result.save('result.png')
That makes these three input images:
into this merged image:
I have an image of a human body showing skin. How can I change the color of the skin assuming I have another skin color and assuming I have a mask of the exposed skin in the body image ?
Here is one way to do that in Python/OpenCV. I am not sure how robust it is.
Basically, we get the average color of the face. The get the difference color (in each channel) between that and the desired color. Then we add the difference to the input image. Then we use the mask to combine the original and new images.
Input:
Facemask:
import cv2
import numpy as np
import skimage.exposure
# specify desired bgr color for new face and make into array
desired_color = (180, 128, 200)
desired_color = np.asarray(desired_color, dtype=np.float64)
# create swatch
swatch = np.full((200,200,3), desired_color, dtype=np.uint8)
# read image
img = cv2.imread("zelda1.jpg")
# read face mask as grayscale and threshold to binary
facemask = cv2.imread("zelda1_facemask.png", cv2.IMREAD_GRAYSCALE)
facemask = cv2.threshold(facemask, 128, 255, cv2.THRESH_BINARY)[1]
# get average bgr color of face
ave_color = cv2.mean(img, mask=facemask)[:3]
print(ave_color)
# compute difference colors and make into an image the same size as input
diff_color = desired_color - ave_color
diff_color = np.full_like(img, diff_color, dtype=np.uint8)
# shift input image color
# cv2.add clips automatically
new_img = cv2.add(img, diff_color)
# antialias mask, convert to float in range 0 to 1 and make 3-channels
facemask = cv2.GaussianBlur(facemask, (0,0), sigmaX=3, sigmaY=3, borderType = cv2.BORDER_DEFAULT)
facemask = skimage.exposure.rescale_intensity(facemask, in_range=(100,150), out_range=(0,1)).astype(np.float32)
facemask = cv2.merge([facemask,facemask,facemask])
# combine img and new_img using mask
result = (img * (1 - facemask) + new_img * facemask)
result = result.clip(0,255).astype(np.uint8)
# save result
cv2.imwrite('zelda1_swatch.png', swatch)
cv2.imwrite('zelda1_recolor.png', result)
cv2.imshow('swatch', swatch)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Desired color swatch:
Result:
import cv2
import numpy as np
import skimage.exposure
#usage
#put this script and the image face.jpg in the same directory /dir
#run these 2 commands inside bash
#cd /dir
#python change_skin_v1.py
#script_name= change_skin_v1.py
#you can change the 3 parameters: alpha, skincolor_low, skincolor_high
#path file
path_face="./face.jpg"
result_partial="./result_partial.png"
result_final="./result_partial.png"
#blending parameter
alpha = 0.7
# Define lower and uppper limits of what we call "skin color"
skincolor_low=np.array([0,10,60])
skincolor_high=np.array([180,150,255])
#specify desired bgr color (brown) for the new face.
#this value is approximated
desired_color_brg = (2, 70, 140)
# read face
img_main_face = cv2.imread(path_face)
# face.jpg has by default the BGR format, convert BGR to HSV
hsv=cv2.cvtColor(img_main_face,cv2.COLOR_BGR2HSV)
#create the HSV mask
mask=cv2.inRange(hsv,skincolor_low,skincolor_high)
# Change image to brown where we found pink
img_main_face[mask>0]=desired_color_brg
cv2.imwrite(result_partial,img_main_face)
#blending block start
#alpha range for blending is 0-1
# load images for blending
src1 = cv2.imread(result_partial)
src2 = cv2.imread(path_face)
if src1 is None:
print("Error loading src1")
exit(-1)
elif src2 is None:
print("Error loading src2")
exit(-1)
# actually blend_images
result_final = cv2.addWeighted(src1, alpha, src2, 1-alpha, 0.0)
cv2.imwrite('./result_final.png', result_final)
#blending block end
The code below is intended to take an infrared image (B&W) and convert it to RGB. It does so successfully, but with significant noise. I have included a few lines for noise reduction but they don't seem to help. I've included the starting/resulting photos below. Any advice/corrections are welcome and thank you in advance!
from skimage import io
import numpy as np
import glob, os
from tkinter import Tk
from tkinter.filedialog import askdirectory
import cv2
path = askdirectory(title='Select PNG Folder') # shows dialog box and return the path
outpath = askdirectory(title='Select SAVE Folder')
# wavelength in microns
MWIR = 4.5
R = .642
G = .532
B = .44
vector = [R, G, B]
vectorsum = np.sum(vector)
vector = vector / vectorsum #normalize
vector = vector*255 / MWIR #changing this value changes the outcome significantly so I
#have been messing with it in the hopes of fixing it but no luck so far.
vector = np.power(vector, 4)
for file in os.listdir(path):
if file.endswith(".png"):
imIn = io.imread(os.path.join(path, file))
imOut = imIn * vector
ret,thresh = cv2.threshold(imOut,64,255,cv2.THRESH_BINARY)
kernel = np.ones((5, 5), np.uint8)
erode = cv2.erode(thresh, kernel, iterations = 1)
result = cv2.bitwise_or(imOut, erode)
io.imsave(os.path.join(outpath, file) + '_RGB.png',imOut.astype(np.uint8))
Your noise looks like completely random values, so I suspect you have an error in your conversion from float to uint8. But instead of rolling everything for yourself, why don't you just use:
imOut = cv2.cvtColor(imIn,cv2.COLOR_GRAY2BGR)
Here is one way to do that in Python/OpenCV.
Your issue is likely that your channel values are exceeding the 8-bit range.
Sorry, I do not understand the relationship between your R,G,B weights and your MWIR. Dividing by MWIR will do nothing if your weights are properly normalized.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('car.jpg')
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# make color channels
red = gray.copy()
green = gray.copy()
blue = gray.copy()
# set weights
R = .642
G = .532
B = .44
MWIR = 4.5
# get sum of weights and normalize them by the sum
R = R**4
G = G**4
B = B**4
sum = R + G + B
R = R/sum
G = G/sum
B = B/sum
print(R,G,B)
# combine channels with weights
red = (R*red)
green = (G*green)
blue = (B*blue)
result = cv2.merge([red,green,blue])
# scale by ratio of 255/max to increase to fully dynamic range
max=np.amax(result)
result = ((255/max)*result).clip(0,255).astype(np.uint8)
# write result to disk
cv2.imwrite("car_colored.png", result)
# display it
cv2.imshow("RESULT", result)
cv2.waitKey(0)
Result
If the noise is coming from the sensor itself, like a grainy noise, you'll need to look into denoising algorithms. scikit-image and opencv provide some denoising algorithms you can try. Maybe take a look at this and this.
I recently learned about matplotlib.cm, which handles colormaps. I've been using those to artificially color IR images, and made a brief example using the same black & white car image used above. Basically, I create a colormap .csv file locally, then refer to it for RGB weights. You may have to pick and choose which colormap you prefer, but that's up to personal preference.
Input image:
Python:
import os
import numpy as np
import cv2
from matplotlib import cm
# Multiple colormap options are available- I've hardcoded viridis for this example.
colormaps = ["viridis", "plasma", "inferno", "magma", "cividis"]
def CreateColormap():
if not os.path.exists("viridis_colormap.csv"):
# Get 256 entries from "viridis" or any other Matplotlib colormap
colormap = cm.get_cmap("viridis", 256)
# Make a Numpy array of the 256 RGB values
# Each line corresponds to an RGB colour for a greyscale level
np.savetxt("viridis_colormap.csv", (colormap.colors[...,0:3]*255).astype(np.uint8), fmt='%d', delimiter=',')
def RecolorInfraredImageToRGB(ir_image):
# Load RGB lookup table from CSV file
lookup_table = np.loadtxt("viridis_colormap.csv", dtype=np.uint8, delimiter=",")
# Make output image, same height and width as IR image, but 3-channel RGB
result = np.zeros((*ir_image.shape, 3), dtype=np.uint8)
# Take entries from RGB LUT according to greyscale values in image
np.take(lookup_table, ir_image, axis=0, out=result)
return result
if __name__ == "__main__":
CreateColormap()
img = cv2.imread("bwcar.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
recolored = RecolorInfraredImageToRGB(gray)
cv2.imwrite("car_recolored.png", recolored)
cv2.imshow("Viridis recolor", recolored)
cv2.waitKey(0)
Output:
I am using python and opencv to cut an image using a mask. The mask itself is quite jagged and so the resulting image becomes a bit jagged around the edges like below
Jagged image
Is there a way I can smooth out the edges so they look more like this without affecting the rest of the image?
Smoothed edge
Thanks
SoS
** UPDATE **
Added the original jagged image without the annotation
Original Jagged image
Here is one way using OpenCV, Numpy and Skimage. I assume you actually have an image with a transparent background and not just checkerboard pattern.
Input:
import cv2
import numpy as np
import skimage.exposure
# load image with alpha channel
img = cv2.imread('lena_circle.png', cv2.IMREAD_UNCHANGED)
# extract only bgr channels
bgr = img[:, :, 0:3]
# extract alpha channel
a = img[:, :, 3]
# blur alpha channel
ab = cv2.GaussianBlur(a, (0,0), sigmaX=2, sigmaY=2, borderType = cv2.BORDER_DEFAULT)
# stretch so that 255 -> 255 and 127.5 -> 0
aa = skimage.exposure.rescale_intensity(ab, in_range=(127.5,255), out_range=(0,255))
# replace alpha channel in input with new alpha channel
out = img.copy()
out[:, :, 3] = aa
# save output
cv2.imwrite('lena_circle_antialias.png', out)
# Display various images to see the steps
# NOTE: In and Out show heavy aliasing. This seems to be an artifact of imshow(), which did not display transparency for me. However, the saved image looks fine
cv2.imshow('In',img)
cv2.imshow('BGR', bgr)
cv2.imshow('A', a)
cv2.imshow('AB', ab)
cv2.imshow('AA', aa)
cv2.imshow('Out', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
I am by no means an expert with OpenCV. I looked at cv2.normalize(), but it did not look like I could provide my own sets of input and output values. So I also tried using the following adding the clipping to be sure there were no over-flows or under-flows:
aa = a*2.0 - 255.0
aa[aa<0] = 0
aa[aa>0] = 255
where I computed that from solving simultaneous equations such that in=255 becomes out=255 and in=127.5 becomes out=0 and doing a linear stretch between:
C = A*X+B
255 = A*255+B
0 = A*127.5+B
Thus A=2 and B=-127.5
But that does not work nearly as well as skimage rescale_intensity.
These are some effects you can do with the PIL image library:
from PIL import Image, ImageFilter
im_1 = Image.open("/constr/pics1/russian_doll.png")
im_2 = im_1.filter(ImageFilter.BLUR)
im_3 = im_1.filter(ImageFilter.CONTOUR)
im_4 = im_1.filter(ImageFilter.DETAIL)
im_5 = im_1.filter(ImageFilter.EDGE_ENHANCE)
im_6 = im_1.filter(ImageFilter.EDGE_ENHANCE_MORE)
im_7 = im_1.filter(ImageFilter.EMBOSS)
im_8 = im_1.filter(ImageFilter.FIND_EDGES)
im_9 = im_1.filter(ImageFilter.SMOOTH)
im_10 = im_1.filter(ImageFilter.SMOOTH_MORE)
im_11 = im_1.filter(ImageFilter.SHARPEN)
# now save the images
im_2.save("/constr/picsx/russian_dol_BLUR.png")
im_3.save("/constr/picsx/russian_doll_CONTOUR.png")
im_4.save("/constr/picsx/russian_doll_DETAIL.png")
im_5.save("/constr/picsx/russian_doll_EDGE_ENHANCE.png")
im_6.save("/constr/picsx/russian_doll_EDGE_ENHANCE_MORE.png")
im_7.save("/constr/picsx/russian_doll_EMBOSS.png")
im_8.save("/constr/picsx/russian_doll_FIND_EDGES.png")
im_9.save("/constr/picsx/russian_doll_SMOOTH.png")
im_10.save("/constr/picsx/russian_doll_SMOOTH_MORE.png")
im_11.save("/constr/picsx/russian_doll_SHARPEN.png")
Like the image above suggests, how can I convert the image to the left into an array that represent the darkness of the image between 0 for white and decimals for darker colours closer to 1? as shown in the image usingpython 3`?
Update:
I have tried to work abit more on this. There are good answers below too.
# Load image
filename = tf.constant("one.png")
image_file = tf.read_file(filename)
# Show Image
Image("one.png")
#convert method
def convertRgbToWeight(rgbArray):
arrayWithPixelWeight = []
for i in range(int(rgbArray.size / rgbArray[0].size)):
for j in range(int(rgbArray[0].size / 3)):
lum = 255-((rgbArray[i][j][0]+rgbArray[i][j][1]+rgbArray[i][j][2])/3) # Reversed luminosity
arrayWithPixelWeight.append(lum/255) # Map values from range 0-255 to 0-1
return arrayWithPixelWeight
# Convert image to numbers and print them
image_decoded_png = tf.image.decode_png(image_file,channels=3)
image_as_float32 = tf.cast(image_decoded_png, tf.float32)
numpy.set_printoptions(threshold=numpy.nan)
sess = tf.Session()
squeezedArray = sess.run(image_as_float32)
convertedList = convertRgbToWeight(squeezedArray)
print(convertedList) # This will give me an array of numbers.
I would recommend to read in images with opencv. The biggest advantage of opencv is that it supports multiple image formats and it automatically transforms the image into a numpy array. For example:
import cv2
import numpy as np
img_path = '/YOUR/PATH/IMAGE.png'
img = cv2.imread(img_path, 0) # read image as grayscale. Set second parameter to 1 if rgb is required
Now img is a numpy array with values between 0 - 255. By default 0 equals black and 255 equals white. To change this you can use the opencv built in function bitwise_not:
img_reverted= cv2.bitwise_not(img)
We can now scale the array with:
new_img = img_reverted / 255.0 // now all values are ranging from 0 to 1, where white equlas 0.0 and black equals 1.0
Load the image and then just invert and divide by 255.
Here is the image ('Untitled.png') that I used for this example: https://ufile.io/h8ncw
import numpy as np
import cv2
import matplotlib.pyplot as plt
my_img = cv2.imread('Untitled.png')
inverted_img = (255.0 - my_img)
final = inverted_img / 255.0
# Visualize the result
plt.imshow(final)
plt.show()
print(final.shape)
(661, 667, 3)
Results (final object represented as image):
You can use PIL package to manage images. Here's example how it can be done.
from PIL import Image
image = Image.open('sample.png')
width, height = image.size
pixels = image.load()
# Check if has alpha, to avoid "too many values to unpack" error
has_alpha = len(pixels[0,0]) == 4
# Create empty 2D list
fill = 1
array = [[fill for x in range(width)] for y in range(height)]
for y in range(height):
for x in range(width):
if has_alpha:
r, g, b, a = pixels[x,y]
else:
r, g, b = pixels[x,y]
lum = 255-((r+g+b)/3) # Reversed luminosity
array[y][x] = lum/255 # Map values from range 0-255 to 0-1
I think it works but please note that the only test I did was if values are in desired range:
# Test max and min values
h, l = 0,1
for row in array:
h = max([max(row), h])
l = min([min(row), l])
print(h, l)
You have to load the image from the path and then transform it to a numpy array.
The values of the image will be between 0 and 255. The next step is to standardize the numpy array.
Hope it helps.