How to combine two RGB images without having white space between them

How to combine two RGB images without having white space between them - python

I am trying to combine some parts of the image together while still maintaining some parts unchanged.
This is first image
This is the code to get the first image, the parameter for the input are img which is original image but already colorized with green while jawline,eyebrows,etc are (x,y) coordinates to cut those parts from the image
def getmask(img,jawline,eyebrows,eyes,mouth):
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
imArray = np.asarray(img)
# create mask
polygon = jawline.flatten().tolist()
maskIm = Image.new('L', (imArray.shape[1], imArray.shape[0]), 0)
ImageDraw.Draw(maskIm).polygon(polygon, outline=1, fill='white')
#ImageDraw.Draw(maskIm).polygon(polygon, outline=(1))
# draw eyes
righteyes=eyes[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(righteyes, outline=1, fill='black')
lefteyes=eyes[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(lefteyes, outline=1, fill='black')
# draw eyebrows
rightbrows=eyebrows[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(rightbrows, outline=2, fill='black')
leftbrows=eyebrows[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(leftbrows, outline=2, fill='black')
# draw mouth
mouth=mouth.flatten().tolist()
ImageDraw.Draw(maskIm).polygon(mouth, outline=1, fill='black')
mask = np.array(maskIm)
mask = np.multiply(img,mask)+ np.multiply((1-mask),np.ones((L,P,3)))
return mask
This is the second image which will fill the white blank inside the first image
I used this code to cut the parts which is very similar to the code on first image.
def getface(img,eyebrows,eyes,mouth):
im=img.copy()
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
imArray = np.asarray(img)
# create mask
maskIm = Image.new('L', (imArray.shape[1], imArray.shape[0]), 0)
righteyes=eyes[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(righteyes, outline=1,fill='white')
lefteyes=eyes[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(lefteyes, outline=1,fill='white')
# draw eyebrows
rightbrows=eyebrows[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(rightbrows, outline=2, fill='white')
leftbrows=eyebrows[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(leftbrows, outline=2, fill='white')
# draw mouth
mouth=mouth.flatten().tolist()
ImageDraw.Draw(maskIm).polygon(mouth, outline=1, fill='white')
cutted_part = np.array(maskIm)
cutted_part = cv2.bitwise_or(im,im,mask=mask)
return cutted_part
So far I have tried to combine those two images by first inversing the second image so that the black background become white and then multiply the first and second image. But the result isn't satisfactory.
As you can see, there are some white space between the combined area and I notice that some part from second image become smaller or missing which I suspect create those white space when combined (Please don't mind the slightly different color on the result). Maybe someone can share how to resolve this problem or has better ways to combine 2 images together?

If you provide your results as actual pictures instead of cropped screenshots we can reproduce your problem, so far i would recommend:
Invert the background of your cutout (black to white) and then simply combine both pictures either by adding them (They need to have the same dimensions, which i presume is the case.) or overlaying them by using opencv's addWeighted function to adjust opacity.

Related

how to efficiently and correctly overlay pngs taking into account transparency?

when i was trying to overlay one image over the other one image had a transparent rounded rectangle filling and the other was just a normal image it looked either like this ( just putting the yellow over the pink without taking into account the rounded corners at all) or like this (looks just like the rounded rectangle without adding anything even kept the transparency)
this is how it should look like:
here are the 2 example images: (pink.png) and (yellow.png)
here is the code used for this :
import cv2
import numpy as np
layer0 = cv2.imread(r'yellow.png', cv2.IMREAD_UNCHANGED)
h0, w0 = layer0.shape[:2]
layer4 = cv2.imread(r"pink.png", cv2.IMREAD_UNCHANGED)
#just a way to help the image look more transparent in the opencv imshow because imshow always ignores
# the transparency and pretends that the image has no alpha channel
for y in range(layer4.shape[0]):
for x in range(layer4.shape[1]):
if layer4[y,x][3]<255:
layer4[y,x][:] =0,0,0,0
# Create a new np array
shapes = np.zeros_like(layer4, np.uint8)
shapes = cv2.cvtColor(shapes, cv2.COLOR_BGR2BGRA)
#the start position of the yellow image on the pink
gridpos = (497,419)
shapes[gridpos[1]:gridpos[1]+h0, gridpos[0]:gridpos[0]+w0] = layer0
# Change this into bool to use it as mask
mask = shapes.astype(bool)
# We'll create a loop to change the alpha
# value i.e transparency of the overlay
for alpha in np.arange(0, 1.1, 0.1)[::-1]:
# Create a copy of the image to work with
bg_img = layer4.copy()
# Create the overlay
bg_img[mask] = cv2.addWeighted( bg_img,1-alpha, shapes, alpha, 0)[mask]
# print the alpha value on the image
cv2.putText(bg_img, f'Alpha: {round(alpha,1)}', (50, 200),
cv2.FONT_HERSHEY_PLAIN, 8, (200, 200, 200), 7)
# resize the image before displaying
bg_img = cv2.resize(bg_img, (700, 600))
cv2.imwrite("out.png", bg_img)
cv2.imshow('Final Overlay', bg_img)
cv2.waitKey(0)
you can test different alpha combinations by pressing a key on the keyboard

OpenCV Version
Took me some time, but basically you have to mask both images and then combine them. The code bellow is commented and should be self explenatory. I think the hardest part to grasp is, that your pink image actually represents the foreground and the yellow image is your background. The trickiest part is to not let anything through from your background, which is why you have to mask both images.
import cv2
import numpy as np
pink = cv2.imread("pink.png", cv2.IMREAD_UNCHANGED)
# We now have to use an image that has the same size as the pink "foreground"
# and create a black image wiht numpy's zeros_like (gives same size as input)
background = np.zeros_like(pink)
# We then split the pink image into 4 channels:
# b, g, r and alpha, we only need the alpha as mask
_, _, _, mask = cv2.split(pink)
yellow = cv2.imread("yellow.png", cv2.IMREAD_UNCHANGED)
# we need the x and y dimensions for pasting the image later
h_yellow, w_yellow = yellow.shape[:2]
# Assuming format is (x, y)
gridpos = (497, 419)
# We paste the yellow image onto our black background
# IMPORTANT: if any of the dimensions of yellow plus the gridpos is
# larger than the background width or height, this will give you an
# error! Also, this only works with the same number of input channels.
# If you are loading a jpg image without alpha channel, you can adjust
# the number of channels, the last input param, e.g. with :3 to only use
# the first 3 channels
background[gridpos[1]:gridpos[1] + h_yellow, gridpos[0]:gridpos[0] + w_yellow, :] = yellow
# This step was not intuitive for me in the first run, since the
# pink img should aready be masked, but for some reason, it is not
pink_masked = cv2.bitwise_and(pink, pink, mask=mask)
# In this step, we mask the positioned yellow image with the inverse
# mask from the pink image, achieved by bitwise_not
background = cv2.bitwise_and(background, background, mask=cv2.bitwise_not(mask))
# We combine the pink masked image with the background
img = cv2.convertScaleAbs(pink_masked + background)
cv2.imshow("img", img), cv2.waitKey(0), cv2.destroyAllWindows()
Cheers!
Old Answer:
It looks like you are setting the whole image as a mask, this is why the rounded corners have no effect at all from your pink background. I myself was struggling a lot with this task aswell and ended up using pillow instead of OpenCV. I don't know if it is more performant, but I got it running.
Here the code that works for your example:
from PIL import Image
# load images
background = Image.open(r"pink.png")
# load image and scale it to the same size as the background
foreground = Image.open(r"yellow.png").resize(background.size)
# split gives you the r, g, b and alpha channel of the image.
# For the mask we only need alpha channel, indexed at 3
mask = background.split()[3]
# we combine the two images and provide the mask that is applied to the foreground.
im = Image.composite(background, foreground, mask)
im.show()
If your background is not monochrome as in your example, and you want to use the version, where you paste your original image, you have to create an empty image with the same size as the background, then paste your foreground to the position (your gridpos), e.g. like this:
canvas = Image.new('RGBA', background.size)
canvas.paste(foreground, gridpos)
foreground = canvas
Hope this helps!

Change background color for Thresholderd image

I have been trying to write a code to extract cracks from an image using thresholding. However, I wanted to keep the background black. What would be a good solution to keep the outer boundary visible and the background black. Attached below is the original image along with the threshold image and the code used to extract this image.
import cv2
#Read Image
img = cv2.imread('Original.png')
# Convert into gray scale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Image processing ( smoothing )
# Averaging
blur = cv2.blur(gray,(3,3))
ret,th1 = cv2.threshold(blur,145,255,cv2.THRESH_BINARY)
inverted = np.invert(th1)
plt.figure(figsize = (20,20))
plt.subplot(121),plt.imshow(img)
plt.title('Original'),plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(inverted,cmap='gray')
plt.title('Threshold'),plt.xticks([]), plt.yticks([])

Method 1
Assuming the circle in your images stays in one spot throughout your image set you can manually create a black 'mask' image with a white hole in the middle, then overlay it on the final inverted image.
You can easily make the mask image using your favorite image editor's magic wand tool.
I made this1 by also expanding the circle inwards by one pixel to take into account some of the pixels the magic wand tool couldn't catch.
You would then use the mask image like this:
mask = cv2.imread('/path/to/mask.png')
masked = cv2.bitwise_and(inverted, inverted, mask=mask)
Method 2
If the circle does NOT stay is the same spot throughout your entire image set you can try to make the mask from all the fully black pixels in your original image. This assumes that the 'sample' itself (the thing with the cracks) does not contain fully black pixels. Although this will result in the text on the bottom left to be left white.
# make all the non black pixels white
_,mask = cv2.threshold(gray,1,255,cv2.THRESH_BINARY)
1 The original is not the same size as your inverted image and thus the mask I made won't actually fit, you're gonna have to make it yourself.

Extracting only specific color from image with scanner artifacts

I have the following problem:
I want to extract only the color of a blue pen from scanned images that also contain grayscale and black printed areas on a white page background.
I'm okay with disregarding any kind of grayscale (not colored) pixel values and only keeping the blue parts, there won't be any dominant color other than blue on the images.
It sounds like a simple task, but the problem is that through the scanning process, the entire image contains colored pixels, including blue ones, even the grayscale or black parts, so I'm not sure how to go about isolating those parts and keeping only the blue ones, here is a closeup to show what I mean:
Here is what an image would look like for reference:
I would like the output to be a new image, containing only the parts drawn / written in blue pen, in this case the drawing of the hedgehog / eye.
So I've tried to isolate an HSV range for blue-ish colors in the image using this code:
img = cv.imread("./data/scan_611a720bcd70bafe7beb502d.jpg")
img_hsv = cv.cvtColor(img, cv.COLOR_BGR2HSV)
# accepted color range for blue pen
lower_blue = np.array([90, 35, 140])
upper_blue = np.array([150, 255, 255])
# preparing the mask to overlay
mask = cv.inRange(img_hsv, lower_blue, upper_blue)
inverted_mask = cv.bitwise_not(mask)
mask_blur = cv.GaussianBlur(inverted_mask, (5, 5), 0)
ret, mask_thresh = cv.threshold(mask_blur, 0, 255, cv.THRESH_BINARY + cv.THRESH_OTSU)
# The black region in the mask has the value of 0,
# so when multiplied with original image removes all non-blue regions
result = cv.bitwise_and(img, img, mask=mask)
cv.imshow("Result", mask_thresh)
k = cv.waitKey(0)
However the result is this:
Many parts of the picture that are drawn in black such as the cloud image are not removed since as mentioned, they contain blue / colored pixels due to the scanning process.
Is there any method that would allow for a clean isolation of those blue parts of the image even with those artifacts present?
The solution would need to work for any kind of image like this, the one given is just an example, but as mentioned the only color present would be the blue pen apart from the grey/black areas.

Maybe try the opposite- search for black parts first and then do some erosion around this black mask and remove all around it before you are searching for the blue. The "main" color in the cloud is still black so you can play around this.

You should realign the color planes of your scan. Then you're at least rid of those color fringes. I'd recommend scanning a sheet of graph paper to calibrate.
This is done using OpenCV's findTransformECC.
Complete examples can be found here:
https://docs.opencv.org/master/dd/d93/samples_2cpp_2image_alignment_8cpp-example.html
https://learnopencv.com/image-alignment-ecc-in-opencv-c-python/
And here's specific code to align the color planes of the picture given in the question:
https://gist.github.com/crackwitz/b8867b46f320eae17f4b2684416c79ea
(all it does is split the color planes, call findTransformECC and warpPerspective, merge the color planes)

Detecting a horizontal line in an image

Problem:
I'm working with a dataset that contains many images that look something like this:
Now I need all these images to be oriented horizontally or vertically, such that the color palette is either at the bottom or the right side of the image. This can be done by simply rotating the image, but the tricky part is figuring out which images should be rotated and which shouldn't.
What I have tried:
I thought that the best way to do this, is by detecting the white line that separates the the color palette from the image. I decided to rotate all images that have the palette at the bottom such that they have it at the right side.
# yes I am mixing between PIL and opencv (I like the PIL resizing more)
# resize image to be 128 by 128 pixels
img = img.resize((128, 128), PIL.Image.BILINEAR)
img = np.array(img)
# perform edge detection, not sure if these are the best parameters for Canny
edges = cv2.Canny(img, 30, 50, 3, apertureSize=3)
has_line = 0
# take numpy slice of the area where the white line usually is
# (not always exactly in the same spot which probably has to do with the way I resize my image)
for line in edges[75:80]:
# check if most of one of the lines contains white pixels
counts = np.bincount(line)
if np.argmax(counts) == 255:
has_line = True
# rotate if we found such a line
if has_line == True:
s = np.rot90(s)
An example of it working correctly:
An example of it working incorrectly:
This works maybe on 98% of images but there are some cases where it will rotate images that shouldn't be rotated or not rotate images that should be rotated. Maybe there is an easier way to do this, or maybe a more elaborate way that is more consistent? I could do it manually but I'm dealing with a lot of images. Thanks for any help and/or comments.
Here are some images where my code fails for testing purposes:

You can start by thresholding your image by setting a very high threshold like 250 to take advantage of the property that your lines are white. This will make all the background black. Now create a special horizontal kernel with a shape like (1, 15) and erode your image with it. What this will do is remove the vertical lines from the image and only the horizontal lines will be left.
import cv2
import numpy as np
img = cv2.imread('horizontal2.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY)
kernel_hor = np.ones((1, 15), dtype=np.uint8)
erode = cv2.erode(thresh, kernel_hor)
As stated in the question the color palates can only be on the right or the bottom. So we can test to check how many contours does the right region has. For this just divide the image in half and take the right part. Before finding contours dilate the result to fill in any gaps with a normal (3, 3) kernel. Using the cv2.RETR_EXTERNAL find the contours and count how many we have found, if greater than a certain number the image is correct side up and there is no need to rotate.
right = erode[:, erode.shape[1]//2:]
kernel = np.ones((3, 3), dtype=np.uint8)
right = cv2.dilate(right, kernel)
cnts, _ = cv2.findContours(right, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
if len(cnts) > 3:
print('No need to rotate')
else:
print('rotate')
#ADD YOUR ROTATE CODE HERE
P.S. I tested for all four images you have provided and it worked well. If in case it does not work for any image let me know.

Bulk removing unwanted parts of images

I have downloaded a number of images (1000) from a website but they each have a black and white ruler running along 1 or 2 edges and some have these catalogue number tickets. I need these elements removed, the ruler at the very least.
Example images of coins:
The images all have the ruler in slightly different places so i cant just preform the same crop on them.
So I tried to remove the black and replace it with white using this code
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
im = Image.open('image-0.jpg')
im = im.convert('RGBA')
data = np.array(im) # "data" is a height x width x 4 numpy array
red, green, blue, alpha = data.T # Temporarily unpack the bands for readability
# Replace black with white
black_areas = (red < 150) & (blue < 150) & (green < 150)
data[..., :-1][black_areas.T] = (255, 255, 255) # Transpose back needed
im2 = Image.fromarray(data)
im2.show()
but it pretty much just removed half the coin as well:
I was having a read of some posts on opencv but though I'd see if there was a simpler way I'd missed first.

So I have taken a look at your problem and I have found a solution for your two images you provided, I hope it works for you other images as well but it is always hard to tell as it can be different on an individual basis. This solution is using OpenCV for preprocessing and contour detection to get the 2nd and 3rd largest elements in your picture (largest is the bounding box around the edges) which should be your coins. Then I create a box around those two items and add some padding before I crop to size.
So we start off with preprocessing:
import numpy as np
import cv2
img = cv2.imread(r'<PATH TO YOUR IMAGE>')
img = cv2.resize(img, None, fx=3, fy=3)
imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(imgray, (5, 5), 0)
ret, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Still rather basic, we make the image bigger so it is easier to detect contours, then we turn it into grayscale, blur it and apply thresholding to it so we turn all grey values either white or black. This then gives us the following image:
We now do contour detection, get the areas around our contours and sort them by the biggest area. Then we drop the biggest one as it is the box around the whole image and take the 2nd and 3rd biggest. And then get the x,y,w,h values we are interested in.
contours, hierarchy = cv2.findContours(
thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
areas = []
for cnt in contours:
area = cv2.contourArea(cnt)
areas.append((area, cnt))
areas.sort(key=lambda x: x[0], reverse=True)
areas.pop(0)
x, y, w, h = cv2.boundingRect(areas[0][1])
x2, y2, w2, h2 = cv2.boundingRect(areas[1][1])
If we draw a rectangle around those contours:
Now we take those coordinates and create a box around both of them. This might need some minor adjusting as I just quickly took the bigger width of the two and not the corresponding one for the right coin but since I added extra padding it should be fine in most cases. And finally crop to size:
pad = 15
img = img[(min(y, y2) - pad) : (max(y, y2) + max(h, h2) + pad),
(min(x, x2) - pad) : (max(x, x2) + max(w, w2) + pad)]
I hope this helps you to understand how you could achieve what you want, I tried it on both your images and it worked well for them. It might need some adjustments and depending on how your other images look the simple approach of taking the two biggest objects (apart from image bounding box) might be turned into something more sophisticated to detect the cricular shapes or something along those lines. Alternatively you could try to detect the rulers and crop from their position inwards. You will have to decide after you have done this on more example images in your dataset.

If you're looking for a robust solution, you should try something like Max Kaha's response, since it'll provide you with greater fine tuning.
Since the rulers tend to be left with just a little bit of text after your "black to white" filter, a quick solution is to use erosion followed by a dilation to create a mask for your images, and then apply the mask to the original image.
Pillow offers that with the ImageFilter class. Here's your code with a few modifications that'll achieve that:
from PIL import Image, ImageFilter
import numpy as np
import matplotlib.pyplot as plt
WHITE = 255, 255, 255
input_image = Image.open('image.png')
input_image = input_image.convert('RGBA')
input_data = np.array(input_image) # "data" is a height x width x 4 numpy array
red, green, blue, alpha = input_data.T # Temporarily unpack the bands for readability
# Replace black with white
thresh = 30
black_areas = (red < thresh) & (blue < thresh) & (green < thresh)
input_data[..., :-1][black_areas.T] = WHITE # Transpose back needed
erosion_factor = 5
# dilation is bigger to avoid cropping the objects of interest
dilation_factor = 11
erosion_filter = ImageFilter.MaxFilter(erosion_factor)
dilation_filter = ImageFilter.MinFilter(dilation_factor)
eroded = Image.fromarray(input_data).filter(erosion_filter)
dilated = eroded.filter(dilation_filter)
mask_threshold = 220
# the mask is black on regions to be hidden
mask = dilated.convert('L').point(lambda x: 255 if x < mask_threshold else 0)
# create base image
output_image = Image.new('RGBA', input_image.size, WHITE)
# paste only the desired regions
output_image.paste(input_image, mask=mask)
output_image.show()
You should also play around with the black to white threshold and the erosion/dilation factors to try and find the best fit for most of your images.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to combine two RGB images without having white space between them - python

Related

how to efficiently and correctly overlay pngs taking into account transparency?

Change background color for Thresholderd image

Extracting only specific color from image with scanner artifacts

Detecting a horizontal line in an image

Bulk removing unwanted parts of images

Categories

Resources