has anyone used template matching on OpenCV to detect images using transparent templates and it worked?
I'm trying to use template matching to detect multiple boxes that make up a table. The template used is a box of the same size but with a transparent background because the contents of the cell can be in the form of text with any font size.
Here the Image
Here the Template 1
Here the Template 2
img = cv.imread('image.png',-1)
template1 = cv.imread('template 1.png',-1)
template2 = cv.imread('template 2.png',-1)
h,w,_ = template.shape
method = cv.TM_CCOEFF_NORMED
res = cv.matchTemplate(img,template1,method)
# res = cv.matchTemplate(img,template2,method)
loc = np.where(res >= 0.8)
boxes = list()
for pt in zip(*loc[::-1]):
boxes.append((pt[0],pt[1],pt[0]+w,pt[1]+h))
boxes = non_max_suppression(np.array(boxes))
for (x1, y1, x2, y2) in boxes:
cv.rectangle(tem, (x1, y1), (x2, y2),(23,255,255), 1)
When running the match template module using template1, the results of the process fail to detect the existing boxes, whereas when using template2, the results of the process are able to detect boxes but only 1 out of 4 boxes can be detected.
The 4 boxes can be detected if the threshold used is lowered to 0.5, but when tested using other images it fails to detect the boxes correctly (sometimes the detection results are more than they should be, and sometimes the coordinates don't match).
From the result, I think the result appears because the match template process for transparent image failed to run and the frequency of appearance of black color in each box still affects the method.
If we look at the tutorial it shows that using a mask can solve this problem, but I am confused about which mask to use. In this example, the mask used is defined and there is no mention of how to create the mask. Anyone can help?
Following #fmw42 example in the link, I have managed to detect the box as needed.
However, in its implementation, adjustments need to be made because the use of black color as a template (border) followed by an transparent background which is also black causes the detection process to fail.
Here I solve it by changing the black color in template (border) to red color so that the template is composed of a red color in the border and a transparent black color. The colors used may be adjusted to your liking.
img = cv.imread('image.png')
img[:,:,:2] = 255 # change colors to red
template_all = cv.imread('template 1.png',cv.IMREAD_UNCHANGED)
template = template_all[:,:,0:3]
template[:,:,2] = 255 # change colors to red
alpha = template_all[:,:,3]
alpha = cv.merge([alpha,alpha,alpha])
h,w = template.shape[:2]
method = cv.TM_CCORR_NORMED
res = cv.matchTemplate(img,template,method,mask=alpha)
loc = np.where(res >= 0.9)
result = img.copy()
boxes = list()
for pt in zip(*loc[::-1]):
boxes.append((pt[0],pt[1],pt[0]+w,pt[1]+h))
boxes = non_max_suppression(np.array(boxes))
for (x1, y1, x2, y2) in boxes:
cv.rectangle(result, (x1, y1), (x2, y2),(23,255,255), 1)
cv.imshow('result',result)
cv.waitKey(0)
cv.destroyAllWindows()
Related
I have two images of a person standing up: one original RGB image with the person and background and the mask/alpha matte for that image displaying only the silhouette of the person. So far I have been able to remove the excess padding from the masked imaged via cropping using the function below.
def crop_excess(image):
y_nonzero, x_nonzero = np.nonzero(image)
return image[np.min(y_nonzero):np.max(y_nonzero), np.min(x_nonzero):np.max(x_nonzero)]
Now I would like to use the cropped mask and impose it on the original RGB image so that the excess background is removed.
Example images
Any ideas on how this could be done?
You should get values from mask and use it on both images
y_nonzero, x_nonzero = np.nonzero(image)
y1 = np.min(y_nonzero)
y2 = np.max(y_nonzero)
x1 = np.min(x_nonzero)
x2 = np.max(x_nonzero)
cropped_image = image[y1:y2, x1:x2]
cropped_original_image = original_image[y1:y2, x1:x2]
I have downloaded a number of images (1000) from a website but they each have a black and white ruler running along 1 or 2 edges and some have these catalogue number tickets. I need these elements removed, the ruler at the very least.
Example images of coins:
The images all have the ruler in slightly different places so i cant just preform the same crop on them.
So I tried to remove the black and replace it with white using this code
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
im = Image.open('image-0.jpg')
im = im.convert('RGBA')
data = np.array(im) # "data" is a height x width x 4 numpy array
red, green, blue, alpha = data.T # Temporarily unpack the bands for readability
# Replace black with white
black_areas = (red < 150) & (blue < 150) & (green < 150)
data[..., :-1][black_areas.T] = (255, 255, 255) # Transpose back needed
im2 = Image.fromarray(data)
im2.show()
but it pretty much just removed half the coin as well:
I was having a read of some posts on opencv but though I'd see if there was a simpler way I'd missed first.
So I have taken a look at your problem and I have found a solution for your two images you provided, I hope it works for you other images as well but it is always hard to tell as it can be different on an individual basis. This solution is using OpenCV for preprocessing and contour detection to get the 2nd and 3rd largest elements in your picture (largest is the bounding box around the edges) which should be your coins. Then I create a box around those two items and add some padding before I crop to size.
So we start off with preprocessing:
import numpy as np
import cv2
img = cv2.imread(r'<PATH TO YOUR IMAGE>')
img = cv2.resize(img, None, fx=3, fy=3)
imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(imgray, (5, 5), 0)
ret, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Still rather basic, we make the image bigger so it is easier to detect contours, then we turn it into grayscale, blur it and apply thresholding to it so we turn all grey values either white or black. This then gives us the following image:
We now do contour detection, get the areas around our contours and sort them by the biggest area. Then we drop the biggest one as it is the box around the whole image and take the 2nd and 3rd biggest. And then get the x,y,w,h values we are interested in.
contours, hierarchy = cv2.findContours(
thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
areas = []
for cnt in contours:
area = cv2.contourArea(cnt)
areas.append((area, cnt))
areas.sort(key=lambda x: x[0], reverse=True)
areas.pop(0)
x, y, w, h = cv2.boundingRect(areas[0][1])
x2, y2, w2, h2 = cv2.boundingRect(areas[1][1])
If we draw a rectangle around those contours:
Now we take those coordinates and create a box around both of them. This might need some minor adjusting as I just quickly took the bigger width of the two and not the corresponding one for the right coin but since I added extra padding it should be fine in most cases. And finally crop to size:
pad = 15
img = img[(min(y, y2) - pad) : (max(y, y2) + max(h, h2) + pad),
(min(x, x2) - pad) : (max(x, x2) + max(w, w2) + pad)]
I hope this helps you to understand how you could achieve what you want, I tried it on both your images and it worked well for them. It might need some adjustments and depending on how your other images look the simple approach of taking the two biggest objects (apart from image bounding box) might be turned into something more sophisticated to detect the cricular shapes or something along those lines. Alternatively you could try to detect the rulers and crop from their position inwards. You will have to decide after you have done this on more example images in your dataset.
If you're looking for a robust solution, you should try something like Max Kaha's response, since it'll provide you with greater fine tuning.
Since the rulers tend to be left with just a little bit of text after your "black to white" filter, a quick solution is to use erosion followed by a dilation to create a mask for your images, and then apply the mask to the original image.
Pillow offers that with the ImageFilter class. Here's your code with a few modifications that'll achieve that:
from PIL import Image, ImageFilter
import numpy as np
import matplotlib.pyplot as plt
WHITE = 255, 255, 255
input_image = Image.open('image.png')
input_image = input_image.convert('RGBA')
input_data = np.array(input_image) # "data" is a height x width x 4 numpy array
red, green, blue, alpha = input_data.T # Temporarily unpack the bands for readability
# Replace black with white
thresh = 30
black_areas = (red < thresh) & (blue < thresh) & (green < thresh)
input_data[..., :-1][black_areas.T] = WHITE # Transpose back needed
erosion_factor = 5
# dilation is bigger to avoid cropping the objects of interest
dilation_factor = 11
erosion_filter = ImageFilter.MaxFilter(erosion_factor)
dilation_filter = ImageFilter.MinFilter(dilation_factor)
eroded = Image.fromarray(input_data).filter(erosion_filter)
dilated = eroded.filter(dilation_filter)
mask_threshold = 220
# the mask is black on regions to be hidden
mask = dilated.convert('L').point(lambda x: 255 if x < mask_threshold else 0)
# create base image
output_image = Image.new('RGBA', input_image.size, WHITE)
# paste only the desired regions
output_image.paste(input_image, mask=mask)
output_image.show()
You should also play around with the black to white threshold and the erosion/dilation factors to try and find the best fit for most of your images.
Sorry for bad english. I want to make condition, that image with 12 in left corner = image with 12 in right corner and != image with 21.
I need a fast way to determine this, cause there are many pics and they refresh.
I tried to use counting pixels of specific image:
result = np.count_nonzero(np.all(original > (0,0,0), axis=2))
(why I use >(0,0,0) instead of == (255,255,255)? there are grey shadows near white symbols, that eyes can't see)
This way doesn't see a difference between 12 and 21.
I tried the second way, compare new images with templates, but it one see a huge difference between 12 and 12 in left-right corners!
original = ('auto/5or.png' )
template= cv2.imread( 'auto/5t.png' )
res = cv2.matchTemplate( original, template, cv2.TM_CCOEFF_NORMED )
I didn't try yet some difficult method of determining digits, cause I think - this is too slow, even on my little pics. (I may mistake).
I have digits only from 0 to 30, I have all templates, examples, they are differ only with location inside black square.
Any thoughts? Thanks in advance.
If you don't want the position of the digits in the image to make a difference, you can threshold the image to black and white and find the bounding box and crop to it so your digits are always in the same place - then just difference the images or use what you were using before:
#!/usr/local/bin/python3
import numpy as np
from PIL import Image
# Open image, greyscale and threshold
im=np.array(Image.open('21.png').convert('L'))
# Mask of white pixels
mask = im.copy()
mask[mask<128] = 0 # Threshold pixels < 128 down to black
# Coordinates of white pixels
coords = np.argwhere(mask)
# Bounding box of white pixels
x0, y0 = coords.min(axis=0)
x1, y1 = coords.max(axis=0) + 1
# Crop to bbox
cropped = im[x0:x1, y0:y1]
# Save
Image.fromarray(cropped).save('result.png')
That gives you this:
Obviously crop your template images as well.
I am less familiar with OpenCV in Python, but it would look something like this:
import cv2
# Load image
img = cv2.imread('21.png',0)
# Threshold at 127
ret,thresh = cv2.threshold(img,127,255,0)
# Get contours
im2, contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Get bounding box
cnt = contours[0]
x,y,w,h = cv2.boundingRect(cnt)
I have an image like so:
I would like to automatically identify the dense white box area in the top left and then fill it and black out the rest of image. Producing something like this:
Essentially, I just want to return the co-ordinates of the densest cluster. I have tried ad-hoc methods such as erosion, dilation and binary closing but they do not quite suite my needs. I'm not sure if I could use k-means here? Looking for an efficient method, any help is appreciated.
You could erode the image a little bit more, to remove more of the noise, and then find the contours and filter them by area. Here is what I would use (not tested):
kernel = np.ones((2, 2), np.uint8)
img = cv2.erode(img, kernel, iterations = 2)
#Finding contours of white square:
_, conts, hierarchy = cv2.findContours(img, cv2.RETR_EXTERNAL , cv2.CHAIN_APPROX_SIMPLE)
for cnt in conts:
area = cv2.contourArea(cnt)
#filter more noise
if area > 200: # optimize this number
x1, y1, w, h = cv2.boundingRect(cnt)
x2 = x1 + w # (x1, y1) = top-left vertex
y2 = y1 + h # (x2, y2) = bottom-right vertex
rect = cv2.rectangle(img, (x1, y1), (x2, y2), (255,0,0), 2)
One right approach here would be to apply a large square averaging filter. If you know approximately the size of the box you're looking for, then match that size with the filter. After applying this filter, the largest pixel value in the image will be at the middle of the densest region. Let's call this point p.
Next, apply segmentation and connected component labeling to your original image. From your example image, it seems that the box you're looking for is connected. You might want to apply some morphological operations to make sure it's connected. You can also paint a reasonably-sizes blob centered at point p, it'll connect lots of small regions that together form a dense area.
Next, remove all connected components except the one containing point p. You can do this by finding the label at pixel p, and comparing all pixels in the labeled image for equality with that label.
This should leave you a connected, compact region. You can find the bounding box of this region, and paint it on your image, if you really want to enforce that the found area be a box.
I have a transparent png image foo.png and I've opened another image with:
im = Image.open("foo2.png")
Now what I need is to merge foo.png with foo2.png.
(foo.png contains some text and I want to print that text on foo2.png)
from PIL import Image
background = Image.open("test1.png")
foreground = Image.open("test2.png")
background.paste(foreground, (0, 0), foreground)
background.show()
First parameter to .paste() is the image to paste. Second are coordinates, and the secret sauce is the third parameter. It indicates a mask that will be used to paste the image. If you pass a image with transparency, then the alpha channel is used as mask.
Check the docs.
Image.paste does not work as expected when the background image also contains transparency. You need to use real Alpha Compositing.
Pillow 2.0 contains an alpha_composite function that does this.
background = Image.open("test1.png")
foreground = Image.open("test2.png")
Image.alpha_composite(background, foreground).save("test3.png")
EDIT: Both images need to be of the type RGBA. So you need to call convert('RGBA') if they are paletted, etc.. If the background does not have an alpha channel, then you can use the regular paste method (which should be faster).
As olt already pointed out, Image.paste doesn't work properly, when source and destination both contain alpha.
Consider the following scenario:
Two test images, both contain alpha:
layer1 = Image.open("layer1.png")
layer2 = Image.open("layer2.png")
Compositing image using Image.paste like so:
final1 = Image.new("RGBA", layer1.size)
final1.paste(layer1, (0,0), layer1)
final1.paste(layer2, (0,0), layer2)
produces the following image (the alpha part of the overlayed red pixels is completely taken from the 2nd layer. The pixels are not blended correctly):
Compositing image using Image.alpha_composite like so:
final2 = Image.new("RGBA", layer1.size)
final2 = Image.alpha_composite(final2, layer1)
final2 = Image.alpha_composite(final2, layer2)
produces the following (correct) image:
One can also use blending:
im1 = Image.open("im1.png")
im2 = Image.open("im2.png")
blended = Image.blend(im1, im2, alpha=0.5)
blended.save("blended.png")
Had a similar question and had difficulty finding an answer. The following function allows you to paste an image with a transparency parameter over another image at a specific offset.
import Image
def trans_paste(fg_img,bg_img,alpha=1.0,box=(0,0)):
fg_img_trans = Image.new("RGBA",fg_img.size)
fg_img_trans = Image.blend(fg_img_trans,fg_img,alpha)
bg_img.paste(fg_img_trans,box,fg_img_trans)
return bg_img
bg_img = Image.open("bg.png")
fg_img = Image.open("fg.png")
p = trans_paste(fg_img,bg_img,.7,(250,100))
p.show()
def trans_paste(bg_img,fg_img,box=(0,0)):
fg_img_trans = Image.new("RGBA",bg_img.size)
fg_img_trans.paste(fg_img,box,mask=fg_img)
new_img = Image.alpha_composite(bg_img,fg_img_trans)
return new_img
Here is my code to merge 2 images of different sizes, each with transparency and with offset:
from PIL import Image
background = Image.open('image1.png')
foreground = Image.open("image2.png")
x = background.size[0]//2
y = background.size[1]//2
background = Image.alpha_composite(
Image.new("RGBA", background.size),
background.convert('RGBA')
)
background.paste(
foreground,
(x, y),
foreground
)
background.show()
This snippet is a mix of the previous answers, blending elements with offset while handling images with different sizes, each with transparency.
the key code is:
_, _, _, alpha = image_element_copy.split()
image_bg_copy.paste(image_element_copy, box=(x0, y0, x1, y1), mask=alpha)
the full function is:
def paste_image(image_bg, image_element, cx, cy, w, h, rotate=0, h_flip=False):
image_bg_copy = image_bg.copy()
image_element_copy = image_element.copy()
image_element_copy = image_element_copy.resize(size=(w, h))
if h_flip:
image_element_copy = image_element_copy.transpose(Image.FLIP_LEFT_RIGHT)
image_element_copy = image_element_copy.rotate(rotate, expand=True)
_, _, _, alpha = image_element_copy.split()
# image_element_copy's width and height will change after rotation
w = image_element_copy.width
h = image_element_copy.height
x0 = cx - w // 2
y0 = cy - h // 2
x1 = x0 + w
y1 = y0 + h
image_bg_copy.paste(image_element_copy, box=(x0, y0, x1, y1), mask=alpha)
return image_bg_copy
the above function supports:
position(cx, cy)
auto resize image_element to (w, h)
rotate image_element without cropping it
horizontal flip