I am trying to perform perspective transformation with homography using the opencv-python package.
I have a background and foreground image and would like to perform perspective transform and stitch the foreground image on the background image given four (x, y) coordinates, as follow:
bgImg = cv2.imread(BACK_PATH, cv2.IMREAD_COLOR)
fgImg = cv2.imread(FORE_PATH, cv2.IMREAD_COLOR)
bgHeight, bgWidth, dpt = bgImg.shape
origImageCoords = np.array([(0, 0),
(0, bgHeight),
(bgWidth, bgHeight),
(bgWidth, 0)])
stitchingCoords = []
def transformPerspective():
y0 = 285
y1 = 400
x0 = 447
x1 = 600
stitchingCoords.append((x0, y0))
stitchingCoords.append((x0, y1))
stitchingCoords.append((x1, y1))
stitchingCoords.append((x1, y0))
homography = cv2.findHomography(origImageCoords, np.array(stitchingCoords))
dst_corners = cv2.warpPerspective(src=fgImg, M=homography[0], dsize=(bgWidth, bgHeight))
showFinal(bgImg, dst_corners)
After the perspective transformation is done using cv2.findhomography(), I mask the foreground and background images using appropriate masks and add them together, as follow:
def showFinal(src1, src2):
grayed = cv2.cvtColor(src2, cv2.COLOR_BGR2GRAY)
_, grayed = cv2.threshold(grayed, 0, 255, cv2.THRESH_BINARY)
grayedInv = cv2.bitwise_not(grayed)
src1final = cv2.bitwise_and(src1, src1, mask=grayedInv)
src2final = cv2.bitwise_and(src2, src2, mask=grayed)
finalImage = cv2.add(src1final, src2final)
cv2.namedWindow("output", cv2.WINDOW_AUTOSIZE)
cv2.imshow("output", finalImage)
Problem
The problem is the that the find result is wrong because the transformed-foreground image is not stitched inside the four coordinates that I used for finding the homography.
Could anyone guide me as to why this error is occurring?
Expected Output
Actual Output
Related
I'm using the code below to blur the background.
change_bg = alter_bg(model_type="pb")
change_bg.load_pascalvoc_model("xception_pascalvoc.pb")
change_bg.blur_bg(filename, low=True, output_image_name=output, detect="car")
It works, but I need to keep only the car close and in front of the photo. If there's cars far in the background, I want it blurred as well. But I'm not finding any way to do that.
Most likely this is not the best solution, but I was able to blur all the background and keep only the main car in the center of the image by drawing a rectangle over the small cars in the background. By doing that, the algorithm is not able to recognize those cars. Then, when loading the photo again to blur the background, I load the original photo.
So, first of all, I use the instance_segmentation to find all cars like this:
segment_image = instance_segmentation()
segment_image.load_model("mask_rcnn_coco.h5", confidence=0.5)
target_classes = segment_image.select_target_classes(car=True)
results, op = segment_image.segmentImage(filename, segment_target_classes=target_classes, show_bboxes=True, extract_segmented_objects=True, output_image_name="image_new.jpg")
other_cars = []
for i in range(0, len(results["rois"])):
roi = results["rois"][i]
w = roi[3] - roi[1]
h = roi[2] - roi[0]
if max(h, w) < 350:
other_cars.append(roi)
Then, I hide the small cars in the image I'll use on the next step.
image = cv2.imread(filename)
for roi in other_cars:
y1 = roi[0]
x1 = roi[1]
y2 = roi[2]
x2 = roi[3]
cv2.rectangle(image, (x1, y1), (x2, y2), (46, 221, 31), cv2.FILLED)
Then I use the alter_bg:
change_bg = alter_bg(model_type="pb")
change_bg.load_pascalvoc_model("xception_pascalvoc.pb")
target_class = change_bg.target_obj("car")
But I rewrite the part where it blurs the background to load the original image.
ori_img = cv2.imread(filename)
blur_img = cv2.blur(ori_img, (16, 16), 0)
out = np.where(seg_image[1], ori_img, blur_img)
cv2.imwrite(output, out)
Done. It works for my case.
The goal is to find the line that represents the distance between the "hole" and the outer edge (transition from black to white). I was able to successfully binarize this photo and get a very clean black and white image. The next step would be to find the (almost) vertical line on it and calculate the perpendicular distance to the midpoint of this vertical line and the hole.
original picture
hole - zoomed in
ps: what I call "hole" is a shadow. I am shooting a laser into a hole. So the lines we can see is a steel and the black part without a line is a hole. The 2 white lines serve as a reference to measure the distance.
Is Canny edge detection the best approach? If so, what are good values for the A, B and C parameters? I can't tune it. I'm getting too much noise.
This is not complete; You have to take the time to reach the final result. But this idea might help you.
Preprocessors:
import os
import cv2
import numpy as np
Main code:
# Read original image
dir = os.path.abspath(os.path.dirname(__file__))
im = cv2.imread(dir+'/'+'im.jpg')
h, w = im.shape[:2]
print(w, h)
# Convert image to Grayscale
imGray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
cv2.imwrite(dir+'/im_1_grayscale.jpg', imGray)
# Eliminate noise and display laser light better
imHLine = imGray.copy()
imHLine = cv2.GaussianBlur(imHLine, (0, 9), 21) # 5, 51
cv2.imwrite(dir+'/im_2_h_line.jpg', imHLine)
# Make a BW mask to find the ROI of laser array
imHLineBW = cv2.threshold(imHLine, 22, 255, cv2.THRESH_BINARY)[1]
cv2.imwrite(dir+'/im_3_h_line_bw.jpg', imHLineBW)
# Remove noise with mask and extract just needed area
imHLineROI = imGray.copy()
imHLineROI[np.where(imHLineBW == 0)] = 0
imHLineROI = cv2.GaussianBlur(imHLineROI, (0, 3), 6)
imHLineROI = cv2.threshold(imHLineROI, 25, 255, cv2.THRESH_BINARY)[1] # 22
cv2.imwrite(dir+'/im_4_ROI.jpg', imHLineROI)
# Found laser array and draw box around it
cnts, _ = cv2.findContours(
imHLineROI, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts.sort(key=lambda x: cv2.boundingRect(x)[0])
pts = []
for cnt in cnts:
x2, y2, w2, h2 = cv2.boundingRect(cnt)
if h2 < h/10:
cv2.rectangle(im, (x2, y2), (x2+w2, y2+h2), (0, 255, 0), 1)
pts.append({'x': x2, 'y': y2, 'w': w2, 'h': h2})
circle = {
'left': (pts[0]['x']+pts[0]['w'], pts[0]['y']+pts[0]['h']/2),
'right': (pts[1]['x'], pts[1]['y']+pts[1]['h']/2),
}
circle['center'] = calculateMiddlePint(circle['left'], circle['right'])
circle['radius'] = (circle['right'][0]-circle['left'][0])//2
# Draw line and Circle inside it
im = drawLine(im, circle['left'], circle['right'], color=(27, 50, 120))
im = cv2.circle(im, circle['center'], circle['radius'], (255, 25, 25), 3)
# Remove pepper/salt noise to find metal edge
imVLine = imGray.copy()
imVLine = cv2.medianBlur(imVLine, 17)
cv2.imwrite(dir+'/im_6_v_line.jpg', imVLine)
# Remove remove the shadows to find metal edge
imVLineBW = cv2.threshold(imVLine, 50, 255, cv2.THRESH_BINARY)[1]
cv2.imwrite(dir+'/im_7_v_bw.jpg', imVLineBW)
# Finding the right vertical edge of metal
y1, y2 = h/5, h-h/5
x1 = horizantalDistance(imVLineBW, y1)
x2 = horizantalDistance(imVLineBW, y2)
pt1, pt2 = (x1, y1), (x2, y2)
imVLineBW = drawLine(imVLineBW, pt1, pt2)
cv2.imwrite(dir+'/im_8_v_bw.jpg', imVLineBW)
# Draw lines
im = drawLine(im, pt1, pt2)
im = drawLine(im, calculateMiddlePint(pt1, pt2), circle['center'])
# Draw final image
cv2.imwrite(dir+'/im_8_output.jpg', im)
Extra functions:
Find the first white pixel in one line of picture:
# This function only processes on a horizontal line of the image
# Its job is to examine the pixels one by one from the right and
# report the distance of the first white pixel from the right of
# the image.
def horizantalDistance(im, y):
y = int(y)
h, w = im.shape[:2]
for i in range(0, w):
x = w-i-1
if im[y][x] == 255:
return x
return -1
To draw a line in opencv:
def drawLine(im, pt1, pt2, color=(128, 0, 200), thickness=2):
return cv2.line(
im,
pt1=(int(pt1[0]), int(pt1[1])),
pt2=(int(pt2[0]), int(pt2[1])),
color=color,
thickness=thickness,
lineType=cv2.LINE_AA # Anti-Aliased
)
To calculate middle point of two 2d points:
def calculateMiddlePint(p1, p2):
return (int((p1[0]+p2[0])/2), int((p1[1]+p2[1])/2))
Output:
Original image:
Eliminate noise and process to see the laser array better:
Find the laser area to extract the hole:
Work on another copy of image to find the right side of metal object:
Remove the shadows to better see the right edge:
The final output:
I first defined an ROI area. I changed the code later but did not change the names of the variables. If you were asked.
You can try one of binarized Canny, or straight binarized image (Otsu). In both cases, you will obtain a quasi-straight line from which you can extract the right-most white pixel on every row or so.
Then use line fitting to these points (depending on image quality the fitting needs to be robust or not).
It would be wise to calibrate the method against ground truth, because the edge your are looking after borders a gradient area, and the location of the edge might be offset by several pixels.
I am trying to crop a face using the facial landmarks identified by dlib. The right eyebrow is causing problems - the crop goes flat across rather than follow the eyebrow arc.
What am I doing wrong here?
from imutils import face_utils
import imutils
import numpy as np
import collections
import dlib
import cv2
def face_remap(shape):
remapped_image = shape.copy()
# left eye brow
remapped_image[17] = shape[26]
remapped_image[18] = shape[25]
remapped_image[19] = shape[24]
remapped_image[20] = shape[23]
remapped_image[21] = shape[22]
# right eye brow
remapped_image[22] = shape[21]
remapped_image[23] = shape[20]
remapped_image[24] = shape[19]
remapped_image[25] = shape[18]
remapped_image[26] = shape[17]
# neatening
remapped_image[27] = shape[0]
return remapped_image
"""
MAIN CODE STARTS HERE
"""
# load the input image, resize it, and convert it to grayscale
image = cv2.imread("images/faceCM1.jpg")
image = imutils.resize(image, width=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
out_face = np.zeros_like(image)
# initialize dlib's face detector (HOG-based) and then create the facial landmark predictor
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(SHAPE_PREDICTOR)
# detect faces in the grayscale image
rects = detector(gray, 1)
# loop over the face detections
for (i, rect) in enumerate(rects):
"""
Determine the facial landmarks for the face region, then convert the facial landmark (x, y)-coordinates to a NumPy array
"""
shape = predictor(gray, rect)
shape = face_utils.shape_to_np(shape)
#initialize mask array
remapped_shape = np.zeros_like(shape)
feature_mask = np.zeros((image.shape[0], image.shape[1]))
# we extract the face
remapped_shape = face_remap(shape)
cv2.fillConvexPoly(feature_mask, remapped_shape[0:27], 1)
feature_mask = feature_mask.astype(np.bool)
out_face[feature_mask] = image[feature_mask]
cv2.imshow("mask_inv", out_face)
cv2.imwrite("out_face.png", out_face)
sample image of cropped face showing the issue
Using the convex hull formed by the 68 landmarks didn't exactly achieve the desired output, so I had the following approach to this problem using scikit-image instead of OpenCV
1. Load image and predict 68 landmarks
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')
img = dlib.load_rgb_image('mean.jpg')
rect = detector(img)[0]
sp = predictor(img, rect)
landmarks = np.array([[p.x, p.y] for p in sp.parts()])
2. Select the landmarks that represents the shape of the face
(I had to reverse the order of the eyebrows landmarks because the 68 landmarks aren't ordered to describe the face outline)
outline = landmarks[[*range(17), *range(26,16,-1)]]
3. Draw a polygon using these landmarks using scikit-image
Y, X = skimage.draw.polygon(outline[:,1], outline[:,0])
4. Create a canvas with zeros and use the polygon as mask to original image
cropped_img = np.zeros(img.shape, dtype=np.uint8)
cropped_img[Y, X] = img[Y, X]
For the sake of completeness, I provide below a solution using scipy.spatial.ConvexHull, if this option is still preferred
vertices = ConvexHull(landmarks).vertices
Y, X = skimage.draw.polygon(landmarks[vertices, 1], landmarks[vertices, 0])
cropped_img = np.zeros(img.shape, dtype=np.uint8)
cropped_img[Y, X] = img[Y, X]
Its because the face shape you are providing is not convex.
fillConvexPoly works perfectly on convex shapes only, In this case there is a concave corner (at point #27) and hence the results are messed up.
To fix this, modify the function as
def face_remap(shape):
remapped_image = cv2.convexHull(shape)
return remapped_image
This would give you a result which looks like.
Now you may write some more code to remove the triangular section on forehead (if you want it that way)
Can someone tell me how to rotate only part of an image like this:
How to find coordinate / center of this image:
i can rotate all pict using this
from PIL import Image
def rotate_image():
img = Image.open("nime1.png")
img.rotate(45).save("plus45.png")
img.rotate(-45).save("minus45.png")
img.rotate(90).save("90.png")
img.transpose(Image.ROTATE_90).save("90_trans.png")
img.rotate(180).save("180.png")
if __name__ == '__main__':
rotate_image()
You can crop an area of the picture as a new variable. In this case, I cropped a 120x120 pixel box out of the original image. It is rotated by 90 and then pasted back on the original.
from PIL import Image
img = Image.open('./image.jpg')
sub_image = img.crop(box=(200,0,320,120)).rotate(90)
img.paste(sub_image, box=(200,0))
So I thought about this a bit more and crafted a function that applies a circular mask to the cropped image before rotations. This allows an arbitrary angle without weird effects.
def circle_rotate(image, x, y, radius, degree):
img_arr = numpy.asarray(image)
box = (x-radius, y-radius, x+radius+1, y+radius+1)
crop = image.crop(box=box)
crop_arr = numpy.asarray(crop)
# build the cirle mask
mask = numpy.zeros((2*radius+1, 2*radius+1))
for i in range(crop_arr.shape[0]):
for j in range(crop_arr.shape[1]):
if (i-radius)**2 + (j-radius)**2 <= radius**2:
mask[i,j] = 1
# create the new circular image
sub_img_arr = numpy.empty(crop_arr.shape ,dtype='uint8')
sub_img_arr[:,:,:3] = crop_arr[:,:,:3]
sub_img_arr[:,:,3] = mask*255
sub_img = Image.fromarray(sub_img_arr, "RGBA").rotate(degree)
i2 = image.copy()
i2.paste(sub_img, box[:2], sub_img.convert('RGBA'))
return i2
i2 = circle_rotate(img, 260, 60, 60, 45)
i2
You can solve this problem as such. Say you have img = Image.open("nime1.png")
Create a copy of the image using img2 = img.copy()
Create a crop of img2 at the desired location using img2.crop(). You can read how to do this here
Paste img2 back onto img at the appropriate location using img.paste()
Notes:
To find the center coordinate, you can divide the width and height by 2 :)
I have a transparent png image foo.png and I've opened another image with:
im = Image.open("foo2.png")
Now what I need is to merge foo.png with foo2.png.
(foo.png contains some text and I want to print that text on foo2.png)
from PIL import Image
background = Image.open("test1.png")
foreground = Image.open("test2.png")
background.paste(foreground, (0, 0), foreground)
background.show()
First parameter to .paste() is the image to paste. Second are coordinates, and the secret sauce is the third parameter. It indicates a mask that will be used to paste the image. If you pass a image with transparency, then the alpha channel is used as mask.
Check the docs.
Image.paste does not work as expected when the background image also contains transparency. You need to use real Alpha Compositing.
Pillow 2.0 contains an alpha_composite function that does this.
background = Image.open("test1.png")
foreground = Image.open("test2.png")
Image.alpha_composite(background, foreground).save("test3.png")
EDIT: Both images need to be of the type RGBA. So you need to call convert('RGBA') if they are paletted, etc.. If the background does not have an alpha channel, then you can use the regular paste method (which should be faster).
As olt already pointed out, Image.paste doesn't work properly, when source and destination both contain alpha.
Consider the following scenario:
Two test images, both contain alpha:
layer1 = Image.open("layer1.png")
layer2 = Image.open("layer2.png")
Compositing image using Image.paste like so:
final1 = Image.new("RGBA", layer1.size)
final1.paste(layer1, (0,0), layer1)
final1.paste(layer2, (0,0), layer2)
produces the following image (the alpha part of the overlayed red pixels is completely taken from the 2nd layer. The pixels are not blended correctly):
Compositing image using Image.alpha_composite like so:
final2 = Image.new("RGBA", layer1.size)
final2 = Image.alpha_composite(final2, layer1)
final2 = Image.alpha_composite(final2, layer2)
produces the following (correct) image:
One can also use blending:
im1 = Image.open("im1.png")
im2 = Image.open("im2.png")
blended = Image.blend(im1, im2, alpha=0.5)
blended.save("blended.png")
Had a similar question and had difficulty finding an answer. The following function allows you to paste an image with a transparency parameter over another image at a specific offset.
import Image
def trans_paste(fg_img,bg_img,alpha=1.0,box=(0,0)):
fg_img_trans = Image.new("RGBA",fg_img.size)
fg_img_trans = Image.blend(fg_img_trans,fg_img,alpha)
bg_img.paste(fg_img_trans,box,fg_img_trans)
return bg_img
bg_img = Image.open("bg.png")
fg_img = Image.open("fg.png")
p = trans_paste(fg_img,bg_img,.7,(250,100))
p.show()
def trans_paste(bg_img,fg_img,box=(0,0)):
fg_img_trans = Image.new("RGBA",bg_img.size)
fg_img_trans.paste(fg_img,box,mask=fg_img)
new_img = Image.alpha_composite(bg_img,fg_img_trans)
return new_img
Here is my code to merge 2 images of different sizes, each with transparency and with offset:
from PIL import Image
background = Image.open('image1.png')
foreground = Image.open("image2.png")
x = background.size[0]//2
y = background.size[1]//2
background = Image.alpha_composite(
Image.new("RGBA", background.size),
background.convert('RGBA')
)
background.paste(
foreground,
(x, y),
foreground
)
background.show()
This snippet is a mix of the previous answers, blending elements with offset while handling images with different sizes, each with transparency.
the key code is:
_, _, _, alpha = image_element_copy.split()
image_bg_copy.paste(image_element_copy, box=(x0, y0, x1, y1), mask=alpha)
the full function is:
def paste_image(image_bg, image_element, cx, cy, w, h, rotate=0, h_flip=False):
image_bg_copy = image_bg.copy()
image_element_copy = image_element.copy()
image_element_copy = image_element_copy.resize(size=(w, h))
if h_flip:
image_element_copy = image_element_copy.transpose(Image.FLIP_LEFT_RIGHT)
image_element_copy = image_element_copy.rotate(rotate, expand=True)
_, _, _, alpha = image_element_copy.split()
# image_element_copy's width and height will change after rotation
w = image_element_copy.width
h = image_element_copy.height
x0 = cx - w // 2
y0 = cy - h // 2
x1 = x0 + w
y1 = y0 + h
image_bg_copy.paste(image_element_copy, box=(x0, y0, x1, y1), mask=alpha)
return image_bg_copy
the above function supports:
position(cx, cy)
auto resize image_element to (w, h)
rotate image_element without cropping it
horizontal flip