How to crop images to remove excess background using image mask? - python

I have two images of a person standing up: one original RGB image with the person and background and the mask/alpha matte for that image displaying only the silhouette of the person. So far I have been able to remove the excess padding from the masked imaged via cropping using the function below.
def crop_excess(image):
y_nonzero, x_nonzero = np.nonzero(image)
return image[np.min(y_nonzero):np.max(y_nonzero), np.min(x_nonzero):np.max(x_nonzero)]
Now I would like to use the cropped mask and impose it on the original RGB image so that the excess background is removed.
Example images
Any ideas on how this could be done?

You should get values from mask and use it on both images
y_nonzero, x_nonzero = np.nonzero(image)
y1 = np.min(y_nonzero)
y2 = np.max(y_nonzero)
x1 = np.min(x_nonzero)
x2 = np.max(x_nonzero)
cropped_image = image[y1:y2, x1:x2]
cropped_original_image = original_image[y1:y2, x1:x2]

Related

Convert an image area to white and the rest to black in Python

I'm making a script that copies an "anomaly" image and pastes it in random places from an original image. Like this:
Original Imagem
Anomaly Image:
Output Imagem Example:
But at the same time that the image with the anomaly is generated, I need to generate a mask where the area of the anomaly that I pasted is white and the rest of the image is black. Like this (I did it manually in Gimp):
Output image mask example:
How can I do this automatically at the same time as the anomaly image is generated? Below the code I'm using:
from PIL import Image
import random
anomaly = Image.open("anomaly_transp.png") # anomaly image with transparent background
image_number = 1 # number of images
w, h = anomaly.size
offset_x, offset_y = 480-w, 512-h # offsets to avoid incorrect paste area from original image
for i in range(image_number):
original = Image.open("Crop_120.png") # original good image
x1 = random.randint(0,offset_x)
y1 = random.randint(0,offset_y)
area = (x1, y1, x1+w, y1+h)
original.paste(anomaly, area, anomaly)
original.save("output_"+str(i)+".png") # save output image
original.close()
You can use
alpha = anomaly.split()[-1]
to fetch the alpha plane of your transparent image. You can then paste that into an all black image of the right size to get your mask.

Merging each instance mask back to the original image Python

I am having a bunch of mask (object is white, non-object is black) bounded by their bounding box as a separate image, and I'm trying to put them back to their original positions on the original image. What I have in mind right now is:
Create a black image of the same size of the original image.
Add the value of each mask with the value of the coordinate of the bounding box on the original image together.
Could anyone tell me if I am heading in the right path, is there any better way to do this?.
Below is roughly my implementation
import cv2
black_img = np.zeros((height,width)) # A image that is of the size of the original but is all black
mask = cv2.imread("mask.png")
bbox = [x1, y1, x2, y2] # Pretend that this is a valid bounding box coordinate on the original image
black_img[y1:y2, x1:x2] += mask
For example:
I have the first image which is one of my masks. Its size is of the same of the bounding box on the original image. I'm trying merge each mask back together so that I achieved something like the second image.
One of the mask:
After merging all the mask:
I am assuming the mask is 0 and 1's and your image is grayscale. Also, for each small_mask, you have a corresponding bbox.
mask = np.zeros((height,width))
for small_mask, bbox in zip(masks, bboxes):
x1, y1, x2, y2 = bbox
mask[y1:y2, x1:x2] += small_mask
mask = ((mask>=1)*255.0).astype(np.uint8)
Now you combined all the small masks together.
The last line:
My assumption was somehow two masks may intersect. So those intersection may have values more than 1. mask >= 1 tells me that the pixels that are more than 0 are gonna be all on.
I multiplied that by 255.0 because I wanted to make it white. You won't be able to see 1's in a grayscale image.
(mask >= 1)*255.0 expanded the range from [0-1] to [0-255]. But this value is float which is not image type.
.astype(np.uint8) converts the float to uint8. Now you can do all the image operations without any problem. When it is float, you may face a lot of issues, like plotting, saving, all will cause some issues.

Resizing non uniform images with precise face location

I work at a studio that does school photos and we are trying to make a script to eliminate the job of cropping each photo to a template. The photos we work with are fairly uniform but they vary in resolution and head position a bit. I took up the mantle of trying to write the script with my fairly limited Python knowledge and through a lot of trial and error and online resources I think I have got most of the way there.
At the moment I am trying to figure out the best way to have the image crop from the NumPy array with the head where I want and I just cant find a good flexible solution. The head needs to be positioned slightly differently for pose 1 and pose 2 so its needs to be easy to change on the fly (Probably going to implement some sort of simple GUI to input stuff like that, but for now I can just change the code).
I also need to be able to change the output resolution of the photo so they are all uniform (2000x2500). Anyone have any ideas?
At the moment this is my current code, it just saves the detected face square:
import cv2
import os.path
import glob
# Cascade path
cascPath = 'haarcascade_frontalface_default.xml'
# Create the haar cascade
faceCascade = cv2.CascadeClassifier(cascPath)
#Check for output folder and create if its not there
if not os.path.exists('output'):
os.makedirs('output')
# Read Images
images = glob.glob('*.jpg')
for c, i in enumerate(images):
image = cv2.imread(i, 1)
# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Find face(s) using cascade
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1, # size of groups
minNeighbors=5, # How many groups around are detected as face for it to be valid
minSize=(500, 500) # Min size in pixels for face
)
# Outputs number of faces found in image
print('Found {0} faces!'.format(len(faces)))
# Places a rectangle on face
for (x, y, w, h) in faces:
imgCrop = image[y:y+h,x:x+w]
if len(faces) > 0:
#Saves Images to output folder with OG name
cv2.imwrite('output/'+ i, imgCrop)
I can crop using it like this:
# Crop Padding
left = 300
right = 300
top = 400
bottom = 1000
for (x, y, w, h) in faces:
imgCrop = image[y-top:y+h+bottom, x-left:x+w+right]
but that outputs pretty random resolutions and changes based on the image resolution
TL;DR
To set a new resolution with the dimension, you can use cv2.resize. There may be a pixel loss so you can use the interpolation method.
The newly resized image may be in BGR format, so you may need to convert to RGB format.
cv2.resize(src=crop, dsize=(2000, 2500), interpolation=cv2.INTER_LANCZOS4)
crop = cv2.cvtColor(crop, cv2.COLOR_BGR2RGB) # Make sure the cropped image is in RGB format
cv2.imwrite("image-1.png", crop)
Suggestion:
One approach is using python's face-recognition library.
The approach is using two sample images for training.
Predict the next image based on training images.
For instance, The followings are the training images:
We want to predict the faces in the below image:
When we get the facial encodings of the training images and apply to the next image:
import face_recognition
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image, ImageDraw
# Load a sample picture and learn how to recognize it.
first_image = face_recognition.load_image_file("images/ex.jpg")
first_face_encoding = face_recognition.face_encodings(first_image)[0]
# Load a second sample picture and learn how to recognize it.
second_image = face_recognition.load_image_file("images/index.jpg")
sec_face_encoding = face_recognition.face_encodings(second_image)[0]
# Create arrays of known face encodings and their names
known_face_encodings = [
first_face_encoding,
sec_face_encoding
]
print('Learned encoding for', len(known_face_encodings), 'images.')
# Load an image with an unknown face
unknown_image = face_recognition.load_image_file("images/babes.jpg")
# Find all the faces and face encodings in the unknown image
face_locations = face_recognition.face_locations(unknown_image)
face_encodings = face_recognition.face_encodings(unknown_image, face_locations)
# Convert the image to a PIL-format image so that we can draw on top of it with the Pillow library
# See http://pillow.readthedocs.io/ for more about PIL/Pillow
pil_image = Image.fromarray(unknown_image)
# Create a Pillow ImageDraw Draw instance to draw with
draw = ImageDraw.Draw(pil_image)
# Loop through each face found in the unknown image
for (top, right, bottom, left), face_encoding in zip(face_locations, face_encodings):
matches = face_recognition.compare_faces(known_face_encodings, face_encoding)
face_distances = face_recognition.face_distance(known_face_encodings, face_encoding)
best_match_index = np.argmin(face_distances)
draw.rectangle(((left, top), (right, bottom)), outline=(0, 0, 255), width=5)
# Remove the drawing library from memory as per the Pillow docs
del draw
# Display the resulting image
plt.imshow(pil_image)
plt.show()
The output will be:
The above is my suggestion. When you create a new resolution with the current image, there will be a pixel loss. Therefore you need to use an interpolation method.
For instance: after finding the face locations, select the coordinates in the original image.
# Add after draw.rectangle function.
crop = unknown_image[top:bottom, left:right]
Set new resolution with the size 2000 x 2500 and interpolation with CV2.INTERN_LANCZOS4.
Possible Question: Why CV2.INTERN_LANCZOS4?
Of course, you can select whatever you like, but in this post CV2.INTERN_LANCZOS4 was suggested.
cv2.resize(src=crop, dsize=(2000, 2500), interpolation=cv2.INTER_LANCZOS4)
Save the image
crop = cv2.cvtColor(crop, cv2.COLOR_BGR2RGB) # Make sure the cropped image is in RGB format
cv2.imwrite("image-1.png", crop)
Outputs are around 4.3 MB Therefore I can't display in here.
From the final result, we clearly see and identify faces. The library precisely finds the faces in the image.
Here what you can do:
Either you can use the training images of your own-set, or you can use the example above.
Apply the face-recognition function for each image, using the trained face-locations and save the results in the directory.
here is how I got it to crop how I wanted, this is added right below the "output number of faces" function
#Get the face postion and output values into variables, might not be needed but I did it
for (x, y, w, h) in faces:
xdis = x
ydis = y
w = w
h = h
#Get scale value by dividing wanted head hight by detected head hight
ws = 600/w
hs = 600/h
#scale image to get head to right size, uses bilinear interpolation by default
scale = cv2.resize(image,(0,0),fx=hs,fy=ws)
#calculate head postion for given values
sxdis = int(xdis*ws) #applying scale to x distance and turning it into a integer
sydis = int(ydis*hs) #applying scale to y distance and turning it into a integer
sycent = sydis+300 #adding half head hight to get center
ystart = sycent-700 #subtract where you want the head center to be in pixels, this is for the vertical
yend = ystart+2500 #Add whatever you want vertical resolution to be
xcent = sxdis+300 #adding half head hight to get center
xstart = xcent-1000 #subtract where you want the head center to be in pixels, this is for the horizontal
xend = xstart+2000 #add whatever you want the horizontal resolution to be
#Crop the image
cropped = scale[ystart:yend, xstart:xend]
Its a mess but it works exactly how I wanted it to work.
ended up going with openCV instead of switching to python-Recognition because of speed but I might switch over if I can get multithreading to work in python-recognition.

How to combine two RGB images without having white space between them

I am trying to combine some parts of the image together while still maintaining some parts unchanged.
This is first image
This is the code to get the first image, the parameter for the input are img which is original image but already colorized with green while jawline,eyebrows,etc are (x,y) coordinates to cut those parts from the image
def getmask(img,jawline,eyebrows,eyes,mouth):
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
imArray = np.asarray(img)
# create mask
polygon = jawline.flatten().tolist()
maskIm = Image.new('L', (imArray.shape[1], imArray.shape[0]), 0)
ImageDraw.Draw(maskIm).polygon(polygon, outline=1, fill='white')
#ImageDraw.Draw(maskIm).polygon(polygon, outline=(1))
# draw eyes
righteyes=eyes[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(righteyes, outline=1, fill='black')
lefteyes=eyes[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(lefteyes, outline=1, fill='black')
# draw eyebrows
rightbrows=eyebrows[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(rightbrows, outline=2, fill='black')
leftbrows=eyebrows[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(leftbrows, outline=2, fill='black')
# draw mouth
mouth=mouth.flatten().tolist()
ImageDraw.Draw(maskIm).polygon(mouth, outline=1, fill='black')
mask = np.array(maskIm)
mask = np.multiply(img,mask)+ np.multiply((1-mask),np.ones((L,P,3)))
return mask
This is the second image which will fill the white blank inside the first image
I used this code to cut the parts which is very similar to the code on first image.
def getface(img,eyebrows,eyes,mouth):
im=img.copy()
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
imArray = np.asarray(img)
# create mask
maskIm = Image.new('L', (imArray.shape[1], imArray.shape[0]), 0)
righteyes=eyes[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(righteyes, outline=1,fill='white')
lefteyes=eyes[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(lefteyes, outline=1,fill='white')
# draw eyebrows
rightbrows=eyebrows[0:6].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(rightbrows, outline=2, fill='white')
leftbrows=eyebrows[6:].flatten().tolist()
ImageDraw.Draw(maskIm).polygon(leftbrows, outline=2, fill='white')
# draw mouth
mouth=mouth.flatten().tolist()
ImageDraw.Draw(maskIm).polygon(mouth, outline=1, fill='white')
cutted_part = np.array(maskIm)
cutted_part = cv2.bitwise_or(im,im,mask=mask)
return cutted_part
So far I have tried to combine those two images by first inversing the second image so that the black background become white and then multiply the first and second image. But the result isn't satisfactory.
As you can see, there are some white space between the combined area and I notice that some part from second image become smaller or missing which I suspect create those white space when combined (Please don't mind the slightly different color on the result). Maybe someone can share how to resolve this problem or has better ways to combine 2 images together?
If you provide your results as actual pictures instead of cropped screenshots we can reproduce your problem, so far i would recommend:
Invert the background of your cutout (black to white) and then simply combine both pictures either by adding them (They need to have the same dimensions, which i presume is the case.) or overlaying them by using opencv's addWeighted function to adjust opacity.

How to copy a cropped image onto the original one, given the coordinates of the center of the crop

I'm cropping an image like this:
self.rst = self.img_color[self.param_a_y:self.param_b_y,
self.param_a_x:self.param_b_x:, ]
How do I copy this image back to the original one. The data I have available are the coordinates of the original image, which makes the center of the crop.
Seems like there's nocopy_to() function for python
I failed myself getting copy_to() working a few days ago, but came up with a difeerent solution: You can uses masks for this task.
I have an example at hand which shows how to create a mask from a defined colour range using inrange. With that mask, you create two partial images (=masks), one for the old content and one for the new content, the not used area in both images is back. Finally, a simple bitwise_or combines both images.
This works for arbitrary shapes, so you can easily adapt this to rectangular ROIs.
import cv2
import numpy as np
img = cv2.imread('image.png')
rows,cols,bands = img.shape
print rows,cols,bands
# Create image with new colour for replacement
new_colour_image= np.zeros((rows,cols,3), np.uint8)
new_colour_image[:,:]= (255,0,0)
# Define range of color to be exchanged (in this case only one single color, but could be range of colours)
lower_limit = np.array([0,0,0])
upper_limit = np.array([0,0,0])
# Generate mask for the pixels to be exchanged
new_colour_mask = cv2.inRange(img, lower_limit, upper_limit)
# Generate mask for the pixels to be kept
old_image_mask=cv2.bitwise_not(new_colour_mask)
# Part of the image which is kept
img2= cv2.bitwise_and(img,img, old_image_mask)
# Part of the image which is replaced
new_colour_image=cv2.bitwise_and(new_colour_image,new_colour_image, new_colour_mask)
#Combination of the two parts
result=cv2.bitwise_or(img2, new_colour_image)
cv2.imshow('image',img)
cv2.imshow('mask',new_colour_mask)
cv2.imshow('r',result)
cv2.waitKey(0)

Categories

Resources