I have many x-ray scans and need to crop the scanned object from its background noise.
The files are in .png format and I am planning to use OpenCV Python for this task. I have seen some works with FindContours() but unsure that thresholding will work for this case.
Before Image:
After/Cropped Image:
Any suggested solution/code is appreciated.
Here is one way to do that in Python/OpenCV. It assumes you have the same excess border in all your images so that one can sort contours by area and skip the largest contour to get the second largest one.
import cv2
import numpy as np
# load image
img = cv2.imread("table_xray.jpg")
hh, ww = img.shape[:2]
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# median filter
filt = cv2.medianBlur(gray, 15)
# threshold the filtered image and invert
thresh = cv2.threshold(filt, 64, 255, cv2.THRESH_BINARY)[1]
thresh = 255 - thresh
# find contours and store index with area in list
cntrs_info = []
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
for cntr in contours:
area = cv2.contourArea(cntr)
print(index, area)
index = index + 1
# sort contours by area
def takeSecond(elem):
return elem[1]
cntrs_info.sort(key=takeSecond, reverse=True)
# get bounding box of second largest contour skipping large border
index_second = cntrs_info[1][0]
x,y,w,h = cv2.boundingRect(contours[index_second])
# crop input image
results = img[y:y+h,x:x+w]
# write result to disk
cv2.imwrite("table_xray_thresholded.png", thresh)
cv2.imwrite("table_xray_extracted.png", results)
cv2.imshow("THRESH", thresh)
cv2.imshow("RESULTS", results)
Filtered and Thresholded Image:
Cropped Result:
This is another possible solution. It uses the K-Channel of your input image, once converted to the CMYK color-space. The K (or Key) channel has most of the information of the black color, so it should be useful for segmenting the input image. After that, you can apply a heavy morphological chain to produce a good mask of the object. After that, cropping the object is very straightforward. Let's see the code:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "D://opencvImages//"
inputImage = cv2.imread(imagePath+"jU6QA.jpg")
# Convert to float and divide by 255:
imgFloat = inputImage.astype(np.float) / 255.
# Calculate channel K:
kChannel = 1 - np.max(imgFloat, axis=2)
# Convert back to uint 8:
kChannel = (255*kChannel).astype(np.uint8)
The first bit of the program converts your image to the CMYK color-space and extracts the K channel. OpenCV has no direct conversion to this color-space, so a manual conversion is necessary. We need to be careful with the data types because there are float operations involved. The resulting image is this:
Pixels with black information are assigned an intensity close to 255. Now, let's threshold this image to get a binary mask. The threshold level is fixed:
# Threshold the image with a fixed thresh level
thresholdLevel = 200
_, binaryImage = cv2.threshold(kChannel, thresholdLevel, 255, cv2.THRESH_BINARY)
This produces the following binary image:
Alright. We need to isolate the object, however we have both the lines of the background and the "frame" around the image. Let's get rid of the lines first. We will apply a morphological Erosion. Then, we will remove the frame Flood-Filling with black color at two locations: upper left and bottom right of the image. After that, we will apply a Dilation to restore the object's original size. I wrapped these OpenCV functions inside custom functions that save me the typing of a couple of lines - These helper functions are presented at the end of the post. This is the approach:
# Perform Small Erosion:
binaryImage = morphoOperation(binaryImage, 3, 5, "Erode")
# Flood-Fill at two locations: Top left corner and bottom right:
(imageHeight, imageWidth) = binaryImage.shape[:2]
floodPositions = [(0, 0),(imageWidth-1, imageHeight-1)]
binaryImage = floodFill(binaryImage, floodPositions, 0)
# Perform Small Dilate:
binaryImage = morphoOperation(binaryImage, 3, 5, "Dilate")
This is the result:
Nice. We can improve the mask by applying a second morphological chain, this time with more iterations. Let's apply a Dilation to try and join the "holes" of the object, followed with a Erosion to, once again, restore the object's original size:
# Perform Big Dilate:
binaryImage = morphoOperation(binaryImage, 3, 10, "Dilate")
# Perform Big Erode:
binaryImage = morphoOperation(binaryImage, 3, 10, "Erode")
This yields the following result:
The gaps inside the object have been filled. Now, let's retrieve the contours on this mask to find the object's contour. I've additionally included an area filter. The mask is pretty clean by this point, so maybe this filter is not too necessary. Once the contour is located, we can crop the object from the original image:
# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# BGR image for drawing results:
binaryBGR = cv2.cvtColor(binaryImage, cv2.COLOR_GRAY2BGR)
# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
# Set a min area value:
minArea = 10000
if minArea < currentArea:
# Get the contour's bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Set bounding rect:
color = (0, 255, 0)
cv2.rectangle( binaryBGR, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 5 )
cv2.imshow("Rects", binaryBGR)
# Crop original input:
currentCrop = inputImage[rectY:rectY + rectHeight, rectX:rectX + rectWidth]
cv2.imshow("Cropped", currentCrop)
The last step produces the following two images. The first is the object enclosed by a rectangle, the second one is the actual crop:
I also tested the algorithm with your second image, these are the final results:
Wow. Somebody brought a gun to the airport? That's not OK. These are the helper functions used earlier. This first function performs the morphological operations:
def morphoOperation(binaryImage, kernelSize, opIterations, opString):
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Operation:
if opString == "Dilate":
if opString == "Erode":
op = cv2.MORPH_ERODE
outImage = cv2.morphologyEx(binaryImage, op, morphKernel, None, None, opIterations,
return outImage
The second function performs Flood-Filling given a list of seed-points:
def floodFill(binaryImage, positions, color):
# Loop thru the positions list of tuples:
for p in range(len(positions)):
currentSeed = positions[p]
x = int(currentSeed[0])
y = int(currentSeed[1])
# Apply flood-fill:
cv2.floodFill(binaryImage, mask=None, seedPoint=(x, y), newVal=(color))
return binaryImage
I want to retrieve all contours of the image below, but ignore text.
When I try to find the contours of the current image I get the following:
I have no idea how to go about this as I am new to using OpenCV and image processing. I want to get ignore the text, how can I achieve this? If ignoring is not possible but making a single bounding box surrounding the text is, than that would be good too.
Criteria that I need to match:
The contours may very in size and shape.
The colors from the image may differ.
The colors and size of the text inside the image may differ.
Here is one way to do that in Python/OpenCV.
Read the input
Convert to grayscale
Get Canny edges
Apply morphology close to ensure they are closed
Get all contour hierarchy
Filter contours to keep only those above threshold in perimeter
Draw contours on input
Draw each contour on a black background
Save results
import numpy as np
import cv2
# read input
img = cv2.imread('short_title.png')
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# get canny edges
edges = cv2.Canny(gray, 1, 50)
# apply morphology close to ensure they are closed
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
edges = cv2.morphologyEx(edges, cv2.MORPH_CLOSE, kernel)
# get contours
contours = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
contours = contours[0] if len(contours) == 2 else contours[1]
# filter contours to keep only large ones
result = img.copy()
i = 1
for c in contours:
perimeter = cv2.arcLength(c, True)
if perimeter > 500:
cv2.drawContours(result, c, -1, (0,0,255), 1)
contour_img = np.zeros_like(img, dtype=np.uint8)
cv2.drawContours(contour_img, c, -1, (0,0,255), 1)
i = i + 1
# save results
cv2.imwrite("short_title_gray.jpg", gray)
cv2.imwrite("short_title_edges.jpg", edges)
cv2.imwrite("short_title_contours.jpg", result)
# show images
cv2.imshow("gray", gray)
cv2.imshow("edges", edges)
cv2.imshow("result", result)
All contours on input:
Contour 1:
Contour 2:
Contour 3:
Contour 4:
Here are two options for erasing the text:
Using pytesseract OCR.
Finding white (and small) connected components.
Both solution build a mask, dilate the mask and use cv2.inpaint for erasing the text.
Using pytesseract:
Find text boxes using pytesseract.image_to_boxes.
Fill the boxes in the mask with 255.
Code sample:
import cv2
import numpy as np
from pytesseract import pytesseract, Output
# Tesseract path
pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread('ShortAndInteresting.png')
# https://stackoverflow.com/questions/20831612/getting-the-bounding-box-of-the-recognized-words-using-python-tesseract
boxes = pytesseract.image_to_boxes(img, lang='eng', config=' --psm 6') # Run tesseract, returning the bounding boxes
h, w, _ = img.shape # assumes color image
mask = np.zeros((h, w), np.uint8)
# Fill the bounding boxes on the image
for b in boxes.splitlines():
b = b.split(' ')
mask = cv2.rectangle(mask, (int(b[1]), h - int(b[2])), (int(b[3]), h - int(b[4])), 255, -1)
mask = cv2.dilate(mask, np.ones((5, 5), np.uint8)) # Dilate the boxes in the mask
clean_img = cv2.inpaint(img, mask, 2, cv2.INPAINT_NS) # Remove the text using inpaint (replace the masked pixels with the neighbor pixels).
# Show mask and clean_img for testing
cv2.imshow('mask', mask)
cv2.imshow('clean_img', clean_img)
Finding white (and small) connected components:
Use mask = cv2.inRange(img, (230, 230, 230), (255, 255, 255)) for finding the text (assume the text is white).
Finding connected components in the mask using cv2.connectedComponentsWithStats(mask, 4)
Remove large components from the mask - fill components with large area with zeros.
Code sample:
import cv2
import numpy as np
img = cv2.imread('ShortAndInteresting.png')
mask = cv2.inRange(img, (230, 230, 230), (255, 255, 255))
nlabel, labels, stats, centroids = cv2.connectedComponentsWithStats(mask, 4) # Finding connected components with statistics
# Remove large components from the mask (fill components with large area with zeros).
for i in range(1, nlabel):
area = stats[i, cv2.CC_STAT_AREA] # Get area
if area > 1000:
mask[labels == i] = 0 # Remove large connected components from the mask (fill with zero)
mask = cv2.dilate(mask, np.ones((5, 5), np.uint8)) # Dilate the text in the maks
cv2.imwrite('mask2.png', mask)
clean_img = cv2.inpaint(img, mask, 2, cv2.INPAINT_NS) # Remove the text using inpaint (replace the masked pixels with the neighbor pixels).
# Show mask and clean_img for testing
cv2.imshow('mask', mask)
cv2.imshow('clean_img', clean_img)
Clean image:
My assumption is that you know how to split the image into contours, and the only issue is the present of the text.
I would recommend using flood fill, find the seed point for each color region, flood fill it to ignore the text values within. Hope that helps!
Refer to example of using floodfill here: https://www.programcreek.com/python/example/89425/cv2.floodFill
Example below copied from link above
def fillhole(input_image):
input gray binary image get the filled image by floodfill method
Note: only holes surrounded in the connected regions will be filled.
:param input_image:
im_flood_fill = input_image.copy()
h, w = input_image.shape[:2]
mask = np.zeros((h + 2, w + 2), np.uint8)
im_flood_fill = im_flood_fill.astype("uint8")
cv.floodFill(im_flood_fill, mask, (0, 0), 255)
im_flood_fill_inv = cv.bitwise_not(im_flood_fill)
img_out = input_image | im_flood_fill_inv
return img_out
I have an image I am attempting to split into its separate components, I have successfully created a mask of the objects in the image using k-means clustering. (I have included the results and mask below)
I am then trying to crop each individual part of the original image and save it to a new image, is this possible?
import numpy as np
import cv2
path = 'software (1).jpg'
img = cv2.imread(path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
twoDimage = img.reshape((-1,3))
twoDimage = np.float32(twoDimage)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 2
ret,label,center = cv2.kmeans(twoDimage,K,None,criteria,attempts,cv2.KMEANS_PP_CENTERS)
center = np.uint8(center)
res = center[label.flatten()]
result_image = res.reshape((img.shape))
Original image
Result of k-means
My solution involves creating a binary object mask where all the objects are colored in white and the background in black. I then extract each object based on area, from smallest to smallest. I use this "isolated object" mask to segment each object in the original image. I then write the result to disk. These are the steps:
Resize the image (your original input is gigantic)
Convert to grayscale
Extract each object based on area from largest to smallest
Create a binary mask of the isolated object
Apply a little bit of morphology to enhance the mask
Mask the original BGR image with the binary mask
Apply flood-fill to color the background with white
Save image to disk
Repeat the process for all the objects in the image
Let's see the code. Through the script I use two helper functions: writeImage and findBiggestBlob. The first function is pretty self-explanatory. The second function creates a binary mask of the biggest blob in a binary input image. Both functions are presented here:
# Writes an PGN image:
def writeImage(imagePath, inputImage):
imagePath = imagePath + ".png"
cv2.imwrite(imagePath, inputImage, [cv2.IMWRITE_PNG_COMPRESSION, 0])
print("Wrote Image: " + imagePath)
def findBiggestBlob(inputImage):
# Store a copy of the input image:
biggestBlob = inputImage.copy()
# Set initial values for the
# largest contour:
largestArea = 0
largestContourIndex = 0
# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(inputImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Get the largest contour in the contours list:
for i, cc in enumerate(contours):
# Find the area of the contour:
area = cv2.contourArea(cc)
# Store the index of the largest contour:
if area > largestArea:
largestArea = area
largestContourIndex = i
# Once we get the biggest blob, paint it black:
tempMat = inputImage.copy()
cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)
# Erase smaller blobs:
biggestBlob = biggestBlob - tempMat
return biggestBlob
Now, let's check out the main script. Let's read the image and get the initial binary mask:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "D://opencvImages//"
inputImage = cv2.imread(imagePath + "L85Bu.jpg")
# Get image dimensions
originalImageHeight, originalImageWidth = inputImage.shape[:2]
# Resize at a fixed scale:
resizePercent = 30
resizedWidth = int(originalImageWidth * resizePercent / 100)
resizedHeight = int(originalImageHeight * resizePercent / 100)
# resize image
inputImage = cv2.resize(inputImage, (resizedWidth, resizedHeight), interpolation=cv2.INTER_LINEAR)
writeImage(imagePath+"objectInput", inputImage)
# Deep BGR copy:
colorCopy = inputImage.copy()
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 250, 255, cv2.THRESH_BINARY_INV)
This is the input resized by 30% according to resizePercent:
And this is the binary mask created with a fixed threshold of 250:
Now, I'm gonna run this mask through a while loop. With each iteration I'll extract the biggest blob until there's no blobs left. Each step will create a new binary mask where the only thing present is one object at a time. This will be the key to isolating the objects in the original (resized) BGR image:
# Image counter to write pngs to disk:
imageCounter = 0
# Segmentation flag to stop the processing loop:
segmentObjects = True
while (segmentObjects):
# Get biggest object on the mask:
currentBiggest = findBiggestBlob(binaryImage)
# Use a little bit of morphology to "widen" the mask:
kernelSize = 3
opIterations = 2
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Dilate:
binaryMask = cv2.morphologyEx(currentBiggest, cv2.MORPH_DILATE, morphKernel, None, None, opIterations,cv2.BORDER_REFLECT101)
# Mask the original BGR (resized) image:
blobMask = cv2.bitwise_and(colorCopy, colorCopy, mask=binaryMask)
# Flood-fill at the top left corner:
fillPosition = (0, 0)
# Use white color:
fillColor = (255, 255, 255)
colorTolerance = (0,0,0)
cv2.floodFill(blobMask, None, fillPosition, fillColor, colorTolerance, colorTolerance)
# Write file to disk:
writeImage(imagePath+"object-"+str(imageCounter), blobMask)
# Subtract current biggest blob to
# original binary mask:
binaryImage = binaryImage - currentBiggest
# Check for stop condition - all pixels
# in the binary mask should be black:
whitePixels = cv2.countNonZero(binaryImage)
# Compare agaisnt a threshold - 10% of
# resized dimensions:
whitePixelThreshold = 0.01 * (resizedWidth * resizedHeight)
if (whitePixels < whitePixelThreshold):
segmentObjects = False
There are some things worth noting here. This is the first isolated mask created for the first object:
Nice. A simple mask with the BGR image will do. However, I can improve the quality of the mask if I apply a dilate morphological operation. This will "widen" the blob, covering the original outline by a few pixels. (The operation actually searches for the maximum intensity pixel within a Neighborhood of pixels). Next, the masking will produce a BGR image where there's only the object blob and a black background. I don't want that black background, I want it white. I flood-fill at the top left corner to get the first BGR mask:
I save each mask a new file on disk. Very cool. Now, the condition to break from the loop is pretty simple - stop when all the blobs have been processed. To achieve this I subtract the current biggest blob to the original binary white and count the number of white pixels. When the count is below a certain threshold (in this case 10% of the resized image) stop the loop.
Check out this gif of every object isolated. Each frame is saved to disk as a png file:
detect external contours.
for each contour:
2.1 create black mask image
2.2 draw i-th filled contour on the mask image
2.3 create black result iamge for i-th part
2.4 copy with just created mask from the source image to the result image
3.5 save result image for i-th part
I tried the code provided bellow to segment each digit in this image and put a contour around it then crop it out but it's giving me bad results, I'm not sure what I need to change or work on.
The best idea I can think of right now is filtering the 4 largest contours in the image except the image contour itself.
The code I'm working with:
import sys
import numpy as np
import cv2
im = cv2.imread('marks/mark28.png')
im3 = im.copy()
gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0)
thresh = cv2.adaptiveThreshold(blur, 255, 1, 1, 11, 2)
################# Now finding Contours ###################
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
samples = np.empty((0, 100))
responses = []
keys = [i for i in range(48, 58)]
for cnt in contours:
if cv2.contourArea(cnt) > 50:
[x, y, w, h] = cv2.boundingRect(cnt)
if h > 28:
cv2.rectangle(im, (x, y), (x + w, y + h), (0, 0, 255), 2)
roi = thresh[y:y + h, x:x + w]
roismall = cv2.resize(roi, (10, 10))
cv2.imshow('norm', im)
key = cv2.waitKey(0)
if key == 27: # (escape to quit)
elif key in keys:
sample = roismall.reshape((1, 100))
samples = np.append(samples, sample, 0)
responses = np.array(responses, np.float32)
responses = responses.reshape((responses.size, 1))
"training complete"
np.savetxt('generalsamples.data', samples)
np.savetxt('generalresponses.data', responses)
I need to change the if condition on height probably but more importantly I need if conditions to get the 4 largest contours on the image. Sadly, I haven't managed to find what I'm supposed to be filtering.
This is the kind of results I get, I'm trying to escape getting those inner contours on the digit "zero"
Unprocessed images as requested: example 1 example 2
All I need is an idea on what I should filter for, don't write code please. Thank you community.
You almost have it. You have multiple bounding rectangles on each digit because you are retrieving every contour (external and internal). You are using cv2.findContours in RETR_LIST mode, which retrieves all the contours, but doesn't create any parent-child relationship. The parent-child relationship is what discriminates between inner (child) and outter (parent) contours, OpenCV calls this "Contour Hierarchy". Check out the docs for an overview of all hierarchy modes. Of particular interest is RETR_EXTERNAL mode. This mode fetches only external contours - so you don't get multiple contours and (by extension) multiple bounding boxes for each digit!
Also, it seems that your images have a red border. This will introduce noise while thresholding the image, and this border might be recognized as the top-level outer contour - thus, every other contour (the children of this parent contour) will not be fetched in RETR_EXTERNAL mode. Fortunately, the border position seems constant and we can eliminate it with a simple flood-fill, which pretty much fills a blob of a target color with a substitute color.
Let's check out the reworked code:
# Imports:
import cv2
import numpy as np
# Set image path
path = "D://opencvImages//"
fileName = "rhWM3.png"
# Read Input image
inputImage = cv2.imread(path+fileName)
# Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
The first step is to get the binary image with all the target blobs/contours. This is the result so far:
Notice the border is white. We have to delete this, a simple flood-filling at position (x=0,y=0) with black color will suffice:
# Flood-fill border, seed at (0,0) and use black (0) color:
cv2.floodFill(binaryImage, None, (0, 0), 0)
This is the filled image, no more border!
Now we can retrieve the external, outermost contours in RETR_EXTERNAL mode:
# Get each bounding box
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
Notice you also get each contour's hierarchy as second return value. This is useful if you want to check out if the current contour is a parent or a child. Alright, let's loop through the contours and get their bounding boxes. If you want to ignore contours below a minimum area threshold, you can also implement an area filter:
# Look for the outer bounding boxes (no children):
for _, c in enumerate(contours):
# Get the bounding rectangle of the current contour:
boundRect = cv2.boundingRect(c)
# Get the bounding rectangle data:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Estimate the bounding rect area:
rectArea = rectWidth * rectHeight
# Set a min area threshold
minArea = 10
# Filter blobs by area:
if rectArea > minArea:
# Draw bounding box:
color = (0, 255, 0)
cv2.rectangle(inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2)
cv2.imshow("Bounding Boxes", inputImageCopy)
# Crop bounding box:
currentCrop = inputImage[rectY:rectY+rectHeight,rectX:rectX+rectWidth]
cv2.imshow("Current Crop", currentCrop)
The last three lines of the above snippet crop and show the current digit. This is the result of detected bounding boxes for both of your images (the bounding boxes are colored in green, the red border is part of the input images):
the task that I'm trying to accomplish is isolating certain objects in an image through finding contours in the mask of the image, then taking each contour (based on area) and isolating it , and then using this contour to crop the same region in the original image, in order to get the pixel values of the region,
the code I wrote in order to get just one contour and then isolating it with the original pixel value:
import cv2
import matplotlib.pyplot as plt
import numpy as np
image = cv2.imread("./xxxx/xx.png")
mask = cv2.imread("./xxxx/xxx.png")
# making them the same size (function I wrote)
image, mask = resize_two_images(image,mask)
#grayscalling the mask (using cv2.cvtCOLOR)
mask = to_gray(mask)
# a function I wrote to display images using plt
display(image,"image: original image")
display(mask,"mask: mask of the image")
th, mask = cv2.threshold(mask, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
contours, hierarchy = cv2.findContours(
for i in range(len(contours)):
hier = hierarchy[0][i][3]
cnt = contours[i]
cntArea = cv2.contourArea(cnt)
if 1000 < cntArea < 2000:
#breaking because I'm just keeping the first contour that fills the condtion
# Creating two zero numpy array
img1 = np.zeros(image.shape, dtype=np.uint8)
img2 = img1.copy()
# drawing the contour which will basiclly give the edges of the object
cv2.drawContours(img1, [cnt], -1, (255,255,255), 2)
# drawing indise the the edges
cv2.fillPoly(img2, [cnt], (255,255,255))
# adding the filled poly with the drawn contour gives bigger object that
# contains the both the edges and the insides of the object
img1 = img1 + img2
display(img1,"img1: final")
res = np.bitwise_and(img1,image)
display(res,"res : the ROI with original pixel values")
#cropping the ROI (the object we want)
x,y,w,h = cv2.boundingRect(cnt)
# (de)increased values in order to get nonzero borders or lost pixels
res1 = res[y-1:y+h+1,x-1:x+w+1]
display(res1,"res1: cropped ROI")
The problem is that yes I found a way to do it for just one contour, but is there another way where I can do it more efficiently because per image there could be hundreds of contours.
It's not clear if you want to have just one image with all selected contours as the output, or one individual image per selected contour.
You could get one image with all selected contours in a efficient manner.
First select all the contours you want to work with, then, plot all the contours filling them with white color so you can use this as a mask, and then mask the original image:
selected_contours = [c for c in contours if cv2.contourArea(c) >= 2000]
# the last parameter, negative line thickness, fills the contour
mask = cv2.drawContours(img1, selected_contours, -1, (255,255,255), -1)
res = np.bitwise_and(mask,image)
I'm trying to identify a list of objects that appeared newly in a photo. The plan is to get multiple cropped images from the original image and feed them to a neural network for object detection. Right now, I'm having trouble in extracting objects that appeared in a frame.
import cv2 as cv
import matplotlib.pyplot as plt
def mdisp(image):
im1 = cv.imread('images/litter-before.jpg')
im2 = cv.imread('images/litter-after.jpg')
fgmask = backsub1.apply(im1)
fgmask = backsub1.apply(im2)
new_image = im2 * (fgmask[:,:,None].astype(im2.dtype))
Ideally, I would like to get a cropped picture of the item within red circle. How can I do it with OpenCv
Here's an approach, subtracting the two frames directly. The idea is that you first convert your images to grayscale, then blur a little bit to ignore the noise. Subtract the two frames, threshold the difference and look for the largest blob that is above a certain area threshold value.
Let's see:
import cv2
import numpy as np
# image path
path = "C:/opencvImages/"
fileName01 = "01.jpg"
fileName02 = "02.jpg"
# Read the2 images in default mode:
image01 = cv2.imread(path + fileName01)
image02 = cv2.imread(path + fileName02)
# Store a copy of the last frame for results drawing:
inputCopy = image02.copy()
# Convert RGB images to grayscale:
grayscaleImage01 = cv2.cvtColor(image01, cv2.COLOR_BGR2GRAY)
grayscaleImage02 = cv2.cvtColor(image02, cv2.COLOR_BGR2GRAY)
# Convert RGB images to grayscale:
filterSize = 5
imageMedian01 = cv2.medianBlur(grayscaleImage01, filterSize)
imageMedian02 = cv2.medianBlur(grayscaleImage02, filterSize)
Now you have the grayscale, blurred frames. Next, we need to calculate the difference between these frames. I don't wanna loose data, so I have to be careful with the data type here. Remember that these are grayscale, uint8 matrices, but the difference could potentially yield negative values. Let's convert the matrices to floats, take the difference, and convert this matrix to uint8:
# uint8 to float32 conversion:
imageMedian01 = imageMedian01.astype('float32')
imageMedian02 = imageMedian02.astype('float32')
# Take the difference and convert back to uint8
imageDifference = np.clip(imageMedian01 - imageMedian02, 0, 255)
imageDifference = imageDifference.astype('uint8')
This gives you the frames difference:
Let's threshold this to get a binary image. I'm using a threshold value of 127, as it is a the center of the 8-bit range:
threshValue = 127
_, binaryImage = cv2.threshold(imageDifference, threshValue, 255, cv2.THRESH_BINARY)
This is the binary image:
We are looking for the biggest blob here, let's find blob/contours and filter the small ones. Let's set a minimum area of 10 pixels:
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(binaryImage, connectivity=4)
# Set the minimum pixels for the area filter:
minArea = 10
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype('uint8')
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(filteredImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
contours_poly = [None] * len(contours)
boundRect = []
# Alright, just look for the outer bounding boxes:
for i, c in enumerate(contours):
if hierarchy[0][i][3] == -1:
contours_poly[i] = cv2.approxPolyDP(c, 3, True)
# Draw the bounding boxes on the (copied) input image:
for i in range(len(boundRect)):
color = (0, 255, 0)
cv2.rectangle(inputCopy, (int(boundRect[i][0]), int(boundRect[i][1])), \
(int(boundRect[i][0] + boundRect[i][2]), int(boundRect[i][1] + boundRect[i][3])), color, 1)
Check out the results: