Bounding Box selecting entire image instead of polygons - python

I am trying to draw bounding boxes around polygon images.I orginally marked the images and created the polygon masks using a json file.
This is my code:
I use the json file to keep the names of the files constant.
import cv2
import numpy as np
import json
import matplotlib.pyplot as plt
jsonFile ='/directory..../.json'
with open(jsonFile) as file:
annotations = json.load(file)
for key in annotations:
regions = annotations[key]['regions']
for region in regions:
print(annotations[key]['filename'],"\n")
image = cv2.imread('/directory to mask images.png' + annotations[key]['filename'])
original = image.copy()
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0,255,cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours, obtain bounding box, extract and save ROI
ROI_number = 0
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
ROI = original[y:y+h, x:x+w]
cv2.imwrite('ROI_{}.png'.format(annotations[key]['filename']), ROI)
ROI_number += 1
cv2.imshow('image', image)
cv2.waitKey()
My problem is instead of drawing bounding boxes over the polygons in the picture, I am just outlining the border of the full image.
If it matters, the image is black and the polygon masks are white.

I think the problem might come from the threshold operation. If the original image is black with white polygons, then the cv2.THRESH_BINARY_INV will transform it into a white image with black polygons. Since the contour function finds the polygons surrounding the white space, the resulting polygon will surround the whole image. In order to fix this, just use cv2.THRESH_BINARY instead of cv2.THRESH_BINARY_INV.

Related

How to split an image into multiple small images based on the blank area or edge detection Python OpenCV?

I have a lot of images similar to it. The size of each text / roman number may be different since it's handwriting.
How do I save each text/roman number to .png format individually?
There are 9 texts with one dot. So the output should be 10 or 9 images respectively. The space between each text/ roman number is different. Should I crop them based on the canny edges or any better ways?
I am not sure how difficult it is, since I am a beginner in CV. But I am planning to do this for my project.
The main idea is to use dilation to combine individual contours together and then individually crop each. Here's a simple method
Obtain binary image. Load the image, grayscale, Gaussian blur, Otsu's threshold, then dilate to obtain a binary black/white image.
Extract ROI. Find contours, obtain bounding boxes, extract ROI using Numpy slicing, and save each ROI
Detected ROIs highlighted in green
Extracted and saved ROIs
import cv2
# Load image, grayscale, Gaussian blur, Otsu's threshold, dilate
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15,15))
dilate = cv2.dilate(thresh, kernel, iterations=2)
# Find contours, obtain bounding box coordinates, and extract ROI
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
image_number = 0
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 3)
ROI = original[y:y+h, x:x+w]
cv2.imwrite("ROI_{}.png".format(image_number), ROI)
image_number += 1
cv2.imshow('image', image)
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.waitKey()

Cropping vertical countours in OpenCV, getting bounding box borders instead of area

I am trying to extract columns of a table in an image using opencv. I have managed to successfully identify the vertical regions of interest as shown in the image below:
My problem is when I am trying to extract and save those regions of interest I am getting a 6 vertical lines of the border of the bounding rectangle as opposed to the region in between them.
This is the code I am using to achieve this:
import cv2
import numpy as np
image = cv2.imread('x.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,50))
vertical_mask = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=1)
cnts = cv2.findContours(vertical_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0]
for c in cnts:
cv2.drawContours(image, [c], -1, (36,255,12), -1)
idx = 0
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
idx+=1
new_img=image[y:y+h,x:x+w]
cv2.imwrite(str(idx) + '.png', new_img)
cv2.imshow("im.png",image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Any ideas as to how I can solve this problem?
Edit: This 2 is is the image of the right most border, as you can see there is some text
The bounding boxes you are drawing are from each line, not from the columns. What you should do is take the two points from some line, take the other two points from the line next to it, and using this 4 points you have defined the rectangle which is your column.
All this, of course assuming they are sorted from left to right, by coordinates. But opencv doesn't guarantee us they are sorted that way.
So sort them first, and then draw rectangles over consecutive two lines

Extracting and saving characters from an image

I am following up from this post: How to extract only characters from image?
This solution works perfectly for me (with some tweaking) for its desired purpose. However, I am attempting to take it 1 step further by saving each of the characters. So in the example of this post, I wish to save the characters K, N, and M as their own individual images. I attempted iterating through the nested if loop with a cv2.imwrite function with the rect object although the final output is 7 images containing the overall image with just an additional rectangle to highlight the next contour each time.
Here's a simple approach:
Obtain binary image. Load image, grayscale, Otsu's threshold
Extract ROIs. Find contours
and sort from left-to-right to ensure we have the contours in the correct order with imutils.contours.sort_contours. We filter using contour area then extract and save each ROI using Numpy slicing.
Input
Binary image
Detected characters highlighted in green
Extracted ROIs
Code
import cv2
from imutils import contours
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)[1]
# Find contours, sort from left-to-right, then crop
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts, _ = contours.sort_contours(cnts, method="left-to-right")
# Filter using contour area and extract ROI
ROI_number = 0
for c in cnts:
area = cv2.contourArea(c)
if area > 10:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
ROI_number += 1
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()

How to detect diagram region and extract(crop) it from a research paper's Image

I want to extract all the figures/diagrams from this research paper image. How can I do that?
I just want the figure part is there any way I could crop out that part?
To extract only the figures/diagram here's an approach using OpenCV:
Obtain binary image. Load the image, convert to grayscale and Otsu's threshold to get a binary image.
Connect text contours. We take advantage of the observation that text is structured in paragraphs so we can dilate with a horizontal contour to connect individual words into a single contour.
Remove non-diagram contours. We find contours and filter using aspect ration and contour area. We effectively remove non-diagram contours by filling in the contour.
Form a single bounding box. Iterate through remaining contours and determine the bounding box coordinates
Extract ROI. Crop/Extract the diagram using Numpy slicing.
Here's a visualization of each step:
Threshold image
Dilate with horizontal kernel
Filtering to remove non-diagram contours
Detected diagram bounding box
Extracted ROI
Note: This approach is with the assumption that there is only one diagram in the image. If there are multiple, then you can remove step #4 to get multiple ROIs and save each individual ROI as a separate image. I'm sure it would be an easy change :)
Code
import cv2
import numpy as np
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Dilate with horizontal kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20,10))
dilate = cv2.dilate(thresh, kernel, iterations=2)
# Find contours and remove non-diagram contours
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
area = cv2.contourArea(c)
if w/h > 2 and area > 10000:
cv2.drawContours(dilate, [c], -1, (0,0,0), -1)
# Iterate through diagram contours and form single bounding box
boxes = []
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x, y, w, h = cv2.boundingRect(c)
boxes.append([x,y, x+w,y+h])
boxes = np.asarray(boxes)
x = np.min(boxes[:,0])
y = np.min(boxes[:,1])
w = np.max(boxes[:,2]) - x
h = np.max(boxes[:,3]) - y
# Extract ROI
cv2.rectangle(image, (x,y), (x + w,y + h), (36,255,12), 3)
ROI = original[y:y+h, x:x+w]
cv2.imshow('image', image)
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('ROI', ROI)
cv2.waitKey()

Extract all bounding boxes using OpenCV Python

I have an image that contains more than one bounding box.
I need to extract everything that has bounding boxes in them. So far, from this site I've gotten this answer:
y = img[by:by+bh, bx:bx+bw]
cv2.imwrite(string + '.png', y)
It works, however, it only gets one. How should I modify the code? I tried putting it in the loop for contours but it still spews out one image instead of multiple ones.
Thank you so much in advance.
there you go:
import cv2
im = cv2.imread('c:/data/ph.jpg')
gray=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
contours, hierarchy = cv2.findContours(gray,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)[-2:]
idx =0
for cnt in contours:
idx += 1
x,y,w,h = cv2.boundingRect(cnt)
roi=im[y:y+h,x:x+w]
cv2.imwrite(str(idx) + '.jpg', roi)
#cv2.rectangle(im,(x,y),(x+w,y+h),(200,0,0),2)
cv2.imshow('img',im)
cv2.waitKey(0)
A simple approach is to find contours, obtain the bounding rectangle coordinates using cv2.boundingRect()
then extract the ROI using Numpy slicing. We can keep a counter to save each ROI then save it with cv2.imwrite(). Here's a working example:
Input image:
Detected ROIs to extract highlighted in green
Saved ROIs
Code
import cv2
import numpy as np
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours, obtain bounding box, extract and save ROI
ROI_number = 0
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
ROI = original[y:y+h, x:x+w]
cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
ROI_number += 1
cv2.imshow('image', image)
cv2.waitKey()

Categories

Resources