How to find rectangles in a full transparent object?

How to find rectangles in a full transparent object? - python

I have an input image of a fully transparent object:
I need to detect the 42 rectangles in this image. This is an example of the output image I need (I marked 6 rectangles for better understanding):
The problem is that the rectangles look really different. I have to use this input image.
How can I achieve this?
Edit 1: Here is a input image as png:

If you calculate the variance down the rows and across the columns, using:
import cv2
import numpy as np
im = cv2.imread('YOURIMAGE', cv2.IMREAD_GRAYSCALE)
# Calculate horizontal and vertical variance
h = np.var(im, axis=1)
v = np.var(im, axis=0)
You can plot them and hopefully locate the peaks of variance which should be your objects:

Mark Setchell's idea is out-of-the-box. Here is a more traditional approach.
Approach:
The image contains boxes whose intensity fades away in the lower rows. Using global equalization would fail here since the intensity changes of the entire image is taken into account. I opted for a local equalization approach in OpenCV this is available as CLAHE (Contrast Limited Adaptive Histogram Equalization))
Using CLAHE:
Equalization is applied on individual regions of the image whose size can be predefined.
To avoid over amplification, contrast limiting is applied, (hence the name).
Let's see how to use it in our problem:
Code:
# read image and store green channel
green_channel = img[:,:,1]
# grid-size for CLAHE
ts = 8
# initialize CLAHE function with parameters
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(ts, ts))
# apply the function
cl = clahe.apply(green_channel)
Notice the image above, the boxes in the lower regions appear slightly darker as expected. This will help us later on.
# apply Otsu threshold
r,th_cl = cv2.threshold(cl, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
# dilation performed using vertical kernels to connect disjoined boxes
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 3))
dilate = cv2.dilate(th_cl, vertical_kernel, iterations=1)
# find contours and draw bounding boxes
contours, hierarchy = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
img2 = img.copy()
for c in contours:
area = cv2.contourArea(c)
if area > 100:
x, y, w, h = cv2.boundingRect(c)
img2 = cv2.rectangle(img2, (x, y), (x + w, y + h), (0,255,255), 1)
(The top-rightmost box isn't covered properly. You would need to tweak the various parameters to get an accurate result)
Other pre-processing approaches you can try:
Global equalization
Contrast stretching
Normalization

Related

how to fill the hollow lines opencv

I have an image like this:
after I applied some processings e.g. cv2.Canny(), it looks like this now:
As you can see that the black lines become hollow.
I have tried erosion and dilation, but if I do them many times, the 2 entrances will be closed(meaning become connected line or closed contour).
How could I make those lines solid like the below image while keep the 2 entrances not affected?
Update 1
I have tested the following answers with a few of photos, but the code seems customized to only be able to handle this one particular picture. Due to the restriction of SOF, I cannot upload photos larger than 2MB, so I uploaded them into my Microsoft OneDrive folder for your convenience to test.
https://1drv.ms/u/s!Asflam6BEzhjgbIhgkL4rt1NLSjsZg?e=OXXKBK
Update 2
I picked up #fmw42's post as answer as his answer is the most detailed one. It doesn't answer my question but points out the correct way to process maze which is my ultimate goal. I like his approach of answering questions, firstly tells you what each step should do so that you have a clear idea about how to do the task, then provide the full code example from beginning to end. Very helpful.
Due to the limitation of SOF, I can only pick up one answer. If multiple answers are allowed, I would also pick up Shamshirsaz.Navid's answer. His answer not only points to the correct direction to solve the issue, but also the explanation with visualization really works well for me~! I guess it works equally well for all people who are trying to understand why each line of code is needed. Also he follows up my questions in comments, this makes the SOF a bit interactive :)
The Threshold track bar in Ann Zen's answer is also a very useful tip for people to quickly find out a optimal value.

Here is one way to process the maze and rectify it in Python/OpenCV.
Read the input
Convert to gray
Threshold
Use morphology close to remove the thinnest (extraneous) black lines
Invert the threshold
Get the external contours
Keep on those contours that are larger than 1/4 of both the width and height of the input
Draw those contours as white lines on black background
Get the convex hull from the white contour lines image
Draw the convex hull as white lines on black background
Use GoodFeaturesToTrack to get the 4 corners from the white hull lines image
Sort the 4 corners by angle relative to the centroid so that they are ordered clockwise: top-left, top-right, bottom-right, bottom-left
Set these points as the array of conjugate control points for the input
Use 1/2 the dimensions of the input to define the array of conjugate control points for the output
Compute the perspective transformation matrix
Warp the input image using the perspective matrix
Save the results
Input:
import cv2
import numpy as np
import math
# load image
img = cv2.imread('maze.jpg')
hh, ww = img.shape[:2]
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
# use morphology to remove the thin lines
kernel = cv2.getStructuringElement(cv2.MORPH_RECT , (5,1))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# invert so that lines are white so that we can get contours for them
thresh_inv = 255 - thresh
# get external contours
contours = cv2.findContours(thresh_inv, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
# keep contours whose bounding boxes are greater than 1/4 in each dimension
# draw them as white on black background
contour = np.zeros((hh,ww), dtype=np.uint8)
for cntr in contours:
x,y,w,h = cv2.boundingRect(cntr)
if w > ww/4 and h > hh/4:
cv2.drawContours(contour, [cntr], 0, 255, 1)
# get convex hull from contour image white pixels
points = np.column_stack(np.where(contour.transpose() > 0))
hull_pts = cv2.convexHull(points)
# draw hull on copy of input and on black background
hull = img.copy()
cv2.drawContours(hull, [hull_pts], 0, (0,255,0), 2)
hull2 = np.zeros((hh,ww), dtype=np.uint8)
cv2.drawContours(hull2, [hull_pts], 0, 255, 2)
# get 4 corners from white hull points on black background
num = 4
quality = 0.001
mindist = max(ww,hh) // 4
corners = cv2.goodFeaturesToTrack(hull2, num, quality, mindist)
corners = np.int0(corners)
for corner in corners:
px,py = corner.ravel()
cv2.circle(hull, (px,py), 5, (0,0,255), -1)
# get angles to each corner relative to centroid and store with x,y values in list
# angles are clockwise between -180 and +180 with zero along positive X axis (to right)
corner_info = []
center = np.mean(corners, axis=0)
centx = center.ravel()[0]
centy = center.ravel()[1]
for corner in corners:
px,py = corner.ravel()
dx = px - centx
dy = py - centy
angle = (180/math.pi) * math.atan2(dy,dx)
corner_info.append([px,py,angle])
# function to define sort key as element 2 (i.e. angle)
def takeThird(elem):
return elem[2]
# sort corner_info on angle so result will be TL, TR, BR, BL order
corner_info.sort(key=takeThird)
# make conjugate control points
# get input points from corners
corner_list = []
for x, y, angle in corner_info:
corner_list.append([x,y])
print(corner_list)
# define input points from (sorted) corner_list
input = np.float32(corner_list)
# define output points from dimensions of image, say half of input image
width = ww // 2
height = hh // 2
output = np.float32([[0,0], [width-1,0], [width-1,height-1], [0,height-1]])
# compute perspective matrix
matrix = cv2.getPerspectiveTransform(input,output)
# do perspective transformation setting area outside input to black
result = cv2.warpPerspective(img, matrix, (width,height), cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT, borderValue=(0,0,0))
# save output
cv2.imwrite('maze_thresh.jpg', thresh)
cv2.imwrite('maze_contour.jpg', contour)
cv2.imwrite('maze_hull.jpg', hull)
cv2.imwrite('maze_rectified.jpg', result)
# Display various images to see the steps
cv2.imshow('thresh', thresh)
cv2.imshow('contour', contour)
cv2.imshow('hull', hull)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Thresholded Image after morphology:
Filtered Contours on black background:
Convex hull and 4 corners on input image:
Result from perspective warp:

You can try a simple threshold to detect the lines of the maze, as they are conveniently black:
import cv2
img = cv2.imread("maze.jpg")
gray = cv2.cvtColor(img, cv2.BGR2GRAY)
_, thresh = cv2.threshold(gray, 60, 255, cv2.THRESH_BINARY)
cv2.imshow("Image", thresh)
cv2.waitKey(0)
Output:
You can adjust the threshold yourself with trackbars:
import cv2
cv2.namedWindow("threshold")
cv2.createTrackbar("", "threshold", 0, 255, id)
img = cv2.imread("maze.jpg")
while True:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
t = cv2.getTrackbarPos("", "threshold")
_, thresh = cv2.threshold(gray, t, 255, cv2.THRESH_BINARY)
cv2.imshow("Image", thresh)
if cv2.waitKey(1) & 0xFF == ord("q"): # If you press the q key
break

Canny is an edge detector. It detects the lines along which color changes. A line in your input image has two such transitions, one on each side. Therefore you see two parallel lines on each side of a line in the image. This answer of mine explains the difference between edges and lines.
So, you shouldn’t be using an edge detector to detect lines in an image.
If a simple threshold doesn't properly binarize this image, try using a local threshold ("adaptive threshold" in OpenCV). Another thing that works well for images like these is applying a top hat filter (for this image, it would be a closing(img) - img), where the structuring element is adjusted to the width of the lines you want to find. This will result in an image that is easy to threshold and will preserve all lines thinner than the structuring element.

Check this:
import cv2
import numpy as np
im=cv2.imread("test2.jpg",1)
#convert 2 gray
mask=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
#convert 2 black and white
mask=cv2.threshold(mask,127,255,cv2.THRESH_BINARY)[1]
#remove thin lines and texts and then remake main lines
mask=cv2.dilate(mask,np.ones((5, 5), 'uint8'))
mask=cv2.erode(mask,np.ones((4, 4), 'uint8'))
#smooth lines
mask=cv2.medianBlur(mask,3)
#write output mask
cv2.imwrite("mask2.jpg",mask)
From now on, everything can be done. You can delete extra blobs, you can extract lines from the original image according to the mask, and things like that.
Median:
Median changes are not much for this project. And it can be safely removed. But I prefer it because it rounds the ends of the lines a bit. You have to zoom in a lot to see the pixels. But this technique is usually used to remove salt/pepper noise.
Erode Kernel:
In the case of the kernel, the larger the number, the thicker the lines. Well, this is not always good. Because it causes the path lines to stick to the arrow and later it becomes difficult to separate the paths from the arrow.
Update:
It does not matter if part of the Maze is cleared. The important thing is that from this mask you can draw a rectangle around this shape and create a new mask for this image.
Make a white rectangle around these paths in a new mask. Completely whiten the inside of the mask with FloodFill or any other technique. Now you have a new mask that can take the whole shape out of the original image. Now in the next step you can correct Perspective.

Watershed segmentation on spherical reflecting objects

I am trying to do some image segmentation with watershed to detect all balls in an image. I have followed a pyimage tuto. But I am getting very poor results. My guess is that the reflexion is the problem. Still the image is pretty clean and the instances look quite separable.
Am I using the correct approach here? Did I miss somethings?
I tested cellpose and I get almost perfect results. It's not the same approach of course and I was hopping to get something with "classical" computer-vision techniques.
Following is the code I have, the original image and the current result. I have tried to change the parameters, but I am not sure about what I am doing here. I also looked at inRange, but I am afraid the balls are never of the same color.
original image: https://i.stack.imgur.com/7595R.jpg
import numpy as np
from scipy import ndimage
import cv2
from skimage.feature import peak_local_max
from skimage.segmentation import watershed
import imutils
from matplotlib import pyplot as plt
img = cv2.imread('balls.jpg')
gray = - cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
plt.imshow(gray)
# Things I tried...
# gray = cv2.dilate(gray,kernel,iterations = 1)
# hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# h, s, v = cv2.split(hsv)
thresh = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
# compute the exact Euclidean distance from every binary
# pixel to the nearest zero pixel, then find peaks in this
# distance map
D = ndimage.distance_transform_edt(thresh)
localMax = peak_local_max(D, indices=False, min_distance=30, labels=thresh)
# perform a connected component analysis on the local peaks,
# using 8-connectivity, then appy the Watershed algorithm
markers = ndimage.label(localMax, structure=np.ones((3, 3)))[0]
labels = watershed(-D, markers, mask=thresh)
# draw on mask
for label in np.unique(labels):
# if the label is zero -> 'background'
if label == 0:
continue
mask = np.zeros(gray.shape, dtype="uint8")
mask[labels == label] = 255
# detect contours in the mask and grab the largest one
cnts = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
c = max(cnts, key=cv2.contourArea)
# draw a circle enclosing the object
((x, y), r) = cv2.minEnclosingCircle(c)
cv2.circle(img, (int(x), int(y)), int(r), (0, 255, 0), 2)
cv2.putText(img, "#{}".format(label), (int(x) - 10, int(y)),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)
plt.imshow(img)
The labels : https://i.stack.imgur.com/M6hZb.png

matchTemplate "solution" with opencv/samples/mouse_and_match.py
use whatever you like to find the peaks.
yes, with this approach you gotta pick a template manually.
to fix that, there could be approaches exploiting self-similarity (auto-correlation). exercise to the reader.
you can't pick whole balls because their sizes vary, so that's a huge downside to template matching already, but also a rectangle around a circle contains a significant amount of non-object pixels, which drives down the correlation score wherever that part varies.
picking the reflection works (off of a medium ball) because the reflection shows an environment with nice strong contrast.
notice the one small ball near the top, slightly to the right? that's not doing so well for a bunch of reasons.

Opencv not finding all contours

I'm trying to find the contours of this image, but the method findContours only returns 1 contour, the contour is highlighted in image 2. I'm trying to find all external contours like these circles where the numbers are inside. What am i doing wrong? What can i do to accomplish it?
image 1:
image 2:
Below is the relevant portion of my code.
thresh = cv2.threshold(image, 0, 255,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
When i change cv2.RETR_EXTERNAL to cv2.RETR_LIST it seems to detect the same contour twice or something like this. Image 3 shows when the border of circle is first detected and then it is detected again as shows image 4. I'm trying to find only outer borders of these circles. How can i accomplish that?
image 3
image 4

The problem is the flag cv2.RETR_EXTERNAL that you used in the function call. As described in the OpenCV documentation, this only returns the external contour.
Using the flag cv2.RETR_LIST you get all contours in the image. Since you try to detect rings, this list will contain the inner and the outer contour of these rings.
To filter the outer boundary of the circles, you could use cv2.contourArea() to find the larger of two overlapping contours.

I am not sure this is really what you expect nevertheless in case like this there is many way to help findContours to do its job.
Here is a way I use frequently.
Convert your image to gray
Ig = cv2.cvtColor(I,cv2.COLOR_BGR2GRAY)
Thresholding
The background and foreground values looklike quite uniform in term of colours but locally they are not so I apply an thresholding based on Otsu's method in order to binarise the intensities.
_,It = cv2.threshold(Ig,0,255,cv2.THRESH_OTSU)
Sobel magnitude
In order to extract only the contours I process the magnitude of the Sobel edges detector.
sx = cv2.Sobel(It,cv2.CV_32F,1,0)
sy = cv2.Sobel(It,cv2.CV_32F,0,1)
m = cv2.magnitude(sx,sy)
m = cv2.normalize(m,None,0.,255.,cv2.NORM_MINMAX,cv2.CV_8U)
thinning (optional)
I use the thinning function which is implemented in ximgproc.
The interest of the thining is to reduce the contours thickness to as less pixels as possible.
m = cv2.ximgproc.thinning(m,None,cv2.ximgproc.THINNING_GUOHALL)
Final Step findContours
_,contours,hierarchy = cv2.findContours(m,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
disp = cv2.merge((m,m,m)
disp = cv2.drawContours(disp,contours,-1,hierarchy=hierarchy,color=(255,0,0))
Hope it help.
I think an approach based on SVM or a CNN might be more robust.
You can find an example here.
This one may also be interesting.
-EDIT-
I found a let say easier way to reach your goal.
Like previously after loading the image applying a threshold ensure that the image is binary.
By reversing the image using a bitwise not operation the contours become white over a black background.
Applying cv2.connectedComponentsWithStats return (among others) a label matrix in which each connected white region in the source has been assign a unique label.
Then applying findContours based on the labels it is possible give the external contours for every areas.
import numpy as np
import cv2
from matplotlib import pyplot as plt
I = cv2.imread('/home/smile/Downloads/ext_contours.png',cv2.IMREAD_GRAYSCALE)
_,I = cv2.threshold(I,0.,255.,cv2.THRESH_OTSU)
I = cv2.bitwise_not(I)
_,labels,stats,centroid = cv2.connectedComponentsWithStats(I)
result = np.zeros((I.shape[0],I.shape[1],3),np.uint8)
for i in range(0,labels.max()+1):
mask = cv2.compare(labels,i,cv2.CMP_EQ)
_,ctrs,_ = cv2.findContours(mask,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
result = cv2.drawContours(result,ctrs,-1,(0xFF,0,0))
plt.figure()
plt.imshow(result)
P.S. Among the outputs return by the function findContours there is a hierachy matrix.
It is possible to reach the same result by analyzing that matrix however it is a little bit more complex as explain here.

Instead of finding contours, I would suggest applying the Hough circle transform using the appropriate parameters.
Finding contours poses a challenge. Once you invert the binary image the circles are in white. OpenCV finds contours both along the outside and the inside of the circle. Moreover since there are letters such as 'A' and 'B', contours will again be found along the outside of the letters and within the holes. You can find contours using the appropriate hierarchy criterion but it is still tedious.
Here is what I tried by finding contours and using hierarchy:
Code:
#--- read the image, convert to gray and obtain inverse binary image ---
img = cv2.imread('keypad.png', 1)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV|cv2.THRESH_OTSU)
#--- find contours ---
_, contours, hierarchy = cv2.findContours(binary, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE)
#--- copy of original image ---
img2 = img.copy()
#--- select contours having a parent contour and append them to a list ---
l = []
for h in hierarchy[0]:
if h[0] > -1 and h[2] > -1:
l.append(h[2])
#--- draw those contours ---
for cnt in l:
if cnt > 0:
cv2.drawContours(img2, [contours[cnt]], 0, (0,255,0), 2)
cv2.imshow('img2', img2)
For more info on contours and their hierarchical relationship please refer this
UPDATE
I have a rather crude way to ignore unwanted contours. Find the average area of all the contours in list l and draw those that are above the average:
Code:
img3 = img.copy()
a = 0
for j, i in enumerate(l):
a = a + cv2.contourArea(contours[i])
mean_area = int(a/len(l))
for cnt in l:
if (cnt > 0) & (cv2.contourArea(contours[cnt]) > mean_area):
cv2.drawContours(img3, [contours[cnt]], 0, (0,255,0), 2)
cv2.imshow('img3', img3)

You can select only the outer borders by this function:
def _select_contours(contours, hierarchy):
"""select contours of the second level"""
# find the border of the image, which has no father
father_i = None
for i, h in enumerate(hierarchy):
if h[3] == -1:
father_i = i
break
# collect its sons
new_contours = []
for c, h in zip(contours, hierarchy):
if h[3] == father_i:
new_contours.append(c)
return new_contours
Note that you should use cv2.RETR_TREE in cv2.findContours() to get the contours and hierarchy.

remove background of any image using opencv [duplicate]

This question already has answers here:
How to remove the background from an image
(3 answers)
Closed 9 months ago.
I have been searching for a technique to remove the background of a any given image. The idea is to detect a face and remove the background of the detected face. I have finished the face part. Now removing the background part still exists.
I used this code.
import cv2
import numpy as np
#== Parameters
BLUR = 21
CANNY_THRESH_1 = 10
CANNY_THRESH_2 = 200
MASK_DILATE_ITER = 10
MASK_ERODE_ITER = 10
MASK_COLOR = (0.0,0.0,1.0) # In BGR format
#-- Read image
img = cv2.imread('SYxmp.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
#-- Edge detection
edges = cv2.Canny(gray, CANNY_THRESH_1, CANNY_THRESH_2)
edges = cv2.dilate(edges, None)
edges = cv2.erode(edges, None)
#-- Find contours in edges, sort by area
contour_info = []
contours, _ = cv2.findContours(edges, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
for c in contours:
contour_info.append((
c,
cv2.isContourConvex(c),
cv2.contourArea(c),
))
contour_info = sorted(contour_info, key=lambda c: c[2], reverse=True)
max_contour = contour_info[0]
#-- Create empty mask, draw filled polygon on it corresponding to largest contour ----
# Mask is black, polygon is white
mask = np.zeros(edges.shape)
cv2.fillConvexPoly(mask, max_contour[0], (255))
#-- Smooth mask, then blur it
mask = cv2.dilate(mask, None, iterations=MASK_DILATE_ITER)
mask = cv2.erode(mask, None, iterations=MASK_ERODE_ITER)
mask = cv2.GaussianBlur(mask, (BLUR, BLUR), 0)
mask_stack = np.dstack([mask]*3) # Create 3-channel alpha mask
#-- Blend masked img into MASK_COLOR background
mask_stack = mask_stack.astype('float32') / 255.0
img = img.astype('float32') / 255.0
masked = (mask_stack * img) + ((1-mask_stack) * MASK_COLOR)
masked = (masked * 255).astype('uint8')
cv2.imshow('img', masked) # Display
cv2.waitKey()
cv2.imwrite("WTF.jpg",masked)
But this code only works for only this image
What should be changed in the code to make it to work for different images

Local Optimal Solution
# Original Code
CANNY_THRESH_2 = 200
# Change to
CANNY_THRESH_2 = 100
####### Change below worth to try but not necessary
# Original Code
mask = np.zeros(edges.shape)
cv2.fillConvexPoly(mask, max_contour[0], (255))
# Change to
for c in contour_info:
cv2.fillConvexPoly(mask, c[0], (255))
Effects
Test Image
Similar color of background, hair and skin
Original Output
original output
original edges
Apply all contour rather than max contour with same edge threshold
slightly better
Canny Thresh 2 set as 100, apply all contour
much better
stronger edges
Canny Thresh 2 set as 40, apply all contour
edges starts to become not so sharp
Reasoning
Program Behavior
The program searches edges and builds contours. Get the max contour and recognize as human face. Then apply mask.
Problem
Not easy to deal with similar color between background and human face. Blond hair and skin color makes it's hard to find correct edges with the original threshold.
Max contour means when images have strong and big vertex like the scarf in test image, it's easy to lose track of some area. But it really depends on what kind of image it is after your human face recognition process.

Extracting text from inside a circular border

I'm trying to develop a script using Python and OpenCV to detect some highlighted regions on a scanned instrumentation diagram and output text using Tesseract's OCR function. My workflow is first to detect the general vicinity of the region of interest, and then apply processing steps to remove everything aside from the blocks of text (lines, borders, noise). The processed image is then feed into Tesseract's OCR engine.
This workflow is works on about half of the images, but fails on the rest due to the text touching the borders. I'll show some examples of what I mean below:
Step 1: Find regions of interest by creating a mask using InRange with the color range of the highlighter.
Step 2: Contour regions of interest, crop and save to file.
--- Referenced code begins here ---
Step 3: Threshold image and apply Canny Edge Detection
Step 4: Contour the edges and filter them into circular shape using cv2.approxPolyDP and looking at ones with vertices greater than 8. Taking the first or second largest contour usually corresponds to the inner edge.
Step 5: Using masks and bitwise operations, everything inside contour is transferred to a white background image. Dilation and erosion is applied to de-noise the image and create the final image that gets fed into the OCR engine.
import cv2
import numpy as np
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
d_path = "Test images\\"
img_name = "cropped_12.jpg"
img = cv2.imread(d_path + img_name) # Reads the image
## Resize image before calculating contour
height, width = img.shape[:2]
img = cv2.resize(img,(2*width,2*height),interpolation = cv2.INTER_CUBIC)
img_orig = img.copy() # Makes copy of original image
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) # Convert to grayscale
# Apply threshold to get binary image and write to file
_, img = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Edge detection
edges = cv2.Canny(img,100,200)
# Find contours of mask threshold
_, contours, hierarchy = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Find contours associated w/ polygons with 8 sides or more
cnt_list = []
area_list = [cv2.contourArea(c) for c in contours]
for j in contours:
poly_pts = cv2.approxPolyDP(j,0.01*cv2.arcLength(j,True),True)
area = cv2.contourArea(j)
if (len(poly_pts) > 8) & (area == max(area_list)):
cnt_list.append(j)
cv2.drawContours(img_orig, cnt_list, -1, (255,0,0), 2)
# Show contours
cv2.namedWindow('Show',cv2.WINDOW_NORMAL)
cv2.imshow("Show",img_orig)
cv2.waitKey()
cv2.destroyAllWindows()
# Zero pixels outside circle
mask = np.zeros(img.shape).astype(img.dtype)
cv2.fillPoly(mask, cnt_list, (255,255,255))
mask_inv = cv2.bitwise_not(mask)
a = cv2.bitwise_and(img,img,mask = mask)
wh_back = np.ones(img.shape).astype(img.dtype)*255
b = cv2.bitwise_and(wh_back,wh_back,mask = mask_inv)
res = cv2.add(a,b)
# Get rid of noise
kernel = np.ones((2, 2), np.uint8)
res = cv2.dilate(res, kernel, iterations=1)
res = cv2.erode(res, kernel, iterations=1)
# Show final image
cv2.namedWindow('result',cv2.WINDOW_NORMAL)
cv2.imshow("result",res)
cv2.waitKey()
cv2.destroyAllWindows()
When code works, these are the images that get outputted:
Working
However, in the instances where the text touches the circular border, the code assumes part of the text is part of the larger contour and ignores the last letter. For example:
Not working
Are there any processing steps that can help me bypass this problem? Or perhaps a different approach? I've tried using Hough Circle Transforms to try to detect the borders, but they're quite finicky and doesn't work as well as contouring.
I'm quite new to OpenCV and Python so any help would be appreciated.

If the Hough circle transform didn't work for you I think you're best option will be to approximate the boarder shape. The best method I know for that is: Douglas-Peucker algorithm which will make your contour simpler by reducing the perimeter on pics.
You can check this reference file from OpenCV to see the type of post processing you can apply to your boarder. They also mention Douglas-Peucker:
OpenCV boarder processing

Just a hunch. After OTSU thresholding. Erode and dilate the image. This will result in vanishing of very thin joints. The code for the same is below.
kernel = np.ones((5,5),np.uint8)
th3 = cv2.erode(th3, kernel,iterations=1)
th3 = cv2.dilate(th3, kernel,iterations=1)
Let me know how it goes. I have couple more idea if this did not work.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.