Sorting contours based on precedence in Python, OpenCV [duplicate] - python

This question already has answers here:
Python: Sorting items from top left to bottom right with OpenCV
(2 answers)
Closed 10 months ago.
I am trying to sort contours based on their arrivals, left-to-right and top-to-bottom just like how you write anything. From, top and left and then whichever comes accordingly.
This is what and how I have achieved up to now:
def get_contour_precedence(contour, cols):
tolerance_factor = 61
origin = cv2.boundingRect(contour)
return ((origin[1] // tolerance_factor) * tolerance_factor) * cols + origin[0]
image = cv2.imread("C:/Users/XXXX/PycharmProjects/OCR/raw_dataset/23.png", 0)
ret, thresh1 = cv2.threshold(image, 130, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
contours, h = cv2.findContours(thresh1.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# perform edge detection, find contours in the edge map, and sort the
# resulting contours from left-to-right
contours.sort(key=lambda x: get_contour_precedence(x, thresh1.shape[1]))
# initialize the list of contour bounding boxes and associated
# characters that we'll be OCR'ing
chars = []
inc = 0
# loop over the contours
for c in contours:
inc += 1
# compute the bounding box of the contour
(x, y, w, h) = cv2.boundingRect(c)
label = str(inc)
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.putText(image, label, (x - 2, y - 2),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
print('x=', x)
print('y=', y)
print('x+w=', x + w)
print('y+h=', y + h)
crop_img = image[y + 2:y + h - 1, x + 2:x + w - 1]
name = os.path.join("bounding boxes", 'Image_%d.png' % (
inc))
cv2.imshow("cropped", crop_img)
print(name)
crop_img = Image.fromarray(crop_img)
crop_img.save(name)
cv2.waitKey(0)
cv2.imshow('mat', image)
cv2.waitKey(0)
Input Image :
Output Image 1:
Input Image 2 :
Output for Image 2:
Input Image 3:
Output Image 3:
As you can see the 1,2,3,4 is not what I was expecting it to be each image, as displayed in the Image Number 3.
How do I adjust this to make it work or even write a custom function?
NOTE: I have multiple images of the same input image provided in my question. The content is the same but they have variations in the text so the tolerance factor is not working for each one of them. Manually adjusting it would not be a good idea.

This is my take on the problem. I'll give you the general gist of it, and then my implementation in C++. The main idea is that I want to process the image from left to right, top to bottom. I'll process each blob (or contour) as I find it, however, I need a couple of intermediate steps for achieving a successful (an ordered) segmentation.
Vertical sort using rows
The first step is trying to sort the blobs by rows – this means that each row has a set of (unordered) horizontal blobs. That's ok. the first step is computing some kind of vertical sorting, and if we process each row from top to bottom, we will achieve just that.
After the blobs are (vertically) sorted by rows, then I can check out their centroids (or center of mass) and horizontally sort them. The idea is that I will process row per row and, for each row, I sort blob centroids. Let’s see an example of what I'm trying to achieve here.
This is your input image:
This is what I call the Row Mask:
This last image contains white areas that represent a "row" each. Each row has a number (e.g., Row1 , Row2, etc.) and each row holds a set of blobs (or characters, in this case). By processing each row, top from bottom, you are already sorting the blobs on the vertical axis.
If I number each row from top to bottom, I get this image:
The Row Mask is a way of creating "rows of blobs", and this mask can be computed morphologically. Check out the 2 images overlaid to give you a better view of the processing order:
What we are trying to do here is, first, a vertical ordering (blue arrow) and then we will take care of the horizontal (red arrow) ordering. You can see that by processing each row we can (possibly) overcome the sorting problem!
Horizontal sort using centroids
Let's see now how we can sort the blobs horizontally. If we create a simpler image, with a width equal to the input image and a height equal to the numbers of rows in our Row Mask, we can simply overlay every horizontal coordinate (x coordinate) of each blob centroid. Check out this example:
This is a Row Table. Each row represents the number of rows found in the Row Mask, and is also read from top to bottom. The width of the table is the same as the width of your input image, and corresponds spatially to the horizontal axis. Each square is a pixel in your input image, mapped to the Row Table using only the horizontal coordinate (as our simplification of rows is pretty straightforward). The actual value of each pixel in the row table is a label, labeling each of the blobs on your input image. Note that the labels are not ordered!
So, for instance, this table shows that, in the row 1 (you already know what is row 1 – it's the first white area on the Row Mask) in the position (1,4) there’s the blob number 3. In position (1,6) there's blob number 2, and so on. What's cool (I think) about this table is that you can loop through it, and for every value different of 0, horizontal ordering becomes very trivial. This is the row table ordered, now, left to right:
Mapping blob information with centroids
We are going to use blobs centroids to map the information between our two representations (Row Mask/Row Table). Suppose you already have both "helper" images and you process each blob (or contour) on the input image at a time. For example, you have this as a start:
Alright, there's a blob here. How can we map it to the Row Mask and to the Row Table? Using its centroids. If we compute the centroid (shown in the figure as the green dot) we can construct a dictionary of centroids and labels. For example, for this blob, the centroid is located at (271,193). Ok, let’s assign the label = 1. So we now have this dictionary:
Now, we find the row in which this blob is placed using the same centroid on the Row Mask. Something like this:
rowNumber = rowMask.at( 271,193 )
This operation should return rownNumber = 3. Nice! We know in which row our blob is placed on, and so, it is now vertically ordered. Now, let's store its horizontal coordinate in the Row Table:
rowTable.at( 271, 193 ) = 1
Now, rowTable holds (in its row and column) the label of the processed blob. The Row Table should look something like this:
The table is a lot wider, because its horizontal dimension has to be the same as your input image. In this image, the label 1 is placed in Column 271, Row 3. If this was the only blob on your image, the blobs would be already sorted. But what happens if you add another blob in, say, Column 2, Row 1? That's why you need to traverse, again, this table after you have processed all the blobs – to properly correct their label.
Implementation in C++
Alright, hopefully the algorithm should be a little bit clear (if not, just ask, my man). I'll try to implement these ideas in OpenCV using C++. First, I need a binary image of your input. Computation is trivial using Otsu’s thresholding method:
//Read the input image:
std::string imageName = "C://opencvImages//yFX3M.png";
cv::Mat testImage = cv::imread( imageName );
//Compute grayscale image
cv::Mat grayImage;
cv::cvtColor( testImage, grayImage, cv::COLOR_RGB2GRAY );
//Get binary image via Otsu:
cv::Mat binImage;
cv::threshold( grayImage, binImage, 0, 255, cv::THRESH_OTSU );
//Invert image:
binImage = 255 - binImage;
This is the resulting binary image, nothing fancy, just what we need to start working:
The first step is to get the Row Mask. This can be achieved using morphology. Just apply a dilation + erosion with a VERY big horizontal structuring element. The idea is you want to turn those blobs into rectangles, "fusing" them together horizontally:
//Create a hard copy of the binary mask:
cv::Mat rowMask = binImage.clone();
//horizontal dilation + erosion:
int horizontalSize = 100; // a very big horizontal structuring element
cv::Mat SE = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(horizontalSize,1) );
cv::morphologyEx( rowMask, rowMask, cv::MORPH_DILATE, SE, cv::Point(-1,-1), 2 );
cv::morphologyEx( rowMask, rowMask, cv::MORPH_ERODE, SE, cv::Point(-1,-1), 1 );
This results in the following Row Mask:
That's very cool, now that we have our Row Mask, we must number them rows, ok? There's a lot of ways of doing this, but right now I'm interested in the simpler one: loop through this image and get every single pixel. If a pixel is white, use a Flood Fill operation to label that portion of the image as a unique blob (or row, in this case). This can be done as follows:
//Label the row mask:
int rowCount = 0; //This will count our rows
//Loop thru the mask:
for( int y = 0; y < rowMask.rows; y++ ){
for( int x = 0; x < rowMask.cols; x++ ){
//Get the current pixel:
uchar currentPixel = rowMask.at<uchar>( y, x );
//If the pixel is white, this is an unlabeled blob:
if ( currentPixel == 255 ) {
//Create new label (different from zero):
rowCount++;
//Flood fill on this point:
cv::floodFill( rowMask, cv::Point( x, y ), rowCount, (cv::Rect*)0, cv::Scalar(), 0 );
}
}
}
This process will label all the rows from 1 to r. That's what we wanted. If you check out the image you'll faintly see the rows, that's because our labels correspond to very low intensity values of grayscale pixels.
Ok, now let's prepare the Row Table. This "table" really is just another image, remember: same width as the input and height as the number of rows you counted on the Row Mask:
//create rows image:
cv::Mat rowTable = cv::Mat::zeros( cv::Size(binImage.cols, rowCount), CV_8UC1 );
//Just for convenience:
rowTable = 255 - rowTable;
Here, I just inverted the final image for convenience. Because I want to actually see how the table is populated with (very low intensity) pixels and be sure that everything is working as intended.
Now comes the fun part. We have both images (or data containers) prepared. We need to process each blob independently. The idea is that you have to extract each blob/contour/character from the binary image and compute its centroid and assign a new label. Again, there's a lot of way of doing this. Here, I'm using the following approach:
I'll loop through the binary mask. I'll get the current biggest blob from this binary input. I'll compute its centroid and store its data in every container needed, and then, I'll delete that blob from the mask. I'll repeat the process until no more blobs are left. This is my way of doing this, especially because I've functions I already wrote for that. This is the approach:
//Prepare a couple of dictionaries for data storing:
std::map< int, cv::Point > blobMap; //holds label, gives centroid
std::map< int, cv::Rect > boundingBoxMap; //holds label, gives bounding box
First, two dictionaries. One receives a blob label and returns the centroid. The other one receives the same label and returns the bounding box.
//Extract each individual blob:
cv::Mat bobFilterInput = binImage.clone();
//The new blob label:
int blobLabel = 0;
//Some control variables:
bool extractBlobs = true; //Controls loop
int currentBlob = 0; //Counter of blobs
while ( extractBlobs ){
//Get the biggest blob:
cv::Mat biggestBlob = findBiggestBlob( bobFilterInput );
//Compute the centroid/center of mass:
cv::Moments momentStructure = cv::moments( biggestBlob, true );
float cx = momentStructure.m10 / momentStructure.m00;
float cy = momentStructure.m01 / momentStructure.m00;
//Centroid point:
cv::Point blobCentroid;
blobCentroid.x = cx;
blobCentroid.y = cy;
//Compute bounding box:
boundingBox boxData;
computeBoundingBox( biggestBlob, boxData );
//Convert boundingBox data into opencv rect data:
cv::Rect cropBox = boundingBox2Rect( boxData );
//Label blob:
blobLabel++;
blobMap.emplace( blobLabel, blobCentroid );
boundingBoxMap.emplace( blobLabel, cropBox );
//Get the row for this centroid
int blobRow = rowMask.at<uchar>( cy, cx );
blobRow--;
//Place centroid on rowed image:
rowTable.at<uchar>( blobRow, cx ) = blobLabel;
//Resume blob flow control:
cv::Mat blobDifference = bobFilterInput - biggestBlob;
//How many pixels are left on the new mask?
int pixelsLeft = cv::countNonZero( blobDifference );
bobFilterInput = blobDifference;
//Done extracting blobs?
if ( pixelsLeft <= 0 ){
extractBlobs = false;
}
//Increment blob counter:
currentBlob++;
}
Check out a nice animation of how this processing goes through each blob, processes it and deletes it until there’s nothing left:
Now, some notes with the above snippet. I've some helper functions: biggestBlob and computeBoundingBox. These functions compute the biggest blob in a binary image and convert a custom structure of a bounding box into OpenCV’s Rect structure respectively. Those are the operations those functions carry out.
The "meat" of the snippet is this: Once you have an isolated blob, compute its centroid (I actually compute the center of mass via central moments). Generate a new label. Store this label and centroid in a dictionary, in my case, the blobMap dictionary. Additionally compute the bounding box and store it in another dictionary, boundingBoxMap:
//Label blob:
blobLabel++;
blobMap.emplace( blobLabel, blobCentroid );
boundingBoxMap.emplace( blobLabel, cropBox );
Now, using the centroid data, fetch the corresponding row of that blob. Once you get the row, store this number into your row table:
//Get the row for this centroid
int blobRow = rowMask.at<uchar>( cy, cx );
blobRow--;
//Place centroid on rowed image:
rowTable.at<uchar>( blobRow, cx ) = blobLabel;
Excellent. At this point you have the Row Table ready. Let’s loop through it and actually, and finally, order those damn blobs:
int blobCounter = 1; //The ORDERED label, starting at 1
for( int y = 0; y < rowTable.rows; y++ ){
for( int x = 0; x < rowTable.cols; x++ ){
//Get current label:
uchar currentLabel = rowTable.at<uchar>( y, x );
//Is it a valid label?
if ( currentLabel != 255 ){
//Get the bounding box for this label:
cv::Rect currentBoundingBox = boundingBoxMap[ currentLabel ];
cv::rectangle( testImage, currentBoundingBox, cv::Scalar(0,255,0), 2, 8, 0 );
//The blob counter to string:
std::string counterString = std::to_string( blobCounter );
cv::putText( testImage, counterString, cv::Point( currentBoundingBox.x, currentBoundingBox.y-1 ),
cv::FONT_HERSHEY_SIMPLEX, 0.7, cv::Scalar(255,0,0), 1, cv::LINE_8, false );
blobCounter++; //Increment the blob/label
}
}
}
Nothing fancy, just a regular nested for loop, looping through each pixel on the row table. If the pixel is different from white, use the label to retrieve both the centroid and bounding box, and just change the label to an increasing number. For result displaying I just draw the bounding boxes and the new label on the original image.
Check out the ordered processing in this animation:
Very cool, here's a bonus animation, the Row Table getting populated with horizontal coordinates:

I would even say use hue moments which tends to be a better estimation for the center point of a polygon
than the "normal" coordinate center point of the rectangle, so the function could be:
def get_contour_precedence(contour, cols):
tolerance_factor = 61
M = cv2.moments(contour)
# calculate x,y coordinate of centroid
if M["m00"] != 0:
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
else:
# set values as what you need in the situation
cX, cY = 0, 0
return ((cY // tolerance_factor) * tolerance_factor) * cols + cX
an super math. explanation what hue moments are, could you find here
Maybe you should think about get rid of this tolerance_factor
by using in general a clustering algorithm like
kmeans to cluster your center to rows and columns.
OpenCv has a an kmeans implementation which you could find here
I do not exactly know what your goal is, but another idea could be to split every line into an Region of Interest (ROI)
for further processing, afterwards you could easily count the letters
by the X-Values of the each contour and the line number
import cv2
import numpy as np
## (1) read
img = cv2.imread("yFX3M.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
## (2) threshold
th, threshed = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV|cv2.THRESH_OTSU)
## (3) minAreaRect on the nozeros
pts = cv2.findNonZero(threshed)
ret = cv2.minAreaRect(pts)
(cx,cy), (w,h), ang = ret
if w>h:
w,h = h,w
## (4) Find rotated matrix, do rotation
M = cv2.getRotationMatrix2D((cx,cy), ang, 1.0)
rotated = cv2.warpAffine(threshed, M, (img.shape[1], img.shape[0]))
## (5) find and draw the upper and lower boundary of each lines
hist = cv2.reduce(rotated,1, cv2.REDUCE_AVG).reshape(-1)
th = 2
H,W = img.shape[:2]
# (6) using histogramm with threshold
uppers = [y for y in range(H-1) if hist[y]<=th and hist[y+1]>th]
lowers = [y for y in range(H-1) if hist[y]>th and hist[y+1]<=th]
rotated = cv2.cvtColor(rotated, cv2.COLOR_GRAY2BGR)
for y in uppers:
cv2.line(rotated, (0,y), (W, y), (255,0,0), 1)
for y in lowers:
cv2.line(rotated, (0,y), (W, y), (0,255,0), 1)
cv2.imshow('pic', rotated)
# (7) we iterate all rois and count
for i in range(len(uppers)) :
print('line=',i)
roi = rotated[uppers[i]:lowers[i],0:W]
cv2.imshow('line', roi)
cv2.waitKey(0)
# here again calc thres and contours
I found an old post with this code here

Instead of taking the upper left corner of the contour, I'd rather use the centroid or at least the bounding box center.
def get_contour_precedence(contour, cols):
tolerance_factor = 4
origin = cv2.boundingRect(contour)
return (((origin[1] + origin[3])/2 // tolerance_factor) * tolerance_factor) * cols + (origin[0] + origin[2]) / 2
But it might be hard to find a tolerance value that works in all cases.

Here is one way in Python/OpenCV by processing by rows first then characters.
Read the input
Convert to grayscale
Threshold and invert
Use a long horizontal kernels and apply morphology close to form rows
Get the contours of the rows and their bounding boxes
Save the row boxes and sort on Y
Loop over each sorted row box and extract the row from the thresholded image
Get the contours of each character in the row and save the the bounding boxes of the characters.
Sort the contours for a given row on X
Draw the bounding boxes on the input and the index number as text on the image
Increment the index
Save the results
Input:
import cv2
import numpy as np
# read input image
img = cv2.imread('vision78.png')
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# otsu threshold
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU )[1]
thresh = 255 - thresh
# apply morphology close to form rows
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (51,1))
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# find contours and bounding boxes of rows
rows_img = img.copy()
boxes_img = img.copy()
rowboxes = []
rowcontours = cv2.findContours(morph, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
rowcontours = rowcontours[0] if len(rowcontours) == 2 else rowcontours[1]
index = 1
for rowcntr in rowcontours:
xr,yr,wr,hr = cv2.boundingRect(rowcntr)
cv2.rectangle(rows_img, (xr, yr), (xr+wr, yr+hr), (0, 0, 255), 1)
rowboxes.append((xr,yr,wr,hr))
# sort rowboxes on y coordinate
def takeSecond(elem):
return elem[1]
rowboxes.sort(key=takeSecond)
# loop over each row
for rowbox in rowboxes:
# crop the image for a given row
xr = rowbox[0]
yr = rowbox[1]
wr = rowbox[2]
hr = rowbox[3]
row = thresh[yr:yr+hr, xr:xr+wr]
bboxes = []
# find contours of each character in the row
contours = cv2.findContours(row, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
for cntr in contours:
x,y,w,h = cv2.boundingRect(cntr)
bboxes.append((x+xr,y+yr,w,h))
# sort bboxes on x coordinate
def takeFirst(elem):
return elem[0]
bboxes.sort(key=takeFirst)
# draw sorted boxes
for box in bboxes:
xb = box[0]
yb = box[1]
wb = box[2]
hb = box[3]
cv2.rectangle(boxes_img, (xb, yb), (xb+wb, yb+hb), (0, 0, 255), 1)
cv2.putText(boxes_img, str(index), (xb,yb), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.75, (0,255,0), 1)
index = index + 1
# save result
cv2.imwrite("vision78_thresh.jpg", thresh)
cv2.imwrite("vision78_morph.jpg", morph)
cv2.imwrite("vision78_rows.jpg", rows_img)
cv2.imwrite("vision78_boxes.jpg", boxes_img)
# show images
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.imshow("rows_img", rows_img)
cv2.imshow("boxes_img", boxes_img)
cv2.waitKey(0)
Threshold image:
Morphology image of rows:
Row contours image:
Character contours image:

Related

Extract most central area in a Binary Image

I am processing binary images, and was previously using this code to find the largest area in the binary image:
# Use the hue value to convert to binary
thresh = 20
thresh, thresh_img = cv2.threshold(h, thresh, 255, cv2.THRESH_BINARY)
cv2.imshow('thresh', thresh_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Finding Contours
# Use a copy of the image since findContours alters the image
contours, _ = cv2.findContours(thresh_img.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#Extract the largest area
c = max(contours, key=cv2.contourArea)
This code isn't really doing what I need it to do, now I think it would better to extract the most central area in the binary image.
Binary Image
Largest Image
This is currently what the code is extracting, but I am hoping to get the central circle in the first binary image extracted.
OpenCV comes with a point-polygon test function (for contours). It even gives a signed distance, if you ask for that.
I'll find the contour that is closest to the center of the picture. That may be a contour actually overlapping the center of the picture.
Timings, on my quadcore from 2012, give or take a millisecond:
findContours: ~1 millisecond
all pointPolygonTests and argmax: ~1 millisecond
mask = cv.imread("fkljm.png", cv.IMREAD_GRAYSCALE)
(height, width) = mask.shape
ret, mask = cv.threshold(mask, 128, 255, cv.THRESH_BINARY) # required because the sample picture isn't exactly clean
# get contours
contours, hierarchy = cv.findContours(mask, cv.RETR_LIST | cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
center = (np.array([width, height]) - 1) / 2
# find contour closest to center of picture
distances = [
cv.pointPolygonTest(contour, center, True) # looking for most positive (inside); negative is outside
for contour in contours
]
iclosest = np.argmax(distances)
print("closest contour is", iclosest, "with distance", distances[iclosest])
# draw closest contour
canvas = cv.cvtColor(mask, cv.COLOR_GRAY2BGR)
cv.drawContours(image=canvas, contours=[contours[iclosest]], contourIdx=-1, color=(0, 255, 0), thickness=5)
closest contour is 45 with distance 65.19202405202648
a cv.floodFill() on the center point can also quickly yield a labeling on that blob... assuming the mask is positive there. Otherwise, there needs to be search.
(cx, cy) = center.astype(int)
assert mask[cy,cx], "floodFill not applicable"
# trying cv.floodFill on the image center
mask2 = mask >> 1 # turns everything else gray
cv.floodFill(image=mask2, mask=None, seedPoint=center.astype(int), newVal=255)
# use (mask2 == 255) to identify that blob
This also takes less than a millisecond.
Some practically faster approaches might involve a pyramid scheme (low-res versions of the mask) to quickly identify areas of the picture that are candidates for an exact test (distance/intersection).
Test target pixel. Hit (positive)? Done.
Calculate low-res mask. Per block, if any pixel is positive, block is positive.
Find positive blocks, sort by distance, examine closer all those that are within sqrt(2) * blocksize of the best distance.
There are several ways you define "most central." I chose to define it as the region with the closest distance to the point you're searching for. If the point is inside the region, then that distance will be zero.
I also chose to do this with a pixel-based approach rather than a polygon-based approach, like you're doing with findContours().
Here's a step-by-step breakdown of what this code is doing.
Load the image, put it into grayscale, and threshold it. You're already doing these things.
Identify connected components of the image. Connected components are places where there are white pixels which are directly connected to other white pixels. This breaks up the image into regions.
Using np.argwhere(), convert a true/false mask into an array of coordinates.
For each coordinate, compute the Euclidean distance between that point and search_point.
Find the minimum within each region.
Across all regions, find the smallest distance.
import cv2
import numpy as np
img = cv2.imread('test197_img.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh_img = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
n_groups, comp_grouped = cv2.connectedComponents(thresh_img)
components = []
search_point = [600, 150]
for i in range(1, n_groups):
mask = (comp_grouped == i)
component_coords = np.argwhere(mask)[:, ::-1]
min_distance = np.sqrt(((component_coords - search_point) ** 2).sum(axis=1)).min()
components.append({
'mask': mask,
'min_distance': min_distance,
})
closest = min(components, key=lambda x: x['min_distance'])['mask']
Output:

Remove and measure a line openCV

Links to all images at the bottom
I have drawn a line over an arrow which captures the angle of that arrow. I would like to then remove the arrow, keep only the line, and use cv2.minAreaRect to determine the angle. So far I've got everything to work except removing the original arrow, which results in an incorrect angle generated by the cv2.minAreaRect bounding box.
Really, I just want the bold black line running through the arrow to use to measure the angle, not the arrow itself. if anyone has an idea to make this work, or a simpler way, please let me know. Thanks
Code:
import numpy as np
import cv2
image = cv2.imread("templates/a_15.png")
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(image, 127, 255, 0)
contours, hierarchy = cv2.findContours(thresh, 1, 2)
cont = contours[0]
rows,cols = image.shape[:2]
[vx,vy,x,y] = cv2.fitLine(cont, cv2.DIST_L2,0,0.01,0.01)
leftish = int((-x*vy/vx) + y)
rightish = int(((cols-x)*vy/vx)+y)
line = cv2.line(image,(cols-1,rightish),(0,leftish),(0,255,0),10)
# thresholding
thresh = cv2.threshold(line, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# compute rotated bounding box based on all pixel values > 0 and
# use coordinates to compute a rotated bounding box of those coordinates
coordinates = np.column_stack(np.where(thresh > 0))
w = coordinates[0]
h = coordinates[1]
# Compute minimum rotated angle that contains entire image.
# Return angle values in the range [-90, 0).
# As the rectangle rotates clockwise, angle values increase towards 0.
# Once 0 is reached, angle is set back to -90 degrees.
angle = cv2.minAreaRect(coordinates)[-1]
# for angles less than -45 degrees, add 90 degrees to angle to take the inverse.
if angle < - 45:
angle = -(90 + angle)
else:
angle = -angle
# rotate image
(h, w) = image.shape[:2]
center = (w // 2, h // 2) # image center
RM = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, RM, (w, h),
flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)
# correction angle for validation
cv2.putText(rotated, "Angle {:.2f} degrees".format(angle),
(10, 30), cv2.FONT_HERSHEY_DUPLEX, 0.9, (0, 255, 0), 2)
# output
print("[INFO] angle: {:.3f}".format(angle))
cv2.imshow("Line", line)
cv2.imshow("Input", image)
cv2.imshow("Rotated", rotated)
cv2.waitKey(0)
Images
original
current results
goal
Here's a possible solution. The main idea is to identify de "tip" and the "tail" of the arrow approximating some key points. After you have identified both ends, you can draw a line joining both points. It is also an advantage to know which of the endpoints is the tip, because that way you can measure the angle from a constant point.
There's more than one way to achieve this. I choose something that I have applied in the past: I will use this approach to identify the endpoints of the overall shape. My assumption is that the tip will yield more points than the tail. After that, I'll cluster all the endpoints in two groups: tip and tail. I can use K-Means for that, as it will return the mean centers for both clusters. After that, we have our tip and tail points that can be joined easily with a line. These are the steps:
Convert the image to grayscale
Get the skeleton of the image, to normalize the shape to a width of 1 pixel
Apply the method described in the link to get the arrow's endpoints
Divide the endpoints in two clusters and use K-Means to get their centers
Join both endpoints with a line
Let's see the code:
# imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "CoXeb.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Grayscale conversion:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
grayscaleImage = 255 - grayscaleImage
# Extend the borders for the skeleton:
extendedImg = cv2.copyMakeBorder(grayscaleImage, 5, 5, 5, 5, cv2.BORDER_CONSTANT)
# Store a deep copy of the crop for results:
grayscaleImageCopy = cv2.cvtColor(extendedImg, cv2.COLOR_GRAY2BGR)
# Compute the skeleton:
skeleton = cv2.ximgproc.thinning(extendedImg, None, 1)
The first step is to get the skeleton of the arrow. As I said, this step is needed prior to the convolution-based method that identifies the endpoints of a shape. Computing the skeleton normalizes the shape to a one pixel width. However, sometimes, if the shape is too close to the "canvas" borders, the skeleton could show some artifacts. This is avoided with a border extension. The skeleton of the arrow is this:
Check that image out. If we identify the endpoints, the tip will exhibit at least 3 points, while the tail at least 1. That's handy - the tip will always have more points than the tail. If only we could detect those points... Luckily, we can:
# Threshold the image so that white pixels get a value of 0 and
# black pixels a value of 10:
_, binaryImage = cv2.threshold(skeleton, 128, 10, cv2.THRESH_BINARY)
# Set the end-points kernel:
h = np.array([[1, 1, 1],
[1, 10, 1],
[1, 1, 1]])
# Convolve the image with the kernel:
imgFiltered = cv2.filter2D(binaryImage, -1, h)
# Extract only the end-points pixels, those with
# an intensity value of 110:
binaryImage = np.where(imgFiltered == 110, 255, 0)
# The above operation converted the image to 32-bit float,
# convert back to 8-bit uint
binaryImage = binaryImage.astype(np.uint8)
This endpoint detecting method convolves the skeleton with a special kernel that identifies endpoints. It returns a binary image where all the endpoints have the value 110. After thresholding this mid-result, we get this image, which represents the arrow endpoints:
Nice, as you see, we can group the points in two clusters and get their cluster centers. Sounds like a job for K-Means, because that's exactly what it does. We first need to treat our data, though, because K-Means operates on defined-shaped arrays of float data:
# Find the X, Y location of all the end-points
# pixels:
Y, X = binaryImage.nonzero()
# Reshape the arrays for K-means
Y = Y.reshape(-1,1)
X = X.reshape(-1,1)
Z = np.hstack((X, Y))
# K-means operates on 32-bit float data:
floatPoints = np.float32(Z)
# Set the convergence criteria and call K-means:
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
ret, label, center = cv2.kmeans(floatPoints, 2, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
# Set the cluster count, find the points belonging
# to cluster 0 and cluster 1:
cluster1Count = np.count_nonzero(label)
cluster0Count = np.shape(label)[0] - cluster1Count
print("Elements of Cluster 0: "+str(cluster0Count))
print("Elements of Cluster 1: " + str(cluster1Count))
The last two lines prints the endpoints that are assigned to Cluster 0 Cluster 1, respectively. That outputs this:
Elements of Cluster 0: 3
Elements of Cluster 1: 2
Just as expected - well, kinda. Seems that Cluster 0 is the tip and cluster 2 the tail! But the tail actually got 2 points. If you look the image of the skeleton closely, you'll see there's a small bifurcation at the tail. That's why we, in reality, got two points instead of just one. Alright, let's get the center points and draw them on the original input:
# Look for the cluster of max number of points
# That cluster will be the tip of the arrow:
maxCluster = 0
if cluster1Count > cluster0Count:
maxCluster = 1
# Check out the centers of each cluster:
matRows, matCols = center.shape
# Store the ordered end-points here:
orderedPoints = [None] * 2
# Let's identify and draw the two end-points
# of the arrow:
for b in range(matRows):
# Get cluster center:
pointX = int(center[b][0])
pointY = int(center[b][1])
# Get the "tip"
if b == maxCluster:
color = (0, 0, 255)
orderedPoints[0] = (pointX, pointY)
# Get the "tail"
else:
color = (255, 0, 0)
orderedPoints[1] = (pointX, pointY)
# Draw it:
cv2.circle(grayscaleImageCopy, (pointX, pointY), 3, color, -1)
cv2.imshow("End-Points", grayscaleImageCopy)
cv2.waitKey(0)
This is the resulting image:
The tip always gets drawn in red while the tail is drawn in blue. Very cool, let's store these points in the orderedPoints list and draw the final line in a new "canvas", with dimension same as the original image:
# Store the tip and tail points:
p0x = orderedPoints[1][0]
p0y = orderedPoints[1][1]
p1x = orderedPoints[0][0]
p1y = orderedPoints[0][1]
# Create a new "canvas" (image) using the input dimensions:
imageHeight, imageWidth = binaryImage.shape[:2]
newImage = np.zeros((imageHeight, imageWidth), np.uint8)
newImage = 255 - newImage
# Draw a line using the detected points:
(x1, y1) = orderedPoints[0]
(x2, y2) = orderedPoints[1]
lineColor = (0, 0, 0)
cv2.line(newImage , (x1, y1), (x2, y2), lineColor, thickness=2)
cv2.imshow("Detected Line", newImage)
cv2.waitKey(0)
The line overlaid on the original image and the new image containing only the line:
It sounds like you want to measure the angle of the line but because you are measuring a line you drew in the original image, you must now filter out the original image to get an accurate measure of the line...which you drew with coordinates you know the endpoints of?
I guess:
make a better filter?
draw the line in a blank image and detect angle there?
determine the angle from the known coordinates?
Since you were asking for just a line, I tried that...just made a blank image, drew your detected line on it and then used that downstream...
blankIm = np.ones((height, width, channels), dtype=np.uint8)
blankIm.fill(255)
line = cv2.line(blankIm,(cols-1,rightish),(0,leftish),(0,255,0),10)

How to infer the state of a shape from colors

I have Lego cubes forming 4x4 shape, and I'm trying to infer the status of a zone inside the image:
empty/full and the color whether if yellow or Blue.
to simplify my work I have added red marker to define the border of the shape since the camera is shaking sometimes.
Here is a clear image of the shape I'm trying to detect taken by my phone camera
( EDIT : Note that this image is not my input image, it is used just to demonstrate the required shape clearly ).
The shape from the side camera that I'm supposed to use looks like this:
(EDIT : Now this is my input image)
to focus my work on the working zone I have created a mask:
what I have tried so far is to locate the red markers by color (simple threshold without HSV color-space) as following:
import numpy as np
import matplotlib.pyplot as plt
import cv2
img = cv2.imread('sample.png')
RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask = cv2.imread('mask.png')
masked = np.minimum(RGB, mask)
masked[masked[...,1]>25] = 0
masked[masked[...,2]>25] = 0
masked = masked[..., 0]
masked = cv2.medianBlur(masked,5)
plt.imshow(masked, cmap='gray')
plt.show()
and I have spotted the markers so far:
But I'm still confused:
how to detect the external borders of the desired zone, and the internal borders (each Lego cube(Yellow-Blue-Green) borders) inside the red markers precisely?.
thanks in advance for your kind advice.
I tested this approach using your undistorted image. Suppose you have the rectified camera image, so you see the lego bricks through a "bird's eye" perspective. Now, the idea is to use the red markers to estimate a center rectangle and crop that portion of the image. Then, as you know each brick's dimensions (and they are constant) you can trace a grid and extract each cell of the grid, You can compute some HSV-based masks to estimate the dominant color on each grid, and that way you know if the space is occupied by a yellow or blue brick, of it is empty.
These are the steps:
Get an HSV mask of the red markers
Use each marker to estimate the center rectangle through each marker's coordinates
Crop the center rectangle
Divide the rectangle into cells - this is the grid
Run a series of HSV-based maks on each cell and compute the dominant color
Label each cell with the dominant color
Let's see the code:
# Importing cv2 and numpy:
import numpy as np
import cv2
# image path
path = "D://opencvImages//"
fileName = "Bg9iB.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Store a deep copy for results:
inputCopy = inputImage.copy()
# Convert the image to HSV:
hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
# The HSV mask values (Red):
lowerValues = np.array([127, 0, 95])
upperValues = np.array([179, 255, 255])
# Create the HSV mask
mask = cv2.inRange(hsvImage, lowerValues, upperValues)
The first part is very straightforward. You set the HSV range and use cv2.inRange to get a binary mask of the target color. This is the result:
We can further improve the binary mask using some morphology. Let's apply a closing with a somewhat big structuring element and 10 iterations. We want those markers as clearly defined as possible:
# Set kernel (structuring element) size:
kernelSize = 5
# Set operation iterations:
opIterations = 10
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
Which yields:
Very nice. Now, let's detect contours on this mask. We will approximate each contour to a bounding box and store its starting point and dimensions. The idea being that, while we will detect every contour, we are not sure of their order. We can sort this list later and get each bounding box from left to right, top to bottom to better estimate the central rectangle. Let's detect contours:
# Create a deep copy, convert it to BGR for results:
maskCopy = mask.copy()
maskCopy = cv2.cvtColor(maskCopy, cv2.COLOR_GRAY2BGR)
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Bounding Rects are stored here:
boundRectsList = []
# Process each contour 1-1:
for i, c in enumerate(contours):
# Approximate the contour to a polygon:
contoursPoly = cv2.approxPolyDP(c, 3, True)
# Convert the polygon to a bounding rectangle:
boundRect = cv2.boundingRect(contoursPoly)
# Get the bounding rect's data:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Estimate the bounding rect area:
rectArea = rectWidth * rectHeight
# Set a min area threshold
minArea = 100
# Filter blobs by area:
if rectArea > minArea:
#Store the rect:
boundRectsList.append(boundRect)
I also created a deep copy of the mask image for further use. Mainly to create this image, which is the result of the contour detection and bounding box approximation:
Notice that I have included a minimum area condition. I want to ignore noise below a certain threshold defined by minArea. Alright, now we have the bounding boxes in the boundRectsList variable. Let's sort this boxes using the Y coordinate:
# Sort the list based on ascending y values:
boundRectsSorted = sorted(boundRectsList, key=lambda x: x[1])
The list is now sorted and we can enumerate the boxes from left to right, top to bottom. Like this: First "row" -> 0, 1, Second "Row" -> 2, 3. Now, we can define the big, central, rectangle using this info. I call these "inner points". Notice the rectangle is defined as function of all the bounding boxes. For example, its top left starting point is defined by bounding box 0's bottom right ending point (both x and y). Its width is defined by bounding box 1's bottom left x coordinate, height is defined by bounding box 2's rightmost y coordinate. I'm gonna loop through each bounding box and extract their relevant dimensions to construct the center rectangle in the following way: (top left x, top left y, width, height). There's more than one way yo achieve this. I prefer to use a dictionary to get the relevant data. Let's see:
# Rectangle dictionary:
# Each entry is an index of the currentRect list
# 0 - X, 1 - Y, 2 - Width, 3 - Height
# Additionally: -1 is 0 (no dimension):
pointsDictionary = {0: (2, 3),
1: (-1, 3),
2: (2, -1),
3: (-1, -1)}
# Store center rectangle coordinates here:
centerRectangle = [None]*4
# Process the sorted rects:
rectCounter = 0
for i in range(len(boundRectsSorted)):
# Get sorted rect:
currentRect = boundRectsSorted[i]
# Get the bounding rect's data:
rectX = currentRect[0]
rectY = currentRect[1]
rectWidth = currentRect[2]
rectHeight = currentRect[3]
# Draw sorted rect:
cv2.rectangle(maskCopy, (int(rectX), int(rectY)), (int(rectX + rectWidth),
int(rectY + rectHeight)), (0, 255, 0), 5)
# Get the inner points:
currentInnerPoint = pointsDictionary[i]
borderPoint = [None]*2
# Check coordinates:
for p in range(2):
# Check for '0' index:
idx = currentInnerPoint[p]
if idx == -1:
borderPoint[p] = 0
else:
borderPoint[p] = currentRect[idx]
# Draw the border points:
color = (0, 0, 255)
thickness = -1
centerX = rectX + borderPoint[0]
centerY = rectY + borderPoint[1]
radius = 50
cv2.circle(maskCopy, (centerX, centerY), radius, color, thickness)
# Mark the circle
org = (centerX - 20, centerY + 20)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(maskCopy, str(rectCounter), org, font,
2, (0, 0, 0), 5, cv2.LINE_8)
# Show the circle:
cv2.imshow("Sorted Rects", maskCopy)
cv2.waitKey(0)
# Store the coordinates into list
if rectCounter == 0:
centerRectangle[0] = centerX
centerRectangle[1] = centerY
else:
if rectCounter == 1:
centerRectangle[2] = centerX - centerRectangle[0]
else:
if rectCounter == 2:
centerRectangle[3] = centerY - centerRectangle[1]
# Increase rectCounter:
rectCounter += 1
This image shows each inner point with a red circle. Each circle is enumerated from left to right, top to bottom. The inner points are stored in the centerRectangle list:
If you join each inner point you get the center rectangle we have been looking for:
# Check out the big rectangle at the center:
bigRectX = centerRectangle[0]
bigRectY = centerRectangle[1]
bigRectWidth = centerRectangle[2]
bigRectHeight = centerRectangle[3]
# Draw the big rectangle:
cv2.rectangle(maskCopy, (int(bigRectX), int(bigRectY)), (int(bigRectX + bigRectWidth),
int(bigRectY + bigRectHeight)), (0, 0, 255), 5)
cv2.imshow("Big Rectangle", maskCopy)
cv2.waitKey(0)
Check it out:
Now, just crop this portion of the original image:
# Crop the center portion:
centerPortion = inputCopy[bigRectY:bigRectY + bigRectHeight, bigRectX:bigRectX + bigRectWidth]
# Store a deep copy for results:
centerPortionCopy = centerPortion.copy()
This is the central portion of the image:
Cool, now let's create the grid. You know that there must be 4 bricks per width and 4 bricks per height. We can divide the image using this info. I'm storing each sub-image, or cell, in a list. I'm also estimating each cell's center, for additional processing. These are stored in a list too. Let's see the procedure:
# Dive the image into a grid:
verticalCells = 4
horizontalCells = 4
# Cell dimensions
cellWidth = bigRectWidth / verticalCells
cellHeight = bigRectHeight / horizontalCells
# Store the cells here:
cellList = []
# Store cell centers here:
cellCenters = []
# Loop thru vertical dimension:
for j in range(verticalCells):
# Cell starting y position:
yo = j * cellHeight
# Loop thru horizontal dimension:
for i in range(horizontalCells):
# Cell starting x position:
xo = i * cellWidth
# Cell Dimensions:
cX = int(xo)
cY = int(yo)
cWidth = int(cellWidth)
cHeight = int(cellHeight)
# Crop current cell:
currentCell = centerPortion[cY:cY + cHeight, cX:cX + cWidth]
# into the cell list:
cellList.append(currentCell)
# Store cell center:
cellCenters.append((cX + 0.5 * cWidth, cY + 0.5 * cHeight))
# Draw Cell
cv2.rectangle(centerPortionCopy, (cX, cY), (cX + cWidth, cY + cHeight), (255, 255, 0), 5)
cv2.imshow("Grid", centerPortionCopy)
cv2.waitKey(0)
This is the grid:
Let's now process each cell individually. Of course, you can process each cell on the last loop, but I'm not currently looking for optimization, clarity is my priority. We need to generate a series of HSV masks with the target colors: yellow, blue and green (empty). I prefer to, again, implement a dictionary with the target colors. I'll generate a mask for each color and I'll count the number of white pixels using cv2.countNonZero. Again, I set a minimum threshold. This time of 10. With this info I can determine which mask generated the maximum number of white pixels, thus, giving me the dominant color:
# HSV dictionary - color ranges and color name:
colorDictionary = {0: ([93, 64, 21], [121, 255, 255], "blue"),
1: ([20, 64, 21], [30, 255, 255], "yellow"),
2: ([55, 64, 21], [92, 255, 255], "green")}
# Cell counter:
cellCounter = 0
for c in range(len(cellList)):
# Get current Cell:
currentCell = cellList[c]
# Convert to HSV:
hsvCell = cv2.cvtColor(currentCell, cv2.COLOR_BGR2HSV)
# Some additional info:
(h, w) = currentCell.shape[:2]
# Process masks:
maxCount = 10
cellColor = "None"
for m in range(len(colorDictionary)):
# Get current lower and upper range values:
currentLowRange = np.array(colorDictionary[m][0])
currentUppRange = np.array(colorDictionary[m][1])
# Create the HSV mask
mask = cv2.inRange(hsvCell, currentLowRange, currentUppRange)
# Get max number of target pixels
targetPixelCount = cv2.countNonZero(mask)
if targetPixelCount > maxCount:
maxCount = targetPixelCount
# Get color name from dictionary:
cellColor = colorDictionary[m][2]
# Get cell center, add an x offset:
textX = int(cellCenters[cellCounter][0]) - 100
textY = int(cellCenters[cellCounter][1])
# Draw text on cell's center:
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(centerPortion, cellColor, (textX, textY), font,
2, (0, 0, 255), 5, cv2.LINE_8)
# Increase cellCounter:
cellCounter += 1
cv2.imshow("centerPortion", centerPortion)
cv2.waitKey(0)
This is the result:
From here it is easy to identify the empty spaces on the grid. What I didn't cover was the perspective rectification of your distorted image, but there's plenty of info on how to do that. Hope this helps you out!
Edit:
If you want to apply this approach to your distorted image you need to undo the fish-eye and the perspective distortion. Your rectified image should look like this:
You probably will have to tweak some values because some of the distortion still remains, even after rectification.

remove demarcation from text image - image processing

Hi I need to write a program that remove demarcation from gray scale image(image with text in it)
i read about thresholding and blurring but still i dont see how can i do it.
my image is an image of a hebrew text like that:
and i need to remove the demarcation(assuming that the demarcation is the smallest element in the image) the output need to be something like that
I want to write the code in python using opencv, what topics do i need to learn to be able to do that, and how?
thank you.
Edit:
I can use only cv2 functions
The symbols you want to remove are significantly smaller than all other shapes, you can use that to determine witch ones to remove.
First use threshold to convert the image to binary. Next, you can use findContours to detect the shapes and then contourArea to determine if the shape is larger than a threshold.
Finally you can can create a mask to remove the unwanted shapes, draw the larger symbols on a new image or draw the smaller symbols in white over the original symbols in the original image - making them disappear. I used that last technique in the code below.
Result:
Code:
import cv2
# load image as grayscale
img = cv2.imread('1MioS.png',0)
# convert to binary. Inverted, so you get white symbols on black background
_ , thres = cv2.threshold(img, 200, 255, cv2.THRESH_BINARY_INV)
# find contours in the thresholded image (this gives all symbols)
contours, hierarchy = cv2.findContours(thres, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# loop through the contours, if the size of the contour is below a threshold,
# draw a white shape over it in the input image
for cnt in contours:
if cv2.contourArea(cnt) < 250:
cv2.drawContours(img,[cnt],0,(255),-1)
# display result
cv2.imshow('res', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Update
To find the largest contour, you can loop through them and keep track of the largest value:
maxArea = 0
for cnt in contours:
currArea = cv2.contourArea(cnt)
if currArea > maxArea:
maxArea = currArea
print(maxArea)
I also whipped up a little more complex version, that creates a sorted list of the indexes and sizes of the contours. Then it looks for the largest relative difference in size of all contours, so you know which contours are 'small' and 'large'. I do not know if this works for all letters / fonts.
# create a list of the indexes of the contours and their sizes
contour_sizes = []
for index,cnt in enumerate(contours):
contour_sizes.append([index,cv2.contourArea(cnt)])
# sort the list based on the contour size.
# this changes the order of the elements in the list
contour_sizes.sort(key=lambda x:x[1])
# loop through the list and determine the largest relative distance
indexOfMaxDifference = 0
currentMaxDifference = 0
for i in range(1,len(contour_sizes)):
sizeDifference = contour_sizes[i][1] / contour_sizes[i-1][1]
if sizeDifference > currentMaxDifference:
currentMaxDifference = sizeDifference
indexOfMaxDifference = i
# loop through the list again, ending (or starting) at the indexOfMaxDifference, to draw the contour
for i in range(0, indexOfMaxDifference):
cv2.drawContours(img,contours,contour_sizes[i][0] ,(255),-1)
To get the background color you can do use minMaxLoc. This returns the lowest color value and it's position of an image (also the max value, but you don't need that). If you apply it to the thresholded image - where the background is black -, it will return the location of a background pixel (big odds it will be (0,0) ). You can then look up this pixel in the original color image.
# get the location of a pixel with background color
min_val, _, min_loc, _ = cv2.minMaxLoc(thres)
# load color image
img_color = cv2.imread('1MioS.png')
# get bgr values of background
b,g,r = img_color[min_loc]
# convert from numpy object
background_color = (int(b),int(g),int(r))
and then to draw the contours
cv2.drawContours(img_color,contours,contour_sizes[i][0],background_color,-1)
and of course
cv2.imshow('res', img_color)
This looks like a problem for template matching since you have what looks like a known font and can easily understand what the characters and/or demarcations are. Check out https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_template_matching/py_template_matching.html
Admittedly, the tutorial talks about finding the match; modification is up to you. In that case, you know the exact shape of the template itself, so using that information along with the location of the match, just overwrite the image data with the appropriate background color (based on the examples above, 255).
You can solve it by removing all the small clusters.
I found a Python solution (using OpenCV) here.
For supporting smaller fonts, I added the following heuristic:
"The largest size of the demarcation cluster is 1/500 of the largest letter cluster".
The heuristic can be refined, by statistical analysts (or improved by other heuristics, such as demarcation locations relative to the letters).
import numpy as np
import cv2
I = cv2.imread('Goodluck.png', cv2.IMREAD_GRAYSCALE)
J = 255 - I # Invert I
img = cv2.threshold(J, 127, 255, cv2.THRESH_BINARY)[1] # Convert to binary
# https://answers.opencv.org/question/194566/removing-noise-using-connected-components/
nlabel,labels,stats,centroids = cv2.connectedComponentsWithStats(img, connectivity=8)
labels_small = []
areas_small = []
# Find largest cluster:
max_size = np.max(stats[:, cv2.CC_STAT_AREA])
thresh_size = max_size / 500 # Set the threshold to maximum cluster size divided by 500.
for i in range(1, nlabel):
if stats[i, cv2.CC_STAT_AREA] < thresh_size:
labels_small.append(i)
areas_small.append(stats[i, cv2.CC_STAT_AREA])
mask = np.ones_like(labels, dtype=np.uint8)
for i in labels_small:
I[labels == i] = 255
cv2.imshow('I', I)
cv2.waitKey(0)
Here is a MATLAB code sample (kept threshold = 200):
clear
I = imbinarize(rgb2gray(imread('בהצלחה.png')));
figure;imshow(I);
J = ~I;
%Clustering
CC = bwconncomp(J);
%Cover all small clusters with zewros.
for i = 1:CC.NumObjects
C = CC.PixelIdxList{i}; %Cluster coordinates.
%Fill small clusters with zeros.
if numel(C) < 200
J(C) = 0;
end
end
J = ~J;
figure;imshow(J);
Result:

Point detection and circle area selection

I'm working with a particular type of images. Those, after obtaining the spectrum (aply the fft), I obtain the following picture:
So I want to select one of those "points" (called orders of the spectrum), at the following way:
I mean "draw" a circle aroud it, select the pixels inside and then center those pixels(without the "border circle"):
How can I perform it using OpenCV? Does exist any function?
EDIT: As per discussion below, to 'select' a circle, a mask can be used:
# Build mask
mask = np.zeros(image_array.shape, dtype=np.uint8)
cv2.circle(mask, max_loc, circle_radius, (255, 255, 255), -1, 8, 0)
# Apply mask (using bitwise & operator)
result_array = image_array & mask
# Crop/center result (assuming max_loc is of the form (x, y))
result_array = result_array[max_loc[1] - circle_radius:max_loc[1] + circle_radius,
max_loc[0] - circle_radius:max_loc[0] + circle_radius, :]
This leaves me with something like:
Another edit:
This might be useful for finding your peaks.
Not sure if that's what you asked, but if you just want to center around such a point, you can do it with subregions:
cv::Point center(yourCenterX_FromLeft, yourCenterY_fromTop);
int nWidth = yourDesiredWidthAfterCentering; // or 2* circle radius
int nHeight= yourDesiredHeightAfterCentering; // or 2* circle radius
// specify the subregion: top-left position and width/height
cv::Rect subImage = cv::Rect(center.x-nWidth/2, center.y-nHeight/2, nWidth, nHeight);
// extract the subregion out of the original image. remark that no data is copied but original data is referenced
cv::Mat subImageCentered = originalImage(subImage);
cv::imshow("subimage", subImageCentered);
didnt test, but that should be ok.
EDIT: sorry, it's c++ but I think subregions will work similar in python?!?

Categories

Resources