OpenCV get centers of multiple objects - python

I'm trying to build a simple image analyzing tool that will find items that fit in a color range and then finds the centers of said objects.
As an example, after masking, I'm analyzing an image like this:
What I'm doing so far code-wise is rather simple:
import cv2
import numpy
bound = 30
inc = numpy.array([225,225,225])
lower = inc - bound
upper = inc + bound
img = cv2.imread("input.tiff")
cv2.imshow("Original", img)
mask = cv2.inRange(img, lower, upper)
cv2.imshow("Range", mask)
contours = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_L1)
print contours
This, however, gives me a countless number of contours. I'm somewhat at a loss while reading the corresponding manpage. Can I use moments to reasonably analyze the contours? Are contours even the right tool for this?
I found this question, that vaguely covers finding the center of one object, but how would I modify this approach when there are multiple items?
How do I find the centers of the objects in the image? For example, in the above sample image I'm looking to find three points (the centers of the rectangle and the two circles).

Try print len(contours). That will give you around the expected answer. The output you see is the full representation of the contours which could be thousands of points.
Try this code:
import cv2
import numpy
img = cv2.imread('inp.png', 0)
_, contours, _ = cv2.findContours(img.copy(), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_L1)
print len(contours)
centres = []
for i in range(len(contours)):
moments = cv2.moments(contours[i])
centres.append((int(moments['m10']/moments['m00']), int(moments['m01']/moments['m00'])))
cv2.circle(img, centres[-1], 3, (0, 0, 0), -1)
print centres
cv2.imshow('image', img)
cv2.imwrite('output.png',img)
cv2.waitKey(0)
This gives me 4 centres:
[(474, 411), (96, 345), (58, 214), (396, 145)]
The obvious thing to do here is to also check for the area of the contours and if it is too small as a percentage of the image, don't count it as a real contour, it will just be noise. Just add something like this to the top of the for loop:
if cv2.contourArea(contours[i]) < 100:
continue
For the return values from findContours, I'm not sure what the first value is for as it is not present in the C++ version of OpenCV (which is what I use). The second value is obviously just the contours (an array of arrays) and the third value is a hierarchy holding information on the nesting of contours, which can be very handy.

You can use the opencv minEnclosingCircle() function on your contours to get the center of each object.
Check out this example which is in c++ but you can adapt the logic Example

Related

Extract most central area in a Binary Image

I am processing binary images, and was previously using this code to find the largest area in the binary image:
# Use the hue value to convert to binary
thresh = 20
thresh, thresh_img = cv2.threshold(h, thresh, 255, cv2.THRESH_BINARY)
cv2.imshow('thresh', thresh_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Finding Contours
# Use a copy of the image since findContours alters the image
contours, _ = cv2.findContours(thresh_img.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#Extract the largest area
c = max(contours, key=cv2.contourArea)
This code isn't really doing what I need it to do, now I think it would better to extract the most central area in the binary image.
Binary Image
Largest Image
This is currently what the code is extracting, but I am hoping to get the central circle in the first binary image extracted.
OpenCV comes with a point-polygon test function (for contours). It even gives a signed distance, if you ask for that.
I'll find the contour that is closest to the center of the picture. That may be a contour actually overlapping the center of the picture.
Timings, on my quadcore from 2012, give or take a millisecond:
findContours: ~1 millisecond
all pointPolygonTests and argmax: ~1 millisecond
mask = cv.imread("fkljm.png", cv.IMREAD_GRAYSCALE)
(height, width) = mask.shape
ret, mask = cv.threshold(mask, 128, 255, cv.THRESH_BINARY) # required because the sample picture isn't exactly clean
# get contours
contours, hierarchy = cv.findContours(mask, cv.RETR_LIST | cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
center = (np.array([width, height]) - 1) / 2
# find contour closest to center of picture
distances = [
cv.pointPolygonTest(contour, center, True) # looking for most positive (inside); negative is outside
for contour in contours
]
iclosest = np.argmax(distances)
print("closest contour is", iclosest, "with distance", distances[iclosest])
# draw closest contour
canvas = cv.cvtColor(mask, cv.COLOR_GRAY2BGR)
cv.drawContours(image=canvas, contours=[contours[iclosest]], contourIdx=-1, color=(0, 255, 0), thickness=5)
closest contour is 45 with distance 65.19202405202648
a cv.floodFill() on the center point can also quickly yield a labeling on that blob... assuming the mask is positive there. Otherwise, there needs to be search.
(cx, cy) = center.astype(int)
assert mask[cy,cx], "floodFill not applicable"
# trying cv.floodFill on the image center
mask2 = mask >> 1 # turns everything else gray
cv.floodFill(image=mask2, mask=None, seedPoint=center.astype(int), newVal=255)
# use (mask2 == 255) to identify that blob
This also takes less than a millisecond.
Some practically faster approaches might involve a pyramid scheme (low-res versions of the mask) to quickly identify areas of the picture that are candidates for an exact test (distance/intersection).
Test target pixel. Hit (positive)? Done.
Calculate low-res mask. Per block, if any pixel is positive, block is positive.
Find positive blocks, sort by distance, examine closer all those that are within sqrt(2) * blocksize of the best distance.
There are several ways you define "most central." I chose to define it as the region with the closest distance to the point you're searching for. If the point is inside the region, then that distance will be zero.
I also chose to do this with a pixel-based approach rather than a polygon-based approach, like you're doing with findContours().
Here's a step-by-step breakdown of what this code is doing.
Load the image, put it into grayscale, and threshold it. You're already doing these things.
Identify connected components of the image. Connected components are places where there are white pixels which are directly connected to other white pixels. This breaks up the image into regions.
Using np.argwhere(), convert a true/false mask into an array of coordinates.
For each coordinate, compute the Euclidean distance between that point and search_point.
Find the minimum within each region.
Across all regions, find the smallest distance.
import cv2
import numpy as np
img = cv2.imread('test197_img.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh_img = cv2.threshold(gray,127,255,cv2.THRESH_BINARY)
n_groups, comp_grouped = cv2.connectedComponents(thresh_img)
components = []
search_point = [600, 150]
for i in range(1, n_groups):
mask = (comp_grouped == i)
component_coords = np.argwhere(mask)[:, ::-1]
min_distance = np.sqrt(((component_coords - search_point) ** 2).sum(axis=1)).min()
components.append({
'mask': mask,
'min_distance': min_distance,
})
closest = min(components, key=lambda x: x['min_distance'])['mask']
Output:

How to crop the largest circle out of an image (shooting target) using OpenCV

as the title states, I'm trying to crop the largest circle out of an image. I'm using OpenCV in python. To be exact, it's a shooting target, which always has the same format, but the picture of it can be taken with any mobile device and in different lighting conditions (I will include some examples lower).
I'm completely new to image recognition, so I have been trying out many different ways of doing this, but couldn't figure out a universal solution, that would work on all of my target images.
Why I'm trying to do this:
My assignment is to calculate score of one or multiple shots on the given target image. I have tried color segmentation to find the shots, but since the shots can be on different backgrounds, this wouldn't work properly. So now I'm trying to see the difference between the empty shooting target image and the already shot on target image. Also, I need to be able to tell, which target it was shot on (there are two target types). So I'm trying to crop out only the target from image to get rid of the background interference and then continue with the shot identifications.
What I have tried so far:
1) Finding the largest circle with HoughCircles. My next step would be to somehow remove the outer part of that found circle. I have played with the configuration of HoughCircles method for quite some time, but always one of the example images wasn't highlighting the most outer circle correctly or wasn't highlighting any of the circles :/.
My final configuration looked something like this:
img = cv2.GaussianBlur(img, (3, 3), 0)
cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, 2, 10000, param1=50, param2=100, minRadius=200, maxRadius=0)
It seemed like using HoughCircles wouldn't be the right way to do this, so I moved on to another possible solution I found on the internet.
2) Finding all the countours by filtering the 'black' color range in which the circles seem to be on the pictures and than finding the largest one. The problem with this solution seemed to be that sometimes the pictures had a shadow that destroyed the outer circle and therefore it seemed impossible to crop by it.
My code looked like this:
# black color boundaries [B, G, R]
lower = [0, 0, 0]
upper = [150, 150, 150]
# create NumPy arrays from the boundaries
lower = np.array(lower, dtype="uint8")
upper = np.array(upper, dtype="uint8")
# find the colors within the specified boundaries and apply the mask
mask = cv2.inRange(img, lower, upper)
output = cv2.bitwise_and(img, img, mask=mask)
ret, thresh = cv2.threshold(mask, 40, 255, 0)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
if len(contours) != 0:
# draw in blue the contours that were founded
cv2.drawContours(output, contours, -1, 255, 3)
# find the biggest countour (c) by the area
c = max(contours, key=cv2.contourArea)
x, y, w, h = cv2.boundingRect(c)
After that, I would try to draw a circle by the largest found contour (c) and crop by it. But I have already seen that the drawn circles weren't complete (probably due to some shadow on the picture) and therefore this wouldn't work anyway.
After those failures, I have tried so many solutions from other questions on here, but none would work for my problem.
Example images:
Target example 1
Target example 2
Target to calc score 1
Target to calc score 2
To be completely honest with you, I'm really lost on how to go about this. I would appreciate any help, advice, anything.
There are two different types of target in your samples. You may want to process them separately or ask the user for the input, what kind of target it is. Basically, you want to know how large the black part of the target, does it cover 7-10 or 4-10.
Binarize your image. Build a histogram along X and Y -- you'll find the position of the black part of your target as (x_left, x_right, y_top, y_bottom). Once you know that, you can calculate the center ((top+bottom)/2, (left+right)/2). After that you can easily calculate the score for every pixel of the image, since you know the center, the black spot size and the number of different score areas within.

remove demarcation from text image - image processing

Hi I need to write a program that remove demarcation from gray scale image(image with text in it)
i read about thresholding and blurring but still i dont see how can i do it.
my image is an image of a hebrew text like that:
and i need to remove the demarcation(assuming that the demarcation is the smallest element in the image) the output need to be something like that
I want to write the code in python using opencv, what topics do i need to learn to be able to do that, and how?
thank you.
Edit:
I can use only cv2 functions
The symbols you want to remove are significantly smaller than all other shapes, you can use that to determine witch ones to remove.
First use threshold to convert the image to binary. Next, you can use findContours to detect the shapes and then contourArea to determine if the shape is larger than a threshold.
Finally you can can create a mask to remove the unwanted shapes, draw the larger symbols on a new image or draw the smaller symbols in white over the original symbols in the original image - making them disappear. I used that last technique in the code below.
Result:
Code:
import cv2
# load image as grayscale
img = cv2.imread('1MioS.png',0)
# convert to binary. Inverted, so you get white symbols on black background
_ , thres = cv2.threshold(img, 200, 255, cv2.THRESH_BINARY_INV)
# find contours in the thresholded image (this gives all symbols)
contours, hierarchy = cv2.findContours(thres, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# loop through the contours, if the size of the contour is below a threshold,
# draw a white shape over it in the input image
for cnt in contours:
if cv2.contourArea(cnt) < 250:
cv2.drawContours(img,[cnt],0,(255),-1)
# display result
cv2.imshow('res', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Update
To find the largest contour, you can loop through them and keep track of the largest value:
maxArea = 0
for cnt in contours:
currArea = cv2.contourArea(cnt)
if currArea > maxArea:
maxArea = currArea
print(maxArea)
I also whipped up a little more complex version, that creates a sorted list of the indexes and sizes of the contours. Then it looks for the largest relative difference in size of all contours, so you know which contours are 'small' and 'large'. I do not know if this works for all letters / fonts.
# create a list of the indexes of the contours and their sizes
contour_sizes = []
for index,cnt in enumerate(contours):
contour_sizes.append([index,cv2.contourArea(cnt)])
# sort the list based on the contour size.
# this changes the order of the elements in the list
contour_sizes.sort(key=lambda x:x[1])
# loop through the list and determine the largest relative distance
indexOfMaxDifference = 0
currentMaxDifference = 0
for i in range(1,len(contour_sizes)):
sizeDifference = contour_sizes[i][1] / contour_sizes[i-1][1]
if sizeDifference > currentMaxDifference:
currentMaxDifference = sizeDifference
indexOfMaxDifference = i
# loop through the list again, ending (or starting) at the indexOfMaxDifference, to draw the contour
for i in range(0, indexOfMaxDifference):
cv2.drawContours(img,contours,contour_sizes[i][0] ,(255),-1)
To get the background color you can do use minMaxLoc. This returns the lowest color value and it's position of an image (also the max value, but you don't need that). If you apply it to the thresholded image - where the background is black -, it will return the location of a background pixel (big odds it will be (0,0) ). You can then look up this pixel in the original color image.
# get the location of a pixel with background color
min_val, _, min_loc, _ = cv2.minMaxLoc(thres)
# load color image
img_color = cv2.imread('1MioS.png')
# get bgr values of background
b,g,r = img_color[min_loc]
# convert from numpy object
background_color = (int(b),int(g),int(r))
and then to draw the contours
cv2.drawContours(img_color,contours,contour_sizes[i][0],background_color,-1)
and of course
cv2.imshow('res', img_color)
This looks like a problem for template matching since you have what looks like a known font and can easily understand what the characters and/or demarcations are. Check out https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_template_matching/py_template_matching.html
Admittedly, the tutorial talks about finding the match; modification is up to you. In that case, you know the exact shape of the template itself, so using that information along with the location of the match, just overwrite the image data with the appropriate background color (based on the examples above, 255).
You can solve it by removing all the small clusters.
I found a Python solution (using OpenCV) here.
For supporting smaller fonts, I added the following heuristic:
"The largest size of the demarcation cluster is 1/500 of the largest letter cluster".
The heuristic can be refined, by statistical analysts (or improved by other heuristics, such as demarcation locations relative to the letters).
import numpy as np
import cv2
I = cv2.imread('Goodluck.png', cv2.IMREAD_GRAYSCALE)
J = 255 - I # Invert I
img = cv2.threshold(J, 127, 255, cv2.THRESH_BINARY)[1] # Convert to binary
# https://answers.opencv.org/question/194566/removing-noise-using-connected-components/
nlabel,labels,stats,centroids = cv2.connectedComponentsWithStats(img, connectivity=8)
labels_small = []
areas_small = []
# Find largest cluster:
max_size = np.max(stats[:, cv2.CC_STAT_AREA])
thresh_size = max_size / 500 # Set the threshold to maximum cluster size divided by 500.
for i in range(1, nlabel):
if stats[i, cv2.CC_STAT_AREA] < thresh_size:
labels_small.append(i)
areas_small.append(stats[i, cv2.CC_STAT_AREA])
mask = np.ones_like(labels, dtype=np.uint8)
for i in labels_small:
I[labels == i] = 255
cv2.imshow('I', I)
cv2.waitKey(0)
Here is a MATLAB code sample (kept threshold = 200):
clear
I = imbinarize(rgb2gray(imread('בהצלחה.png')));
figure;imshow(I);
J = ~I;
%Clustering
CC = bwconncomp(J);
%Cover all small clusters with zewros.
for i = 1:CC.NumObjects
C = CC.PixelIdxList{i}; %Cluster coordinates.
%Fill small clusters with zeros.
if numel(C) < 200
J(C) = 0;
end
end
J = ~J;
figure;imshow(J);
Result:

Finding the boundary points of a contour in opencv numpy

complete noob at open cv and numpy here. here is the image: here is my code:
import numpy as np
import cv2
im = cv2.imread('test.jpg')
imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
imgray = cv2.medianBlur(imgray, ksize=7)
ret, thresh = cv2.threshold(imgray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
_, contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
print ("number of countours detected before filtering %d -> "%len(contours))
new = np.zeros(imgray.shape)
new = cv2.drawContours(im,contours,len(contours)-1,(0,0,255),18)
cv2.namedWindow('Display',cv2.WINDOW_NORMAL)
cv2.imshow('Display',new)
cv2.waitKey()
mask = np.zeros(imgray.shape,np.uint8)
cv2.drawContours(mask,[contours[len(contours)-1]],0,255,-1)
pixelpoints = cv2.findNonZero(mask)
cv2.imwrite("masked_image.jpg",mask)
print(len(pixelpoints))
print("type of pixelpoints is %s" %type(pixelpoints))
the length of pixelpoints is nearly 2 million since it contains all the point covered by the contours. But i only require the bordering point of that contour. How do I do it? I have tried several methods from opencv documentation but always errors with tuples and sorting operations. please...help?
I only require the border points of the contour :(
Is this what you mean by border points of a contour?
The white lines you see are points that I have marked out in white against the blue drawn contours. There's a little spot at the bottom right because I think its most likely that your black background isn't really black and so when I did thresholding and a floodfill to get this,
there was a tiny white speck at the same spot. But if you play around with the parameters and do a more proper thresholding and floodfill it shouldn't be an issue.
In openCV's drawContours function, the cnts would contain lists of contours and each contour will contain an array of points. Each point is also of type numpy.ndarray. If you want to place all points of each contour in one place so it returns you a set of points of boundary points (like the white dots outline in the image above), you might want to append them all into a list. You can try this:
#rgb is brg instead
contoured=cv2.drawContours(black, cnts, -1, (255,0,0), 3)
#list of ALL points of ALL contours
all_pixels=[]
for i in range(0, len(cnts)):
for j in range(0,len(cnts[i])):
all_pixels.append(cnts[i][j])
When I tried to
print(len(all_pixels))
it returned me 2139 points.
Do this if you want to mark out the points for visualization purposes (e.g. like my white points):
#contouredC is a copy of the contoured image above
contouredC[x_val, y_val]=[255,255,255]
If you want less points, just use a step function when iterating through to draw the white points out. Something like this:
In python, for loops are slow so I think there's better ways of replacing the nested for loops with a np.where() function or something instead. Will update this if/when I figure it out. Also, this needs better thresholding and binarization techniques. Floodfill technique referenced from: Python 2.7: Area opening and closing binary image in Python not so accurate.
Hope it helps.

Merge MSER detected objetcs (OpenCV, Python)

I am working on this image as source:
Applying the next code...
import cv2
import numpy as np
mser = cv2.MSER_create()
img = cv2.imread('C:\\Users\\Link\\Desktop\\test2.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
regions, _ = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
cv2.polylines(vis, hulls, 1, (0, 255, 0))
mask = np.zeros((img.shape[0], img.shape[1], 1), dtype=np.uint8)
for contour in hulls:
cv2.drawContours(mask, [contour], -1, (255, 255, 255), -1)
text_only = cv2.bitwise_and(img, img, mask=mask)
cv2.imshow('img', vis)
cv2.waitKey(0)
cv2.imshow('img', mask)
cv2.waitKey(0)
cv2.imshow('img', text_only)
cv2.waitKey(0)
cv2.imwrite('C:\\Users\\Link\\Desktop\\test_o\\1.png', text_only)
...I am obtaining this as result (mask):
The question is this:
how to merge into a single object the number 5 in the number series (157661546) as long as it is divided in the mask image ?
Thanks
Have a look here, it seems like the exact answer.
Here instead there is my version of the above code fine tuned for text extraction (with masking too).
Below there is the original code from the previous article, "ported" to python 3, opencv 3, added mser and bounding boxes. The main difference with my version is how the grouping distance is defined: mine is text-oriented while the one below is a free geometrical distance.
import sys
import cv2
import numpy as np
def find_if_close(cnt1,cnt2):
row1,row2 = cnt1.shape[0],cnt2.shape[0]
for i in range(row1):
for j in range(row2):
dist = np.linalg.norm(cnt1[i]-cnt2[j])
if abs(dist) < 25: # <-- threshold
return True
elif i==row1-1 and j==row2-1:
return False
img = cv2.imread(sys.argv[1])
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
cv2.imshow('input', img)
ret,thresh = cv2.threshold(gray,127,255,0)
mser=False
if mser:
mser = cv2.MSER_create()
regions = mser.detectRegions(thresh)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions[0]]
contours = hulls
else:
thresh = cv2.bitwise_not(thresh) # wants black bg
im2,contours,hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,2)
cv2.drawContours(img, contours, -1, (0,0,255), 1)
cv2.imshow('base contours', img)
LENGTH = len(contours)
status = np.zeros((LENGTH,1))
print("Elements:", len(contours))
for i,cnt1 in enumerate(contours):
x = i
if i != LENGTH-1:
for j,cnt2 in enumerate(contours[i+1:]):
x = x+1
dist = find_if_close(cnt1,cnt2)
if dist == True:
val = min(status[i],status[x])
status[x] = status[i] = val
else:
if status[x]==status[i]:
status[x] = i+1
unified = []
maximum = int(status.max())+1
for i in range(maximum):
pos = np.where(status==i)[0]
if pos.size != 0:
cont = np.vstack(contours[i] for i in pos)
hull = cv2.convexHull(cont)
unified.append(hull)
cv2.drawContours(img,contours,-1,(0,0,255),1)
cv2.drawContours(img,unified,-1,(0,255,0),2)
#cv2.drawContours(thresh,unified,-1,255,-1)
for c in unified:
(x,y,w,h) = cv2.boundingRect(c)
cv2.rectangle(img, (x,y), (x+w,y+h), (255, 0, 0), 2)
cv2.imshow('result', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Sample output (the yellow blob is below the binary threshold conversion so it's ignored). Red: original contours, green: unified ones, blue: bounding boxes.
Probably there is no need to use MSER as a simple findContours may work fine.
------------------------
Starting from here there is my old answer, before I found the above code. I'm leaving it anyway as it describe a couple of different approaches that may be easier/more appropriate for some scenarios.
A quick and dirty trick is to add a small gaussian blur and a high threshold before the MSER (or some dilute/erode if you prefer fancy things). In practice you just make the text bolder so that it fills small gaps. Obviously you can later discard this version and crop from the original one.
Otherwise, if your text is in lines, you may try to detect the average line center (make an histogram of Y coordinates and find the peaks for example). Then, for each line, look for fragments with a close average X. Quite fragile if text is noisy/complex.
If you do not need to split each letter, getting the bounding box for the whole word, may be easier: just split in groups based on a maximum horizontal distance between fragments (using the leftmost/rightmost points of the contour). Then use the leftmost and rightmost boxes within each group to find the whole bounding box. For multiline text first group by centroids Y coordinate.
Implementation notes:
Opencv allows you to create histograms but you probably can get away with something like this (worked for me on a similar task):
def histogram(vals, th=4, bins=400):
hist = np.zeros(bins)
for y_center in vals:
bucket = int(round(y_center / 2.)) <-- change this "2."
hist[bucket-1] += 1
print("hist: ", hist)
hist = np.where(hist > th, hist, 0)
return hist
Here my histogram is just an array with 400 buckets (my image was 800px high so each bucket catches two pixels, that is where the "2." comes from). Vals are the Y coordinates of the centroids of each fragment (you may want to ignore very small elements when you build this list). The th threshold is there just to remove some noise. You should get something like this:
0,0,0,5,22,0,0,0,0,43,7,0,0,0
This list describes, moving top to bottom, how many fragments are at each location.
Now I ran another pass to merge the peaks into a single value (just scan the array and sum while it is non-zero and reset the count on first zero) getting something like this {y:count}:
{9:27, 20:50}
Now I know I have two text rows at y=9 and y=20. Now, or before, you assign each fragment to on line (with again an 8px threshold in my case). Now you can process each line on its own, finding "words". BTW, I have your identical problem with broken letters that's why I came here looking for MSER :). Notice that if you find the whole bounding box for the word this problem happens only on the first/last letters: the other broken letters just falls inside the word box anyway.
Here is a reference for the erode/dilate thing, but gaussian blur/th worked for me.
UPDATE: I've noticed that there is something wrong in this line:
regions = mser.detectRegions(thresh)
I pass in the already thresholded image(!?). This is not relevant for the aggregation part but keep in mind that the mser part is not being used as expected.

Categories

Resources