Removing small annotations from image

Removing small annotations from image - python

I am trying to remove small images from graphs using Python. As an example, I attach a graph with some '+' and '-' annotating it. I don't want them there, but don't want to manually remove them as there are quite a few to go through. Any easy way to detect and remove them?

I'll give you a solution using blob analysis since I had it almost ready at hand, but would ask you to do the reading and explanation yourself, since you have not spent too much time on your own code. Maybe it helps anyway.
Resulting image:
import numpy as np
import cv2
imgray = cv2.imread('image.png')
#### Blob analysis
# SimpleBlobDetector will find black blobs on white surface
ret,imthresh = cv2.threshold(imgray,160, 255,type=cv2.THRESH_BINARY)
# Remove small breaks in lines
kernel = np.ones((3,3),np.uint8)
imthresh=cv2.erode(imthresh,kernel, iterations=1)
# Setup SimpleBlobDetector parameters.
params = cv2.SimpleBlobDetector_Params()
# Filter by Area.
params.filterByArea = True
params.minArea = 0
params.maxArea =350
# Don't filter by Circularity
params.filterByCircularity = False
# Don't filter by Convexity
params.filterByConvexity = False
# Don't filter by Inertia
params.filterByInertia = False
# Create a detector with the parameters
detector = cv2.SimpleBlobDetector_create(params)
# Detect blobs.
keypoints = detector.detect(imthresh)
# Draw detected blobs as red circles.
# cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures
# the size of the circle corresponds to the size of blob
im_with_keypoints = cv2.drawKeypoints(imthresh, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
# Show blobs
cv2.imshow("Keypoints", im_with_keypoints)
cv2.imshow('threshold',imthresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

Related

Bounding box detection for characters / digits

I have images, which look like the following:
I want to find the bounding boxes for the 8 digits. My first try was to use cv2 with the following code:
import cv2
import matplotlib.pyplot as plt
import cvlib as cv
from cvlib.object_detection import draw_bbox
im = cv2.imread('31197402.png')
bbox, label, conf = cv.detect_common_objects(im)
output_image = draw_bbox(im, bbox, label, conf)
plt.imshow(output_image)
plt.show()
Unfortunately that doesn't work. Does anyone have an idea?

The problem in your solution is likely the input image, which is very poor in quality. There’s hardly any contrast between the characters and the background. The blob detection algorithm from cvlib is probably failing to distinguish between character blobs and background, producing a useless binary mask. Let’s try to solve this using purely OpenCV.
I propose the following steps:
Apply adaptive threshold to get a reasonably good binary mask.
Clean the binary mask from blob noise using an area filter.
Improve the quality of the binary image using morphology.
Get the outer contours of each character and fit a bounding rectangle to each character blob.
Crop each character using the previously calculated bounding rectangle.
Let’s see the code:
# importing cv2 & numpy:
import numpy as np
import cv2
# Set image path
path = "C:/opencvImages/"
fileName = "mrrm9.png"
# Read input image:
inputImage = cv2.imread(path+fileName)
inputCopy = inputImage.copy()
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
From here there’s not much to discuss, just reading the BGR image and converting it to grayscale. Now, let’s apply an adaptive threshold using the gaussian method. This is the tricky part, as the parameters are adjusted manually depending on the quality of the input. The way the method works is dividing the image into a grid of cells of windowSize, it then applies a local threshold to found the optimal separation between foreground and background. An additional constant, indicated by windowConstant can be added to the threshold to fine tune the output:
# Set the adaptive thresholding (gasussian) parameters:
windowSize = 31
windowConstant = -1
# Apply the threshold:
binaryImage = cv2.adaptiveThreshold(grayscaleImage, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, windowSize, windowConstant)
You get this nice binary image:
Now, as you can see, the image has some blob noise. Let’s apply an area filter to get rid of the noise. The noise is smaller than the target blobs of interest, so we can easy filter them based on area, like this:
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(binaryImage, connectivity=4)
# Set the minimum pixels for the area filter:
minArea = 20
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype('uint8')
This is the filtered image:
We can improve the quality of this image with some morphology. Some of the characters seem to be broken (Check out the first 3 - it is broken in two separated blobs). We can join them applying a closing operation:
# Set kernel (structuring element) size:
kernelSize = 3
# Set operation iterations:
opIterations = 1
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
closingImage = cv2.morphologyEx(filteredImage, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
This is the "closed" image:
Now, you want to get the bounding boxes for each character. Let’s detect the outer contour of each blob and fit a nice rectangle around it:
# Get each bounding box
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(closingImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
contours_poly = [None] * len(contours)
# The Bounding Rectangles will be stored here:
boundRect = []
# Alright, just look for the outer bounding boxes:
for i, c in enumerate(contours):
if hierarchy[0][i][3] == -1:
contours_poly[i] = cv2.approxPolyDP(c, 3, True)
boundRect.append(cv2.boundingRect(contours_poly[i]))
# Draw the bounding boxes on the (copied) input image:
for i in range(len(boundRect)):
color = (0, 255, 0)
cv2.rectangle(inputCopy, (int(boundRect[i][0]), int(boundRect[i][1])), \
(int(boundRect[i][0] + boundRect[i][2]), int(boundRect[i][1] + boundRect[i][3])), color, 2)
The last for loop is pretty much optional. It fetches each bounding rectangle from the list and draws it on the input image, so you can see each individual rectangle, like this:
Let's visualize that on the binary image:
Additionally, if you want to crop each character using the bounding boxes we just got, you do it like this:
# Crop the characters:
for i in range(len(boundRect)):
# Get the roi for each bounding rectangle:
x, y, w, h = boundRect[i]
# Crop the roi:
croppedImg = closingImage[y:y + h, x:x + w]
cv2.imshow("Cropped Character: "+str(i), croppedImg)
cv2.waitKey(0)
This is how you can get the individual bounding boxes. Now, maybe you are trying to pass these images to an OCR. I tried passing the filtered binary image (after the closing operation) to pyocr (That’s the OCR I’m using) and I get this as output string: 31197402
The code I used to get the OCR of the closed image is this:
# Set the OCR libraries:
from PIL import Image
import pyocr
import pyocr.builders
# Set pyocr tools:
tools = pyocr.get_available_tools()
# The tools are returned in the recommended order of usage
tool = tools[0]
# Set OCR language:
langs = tool.get_available_languages()
lang = langs[0]
# Get string from image:
txt = tool.image_to_string(
Image.open(path + "closingImage.png"),
lang=lang,
builder=pyocr.builders.TextBuilder()
)
print("Text is:"+txt)
Be aware that the OCR receives black characters on white background, so you must invert the image first.

Looking for suggestions to replace openCV's blob detector

I have some code written in python that is used to clean up an image and then use a blob detector to identify blobs in the cleared up image. While this works efficiently for pictures that are small for example 549x549 pixels. It gets very slow when doing large pictures such as those that are 14790x13856 pixels. I was wondering if anyone had any suggestions on how to make the blob detector (oepncv) faster or a replacement library that is faster?
edit:
add pic:
Here is an example of a picture:
Pic
It's about that size^, but about 667x bigger
I already have the code written and on a small scale there are no errors
params = cv2.SimpleBlobDetector_Params()
# change thresholds
params.minThreshold = 0
params.maxThreshold = 255
# Filter by Area.
params.filterByArea = True
params.minArea = 0
params.maxArea = 35
# Filter by Circularity
params.filterByCircularity = True
params.minCircularity = 0
# Filter by Convexity
params.filterByConvexity = True
params.minConvexity = 0
# Filter by Inertia
params.filterByInertia = True
params.minInertiaRatio = 0
# Create a detector with the parameters
detector = cv2.SimpleBlobDetector_create(params)
keypoints = detector.detect(third)
ImgPoints = cv2.drawKeypoints(third, keypoints, np.array([]), (0, 0, `255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
The result is a picture (tiff) that I save and there are no error messages.

How to calculate how circular the blob is using Open CV - SimpleBlobDetector()?

I have two images of moles. One is relatively round but the other isn't. I want to find out how circular the mole is say -1 being not at all circular,0 being elliptical, and 1 being circular. I first converted the raw image into binary and then tried using the code below. The code draws a circle around the circular mole but doesn't give information on inertia. The non-cricular mole is not even detected as a blob. Am I understanding this concept incorrectly? How should I go about solving this problem?
# Standard imports
import cv2
import numpy as np
# Setup SimpleBlobDetector parameters.
params = cv2.SimpleBlobDetector_Params()
# Change thresholds
params.minThreshold = 10;
params.maxThreshold = 200;
# Filter by Area.
params.filterByArea = True
params.minArea = 1500
# Filter by Circularity
params.filterByCircularity = True
params.minCircularity = 0.1
# Filter by Convexity
params.filterByConvexity = True
params.minConvexity = 0.87
# Filter by Inertia
params.filterByInertia = True
params.minInertiaRatio = 0.01
# Create a detector with the parameters
detector = cv2.SimpleBlobDetector_create(params)
# Read image
im = cv2.imread("mole_torezo.png", cv2.IMREAD_GRAYSCALE)
# Detect blobs.
keypoints = detector.detect(im)
# Draw detected blobs as red circles.
# cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures the size of the circle corresponds to the size of blob
im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
# Show keypoints
cv2.imshow("Keypoints", im_with_keypoints)
cv2.waitKey(0)
Thanks to Blob Detection Using OpenCV ( Python, C++ ) for the code

Using findCirclesGrid() in large images

I am using OpenCV 3 in Python 2.7 to calibrate different cameras. I use the findCirclesGrid() function, which succesfully finds a 4 by 11 circle pattern in a 1 Megapixel image. However, when I try to detect the pattern up close in an image with a higher resolution, the function fails. When the object is farther away in the image, it is still detected. I use the function as follows:
ret, corners = cv2.findCirclesGrid(image, (4, 11), flags=cv2.CALIB_CB_ASYMMETRIC_GRID)
With larger images, it returns False, None. It seems that the function can't handle circles that have a too large area. I tried adding cv2.CALIB_CB_CLUSTERING, but this doesn't seem to make a difference. Also, it seems that in C++ the user can signify the use of blobdetector, but not in Python. Details: http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#findcirclesgrid
Can I increase the maximum detection size somehow or make the function detect the pattern in another way?
Edit: I found out how to edit parameters of the blobDetector by using
params = cv2.SimpleBlobDetector_Params()
params.maxArea = 100000
detector = cv2.SimpleBlobDetector_create(params)
ret, corners = cv2.findCirclesGrid(self.gray, (horsq, versq), None,
flags=cv2.CALIB_CB_ASYMMETRIC_GRID, blobDetector=detector)
Still the same issue, though.
Edit2:
Now adding cv2.CALIB_CB_CLUSTERING resolves the issue!

The main thing you probably need to do is tweak the min area and max area of the blob detector.
Create a blob detector with params (don't use the default parameters), and adjust the minarea and max area that the detector will accept. You can first just show all the found blobs before you pass the detector that you have created into the findcirclesgrid function.
Python Sample code
params = cv2.SimpleBlobDetector_Params()
# Setup SimpleBlobDetector parameters.
print('params')
print(params)
print(type(params))
# Filter by Area.
params.filterByArea = True
params.minArea = 200
params.maxArea = 18000
params.minDistBetweenBlobs = 20
params.filterByColor = True
params.filterByConvexity = False
# tweak these as you see fit
# Filter by Circularity
# params.filterByCircularity = False
params.minCircularity = 0.2
# # # Filter by Convexity
# params.filterByConvexity = True
# params.minConvexity = 0.87
# Filter by Inertia
params.filterByInertia = True
# params.filterByInertia = False
params.minInertiaRatio = 0.01
detector = cv2.SimpleBlobDetector_create(params)
# Detect blobs.
keypoints = detector.detect(gray)
im_with_keypoints = cv2.drawKeypoints(img, keypoints, np.array([]), (0, 0, 255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
fig = plt.figure()
im_with_keypoints = cv2.drawKeypoints(gray, keypoints, np.array([]), (0, 0, 255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
plt.imshow(cv2.cvtColor(im_with_keypoints, cv2.COLOR_BGR2RGB),
interpolation='bicubic')
titlestr = '%s found %d keypoints' % (fname, len(keypoints))
plt.title(titlestr)
fig.canvas.set_window_title(titlestr)
ret, corners = cv2.findCirclesGrid(gray, (cbcol, cbrow), flags=(cv2.CALIB_CB_ASYMMETRIC_GRID + cv2.CALIB_CB_CLUSTERING ), blobDetector=detector )

Blob ID tagging for OpenCV python

I am currently making a python code for people headcounting with direction. I have used 'moments'method to gather the coordinates and eventually when it crosses a certain line then the counter increments.But, this method is proving to be very inefficient. My question regarding the blob detection are:
Is there any blob detection technique for python opencv? Or it could be done with cv2.findContours?
I'm working on raspberry pi so could anyone suggest how to get blob library on debian linux?
Even if there is, how could i get a unique ID for each blob? Is there any algorithm to provide tagging of UNIQUE ID's?
If there's any better method to do this, kindly suggest an algorithm.
Thanks in advance.

For blob detection you can use SimpleBlobDetector from OpenCV:
# Setup SimpleBlobDetector parameters.
params = cv2.SimpleBlobDetector_Params()
# Filter by Area.
params.filterByArea = True
params.minArea = 100
params.maxArea =100000
# Don't filter by Circularity
params.filterByCircularity = False
# Don't filter by Convexity
params.filterByConvexity = False
# Don't filter by Inertia
params.filterByInertia = False
# Create a detector with the parameters
detector = cv2.SimpleBlobDetector_create(params)
# Detect blobs.
keypoints = detector.detect(imthresh)
# Draw detected blobs as red circles.
# cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures
# the size of the circle corresponds to the size of blob
im_with_keypoints = cv2.drawKeypoints(imthresh, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
For labelling, using scipy.ndimage.label is usually a better idea:
label_im, nb_labels = ndimage.label(mask)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Removing small annotations from image - python

I am trying to remove small images from graphs using Python. As an example, I attach a graph with some '+' and '-' annotating it. I don't want them there, but don't want to manually remove them as there are quite a few to go through. Any easy way to detect and remove them?

Related

Bounding box detection for characters / digits

Looking for suggestions to replace openCV's blob detector

How to calculate how circular the blob is using Open CV - SimpleBlobDetector()?

Using findCirclesGrid() in large images

Blob ID tagging for OpenCV python

Categories

Resources