We print 500 bubble surveys, get them back, and scan them in a giant batch giving us 500 PNG images.
Each image has a slight variations in alignment, but identical size and resolution. We need to register the images so they're all perfectly aligned. (With the next step being semi-automated scoring of the bubbles).
If these were 3D-MRI images, I could accomplish this with a single command line utility; But I'm not seeing any such tool for aligning scanned text documents.
I've played around with opencv as described in Image Alignment (Feature Based) using OpenCV, and it produces dynamite results when it works, but it often fails spectacularly. That approach is looking for documents hidden within natural scenes, a much harder problem than our case where the images are just rotated and translated in 2D, not 3.
I've also explored imreg_dft, which runs consistently but does a very poor job -- presumably the dft approach is better on photographs than text documents.
Does a solution for Image Registration of Scanned Forms already exist? If not, what's the correct approach? Opencv, imreg_dft, or something else?
Similar prior question: How to find blank field on scanned document image
What you can try is using the red outline of the answer boxes to create a mask where you can select the outline. I create a sample below. You can also remove the blue letters by creating a mask for the letters, inverting it, then apply it as a mask. I didn't do that, because he image of the publisher is low-res, and it caused issues. I expect your scans to perform better.
When you have the contours of the boxes you can transform/compare them individually (as the boxes have different sizes). Or you can use the biggest contour to create a transform for the entire document.
You can then use minAreaRect to find the cornerpoints of the contours. Threshold the contourArea to exclude noise / non answer area's.
import cv2
import numpy as np
# load image
img = cv2.imread('Untitled.png')
# convert to hsv colorspace
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# define range of image bachground in HSV
lower_val = np.array([0,0,0])
upper_val = np.array([179,255,237])
# Threshold the HSV image
mask = cv2.inRange(hsv, lower_val, upper_val)
# find external contours in the mask
contours, hier = cv2.findContours(mask, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
# draw contours
for cnt in contours:
cv2.drawContours(img,[cnt],0,(0,255,0),3)
# display image
cv2.imshow('Result', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Related
I am very new to OpenCV(and to StackOverflow). I'm writing a program with OpenCV which takes a picture with an object (i.e. pen(rice, phone) put on paper) and calculates what percent does the object make of the picture.
Problem I'm facing with is when I threshold image (tried adaptive and otsu) photo is a little bit shadow around edges:
Original image
Resulted picture
And here's my code:
import cv2
img = cv2.imread("image.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
b,g,r = cv2.split(img)
th, thresh = cv2.threshold(b, 100, 255, cv2.THRESH_BINARY|cv2.THRESH_OTSU)
cv2.imwrite("image_bl_wh.png", thresh)
Tried to blur and morphology, but couldn't do it.
How can I make my program count that black parts around the picture as background and is there more better and easier way to do it?
P.S. Sorry for my English grammar mistakes.
This is not a programmatic solution but when you do automatic visual inspection it is the first thing you should try: Improve your set-up. The image is simply darker around the edges so increasing the brightness when recording the images should help.
If that's not an option you could consider having an empty image for comparison. What you are trying to do is background segmentation and there are better ways than simple color thresholding they do however usually require at least one image of the background or multiple images.
If you want a software only solution you should try an edge detector combined with morphological operators.
So I have an image processing task at hand which requires me to crop a certain portion of an image. I have no prior experience of OpenCV. I would like to know of a certain approach where I should be headed.
Sample Input Image:
Sample Output Image:
What I initially thought was to convert the image to a bitmap and remove pixels that are below or above a certain threshold. Since I am free to use OpenCV and Python, I would like to know of any automated algorithm that does so and if not, what should be the right approach for such a problem. Thank you.
Applying a simple threshold should get rid of the background, provided it's always darker than the foreground. If you use the Otsu thresholding algorithm, it should choose a good partition for you. Using your example as input, this gives:
Next you could compute the bounding box to select the region of the foreground. Provided the background is distinct enough and there are no holes, this gives you the resulting rect:
[619 x 96 from (0, 113)]
You can then use this rect to crop the original, to produce the desired result:
I wrote the code to solve this in C++. A rough translation into Python would look something like this:
import cv2 as cv
img = cv.imread(sys.argv[1])
grayscale = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
thresholded = cv.threshold(grayscale, 0, 255, cv.THRESH_OTSU)
imwrite("otsu.png", thresholded)
bbox = cv.boundingRect(thresholded)
x, y, w, h = bbox
print(bbox)
foreground = img[y:y+h, x:x+w]
imwrite("foreground.png", foreground)
This method is fast and simple. If you find you have some white holes in your background which enlarge the bounding box, try applying an erosion operator.
FWIW I very much doubt you would get results like this as predictably or reliably using NNs.
The thresholding seems like a good approach. An overkill would be a neural network but you probably don't have enough data to train (:D) anyways check out this link.
you should be able to do something like:
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('img.png')
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
ret, thresh = cv.threshold(gray,0,255,cv.THRESH_BINARY_INV+cv.THRESH_OTSU
NN would be a overkill! You can do edge detection and get the extreme horizontal lines as boundaries. Then crop only the roi within these two lines.
I need to identify the pixels where there is a change in colour. I googled for edge detection and line detection techniques but am not sure how or in what way can these be applied.
Here are my very naive attempts:
Applying Canny Edge Detection
edges = cv2.Canny(img,0,10)
with various parameters but it didn't work
Applying Hough Line Transform to detect lines in the document
The intent behind this exercise is that I have an ill-formed table of values in a pdf document with the background I have attached. If I am able to identify the row boundaries using colour matching as in this question, my problem will be reduced to identifying columns in the data.
Welcome to image processing. What you're trying to do here is basically trying to find the places where the change in color between neighboring pixels is big, thus where the derivative of pixel intensities in the y direction is substantial. In signal processing, those are called high frequencies. The most common detector for high frequencies in images is called Canny Edge Detector and you can find a very nice tutorial here, on the OpenCV website.
The algorithm is very easy to implement and requires just a few simple steps:
import cv2
# load the image
img = cv2.imread("sample.png")
# convert to grayscale
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# resize for the visualization purposes
img = cv2.resize(img, None, img, fx=0.4, fy=0.4)
# find edges with Canny
edges = cv2.Canny(img, 10, 20, apertureSize=3)
# show and save the result
cv2.imshow("edges", edges)
cv2.waitKey(0)
cv2.imwrite("result.png", edges)
Since your case is very straightforward you don't have to worry about the parameters in the Canny() function call. But if you choose to find out what they do, I recommend checking out how to implement a trackbar and use it for experimenting. The result:
Good luck.
I want to convert the picture into black and white image accurately where the seeds will be represented by white color and the background as black color. I would like to have it in python opencv code. Please help me out
I got good result for the above picture using the given code below. Now I have another picture for which thresholding doesn't seem to work. How can I tackle this problem. The output i got is in the following picture
also, there are some dents in the seeds, which the program takes it as the boundary of the seed which is not a good results like in the picture below. How can i make the program ignore dents. Is masking the seeds a good option in this case.
I converted the image from BGR color space to HSV color space.
Then I extracted the hue channel:
Then I performed threshold on it:
Note:
Whenever you face difficulty in certain areas try working in a different color space, the HSV color space being most prominent.
UPDATE:
Here is the code:
import cv2
import numpy as np
filename = 'seed.jpg'
img = cv2.imread(filename) #---Reading image file---
hsv_img = cv2.cvtColor(img,cv2.COLOR_BGR2HSV) #---Converting RGB image to HSV
hue, saturation, value, = cv2.split(hsv_img) #---Splitting HSV image to 3 channels---
blur = cv2.GaussianBlur(hue,(3,3),0) #---Blur to smooth the edges---
ret,th = cv2.threshold(blur, 38, 255, 0) #---Binary threshold---
cv2.imshow('th.jpg',th)
Now you can perform contour operations to highlight your regions of interest also. Try it out!! :)
ANOTHER UPDATE:
I found the contours higher than a certain constraint to get this:
There are countless ways for image segmentation.
The simplest one is a global threshold operation. If you want to know more on other methods you should read some books. Which I recommend anyway befor you do any further image processing. It doesn't make much sense to start image processing if you don't know the most basic tools.
Just to show you how this could be achieved:
I converted the image from RGB to HSB. I then applied separate global thresholds to the hue and brightness channels to get the best segmentation result for both images.
Both binary images were then combined using a pixelwise AND operation. I did this because both channels gave sub-optimal results, but their overlap was pretty good.
I also applied some morphological operators to clean up the results.
Of course you can just invert the image to get the desired black background...
Thresholds and the used channels of course depend on the image you have and what you want to achieve. This is a very case-specific process that can be dynamically adapted to a limited extend.
This could be followed by labling or whatever you need:
I'm working with the following input image:
I want to extract all the boxes inside original images as an individual images with position so that i can also construct it after doing some operations on it. Currently I'm trying to detect contours on the image using OpenCV. But the problem is it also extracts all the words inside the box. The output is coming something like this:
Is there is any way where i can set the dimensions of box to be taken or something else is required for this.
Fairly simple approach:
Convert to grayscale.
Invert the image (to avoid getting top level contour detected around whole image -- we want the lines white and background black)
Find external contours only (we don't have any nested boxes).
Filter contours by area, discard the small ones.
You could possibly also filter by bounding box dimensions, etc. Feel free to experiment.
Example Script
Note: for OpenCV 2.4.x
import cv2
img = cv2.imread('cnt2.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = 255 - gray # invert
contours,heirarchy = cv2.findContours(gray,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
for contour in contours:
area = cv2.contourArea(contour)
if area > 500.0:
cv2.drawContours(img,[contour],-1,(0,255,0),1)
cv2.imwrite('cnt2_out.png', img)
Example Output