I am doing handwritten digit recognition using SciKit-learn so for that I need to crop the clicked picture so I have prepared a template on the Word.
Now what I want is the image to be cropped along the border so that I can crop it further to extract the digits.
Sample Image is given below:
For cropping the image I am using this Code.
Below is the parent Image from which the above rectangle has been cropped:
Note: The parent image has a border too(which is not visible in the image) so trimming the white space might help in getting a modified parent image so that predefined (height, width) would be almost same for various crops to be done on the image.
You could apply this pipeline: convert to grayscale -> apply thresholding (convert to white & black) -> find contours -> choose the contours of the right shape.
Here is example code:
#!/usr/bin/env python
import cv2
BLACK_THRESHOLD = 200
THIN_THRESHOLD = 10
ANNOTATION_COLOUR = (222,0,222)
img = cv2.imread('template.png')
orig = img.copy()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, thresh=BLACK_THRESHOLD, maxval=255, type=cv2.THRESH_BINARY_INV)[1]
# Optional: save thesholded image
cv2.imwrite("temp_thres.png", thresh)
# Find contours on the thresholded image
contours = cv2.findContours(thresh,cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[1]
for cont in contours:
# Find bounding rectangle of a contour
x,y,w,h = cv2.boundingRect(cont)
# Skip thin contours (vertical and horizontal lines)
if h<THIN_THRESHOLD or w<THIN_THRESHOLD:
continue
# Does the countour has the right shape (roughly four times longer than high)?
if 3*h<w<5*h:
roi = orig[y:y+h,x:x+w]
cv2.imwrite("four_letters.png",roi)
# Optional: draw annotations
cv2.rectangle(img,(x,y),(x+w,y+h),ANNOTATION_COLOUR,3)
# Optional: save annotated image
cv2.imwrite("temp_cont.png",img)
(You can delete the three optional steps. They are just for generating images temp_thres.png and temp_cont.png.)
Input image template.png:
Thresholded image temp_thres.png:
Found contours temp_cont.png:
Four letter space four_letters.png:
Related
I have a program that uses PIL to automatically crop a series of images to a certain area. However, when I run the program with screenshots from a different sized display, the area cropped is in the wrong place. Is there a way to use OpenCV or PIL to automatically find the rectangle that I want to crop to (for example the main viewer of a Youtube video) and crop to it, while leaving the image in color and then saving the image to a new folder?
My code for cropping images:
import os, random
from PIL import Image
files = []
for x in os.listdir():
if '.jpg' in f'{x}' or '.png' in f'{x}' or '.jpeg' in f'{x}':
files.append(x)
else:
print(f'Passed by {x}! It is not an image!')
for x in files:
y = hex(random.randint(100000,500000))
image = Image.open(f'{x}')
newimage = image.crop((48,367,1626,1256))
newimage.save(f'newdir/{y}.png')
The example image (this works with the PIL cropper):
The image I want:
Another image from another computer that needs to be cropped to the same viewer:
Here is one way to do that using Python/OpenCV.
Basically, threshold the image, then get contours, then get the bounding box of the largest contour and crop using the bounding box.
Input:
import cv2
import numpy as np
# load image
img = cv2.imread("screen.jpg")
# get color bounds of white region
lower =(180,180,180) # lower bound for each channel
upper = (255,255,255) # upper bound for each channel
# threshold
threshold = cv2.inRange(img, lower, upper)
# get the largest contour
contours = cv2.findContours(threshold, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
big_contour = max(contours, key=cv2.contourArea)
# get bounding box
x,y,w,h = cv2.boundingRect(big_contour)
print(x,y,w,h)
# crop the image at the bounds
crop = img[y:y+h, x:x+w]
# write result to disk
cv2.imwrite("screen_threshold.jpg", threshold)
cv2.imwrite("screen_cropped.jpg", crop)
# display it
cv2.imshow("threshold", threshold)
cv2.imshow("crop", crop)
cv2.waitKey(0)
Threshold Image:
Cropped Result:
Cropped (x,y,w,h):
48 368 1578 801
What is the efficient way to constract bounding boxes for an objeject given the encoded pixel of that object. I am trying to convert a task from segmentation to object detection in yolo. Any suggestions are welcomed.
If you have the points of the segmentation then you can find the extents of the points by iterating through and recording the lowest and highest values of [x,y]. This will give you the the top left and bottom right corners.
If you're trying to work backwards from a colored, segmented image then you can use opencv to threshold the colors and get boxes from the blobs like this:
import cv2
import numpy as np
# load image
img = cv2.imread("mask.png");
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY);
# threshold
thresh = cv2.inRange(gray, 100, 255);
# contour
_, contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE);
# draw rectangle
x,y,w,h = cv2.boundingRect(contours[0]);
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2);
# show
cv2.imshow("box", img);
cv2.imshow("Thresh", thresh);
cv2.waitKey(0);
# save
cv2.imwrite("boxed.png", img);
I want to remove a background from x-ray to extract actual area. So My original
image looks like left image and I want to crop to look like image.
A solution with Python and Open-CV is appreciated.
There are multiple files and so we don't know the height and width to. Crop in advance. So it needs to be computed.
Here is one way to do that in Python/OpenCV.
Read the input
Convert to gray
Threshold
Blacken the bottom two white rows
Find where all white pixels are in the image
Get the bounds of those pixels
Crop the image at the bounds
Save the results
Input:
import cv2
import numpy as np
# load image as grayscale
img = cv2.imread('xray_chest.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# threshold
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
hh, ww = thresh.shape
# make bottom 2 rows black where they are white the full width of the image
thresh[hh-3:hh, 0:ww] = 0
# get bounds of white pixels
white = np.where(thresh==255)
xmin, ymin, xmax, ymax = np.min(white[1]), np.min(white[0]), np.max(white[1]), np.max(white[0])
print(xmin,xmax,ymin,ymax)
# crop the image at the bounds adding back the two blackened rows at the bottom
crop = img[ymin:ymax+3, xmin:xmax]
# save resulting masked image
cv2.imwrite('xray_chest_thresh.jpg', thresh)
cv2.imwrite('xray_chest_crop.jpg', crop)
# display result
cv2.imshow("thresh", thresh)
cv2.imshow("crop", crop)
cv2.waitKey(0)
cv2.destroyAllWindows()
Thresholded image with bottom two rows blackened:
Cropped input:
An alternate method would be to get the external contour of the white region from the thresholded image. Get the bounds of the contour. Then crop to those bounds.
I'm just working on images and I found difficult on cropping the binary images automatically. I'm new to Image processing.
Example images are shown below,
Original image:
Needed output(manually edited by GIMP Image editor):
I needed to crop the image by finding the edges of a whit pixels(mask) in image. But its hard to find the approximate edges. Please help me to find out.
thanks in advance..!
You can use findContours to find the bounding of the object, then use minAreaRect to draw your needed output, 1st image. Or you can just draw the bounding of the object, 2nd image.
import cv2
import numpy as np
img = cv2.imread('1.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_,thresh = cv2.threshold(gray,128,255,cv2.THRESH_BINARY)
_,contours,_ = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
mask = np.zeros(img.shape)
cv2.drawContours(mask, contours, -1 , (255,255,255), 1)
rect = cv2.minAreaRect(contours[0])
box = cv2.boxPoints(rect)
box = np.int0(box)
cv2.drawContours(img,[box],0,(255,255,255),1)
cv2.imshow("img",img)
cv2.imshow("mask",mask)
cv2.waitKey(0)
cv2.destroyAllWindows()
I'm trying to develop a script using Python and OpenCV to detect some highlighted regions on a scanned instrumentation diagram and output text using Tesseract's OCR function. My workflow is first to detect the general vicinity of the region of interest, and then apply processing steps to remove everything aside from the blocks of text (lines, borders, noise). The processed image is then feed into Tesseract's OCR engine.
This workflow is works on about half of the images, but fails on the rest due to the text touching the borders. I'll show some examples of what I mean below:
Step 1: Find regions of interest by creating a mask using InRange with the color range of the highlighter.
Step 2: Contour regions of interest, crop and save to file.
--- Referenced code begins here ---
Step 3: Threshold image and apply Canny Edge Detection
Step 4: Contour the edges and filter them into circular shape using cv2.approxPolyDP and looking at ones with vertices greater than 8. Taking the first or second largest contour usually corresponds to the inner edge.
Step 5: Using masks and bitwise operations, everything inside contour is transferred to a white background image. Dilation and erosion is applied to de-noise the image and create the final image that gets fed into the OCR engine.
import cv2
import numpy as np
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
d_path = "Test images\\"
img_name = "cropped_12.jpg"
img = cv2.imread(d_path + img_name) # Reads the image
## Resize image before calculating contour
height, width = img.shape[:2]
img = cv2.resize(img,(2*width,2*height),interpolation = cv2.INTER_CUBIC)
img_orig = img.copy() # Makes copy of original image
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) # Convert to grayscale
# Apply threshold to get binary image and write to file
_, img = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Edge detection
edges = cv2.Canny(img,100,200)
# Find contours of mask threshold
_, contours, hierarchy = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Find contours associated w/ polygons with 8 sides or more
cnt_list = []
area_list = [cv2.contourArea(c) for c in contours]
for j in contours:
poly_pts = cv2.approxPolyDP(j,0.01*cv2.arcLength(j,True),True)
area = cv2.contourArea(j)
if (len(poly_pts) > 8) & (area == max(area_list)):
cnt_list.append(j)
cv2.drawContours(img_orig, cnt_list, -1, (255,0,0), 2)
# Show contours
cv2.namedWindow('Show',cv2.WINDOW_NORMAL)
cv2.imshow("Show",img_orig)
cv2.waitKey()
cv2.destroyAllWindows()
# Zero pixels outside circle
mask = np.zeros(img.shape).astype(img.dtype)
cv2.fillPoly(mask, cnt_list, (255,255,255))
mask_inv = cv2.bitwise_not(mask)
a = cv2.bitwise_and(img,img,mask = mask)
wh_back = np.ones(img.shape).astype(img.dtype)*255
b = cv2.bitwise_and(wh_back,wh_back,mask = mask_inv)
res = cv2.add(a,b)
# Get rid of noise
kernel = np.ones((2, 2), np.uint8)
res = cv2.dilate(res, kernel, iterations=1)
res = cv2.erode(res, kernel, iterations=1)
# Show final image
cv2.namedWindow('result',cv2.WINDOW_NORMAL)
cv2.imshow("result",res)
cv2.waitKey()
cv2.destroyAllWindows()
When code works, these are the images that get outputted:
Working
However, in the instances where the text touches the circular border, the code assumes part of the text is part of the larger contour and ignores the last letter. For example:
Not working
Are there any processing steps that can help me bypass this problem? Or perhaps a different approach? I've tried using Hough Circle Transforms to try to detect the borders, but they're quite finicky and doesn't work as well as contouring.
I'm quite new to OpenCV and Python so any help would be appreciated.
If the Hough circle transform didn't work for you I think you're best option will be to approximate the boarder shape. The best method I know for that is: Douglas-Peucker algorithm which will make your contour simpler by reducing the perimeter on pics.
You can check this reference file from OpenCV to see the type of post processing you can apply to your boarder. They also mention Douglas-Peucker:
OpenCV boarder processing
Just a hunch. After OTSU thresholding. Erode and dilate the image. This will result in vanishing of very thin joints. The code for the same is below.
kernel = np.ones((5,5),np.uint8)
th3 = cv2.erode(th3, kernel,iterations=1)
th3 = cv2.dilate(th3, kernel,iterations=1)
Let me know how it goes. I have couple more idea if this did not work.