I want to count cardboard boxes and read a specific label which will only contain 3 words with white background on a conveyer belt using OpenCV and Python. Attached is the image I am using for experiments. The problem so far is that I am unable to detect the complete box due to noise and if I try to check w and h in x, y, w, h = cv2.boundingRect(cnt) then it simply filter out the text. in this case ABC is written on the box. Also the box have detected have spikes on both top and bottom, which I am not sure how to filter.
Below it the code I am using
import cv2
# reading image
image = cv2.imread('img002.jpg')
# convert the image to grayscale format
img_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# apply binary thresholding
ret, thresh = cv2.threshold(img_gray, 150, 255, cv2.THRESH_BINARY)
# visualize the binary image
cv2.imshow('Binary image', thresh)
# collectiong contours
contours,h = cv2.findContours(thresh, cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
# looping through contours
for cnt in contours:
x, y, w, h = cv2.boundingRect(cnt)
cv2.rectangle(image,(x,y),(x+w,y+h),(0,215,255),2)
cv2.imshow('img', image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Also please suggest how to crop the text ABC and then apply an OCR on that to read the text.
Many Thanks.
EDIT 2: Many thanks for your answer and based upon your suggestion I changed the code so that it can check for boxes in a video. It worked liked a charm expect it only failed to identify one box for a long time. Below is my code and link to the video I have used. I have couple of questions around this as I am new to OpenCV, if you can find some time to answer.
import cv2
import numpy as np
from time import time as timer
def get_region(image):
contours, hierarchy = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
c = max(contours, key = cv2.contourArea)
black = np.zeros((image.shape[0], image.shape[1]), np.uint8)
mask = cv2.drawContours(black,[c],0,255, -1)
return mask
cap = cv2.VideoCapture("Resources/box.mp4")
ret, frame = cap.read()
fps = 60
fps /= 1000
framerate = timer()
elapsed = int()
while(1):
start = timer()
ret, frame = cap.read()
# convert the image to grayscale format
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# Performing threshold on the hue channel `hsv[:,:,0]`
thresh = cv2.threshold(hsv[:,:,0],127,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
mask = get_region(thresh)
masked_img = cv2.bitwise_and(frame, frame, mask = mask)
newImg = cv2.cvtColor(masked_img, cv2.COLOR_BGR2GRAY)
# collectiong contours
c,h = cv2.findContours(newImg, cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cont_sorted = sorted(c, key=cv2.contourArea, reverse=True)[:5]
x,y,w,h = cv2.boundingRect(cont_sorted[0])
cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),5)
#cv2.imshow('frame',masked_img)
cv2.imshow('Out',frame)
if cv2.waitKey(1) & 0xFF == ord('q') or ret==False :
break
diff = timer() - start
while diff < fps:
diff = timer() - start
cap.release()
cv2.destroyAllWindows()
Link to video: https://www.storyblocks.com/video/stock/boxes-and-packages-move-along-a-conveyor-belt-in-a-shipment-factory-a-few-blank-boxes-for-your-custom-graphics-lmgxtwq
Questions:
How can we be 100% sure if the rectangle drawn is actually on top of a box and not on belt or somewhere else.
Can you please tell me how can I use the function you have provided in original answer to use for other boxes in this new code for video.
Is it correct way to again convert masked frame to grey, find contours again to draw a rectangle. Or is there a more efficient way to do it.
The final version of this code is intended to run on raspberry pi. So what can we do to optimize the code's performance.
Many thank again for your time.
There are 2 steps to be followed:
1. Box segmentation
We can assume there will be no background change since the conveyor belt is present. We can segment the box using a different color space. In the following I have used HSV color space:
img = cv2.imread('box.jpg')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Performing threshold on the hue channel `hsv[:,:,0]`
th = cv2.threshold(hsv[:,:,0],127,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]
Masking the largest contour in the binary image:
def get_region(image):
contours, hierarchy = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
c = max(contours, key = cv2.contourArea)
black = np.zeros((image.shape[0], image.shape[1]), np.uint8)
mask = cv2.drawContours(black,[c],0,255, -1)
return mask
mask = get_region(th)
Applying the mask on the original image:
masked_img = cv2.bitwise_and(img, img, mask = mask)
2. Text Detection:
The text region is enclosed in white, which can be isolated again by applying a suitable threshold. (You might want to apply some statistical measure to calculate the threshold)
# Applying threshold at 220 on green channel of 'masked_img'
result = cv2.threshold(masked_img[:,:,1],220,255,cv2.THRESH_BINARY)[1]
Note:
The code is written for the shared image. For boxes of different sizes you can filter contours with approximately 4 vertices/sides.
# Function to extract rectangular contours above a certain area
def extract_rect(contours, area_threshold):
rect_contours = []
for c in contours:
if cv2.contourArea(c) > area_threshold:
perimeter = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02*perimeter, True)
if len(approx) == 4:
cv2.drawContours(image, [approx], 0, (0,255,0),2)
rect_contours.append(c)
return rect_contours
Experiment using a statistical value (mean, median, etc.) to find optimal threshold to detect text region.
Your additional questions warranted a separate answer:
1. How can we be 100% sure if the rectangle drawn is actually on top of a box and not on belt or somewhere else?
PRO: For this very purpose I chose the Hue channel of HSV color space. Shades of grey, white and black (on the conveyor belt) are neutral in this channel. The brown color of the box is contrasting could be easily segmented using Otsu threshold. Otsu's algorithm finds the optimal threshold value without user input.
CON You might face problems when boxes are also of the same color as conveyor belt
2. Can you please tell me how can I use the function you have provided in original answer to use for other boxes in this new code for video.
PRO: In case you want to find boxes using edge detection and without using color information; there is a high chance of getting many unwanted edges. By using extract_rect() function, you can filter contours that:
have approximately 4 sides (quadrilateral)
are above certain area
CON If you have parcels/packages/bags that have more than 4 sides you might need to change this.
3. Is it correct way to again convert masked frame to grey, find contours again to draw a rectangle. Or is there a more efficient way to do it.
I felt this is the best way, because all that is remaining is the textual region enclosed in white. Applying threshold of high value was the simplest idea in my mind. There might be a better way :)
(I am not in the position to answer the 4th question :) )
Related
I'm using OpenCv (4.x) on Anime Sketch dataset from Kaggle to get the image's silhouette. What I found to be the hardest part was to detect that empty areas inside that silhouette, areas between arm-body, legs and hair. The tutorials I followed always use "full filled" objects, like a ball, head or cars and I ended up tunning that code to make it work, but it is too specific so that tunning just work ok on one image.
Playing around in online-image-editor.com I've noticed that I can use the tool called Trans-parency to change one color, just like cv2.inRange() does.
Original image
The code:
image = cv2.imread("2.png",cv2.IMREAD_UNCHANGED)
crop_img = image[:, 0:512]
fuzz_factor = 0.97
maxColor = (crop_img[1,1] * 1).astype(int)
minColor = (maxColor * fuzz_factor).astype(int)
mask = cv2.inRange(crop_img, minColor, maxColor)
cv2.imshow("mask", mask)
cv2.waitKey()
and outputs this (not that bad..)
BUT then trying with another image it doesn't work anymore, output:
So, question(s):
There is some "magic rule" where I can extract a specific fuzz_factor for each image?
How could I use the image's right half to get that silhouette/contour?
Thanks guys
I post to close this question.
Thanks to Micka I made some progress, there are two variables that have high impact on output's quality:
fuzz_factor: which sets the color range for cv2.inRange()
max_contours: number of contours to draw (sorted by size)
High numbers are better until there are white zones that are not background, so next thing could be discard that ones.
import numpy as np
import cv2
# constants
fuzz_factor = 1
max_contours = -10
image_path = "9.png"
image = cv2.imread(image_path)
image = image[:, 0:512]
# background color boundaries
color = image[3,3]
upper = (color).astype(int)
lower = (color * (100 - fuzz_factor/2.0)/100).astype(int)
# create mask with specific colors
mask = cv2.inRange(image, lower, upper)
# get all contours
contours, _ = cv2.findContours(mask, mode = cv2.RETR_EXTERNAL, method = cv2.CHAIN_APPROX_NONE)
if(len(contours) > 1):
# get the [max_contours] biggest areas
contours = sorted(contours, key=cv2.contourArea)[max_contours:]
# mask where contours are filled
mask = np.zeros_like(image)
# draw contours and fill
cv2.drawContours(mask, contours, -1, color=[255,255,255], thickness= -1)
cv2.drawContours(image, contours, -1, 255, 2)
cv2.imshow("Result", np.hstack([image, mask]))
cv2.waitKey(0)
I'm trying out a process from a research that I've read and came across this procedure.
I have tried reading about the processes involved but I can't seem to wrap my head around it.
Take the binary image I
Create a marker image F which has gray value 255 in
all pixels except for those pixels along the boundary which are not object pixels in the cell image, where it is 0.
Dilate F by B, a 5×5 mask that has gray value 0 in all pixels. Let this dilated image be F ⊕ B.
Take intersection of complement of I and F ⊕ B. Let this be H.
Make F equal to H.
Repeat the above steps 3 to 5 for t times (experimen-
tally, t is taken as 1000).
Take intersection of complement of I and comple-
ment of H. This gives us the image of holes. Let this
be G.
Take union of the I and G to get the final image,
which is free of non peripheral holes.
This is the result of their process:
I wanted to have the same result using this binary image:
Can someone please explain the thorough process and achieve the same result.
This is where I'm currently at:
# LOAD IMAGE
img = cv2.imread('resources/rbc2.png')
# CONVERT TO GRAYSCALE
imgGray = cv2.cvtColor(imgBrightness, cv2.COLOR_BGR2GRAY)
# APPLY MEDIAN BLUR
medianImg = cv2.medianBlur(imgGray,9)
# OTSU THRESHOLDING
ret, otsu = cv2.threshold(medianImg,0,255,cv2.THRESH_BINARY + cv2.THRESH_OTSU)
complimentI = cv2.bitwise_not(otsu)
If all you're looking to do is fill holes in a mask, we can do that much more simply by using opencv's findContours. We can filter for small contours and fill in those contours on the mask.
Edit: I am using Opencv 3.4. If you are using Opencv 2.* or 4.* then findContours returns 2 arguments and should look like this:
contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE);
Filled mask
import cv2
# load image
gray = cv2.imread("blobs.png", cv2.IMREAD_GRAYSCALE);
# mask with otsu
_, mask = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU);
# find contours
_, contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE);
# filter out contours by size
small_cntrs = [];
for con in contours:
area = cv2.contourArea(con);
if area < 1000: # size threshold
small_cntrs.append(con);
cv2.drawContours(mask, small_cntrs, -1, (0), -1);
# show
cv2.imshow("mask", mask);
cv2.waitKey(0);
So I decided to get started learning Open CV and Python together!
My first project is to detect moving objects on a relatively still background and then detect their average color to sort them. There are at least 10 objects to detect and I am processing a colored video.
So far I managed to remove the background, identify the contours (optionally get the center of each contour) but now I am struggling getting the average or mean color inside of each contour. There are some topics about this kind of question but most of them are written in C. Apparently I could use cv.mean() but I can't get a working mask to feed in this function. I guess it's not so difficult but I am stuck there... Cheers!
import numpy as np
import cv2
video_path = 'test.h264'
cap = cv2.VideoCapture(video_path)
fgbg = cv2.createBackgroundSubtractorMOG2()
while (cap.isOpened):
ret, frame = cap.read()
if ret==True:
fgmask = fgbg.apply(frame)
(contours, hierarchy) = cv2.findContours(fgmask, cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
for c in contours:
if cv2.contourArea(c) > 2000:
cv2.drawContours(frame, c, -1, (255,0,0), 3)
cv2.imshow('foreground and background',fgmask)
cv2.imshow('rgb',frame)
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
You can create a mask by first creating a new image with the same dimensions as your input image and pixel values set to zero.
You then draw the contour(s) onto this image with pixel value 255. The resulting image can be used as a mask.
mask = np.zeros(frame.shape, np.uint8)
cv2.drawContours(mask, c, -1, 255, -1)
mask can then be used as a parameter to cv.mean like
mean = cv.mean(frame, mask=mask)
Just one word of caution, the mean of RGB colors does not always make sense. Maybe try converting to HSV color space and solely use the H channel for detecting the color of your objects.
Solution on an image
1) find contour (in this case rectangle, contour that is not rectangle is much harder to make)
2) find coordiantes of contour
3) cut the image from contour
4) sum individual channels and divide them by number of pixels in it ( or with mean function)
import numpy as np
import cv2
img = cv2.imread('my_image.jpg',1)
cp = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(cp,150,255,0)
cv2.imshow('img',thresh)
cv2.waitKey(0)
im2,contours,hierarchy = cv2.findContours(thresh.astype(np.uint8), 1, 2)
cnts = contours
for cnt in cnts:
if cv2.contourArea(cnt) >800: # filter small contours
x,y,w,h = cv2.boundingRect(cnt) # offsets - with this you get 'mask'
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
cv2.imshow('cutted contour',img[y:y+h,x:x+w])
print('Average color (BGR): ',np.array(cv2.mean(img[y:y+h,x:x+w])).astype(np.uint8))
cv2.waitKey(0)
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
To remove noise, you can just take center of the contour, and take smaller rectangle to examin.
For non rectangular contour, look at cv2.fillPoly function -> Cropping non rectangular contours. But its a bit slow algorithm (but nothing limiting)
If you are interested in non rectangular contour, you will have to be careful about doing mean, because you will need mask and the mask/background is always rectangular so you will be doing mean on something you dont want
I am trying to extract the account number from an image of a cheque. The logic that I have is that, I am trying to find the rectangle that contains the account number, slice the bounding rectangle and then feed the slice into an OCR to get the text out of it.
The problem I am facing is when the rectangle is not very prominent and light colour, I am not able to get the rectangle contour since the edges are not connected totally.
How to overcome this?
Things I tried, but did not work are
I cannot increase the erosion iteration, to erode it more, because then the edges connect with the surrounding black pixels and form a different shape.
Reducing the threshold offset might help, but, it seems inefficient. Since the code has to work with several types of images. I can start with offset 10 and keep incrementing the offset and checking if I found the rectangle or not. This will increase the time a lot for cheques with prominent rectangles that work well at offset 20 or more. And since I don't have a condition to check if the edges of the rectangle are prominent or not, the loop has to be applied in all the cheques.
Keeping the above points in mind. Can someone help me out with a solution to this problem?
Libraries used and versions
scikit-image==0.13.1
opencv-python==3.3.0.10
Code
from skimage.filters import threshold_adaptive, threshold_local
import cv2
Step 1:
image = cv2.imread('cropped.png')
Step 2:
Using adaptive threshold from skimage to remove the background, so that I can get the account number rectangle box. This works fine for the cheques where the rectangle is more pronounced, but when the rectangle edges are thin, or are lighter in colour, the threshold results in
unconnected edges, because of which I am not able to find the contours. I have attached examples of this further down in the question.
account_number_block = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
account_number_block = threshold_adaptive(account_number_block, 251, offset=20)
account_number_block = account_number_block.astype("uint8") * 255
Step 3:
Erode the image a bit to try to connect small disconnections in the edges
kernel = np.ones((3,3), np.uint8)
account_number_block = cv2.erode(account_number_block, kernel, iterations=5)
Find the contours
(_, cnts, _) = cv2.findContours(account_number_block.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
# cnts = sorted(cnts, key=cv2.contourArea)[:3]
rect_cnts = [] # Rectangular contours
for cnt in cnts:
approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
if len(approx) == 4:
rect_cnts.append(cnt)
rect_cnts = sorted(rect_cnts, key=cv2.contourArea, reverse=True)[:1]
Working Example
Step 1: Original Image
Step 2: After thresholding to remove the background.
Step 3: Finding contours to find rectangle box of the account number.
Failure Working example - Light rectangular boundary.
Step 1: Read original image
Step 2: After thresholding to remove the background. Notice that the edges of the rectangle are not connected, because of which I am not able to get the contour out of it.
Step 3: Finding contours to find rectangle box of the account number.
import numpy as np
import cv2
import pytesseract as pt
from PIL import Image
#Run Main
if __name__ == "__main__" :
image = cv2.imread("image.jpg", -1)
# resize image to speed up computation
rows,cols,_ = image.shape
image = cv2.resize(image, (np.int32(cols/2),np.int32(rows/2)))
# convert to gray and binarize
gray_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
binary_img = cv2.adaptiveThreshold(gray_img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 9, 9)
# note: erosion and dilation works on white forground
binary_img = cv2.bitwise_not(binary_img)
# dilate the image to fill the gaps
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
dilated_img = cv2.morphologyEx(binary_img, cv2.MORPH_DILATE, kernel,iterations=2)
# find contours, discard contours which do not belong to a rectangle
(_, cnts, _) = cv2.findContours(dilated_img, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
rect_cnts = [] # Rectangular contours
for cnt in cnts:
approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
if len(approx) == 4:
rect_cnts.append(cnt)
# sort contours based on area
rect_cnts = sorted(rect_cnts, key=cv2.contourArea, reverse=True)[:1]
# find bounding rectangle of biggest contour
box = cv2.boundingRect(rect_cnts[0])
x,y,w,h = box[:]
# extract rectangle from the original image
newimg = image[y:y+h,x:x+w]
# use 'pytesseract' to get the text in the new image
text = pt.image_to_string(Image.fromarray(newimg))
print(text)
cv2.namedWindow('Image', cv2.WINDOW_NORMAL)
cv2.imshow('Image', newimg)
cv2.waitKey(0)
cv2.destroyAllWindows()
result: 03541140011724
result: 34785736216
This question already has answers here:
How to remove the background from an image
(3 answers)
Closed 9 months ago.
I have been searching for a technique to remove the background of a any given image. The idea is to detect a face and remove the background of the detected face. I have finished the face part. Now removing the background part still exists.
I used this code.
import cv2
import numpy as np
#== Parameters
BLUR = 21
CANNY_THRESH_1 = 10
CANNY_THRESH_2 = 200
MASK_DILATE_ITER = 10
MASK_ERODE_ITER = 10
MASK_COLOR = (0.0,0.0,1.0) # In BGR format
#-- Read image
img = cv2.imread('SYxmp.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
#-- Edge detection
edges = cv2.Canny(gray, CANNY_THRESH_1, CANNY_THRESH_2)
edges = cv2.dilate(edges, None)
edges = cv2.erode(edges, None)
#-- Find contours in edges, sort by area
contour_info = []
contours, _ = cv2.findContours(edges, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
for c in contours:
contour_info.append((
c,
cv2.isContourConvex(c),
cv2.contourArea(c),
))
contour_info = sorted(contour_info, key=lambda c: c[2], reverse=True)
max_contour = contour_info[0]
#-- Create empty mask, draw filled polygon on it corresponding to largest contour ----
# Mask is black, polygon is white
mask = np.zeros(edges.shape)
cv2.fillConvexPoly(mask, max_contour[0], (255))
#-- Smooth mask, then blur it
mask = cv2.dilate(mask, None, iterations=MASK_DILATE_ITER)
mask = cv2.erode(mask, None, iterations=MASK_ERODE_ITER)
mask = cv2.GaussianBlur(mask, (BLUR, BLUR), 0)
mask_stack = np.dstack([mask]*3) # Create 3-channel alpha mask
#-- Blend masked img into MASK_COLOR background
mask_stack = mask_stack.astype('float32') / 255.0
img = img.astype('float32') / 255.0
masked = (mask_stack * img) + ((1-mask_stack) * MASK_COLOR)
masked = (masked * 255).astype('uint8')
cv2.imshow('img', masked) # Display
cv2.waitKey()
cv2.imwrite("WTF.jpg",masked)
But this code only works for only this image
What should be changed in the code to make it to work for different images
Local Optimal Solution
# Original Code
CANNY_THRESH_2 = 200
# Change to
CANNY_THRESH_2 = 100
####### Change below worth to try but not necessary
# Original Code
mask = np.zeros(edges.shape)
cv2.fillConvexPoly(mask, max_contour[0], (255))
# Change to
for c in contour_info:
cv2.fillConvexPoly(mask, c[0], (255))
Effects
Test Image
Similar color of background, hair and skin
Original Output
original output
original edges
Apply all contour rather than max contour with same edge threshold
slightly better
Canny Thresh 2 set as 100, apply all contour
much better
stronger edges
Canny Thresh 2 set as 40, apply all contour
edges starts to become not so sharp
Reasoning
Program Behavior
The program searches edges and builds contours. Get the max contour and recognize as human face. Then apply mask.
Problem
Not easy to deal with similar color between background and human face. Blond hair and skin color makes it's hard to find correct edges with the original threshold.
Max contour means when images have strong and big vertex like the scarf in test image, it's easy to lose track of some area. But it really depends on what kind of image it is after your human face recognition process.