I am new to OpenCV and have a few questions. I need to detect a bottle or a can based on their shape. For this I am using a raspberry pi board and pi camera. The background is always black and does not change. I have tried many possible solutions to this problem but could not get satisfactory results. The things I have tried include edge detection, morphological transformations, matchShapes(), matchTemplate(). Please let me know if I can do this task efficiently and with maximum accuracy.
A sample image:
I came up with an approach that may help! If you know more things about the can, i.e the width to height ratio it can be more robust by adjusting the rectangle size!
Approach
Convert image to HSV color space. Increase V by a factor of 2 in order to have more visible things.
Find Sobel derivatives in x and y direction. Compute magnitude with equal weight for both direction.
Threshold your image using Otsu method.
Apply Closing to you image.
Apply Canny edge detector.
Find Hough Line Transform.
Find Bounding Rectangle of your line image.
Superimpose it onto your image.(Finally done :P)
Code
image = cv2.imread('image3.jpg', cv2.IMREAD_COLOR)
original = np.copy(image)
if image is None:
print 'Can not read/find the image.'
exit(-1)
hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
H,S,V = hsv_image[:,:,0], hsv_image[:,:,1], hsv_image[:,:,2]
V = V * 2
hsv_image = cv2.merge([H,S,V])
image = cv2.cvtColor(hsv_image, cv2.COLOR_HSV2RGB)
image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# plt.figure(), plt.imshow(image)
Dx = cv2.Sobel(image,cv2.CV_8UC1,1,0)
Dy = cv2.Sobel(image,cv2.CV_8UC1,0,1)
M = cv2.addWeighted(Dx, 1, Dy,1,0)
# plt.subplot(1,3,1), plt.imshow(Dx, 'gray'), plt.title('Dx')
# plt.subplot(1,3,2), plt.imshow(Dy, 'gray'), plt.title('Dy')
# plt.subplot(1,3,3), plt.imshow(M, 'gray'), plt.title('Magnitude')
ret, binary = cv2.threshold(M,10,255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# plt.figure(), plt.imshow(binary, 'gray')
binary = binary.astype(np.uint8)
binary = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (20, 20)))
edges = cv2.Canny(binary, 50, 100)
# plt.figure(), plt.imshow(edges, 'gray')
lines = cv2.HoughLinesP(edges,1,3.14/180,50,20,10)[0]
output = np.zeros_like(M, dtype=np.uint8)
for line in lines:
cv2.line(output,(line[0],line[1]), (line[2], line[3]), (100,200,50), thickness=2)
# plt.figure(), plt.imshow(output, 'gray')
points = np.array([np.transpose(np.where(output != 0))], dtype=np.float32)
rect = cv2.boundingRect(points)
cv2.rectangle(original,(rect[1],rect[0]), (rect[1]+rect[3], rect[0]+rect[2]),(255,255,255),thickness=2)
original = cv2.cvtColor(original,cv2.COLOR_BGR2RGB)
plt.figure(), plt.imshow(original,'gray')
plt.show()
NOTE: you can uncomment the lines for showing the result of each step! I just comment them for the sake of readability.
Result
NOTE: If you know the aspect ratio of your can you can fix it better!
I hope that will help. Good Luck :)
Related
I'm trying to clean the following image in order to perform edge and then polygon detection in order to extract the key buildings/features. Ideally I wish to end up with an image where the contours around houses and buildings are extracted correctly. The input image is (the full sized image is given here)
Currently my processing has worked as follows:
Read in image and perform canny edge detection
Apply Gaussian and median blurs
Perform a probabilistic Hough transform
Draw the lines given from the Hough transform in red
Remove non-red lines, apply blurs to the red and perform contour detection
Which is done with the following code:
def read_tif(PATH):
img = Image.open(PATH)
# Turn into np array
img = np.array(img, dtype=np.uint8)*255
return img
# Read in image
img = read_tif(here("Images/" + params['img']))
dst = cv2.Canny(img, 50, 200, None, 3)
# Apply blurs
dst = cv2.GaussianBlur(dst, (5, 5), 0)
dst = cv2.medianBlur(dst, 3)
cdst = cv2.cvtColor(dst, cv2.COLOR_GRAY2BGR)
cdstP = np.copy(cdst)
# Probabilistic Hough Transform
linesP = cv2.HoughLinesP(dst, 1, np.pi / 180, 50, None, 50, 10)
# Draw lines from Hough
if linesP is not None:
for i in range(0, len(linesP)):
l = linesP[i][0]
cv2.line(cdstP , (l[0], l[1]), (l[2], l[3]), (0,0,255), 6, cv2.LINE_AA)
Unfortunately this method is not having too much success as although lines are detected well there are many houses which are returned as several irregular polygons:
(see full image here). I have tried playing around with various other methods like dilation (to increase the width of the house boundary lines) but these don't seem to improve results and amplify some of the noise in the image.
Any advice or help on methods/approaches that can help in improving these results is much appreciated, TIA!!
Problem:
I'm working with a dataset that contains many images that look something like this:
Now I need all these images to be oriented horizontally or vertically, such that the color palette is either at the bottom or the right side of the image. This can be done by simply rotating the image, but the tricky part is figuring out which images should be rotated and which shouldn't.
What I have tried:
I thought that the best way to do this, is by detecting the white line that separates the the color palette from the image. I decided to rotate all images that have the palette at the bottom such that they have it at the right side.
# yes I am mixing between PIL and opencv (I like the PIL resizing more)
# resize image to be 128 by 128 pixels
img = img.resize((128, 128), PIL.Image.BILINEAR)
img = np.array(img)
# perform edge detection, not sure if these are the best parameters for Canny
edges = cv2.Canny(img, 30, 50, 3, apertureSize=3)
has_line = 0
# take numpy slice of the area where the white line usually is
# (not always exactly in the same spot which probably has to do with the way I resize my image)
for line in edges[75:80]:
# check if most of one of the lines contains white pixels
counts = np.bincount(line)
if np.argmax(counts) == 255:
has_line = True
# rotate if we found such a line
if has_line == True:
s = np.rot90(s)
An example of it working correctly:
An example of it working incorrectly:
This works maybe on 98% of images but there are some cases where it will rotate images that shouldn't be rotated or not rotate images that should be rotated. Maybe there is an easier way to do this, or maybe a more elaborate way that is more consistent? I could do it manually but I'm dealing with a lot of images. Thanks for any help and/or comments.
Here are some images where my code fails for testing purposes:
You can start by thresholding your image by setting a very high threshold like 250 to take advantage of the property that your lines are white. This will make all the background black. Now create a special horizontal kernel with a shape like (1, 15) and erode your image with it. What this will do is remove the vertical lines from the image and only the horizontal lines will be left.
import cv2
import numpy as np
img = cv2.imread('horizontal2.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 250, 255, cv2.THRESH_BINARY)
kernel_hor = np.ones((1, 15), dtype=np.uint8)
erode = cv2.erode(thresh, kernel_hor)
As stated in the question the color palates can only be on the right or the bottom. So we can test to check how many contours does the right region has. For this just divide the image in half and take the right part. Before finding contours dilate the result to fill in any gaps with a normal (3, 3) kernel. Using the cv2.RETR_EXTERNAL find the contours and count how many we have found, if greater than a certain number the image is correct side up and there is no need to rotate.
right = erode[:, erode.shape[1]//2:]
kernel = np.ones((3, 3), dtype=np.uint8)
right = cv2.dilate(right, kernel)
cnts, _ = cv2.findContours(right, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
if len(cnts) > 3:
print('No need to rotate')
else:
print('rotate')
#ADD YOUR ROTATE CODE HERE
P.S. I tested for all four images you have provided and it worked well. If in case it does not work for any image let me know.
I have downloaded a number of images (1000) from a website but they each have a black and white ruler running along 1 or 2 edges and some have these catalogue number tickets. I need these elements removed, the ruler at the very least.
Example images of coins:
The images all have the ruler in slightly different places so i cant just preform the same crop on them.
So I tried to remove the black and replace it with white using this code
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
im = Image.open('image-0.jpg')
im = im.convert('RGBA')
data = np.array(im) # "data" is a height x width x 4 numpy array
red, green, blue, alpha = data.T # Temporarily unpack the bands for readability
# Replace black with white
black_areas = (red < 150) & (blue < 150) & (green < 150)
data[..., :-1][black_areas.T] = (255, 255, 255) # Transpose back needed
im2 = Image.fromarray(data)
im2.show()
but it pretty much just removed half the coin as well:
I was having a read of some posts on opencv but though I'd see if there was a simpler way I'd missed first.
So I have taken a look at your problem and I have found a solution for your two images you provided, I hope it works for you other images as well but it is always hard to tell as it can be different on an individual basis. This solution is using OpenCV for preprocessing and contour detection to get the 2nd and 3rd largest elements in your picture (largest is the bounding box around the edges) which should be your coins. Then I create a box around those two items and add some padding before I crop to size.
So we start off with preprocessing:
import numpy as np
import cv2
img = cv2.imread(r'<PATH TO YOUR IMAGE>')
img = cv2.resize(img, None, fx=3, fy=3)
imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(imgray, (5, 5), 0)
ret, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Still rather basic, we make the image bigger so it is easier to detect contours, then we turn it into grayscale, blur it and apply thresholding to it so we turn all grey values either white or black. This then gives us the following image:
We now do contour detection, get the areas around our contours and sort them by the biggest area. Then we drop the biggest one as it is the box around the whole image and take the 2nd and 3rd biggest. And then get the x,y,w,h values we are interested in.
contours, hierarchy = cv2.findContours(
thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
areas = []
for cnt in contours:
area = cv2.contourArea(cnt)
areas.append((area, cnt))
areas.sort(key=lambda x: x[0], reverse=True)
areas.pop(0)
x, y, w, h = cv2.boundingRect(areas[0][1])
x2, y2, w2, h2 = cv2.boundingRect(areas[1][1])
If we draw a rectangle around those contours:
Now we take those coordinates and create a box around both of them. This might need some minor adjusting as I just quickly took the bigger width of the two and not the corresponding one for the right coin but since I added extra padding it should be fine in most cases. And finally crop to size:
pad = 15
img = img[(min(y, y2) - pad) : (max(y, y2) + max(h, h2) + pad),
(min(x, x2) - pad) : (max(x, x2) + max(w, w2) + pad)]
I hope this helps you to understand how you could achieve what you want, I tried it on both your images and it worked well for them. It might need some adjustments and depending on how your other images look the simple approach of taking the two biggest objects (apart from image bounding box) might be turned into something more sophisticated to detect the cricular shapes or something along those lines. Alternatively you could try to detect the rulers and crop from their position inwards. You will have to decide after you have done this on more example images in your dataset.
If you're looking for a robust solution, you should try something like Max Kaha's response, since it'll provide you with greater fine tuning.
Since the rulers tend to be left with just a little bit of text after your "black to white" filter, a quick solution is to use erosion followed by a dilation to create a mask for your images, and then apply the mask to the original image.
Pillow offers that with the ImageFilter class. Here's your code with a few modifications that'll achieve that:
from PIL import Image, ImageFilter
import numpy as np
import matplotlib.pyplot as plt
WHITE = 255, 255, 255
input_image = Image.open('image.png')
input_image = input_image.convert('RGBA')
input_data = np.array(input_image) # "data" is a height x width x 4 numpy array
red, green, blue, alpha = input_data.T # Temporarily unpack the bands for readability
# Replace black with white
thresh = 30
black_areas = (red < thresh) & (blue < thresh) & (green < thresh)
input_data[..., :-1][black_areas.T] = WHITE # Transpose back needed
erosion_factor = 5
# dilation is bigger to avoid cropping the objects of interest
dilation_factor = 11
erosion_filter = ImageFilter.MaxFilter(erosion_factor)
dilation_filter = ImageFilter.MinFilter(dilation_factor)
eroded = Image.fromarray(input_data).filter(erosion_filter)
dilated = eroded.filter(dilation_filter)
mask_threshold = 220
# the mask is black on regions to be hidden
mask = dilated.convert('L').point(lambda x: 255 if x < mask_threshold else 0)
# create base image
output_image = Image.new('RGBA', input_image.size, WHITE)
# paste only the desired regions
output_image.paste(input_image, mask=mask)
output_image.show()
You should also play around with the black to white threshold and the erosion/dilation factors to try and find the best fit for most of your images.
I need to detect the vein junctions of wings bee (the image is just one example). I use opencv - python.
ps: maybe the image lost a little bit of quality, but the image is all connected with one pixel wide.
This is an interesting question. The result I got is not perfect, but it might be a good start. I filtered the image with a kernel that only looks at the edges of the kernel. The idea being, that a junction has at least 3 lines that cross the kernel-edge, where regular lines only have 2. This means that when the kernel is over a junction, the resulting value will be higher, so a threshold will reveal them.
Due to the nature of the lines there are some value positives and some false negatives. A single joint will most likely be found several times, so you'll have to account for that. You can make them unique by drawing small dots and detecting those dots.
Result:
Code:
import cv2
import numpy as np
# load the image as grayscale
img = cv2.imread('xqXid.png',0)
# make a copy to display result
im_or = img.copy()
# convert image to larger datatyoe
img.astype(np.int32)
# create kernel
kernel = np.ones((7,7))
kernel[2:5,2:5] = 0
print(kernel)
#apply kernel
res = cv2.filter2D(img,3,kernel)
# filter results
loc = np.where(res > 2800)
print(len(loc[0]))
#draw circles on found locations
for x in range(len(loc[0])):
cv2.circle(im_or,(loc[1][x],loc[0][x]),10,(127),5)
#display result
cv2.imshow('Result',im_or)
cv2.waitKey(0)
cv2.destroyAllWindows()
Note: you can try to tweak the kernel and the threshold. For example, with the code above I got 126 matches. But when I use
kernel = np.ones((5,5))
kernel[1:4,1:4] = 0
with threshold
loc = np.where(res > 1550)
I got 33 matches in these locations:
You can use Harris corner detector algorithm to detect vein junction in above image. Compared to the previous techniques, Harris corner detector takes the differential of the corner score into account with reference to direction directly, instead of using shifting patches for every 45 degree angles, and has been proved to be more accurate in distinguishing between edges and corners (Source: wikipedia).
code:
img = cv2.imread('wings-bee.png')
# convert image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = np.float32(gray)
'''
args:
img - Input image, it should be grayscale and float32 type.
blockSize - It is the size of neighbourhood considered for corner detection
ksize - Aperture parameter of Sobel derivative used.
k - Harris detector free parameter in the equation.
'''
dst = cv2.cornerHarris(gray, 9, 5, 0.04)
# result is dilated for marking the corners
dst = cv2.dilate(dst,None)
# Threshold for an optimal value, it may vary depending on the image.
img_thresh = cv2.threshold(dst, 0.32*dst.max(), 255, 0)[1]
img_thresh = np.uint8(img_thresh)
# get the matrix with the x and y locations of each centroid
centroids = cv2.connectedComponentsWithStats(img_thresh)[3]
stop_criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
# refine corner coordinates to subpixel accuracy
corners = cv2.cornerSubPix(gray, np.float32(centroids), (5,5), (-1,-1), stop_criteria)
for i in range(1, len(corners)):
#print(corners[i])
cv2.circle(img, (int(corners[i,0]), int(corners[i,1])), 5, (0,255,0), 2)
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
output:
You can check the theory behind Harris Corner detector algorithm from here.
I am new to opencv, start to learn it by extract char from simple captcha.
After some effort, I got findContours and some method to clean the image, sometimes worked, but not more often.
For example:
I have a original image(already scale to a large size):
convert to grayscale and use cv2.threshold clean:
use cv2.findContours to get bounding boxes:
W only cover a half, and not get b.
My code:
from StringIO import StringIO
import string
from PIL import Image
import requests
import cv2
import numpy as np
import matplotlib.pyplot as plt
def get_ysdm_captcha():
url = 'http://www.ysdm.net/common/CleintCaptcha'
r = requests.get(url)
img = Image.open(StringIO(r.content))
return img
def scale_image(img, ratio):
return img.resize((int(img.width*ratio), int(img.height*ratio)))
def draw_rect(im):
im = np.array(im)
if len(im.shape) == 3 and im.shape[2] == 3:
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
else:
imgray = im
#plt.imshow(Image.fromarray(imgray), 'gray')
pilimg = Image.fromarray(imgray)
ret,thresh = cv2.threshold(imgray,127,255,0)
threimg = Image.fromarray(thresh)
plt.figure(figsize=(4,3))
plt.imshow(threimg, 'gray')
plt.xticks([]), plt.yticks([])
contours, hierarchy = cv2.findContours(np.array(thresh),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
areas = []
for c in contours:
rect = cv2.boundingRect(c)
area = cv2.contourArea(c)
areas.append(area)
x,y,w,h = rect
if area > 2000 or area < 200 : continue
cv2.rectangle(thresh,(x,y),(x+w,y+h),(0,255,0),1)
plt.figure(figsize=(1,1))
plt.imshow(threimg.crop((x,y,x+w,y+h)), 'gray')
plt.xticks([]), plt.yticks([])
plt.figure(figsize=(10,10))
plt.figure()
plt.imshow(Image.fromarray(thresh), 'gray')
plt.xticks([]), plt.yticks([])
image = get_ysdm_captcha()
im = scale_image(image, 3)
im = np.array(im)
imgray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
imgray = cv2.GaussianBlur(imgray,(5,5),0)
# im = cv2.medianBlur(imgray,9)
# im = cv2.bilateralFilter(imgray,9,75,75)
draw_rect(imgray)
I tried my best to write above code.
The solutions I imagine is:
find was there any way to tell cv2.findContours I need 4 bounding boxes in some size
tried some different parameter (I tried all from http://docs.opencv.org/2.4/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html?highlight=findcontours#findcontours , but still not work)
Now I'm stuck , have no idea how to improve cv2.findContours...
You can use morphological operations to modify your image and fill the gaps, for example erode and dilate
See here:
http://docs.opencv.org/2.4/doc/tutorials/imgproc/erosion_dilatation/erosion_dilatation.html
Original:
Dilated:
By the way: I would implement a HSV separation step in the original image, removing all the 'white/grey/black' content (low saturation). This will reduce the number of specks. Do this before converting to grayscale.
Here is the result filtering on: saturation > 90
Final result: (Added a blur step before)
Also, if there always a gradient, you could detect this and filter out even more colors. But that is a bit much if you just started image processing ;)
findCountours works properly as it finds all connected component of your image. Your area condition is what is probably avoiding you to get a bounding box around letter b, for instance. Of course, if you put a bounding box around each connected component you will not end up with a bounding box around each character because you have many holes in your letters.
If you want to segment the letters, I would first try to play with opening operations (because your letters are black on a white background, it would be closing if it was the opposite) in order to fill the holes that you have in your letters. Then I would project vertically the pixels and analyze the shape that you get. If you find the valley points in this projected shape you will get the vertical limits between characters. You can do the same horizontally to get the upper and bottom limits of your chars. This approach will only work if the text is horizontal. If it is not, you should find the main axis angle of your string and you could rotate the image accordingly. To find the main axis angle you could fit an ellipse to your text and find its main axis angle or you could keep rotating your image by a certain angle until your horixontal projection is maximum.