Is it possible to detect the upper side of a dice? While this will be an easy task if you look from the top, from many perspectives multiple sides are visible.
Here is an example of a dice, feel free to take your own pictures:
You usually want to know the score you have achieved. It is easy for me to extract ALL dots, but how to only extract those on the top? In this special case, the top side is the largest, but this might not always be true. I am looking for someting which evaluates the distortion of the top square (or circle in this case, which I can extract) in relation to the perspective given by the grid in the bottom.
Example program with some results is given below.
import numpy as np
import cv2
img = cv2.imread('dice.jpg')
# Colour range to be extracted
lower_blue = np.array([0,0,0])
upper_blue = np.array([24,24,24])
# Threshold the BGR image
dots = cv2.inRange(img, lower_blue, upper_blue)
# Colour range to be extracted
lower_blue = np.array([0,0,0])
upper_blue = np.array([226,122,154])
# Threshold the BGR image
upper_side_shape = cv2.inRange(img, lower_blue, upper_blue)
cv2.imshow('Upper side shape',upper_side_shape)
cv2.imshow('Dots',dots)
cv2.waitKey(0)
cv2.destroyAllWindows()
Some resulting images:
The best solution is dot size, which I mentioned in the comment. You find the largest dot, consider it as max, and then create a tolerance level.
But what if all dots are nearly equal (viewing it from the edge at angle that makes things equidistant), or even too small? The best solution for that is creating a boundary to capture the dots. This requires the analysis of the dice's edge (edge detection basically), but once you define the boundary, you're solid.
All you need is to capture the edges of the dice from the perspective you're seeing.
Here's a visual example:
Since you have a virtual boundary set, you'll simply measure dots above a specific point on the y-axis.
The dot size is a good heuristic, but I would also add the dot roundness: if you compute the second order image moments of the binarized dots, the more the x and y moment are similar, the more round the figure. This will of course fail, like the size, for a side view, but then what does "top-side" really means if you can't sense gravity..
why try to chop up the image at all? based on what numbers you see on the side you can infer what number is on the top. your side numbers can be used as a check to validate your guess.
note that you'll have to be careful about handedness (see: http://mathworld.wolfram.com/Dice.html)
Related
I'm trying to create a mask. I have database of images similar like this image.
INPUT IMAGE
CODE
import cv2
import numpy as np
img = cv2.imread('sample1.png', cv2.IMREAD_UNCHANGED)
gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
#img_ = cv2.threshold(gray,100,225,cv2.THRESH_BINARY)
edges = cv2.Canny(gray, 250, 250)
cv2.imwrite('output.png',edges)
OUTPUT
How can I remove inner bonder and fill with white.
Result I Want
Well, there are many ways to do that. All of them need some tuning, depending on your image.
There is, for example, a floodfill function in opencv.
But the easiest is probably to use some mathematical morphology and then connected component. Because from the connected component, it is easier to adjust result if needed.
We can start by having a binary version of your edges
binedge=(edges>0).astype(np.uint8)
Once this is done, since there are "holes" in it, we need to fill those holes, so that the edge strictly separate inside from outside. This can be done by a dilatation
ker=np.ones((3,3))
fatedge=cv2.dilate(binedge, ker)
Then, we want to find the inside. That is not easy, because there might be many parts in that inside. So the easiest way is probably to find the outside and revert it. Tho there could also be several outside parts, if character touch the border in different places.
So, let's start to find all connected black parts of this picture.
n,comp=cv2.connectedComponents((fatedge==0).astype(np.uint8))
comp here is an image whose values is the index of the connected component. Shown here with random colors for each index.
Let's assume that outside is connected and that (0,0) is in it (it is almost always the case. And it is here. If not, you'll have to find more complex criteria. Such as "the biggest component". Or even to merge different parts). The component we are interested in is the one that contains (0,0). That is the pixels of comp that have the same value as comp[0,0]. And in fact, what we are interested in is the opposite of that: what is inside. We compute outside only because it is easier. Inside is what is not inside, that is pixels that are != comp[0,0].
filled=(comp!=comp[0,0]).astype(np.uint8)
Last stage (not really necessary from an aesthetics point of view. But strictly speaking, it is needed) : since we have dilated the edges at the beginning, this picture is a few pixels bigger than it should. We can erode it back now that we have what we want
output=cv2.erode(filled, ker)*255
cv2.imwrite('output.png',output)
So, all together
import cv2
import numpy as np
img = cv2.imread('Downloads/93Lwd.png', cv2.IMREAD_UNCHANGED)
gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
#img_ = cv2.threshold(gray,100,225,cv2.THRESH_BINARY)
edges = cv2.Canny(gray, 250, 250)
# Binarize edges
binedge=(edges>0).astype(np.uint8)
# Removing edges too close from left and right borders
binedge[:,:20]=0
binedge[:,-20:]=0
# Fatten them so that there is no hole
ker=np.ones((3,3))
fatedge=cv2.dilate(binedge, ker)
# Find connected black areas
n,comp=cv2.connectedComponents((fatedge==0).astype(np.uint8))
# comp is an image whose each value is the index of the connected component
# Assuming that point (0,0) is in the border, not inside the character, border is where comp is == comp[0,0]
# So character is where it is not
# Or, variant from new image: considering "outside" any part that touches one of the left, right, or top border
# Note: that is redundant with previous 0ing of left and right borders
# Set of all components touching left, right or top border
listOutside=set(comp[:,0]).union(comp[:,-1]).union(comp[0,:])
if 0 in listOutside: listOutside.remove(0) # 0 are the lines, that is what is False in fatedge==0
filled=(~np.isin(comp, list(listOutside))).astype(np.uint8) # isin need array or list, not set
# Just to be extra accurate, since we had dilated edges, with can now erode result
output=cv2.erode(filled, ker)
cv2.imwrite('output.png',output*255)
I need to find the largest empty area in the document and display its coordinates, center point and area, using python to put a QR Code there.
I think OpenCV and Numpy should be enough for this task.
What kinda THRESH to use? Because there are a lot of types of scans:
gray, BW, with color, and how to find the contour properly?
How this can be implemented in the fastest way? An example using the
first scan from google is attached, where you can see that the code
should find the largest empty square area.
#Mark Setchell Thanks! This code works perfectly for all docs with a white background, but when I use smth with a color in the background it finds a completely different area. Also, to keep thin lines in the docs I used Erode after thresholding. Tried to change thresholding and erode parameters, still not working properly.
Edited post, added color pictures.
Here's a possible approach:
#!/usr/bin/env python3
import cv2
import numpy as np
def largestSquare(im):
# Make image square of 100x100 to simplify and speed up
s = 100
work = cv2.resize(im, (s,s), interpolation=cv2.INTER_NEAREST)
# Make output accumulator - uint16 is ok because...
# ... max value is 100x100, i.e. 10,000 which is less than 65,535
# ... and you can make a PNG of it too
p = np.zeros((s,s), np.uint16)
# Find largest square
for i in range(1, s):
for j in range(1, s):
if (work[i][j] > 0 ):
p[i][j] = min(p[i][j-1], p[i-1][j], p[i-1][j-1]) + 1
else:
p[i][j] = 0
# Save result - just for illustration purposes
cv2.imwrite("result.png",p)
# Work out what the actual answer is
ind = np.unravel_index(np.argmax(p, axis=None), p.shape)
print(f'Location: {ind}')
print(f'Length of side: {p[ind]}')
# Load image and threshold
im = cv2.imread('page.png', cv2.IMREAD_GRAYSCALE)
_, thr = cv2.threshold(im,127,255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# Get largest white square
largestSquare(thr)
Output
Location: (21, 77)
Length of side: 18
Notes:
I edited out your red annotation so it didn't interfere with my algorithm.
I did Otsu thresholding to get pure black and white - that may or may not be appropriate to your use case. It will depend on your scans and paper background etc.
I scaled the image down to 100x100 so it doesn't take all day to run. You will need to scale the results back up to the size of your original image but I assume you can do that easily enough.
Keywords: Image processing, image, Python, OpenCV, largest white square, largest empty space.
A chem student asked me for help with plotting image segmenetation:
A stationary camera takes a picture of the experimental setup every second over a period of a few minutes, so like 300 images yield.
The relevant parts in the setup are two adjacent layers of differently-colored foams observed from the side, a 2-color sandwich shrinking from both sides, basically, except one of the foams evaporates a bit faster.
I'd like to segment each of the images in the way that would let me plot both foam regions' "width" against time.
Here is a "diagram" :)
I want to go from here --> To here
Ideally, given a few hundred of such shots, in which only the widths change, I get an array of scalars back that I can plot. (Going to look like a harmonic series on either side of the x-axis)
I have a bit of python and matlab experience, but have never used OpenCV or Image Processing toolbox in matlab, or actually never dealt with any computer vision in general. Could you guys throw like a roadmap of what packages/functions to use or steps one should take and i'll take it from there?
I'm not sure how to address these things:
-selecting at which slice along the length of the slice the algorithm measures the width(i.e. if the foams are a bit uneven), although this can be ignored.
-which library to use to segment regions of the image based on their color, (some k-means shenanigans probably), and selectively store the spatial parameters of the resulting segments?
-how to iterate that above over a number of files.
Thank you kindly in advance!
Assume your Intensity will be different after converting into gray scale ( if not, just convert to other color space like HSV or LAB, then just use one of the components)
img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
First, Threshold your grayscaled input into a few bands
ret,thresh1 = cv.threshold(img,128,255,cv.THRESH_BINARY)
ret,thresh2 = cv.threshold(img,27,255,cv.THRESH_BINARY_INV)
ret,thresh3 = cv.threshold(img,77,255,cv.THRESH_TRUNC)
ret,thresh4 = cv.threshold(img,97,255,cv.THRESH_TOZERO)
ret,thresh5 = cv.threshold(img,227,255,cv.THRESH_TOZERO_INV)
The value should be tested out by your actual data. Here Im just give a example
Clean up the segmented image using median filter with a radius larger than 9. I do expect some noise. You can also use ROI here to help remove part of noise. But personally I`m lazy, I just wrote program to handle all cases and angle
threshholed_images_aftersmoothing = cv2.medianBlur(threshholed_images,9)
Each band will be corresponding to one color (layer). Now you should have N segmented image from one source. where N is the number of layers you wish to track
Second use opencv function bounding rect to find location and width/height of each Layer AKA each threshholed_images_aftersmoothing. Eg. boundingrect on each sub-segmented images.
C++: Rect boundingRect(InputArray points)
Python: cv2.boundingRect(points) → retval¶
Last, the rect have x,y, height and width property. You can use a simple sorting order to sort from top to bottom layer based on rect attribute x. Run though all vieo to obtain the x(layer id) , height vs time graph.
Rect API
Public Attributes
_Tp **height** // this is what you are looking for
_Tp width
_Tp **x** // this tells you the position of the band
_Tp y
By plot the corresponding heights (|AB| or |CD|) over time, you can obtain the graph you needed.
The more correct way is to use Kalman filter to track the position and height graph as I would expect some sort of bubble will occur and will interfere with the height of the layers.
To be honest, i didnt expect a chem student to be good at this. Haha good luck
Anything wrong you can find me here or Email me if i`m not watching stackoverflow
You can select a region of interest straight down the middle of the foams, a few pixels wide. If you stack these regions for each image it will show the shrink over time.
If for example you use 3 pixel width for the roi, the result of 300 images will be a 900 pixel wide image, where the left is the start of the experiment and the right is the end. The following image can help you understand:
Though I have not fully tested it, this code should work. Note that there must only be images in the folder you reference.
import cv2
import numpy as np
import os
# path to folder that holds the images
path = '.'
# dimensions of roi
x = 0
y = 0
w = 3
h = 100
# store references to all images
all_images = os.listdir(path)
# sort images
all_images.sort()
# create empty result array
result = np.empty([h,0,3],dtype=np.uint8)
for image in all_images:
# load image
img = cv2.imread(path+'/'+image)
# get the region of interest
roi = img[y:y+h,x:x+w]
# add the roi to previous results
result = np.hstack((result,roi))
# optinal: save result as image
# cv2.imwrite('result.png',result)
# display result - can also plot with matplotlib
cv2.imshow('Result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Update after question edit:
If the foams have different colors, your can use easily separate them by color by converting the image you hsv and using inrange (example). This creates a mask (=2D array with values from 0-255, one for each pixel) that you can use to calculate average height and extract the parameters and area of the image.
You can find a script to help you find the HSV colors for separation on this GitHub
I have found a plethora of questions regarding finding "things" in images using openCV, et al. in Python but so far I have been unable to piece them together for a reliable solution to my problem.
I am attempting to use computer vision to help count tiny surface mount electronics parts. The idea is for me to dump parts onto a solid color piece of paper, snap a picture, and have the software tell me how many items are in it.
The "things" differ from one picture to the next but will always be identical in any one image. I seem to be able to manually tune the parameters for things like hue/saturation for a particular part but it tends to require tweaking every time I change to a new part.
My current, semi-functioning code is posted below:
import imutils
import numpy
import cv2
import sys
def part_area(contours, round=10):
"""Finds the mode of the contour area. The idea is that most of the parts in an image will be separated and that
finding the most common area in the list of areas should provide a reasonable value to approximate by. The areas
are rounded to the nearest multiple of 200 to reduce the list of options."""
# Start with a list of all of the areas for the provided contours.
areas = [cv2.contourArea(contour) for contour in contours]
# Determine a threshold for the minimum amount of area as 1% of the overall range.
threshold = (max(areas) - min(areas)) / 100
# Trim the list of areas down to only those that exceed the threshold.
thresholded = [area for area in areas if area > threshold]
# Round the areas to the nearest value set by the round argument.
rounded = [int((area + (round / 2)) / round) * round for area in thresholded]
# Remove any areas that rounded down to zero.
cleaned = [area for area in rounded if area != 0]
# Count the areas with the same values.
counts = {}
for area in cleaned:
if area not in counts:
counts[area] = 0
counts[area] += 1
# Reduce the areas down to only those that are in groups of three or more with the same area.
above = []
for area, count in counts.iteritems():
if count > 2:
for _ in range(count):
above.append(area)
# Take the mean of the areas as the average part size.
average = sum(above) / len(above)
return average
def find_hue_mode(hsv):
"""Given an HSV image as an input, compute the mode of the list of hue values to find the most common hue in the
image. This is used to determine the center for the background color filter."""
pixels = {}
for row in hsv:
for pixel in row:
hue = pixel[0]
if hue not in pixels:
pixels[hue] = 0
pixels[hue] += 1
counts = sorted(pixels.keys(), key=lambda key: pixels[key], reverse=True)
return counts[0]
if __name__ == "__main__":
# load the image and resize it to a smaller factor so that the shapes can be approximated better
image = cv2.imread(sys.argv[1])
# define range of blue color in HSV
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
center = find_hue_mode(hsv)
print 'Center Hue:', center
lower = numpy.array([center - 10, 50, 50])
upper = numpy.array([center + 10, 255, 255])
# Threshold the HSV image to get only blue colors
mask = cv2.inRange(hsv, lower, upper)
inverted = cv2.bitwise_not(mask)
blurred = cv2.GaussianBlur(inverted, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 100)
dilated = cv2.dilate(edged, None, iterations=1)
eroded = cv2.erode(dilated, None, iterations=1)
# find contours in the thresholded image and initialize the shape detector
contours = cv2.findContours(eroded.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if imutils.is_cv2() else contours[1]
# Compute the area for a single part to use when setting the threshold and calculating the number of parts within
# a contour area.
part_area = part_area(contours)
# The threshold for a part's area - can't be too much smaller than the part itself.
threshold = part_area * 0.5
part_count = 0
for contour in contours:
if cv2.contourArea(contour) < threshold:
continue
# Sometimes parts are close enough together that they become one in the image. To battle this, the total area
# of the contour is divided by the area of a part (derived earlier).
part_count += int((cv2.contourArea(contour) / part_area) + 0.1) # this 0.1 "rounds up" slightly and was determined empirically
# Draw an approximate contour around each detected part to give the user an idea of what the tool has computed.
epsilon = 0.1 * cv2.arcLength(contour, True)
approx = cv2.approxPolyDP(contour, epsilon, True)
cv2.drawContours(image, [approx], -1, (0, 255, 0), 2)
# Print the part count and show off the processed image.
print 'Part Count:', part_count
cv2.imshow("Image", image)
cv2.waitKey(0)
Here's an example of the type of input image I am using:
or this:
And I'm currently getting results like this:
The results clearly show that the script is having trouble identifying some parts and it's true Achilles heel seems to be when parts touch one another.
So my question/challenge is, what can I do to improve the reliability of this script?
The script is to be integrated into an existing Python tool so I am searching for a solution using Python. The solution does not need to be pure Python as I am willing to install whatever 3rd party libraries might be needed.
If the objects are all of similar types, you might have more success isolating a single example in the image and then using feature matching to detect them.
A full solution would be out of scope for Stack Overflow, but my suggestion for progress would be to first somehow find one or more "correct" examples using your current rectangle retrieval method. You could probably look for all your samples that are of the expected size, or that are accurate rectangles.
Once you have isolated a few positive examples, use some feature matching techniques to find the others. There is a lot of reading up you probably need to do on it but that is a potential solution.
A general summary is that you use your positive examples to find "features" of the object you want to detect. These "features" are generally things like corners or changes in gradient. OpenCV contains many methods you can use.
Once you have the features, there are several algorithms in OpenCV you can look at that will search the image for all matching features. You’ll want one that is rotation invariant (can detect the same features arranged in different rotation), but you probably don’t need scale invariance (can detect the same features at multiple scales).
My one concern with this method is that the items you are searching for in your images are quite small. It might be difficult to find good, consistent features to match on.
You're tackling a 2D object recognition problem, for which there are many possible approaches. You've gone about it using background/foreground segmentation, which is ok as you have control on the scene (laying down the background paper sheet). However this will always have fundamental limitations when the objects touch. A simple solution to your problem can be this:
1) You assume that touching objects are rare events (which is a fine assumption in your problem). Therefore you can compute the areas for each segmented region, and compute the median of these, which will give a robust estimate for the object's area. Let's call this robust estimate A (in squared pixels). This will be fine if fewer than 50% of regions correspond to touching objects.
2) You then proceed to measure the number of objects in each segmented region. Let Ai be the area of the ith region. You then compute the number of objects in each region by Ni=round(Ai/A). You then sum Ni to give you the total number of objects.
This approach will be fine as long as the following conditions are met:
A) The touching objects do not significantly overlap
B) You do not have objects lying on their sides. If you do you might be able to deal with this using two area estimates (side and flat). Better to eliminate this scenario if you can for simplicity.
C) The objects are all roughly the same distance to the camera. If this is not the case then the areas of the objects (in pixels) cannot be modelled well by a single value.
D) There are not partially visible objects at the borders of the image.
E) You ensure that only the same type of object is visible in each image.
I have a code:
def compare_frames(frame1, frame2):
# cropping ranges of two images
frame1, frame2 = similize(frame1, frame2)
sc = 0
h = numpy.zeros((300,256,3))
frame1= cv2.cvtColor(frame1,cv2.COLOR_BGR2HSV)
frame2= cv2.cvtColor(frame2,cv2.COLOR_BGR2HSV)
bins = numpy.arange(256).reshape(256,1)
color = [ (255,0,0),(0,255,0),(0,0,255) ]
for ch, col in enumerate(color):
hist_item1 = cv2.calcHist([frame1],[ch],None,[256],[0,255])
hist_item2 = cv2.calcHist([frame2],[ch],None,[256],[0,255])
cv2.normalize(hist_item1,hist_item1,0,255,cv2.NORM_MINMAX)
cv2.normalize(hist_item2,hist_item2,0,255,cv2.NORM_MINMAX)
sc = sc + (cv2.compareHist(hist_item1, hist_item2, cv2.cv.CV_COMP_CORREL)/len(color))
return sc
It works, but if image have color noise (more darken/lighten tint) it's not working and give similarity equals is 0.5. (need 0.8)
An image 2 is more darken than image 1.
Can you suggest me FAST comparison algorythm ignore light, blur, noise on images or modify that?
Note:
i have template matching algorythm too:
But it works slowly than i need although similarity is 0.95.
def match_frames(frame1, frame2):
# cropping ranges of two images
frame1, frame2 = similize(frame1, frame2)
result = cv2.matchTemplate(frame1,frame2,cv2.TM_CCOEFF_NORMED)
return numpy.amax(result)
Thanks
Your question is one of the classic ones in computer vision and image processing. Many doctoral theses have been written and scores of papers in conferences and journals.
In short direct pixel comparisons will not work in this case. A transformation of some kind is needed to take you to a different feature space. You could do something simple or complex depending on the requirements you have in mind. You could compute edges or corners. One suggestion already mentioned is the FAST corner detection. This would be a good choice as would SIFT etc... There are many others you could use but it will depend on how much the two images can vary and in what ways.
For example, if there is only going to be global color changes, tint, etc the approach would be different than if the images could be rotated or the object position changing in size (i.e. camera zoom).
Strictly speaking for the case you mention features such as FAST, SIFT, or even edges would work reasonably well. Check http://en.wikipedia.org/wiki/Feature_detection_%28computer_vision%29 for more information
Image patch descriptors (SIFT, SURF...) are usually monochromatic and expect black-and-white images. Thus, for any approach (point matching, frame matching...) I would advise you to change the color space to Lab or YUV first and then work on the luminance plane.
FAST is a (fast) corner detection algorithm. A corner is obviously insensitive to noise and contrast, but may be affected by blur (bad position, bad corner response for example). FAST does not include a descriptor part however, so your matching should then rely on geometric proximity. If you need a descriptor part, then you need to switch to one of the many other keypoint descriptors (SIFT, SURF, FAST + BRIEF/BRISK/ORB/FREAK...).