Related
I'm trying to draw a horizontal line across a shape (ellipse in this instance with only the centroid and the boundary of the ellipse on a black background) starting from the centroid of the shape (ellipse) . I started off checking each and every pixel along +x and -x axes from centroid and replacing each non-green pixel(boundary) to a white pixel (essentially drawing a line pixel by pixel) and stop converting as soon as I reach the first green pixel (boundary). Code is given at the end
According to my logic, the line (created using points) should stop as soon as it reaches the boundary aka first green pixel along a particular axis but there is a slight offset of the detected boundary. In the given image, you can clearly see the right and left most points calculated by checking each and every pixel is slightly off center from the actual line
Images are enlarged for better view
I checked my code multiple times and I freshly drew ellipses every time to make sure there is no stray green pixels left on the image but the offset is consistent for each try
So my question will be: How do I get rid of this offset and make my line incident on the boundary perfectly? Is this a visual glitch or Am I doing something wrong?
Note: I know there are rectFitting and MinAreaRect functions which I can use to draw perfect bounding boxes to get points but I wanted to know why this is happening. I'm not looking for optimal method instead I'm looking for the cause and solution for this issue.
If you can suggest better/accurate title, its much appreciated. I think I have explained everything for the time being.
Code:
import cv2
import numpy as np
import matplotlib.pyplot as plt
%inline matplotlib
#Function to plot images
def display_img(img,name):
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111)
ax.imshow(img,cmap ="gray")
plt.title(name)
#A black canvas
canvas = np.zeros((1600,1200,3),np.uint8)
#Value obtained after ellipse fitting an object
val = ((654, 664),(264, 266),80)
centroid = (val[0][0],val[0][1])
#Drawing the ellipse on the canvas(green)
ell = cv2.ellipse(canvas,val,(0,255,0),1)
centroid_ = cv2.circle(canvas,centroid,1,(255,0,0),10) #High thickness to see it visibly (Red)
display_img(canvas,"Canvas w/ ellipse and centroid")
#variables for centers
y_center = centroid[1]
#Variables which iterate over time
right_pt = centroid[0]
left_pt = centroid[0]
#Using while loops to find the distance from the center to the
#nearby first green pixel (leftmost and rightmost boundary)
while(np.any(canvas[right_pt,y_center] != [0,255,0])):
cv2.circle(canvas,(right_pt,y_center),1,(255,255,255),1)
right_pt += 1
while(np.any(canvas[left_pt,y_center] != [0,255,0])):
cv2.circle(canvas,(left_pt,y_center),1,(255,255,255),1)
left_pt -= 1
#Drawing the obtained points
canvas = cv2.circle(canvas,(right_pt,y_center),1,(0,255,0),2)
canvas = cv2.circle(canvas,(left_pt,y_center),1,(0,255,0),2)
display_img(canvas,"Finale")
There are couple of problems, one hiding neatly behind another.
The first issue is evident in this snippet of code extracted from your script:
# ...
val = ((654, 664),(264, 266),80)
centroid = (val[0][0],val[0][1])
y_center = centroid[1]
right_pt = centroid[0]
left_pt = centroid[0]
while(np.any(canvas[right_pt,y_center] != [0,255,0])):
cv2.circle(canvas,(right_pt,y_center),1,(255,255,255),1)
right_pt += 1
# ...
Notice that you use the X and Y coordinates of the point you want to process
(represented by right_pt and y_center respectively) in the same order
to do both of the following:
index a numpy array: canvas[right_pt,y_center]
specify point coordinate to an OpenCV function: (right_pt,y_center)
That is a problem, because each of those libraries expects a different order:
numpy indexing is by default row-major, i.e. img[y,x]
points and sizes in OpenCV are column-major, i.e. (x,y)
In this particular case, the error is in the order of indexes for the numpy array canvas.
To fix it, just switch them around:
while(np.any(canvas[y_center,right_pt] != [0,255,0])):
cv2.circle(canvas,(right_pt,y_center),1,(255,255,255),1)
right_pt += 1
# ditto for the second loop
Once you fix that, and run your script, it will crash with an error like
while(np.any(canvas[y_center,right_pt] != [0,255,0])):
IndexError: index 1200 is out of bounds for axis 1 with size 1200
Why didn't this happen before? Since the centroid was (654, 664)
and you had the coordinates swapped, you were looking 10 rows away
from where you were drawing.
The problem lies in the fact that you're drawing white circles
into the same image you're also searching for green pixels, combined
with perhaps mistaken interpretation of what the radius parameter of
cv2.circle does. I suppose the best way to show this with an image
(representing 5 rows of 13 pixels):
The red dots are centers of respective circles,
white squares are the pixels drawn,
black squares are the pixels left untouched
and the yellow arrows indicate the direction of iteration along the row.
On the left side, you can see circle with radius 1, on the right radius 0.
Let's say we're approaching the green area we want to detect:
And make another iteration:
Oops, with radius of 1, we just changed the green pixel we're looking for to white.
Hence we can never find any green pixels (with the exception of the first point tested, since at that point we haven't drawn anything yet, and only in the first loop), and the loop will run out of bounds of the image.
There are several options on how to resolve this problem. The simplest one, if you're fine with a thinner line, is to change the radius to 0 in both calls to cv2.circle. Another possibility would be to cache a copy of "the row of interest", so that any drawing you do on canvas won't effect the search:
target_row = canvas[y_center].copy()
while(np.any(target_row[right_pt] != [0,255,0])):
cv2.circle(canvas,(right_pt,y_center),1,(255,255,255),1)
right_pt += 1
or
target_row = canvas[y_center] != [0,255,0]
while(np.any(target_row[right_pt])):
cv2.circle(canvas,(right_pt,y_center),1,(255,255,255),1)
right_pt += 1
or even better
target_row = np.any(canvas[y_center] != [0,255,0], axis=1)
while(target_row[right_pt]):
cv2.circle(canvas,(right_pt,y_center),1,(255,255,255),1)
right_pt += 1
Finally, you could skip the drawing in the loops, and just use a single function call to draw a line connecting the two endpoints you found.
target_row = np.any(canvas[y_center] != [0,255,0], axis=1)
while(target_row[right_pt]):
right_pt += 1
while(target_row[left_pt]):
left_pt -= 1
#Drawing the obtained points
cv2.line(canvas, (left_pt,y_center), (right_pt,y_center), (255,255,255), 2)
cv2.circle(canvas, (right_pt,y_center), 1, (0, 255, 0), 2)
cv2.circle(canvas, (left_pt,y_center), 1, (0, 255, 0), 2)
Bonus: Let's get rid of the explicit loops.
left_pt, right_pt = np.where(np.all(canvas[y_center] == [0,255,0], axis=1))[0]
This will (obviously) work only if there are two matching pixels on the row of interest. However, it is trivial to extend this to find the first one from the ellipses center in each direction (you get an array of all X coordinates/columns that contain a green pixel for that row).
Cropped output (generated by cv2.imshow) of that implementation can be seen in the following image (the centroid is blue, since you used (255,0,0) to draw it, and OpenCV uses BGR order by default):
I have a video file and I need to circle all moving objects in a certain frame I select. My idea of a solution to this problem is:
Circle all moving objects (white areas) on a video on which was applied motion detector and circle the same areas on the original frame.
I am using BackgroundSubtractorGMG() from cv2 to detect movement
Below I show the way I expect this program to work(I used to paint, so I am now sure this is correct, but I hope it is good enough to demonstrate the concept)
As others have said in comments:
Get the mask from you background subtraction algorithm
use cv.findContours(mask, ...) to find contours
(optional) select which contours you want to keep (something like ((x, y), radius) = cv.minEnclosingCircle(contour) or a, b, w, h = cv.boundingRect(c)
and if radius > 5
use drawing functions like cv.rectangle or similar to draw the shape around the contour (like so: cv.rectangle(img, (a, b), (a + w, b + h), (0, 255, 0), 2))
After converting a piece of code (that animates a pattern of rectangles) from Java to Python, I noticed that the animation that the code produced seemed quite glitchy. I managed to reproduce the problem with a minimal example as follows:
import pygame
SIZE = 200
pygame.init()
DISPLAYSURF = pygame.display.set_mode((SIZE, SIZE))
D = 70.9
xT = 0.3
yT = 0
#pygame.draw.rect(DISPLAYSURF, (255,0,0), (0, 0, SIZE, SIZE))
pygame.draw.rect(DISPLAYSURF, (255,255,255), (xT, yT, D, D))
pygame.draw.rect(DISPLAYSURF, (255,255,255), (xT+D, yT+D, D, D))
pygame.draw.rect(DISPLAYSURF, (0,0,0), (xT, yT+D, D, D))
pygame.draw.rect(DISPLAYSURF, (0,0,0), (xT+D, yT, D, D))
pygame.display.update()
This code generates the following image:
Notice that the squares don't line up perfectly in the middle. Uncommenting the commented line in the code above results in the following image, which serves to illuminate the problem further:
It seems that there are pixel-wide gaps in the black and white pattern, even though it can be seen in the code (by the data that is passed in the calls to pygame.draw.rect()) that this shouldn't be the case. What is the reason for this behaviour, and how can I fix it?
(This didn't happen in Java, here is a piece of Java code corresponding to the Python code above).
Looking at the rendered picture in an image editor, the pixel distances can be confirmed as such:
Expanding the function calls (i.e. performing the additions manually), one can see that the input arguments to draw the white rectangles are of the form
pygame.draw.rect(DISPLAYSURF, (255,255,255), ( 0.3, 0, 70.9, 70.9))
pygame.draw.rect(DISPLAYSURF, (255,255,255), (71.2, 70.9, 70.9, 70.9))
Since fractions of pixels do not make sense screen-wise, the input must be discretized in some way. Pygame (or SDL, as mentioned in the comments to the question) seems to choose truncating, which in practice transforms the drawing commands to:
pygame.draw.rect(DISPLAYSURF, (255,255,255), ( 0, 0, 70, 70))
pygame.draw.rect(DISPLAYSURF, (255,255,255), (71, 70, 70, 70))
which corresponds to the dimensions in the rendered image. If AWT draws it differently, my guess is that it uses rounding (of some sort) instead of truncating. This could be investigated by trying different rendering inputs, or by digging in the documentation.
If one wants pixel perfect rendering, using floating points as input is not well defined. If one keeps to the integers, the result should be independent of renderer, though.
EDIT: I expand a bit if anyone else finds this, since I couldn't find much info on this behavior apart from the source code.
The function call in question takes the following input arguments (documentation):
pygame.draw.rect(Surface, color, Rect, width=0)
where Rect is a specific object defined by a top-left coordinate, a width and a height. By design it only handles integer attributes, since it is meant as a low-level "this is what you see on the screen" data type. The data type handles floats by truncating:
>>> import pygame
>>> r = pygame.Rect((1, 1, 8, 12))
>>> r.bottomright
(9, 13)
>>> r.bottomright = (9.9, 13.5)
>>> r.bottomright
(9, 13)
>>> r.bottomright = (11.9, 13.5)
>>> r.bottomright
(11, 13)
i.e., a regular (int) cast is done.
The Rect object is not meant as a "store the coordinates for my sprite" object, but as a "this is what the screen will represent" object. Floating points are certainly useful for the former purpose, and the designer would probably want to keep an internal list of floats to store this information. Otherwise, incrementing a screen position by e.g. r.left += 0.8 (where r is the Rect object) would never move r at all.
The problem in the question comes from (quite reasonably) assuming that the right x coordinate of the rectangle will at least be calculated as something like x₂ = int(x₁ + width), but since the function call implicitly transforms the input tuple to a Rect object before proceeding, and since Rect will truncate its input arguments, it will instead calculate it as x₂ = int(x₁) + int(width), which is not always the same for float input.
To create a Rect using rounding rules, one could e.g. define a wrapper like:
def rect_round(x1, y1, w, h):
"""Returns pygame.Rect object after applying sane rounding rules.
Args:
x1, y1, w, h:
(x1, y1) is the top-left coordinate of the rectangle,
w is width,
h is height.
Returns:
pygame.Rect object.
"""
r_x1 = round(x1)
r_y1 = round(y1)
r_w = round(x1 - r_x1 + w)
r_h = round(y1 - r_y1 + h)
return pygame.Rect(map(int, (r_x1, r_y1, r_w, r_h)))
(or modified for other rounding rules) and then call the draw function as e.g.
pygame.draw.rect(DISPLAYSURF, (255,255,255), rect_round(71.2, 70.9, 70.9, 70.9))
One will never bypass the fact that the pixel by definition is the smallest addressable unit on the screen, though, so this solution might also have its quirks.
Related thread on the Pygame mailing list from 2005: Suggestion: make Rect use float coordinates
I'm trying to detect contiguous areas of close-enough colors in Python. I independently stumbled across the 8-way recursive flood fill algorithm (terminating when the Euclidian distance between the found and desired RGB colors exceeds a threshold), which works great at small scale, but causes a stack overflow on a 2 megapixel image.
Stack Overflow and Wikipedia point to scanline fill as the answer, but every explanation I've found is either in C++ or about filling a polygon with known vertices. Can someone point me to a good pseudocode explanation for a situation analogous to recursive flood fill?
I'm hitting a wall on researching image segmentation due to a lack of formal mathematics (I'm in high school.) If there is a plain-English explanation of K-Means or something like it, that'd be great too. OpenCV looked promising but it appears all I get is a color-flattened image; all I care about is a list of pixels in the object at x,y.
The idea of scanline flood fill is the following
you are given the initial point (seed) (x, y)
go left as far as possible until the pixel (x-1, y) is not to be filled or you get to x=0
the x you reached will be the start of scanline; keep two flags "look for caverns above" and "look for caverns below", both initialized at true
this is the begin of scanline loop. Check the pixel above (x, y-1) if it's to be filled and now considering the look-above flag and the pixel there are 4 cases:
if "look above" is true and the pixel is to be filled then you found a "new cavern", store (x, y-1) in the "to-do-list" and mark "look above" as false
if "look above" is false and the pixel is NOT to be filled then the current cavern is complete and you need to look for another one, so just mark "look_above" as true
in other cases (look above is true and pixel is not to be filled or look above is false and pixel is to be filled) you just do nothing
Repeat the same reasoning with the "look below" flag and the color of pixel (x, y+1)
paint the pixel (x, y) and move to (x+1, y), repeating from (5) unless the pixel you move to is not going to be painted.
if there is anything in the "to-do list" pick it out and go back at (1) using the coordinates you found in the to-do list as (x, y)
This is the version for a 4-connected flood fill. For an 8-connected fill you also need to check for caverns at (x-1, y-1) and (x-1, y+1) when starting the scanline and you need to check for caverns (x+1, y-1) and (x+1, y+1) at scanline end (if the corresponding flags are true).
When moving on the scanline what you want to do is adding the green points in the picture to your to-do list:
Note that the number of seeds will not be "minimal" (for example the first two "above seeds" in the example will end up in the same cavern and therefore only one of them is really needed). Still the amount of stack needed to store them will be normally much smaller than the amount needed for a pixel-by-pixel recursive approach.
Another possible approach to limit the amount of memory needed is using a frontier painting algorithm:
You put the initial seed (x, y) in the current_active list and paint it
Initialize a next_active list to empty
for every pixel in the current_active list check for neighbors that need to be painted: when you find one paint it and add it to the next_active list
when you're done set current_active list to next_active list and repeat from 2 unless the list was empty
You can see an example of the two algorithms in action in this video.
I've been reading about the subject but cannot get the idea in "plain English" about the usage and parameters for HoughCircles (specially the ones after CV_HOUGH_GRADIENT).
What's an accumulator threshold? Are 100 "votes" a right value?
I could find and "mask" the pupil, and worked my way through the Canny function, but I'm struggling beyond that and my problem is the HoughCircles function. There seems to be failing at finding the Iris' circle and I don't know why.
And this is the function I'm working on:
def getRadius(area):
r = 1.0
r = math.sqrt(area/3.14)
return (r)
def getIris(frame):
grayImg = cv.CreateImage(cv.GetSize(frame), 8, 1)
cv.CvtColor(frame,grayImg,cv.CV_BGR2GRAY)
cv.Smooth(grayImg,grayImg,cv.CV_GAUSSIAN,9,9)
cv.Canny(grayImg, grayImg, 32, 2)
storage = cv.CreateMat(grayImg.width, 1, cv.CV_32FC3)
minRad = int(getRadius(pupilArea))
circles = cv.HoughCircles(grayImg, storage, cv.CV_HOUGH_GRADIENT, 2, 10,32,200,minRad, minRad*2)
cv.ShowImage("output", grayImg)
while circles:
cv.DrawContours(frame, circles, (0,0,0), (0,0,0), 2)
# this message is never shown, therefore I'm not detecting circles
print "circle!"
circles = circles.h_next()
return (frame)
HoughCircles can be kind of tricky, I suggest looking through this thread. Where a bunch of people, including me ;), discuss how to use it. The key parameter is param2, the so-called accumulator threshold. Basically, the higher it is the less circles you get. And these circles have a higher probability of being correct. The best value is different for every image. I think the best approach is to use a parameter search on param2. Ie. keep on trying values until your criteria is met (such as: there are 2 circles, or max. number of circles that are non-overlapping, etc.). I have some code that does a binary search on 'param2', so it meet the criteria quickly.
The other crucial factor is pre-processing, try to reduce noise, and simplify the image. Some combination of blurring/thresholding/canny is good for this.
Anyhow, I get this:
From your uploded image, using this code:
import cv
import numpy as np
def draw_circles(storage, output):
circles = np.asarray(storage)
for circle in circles:
Radius, x, y = int(circle[0][3]), int(circle[0][0]), int(circle[0][4])
cv.Circle(output, (x, y), 1, cv.CV_RGB(0, 255, 0), -1, 8, 0)
cv.Circle(output, (x, y), Radius, cv.CV_RGB(255, 0, 0), 3, 8, 0)
orig = cv.LoadImage('eyez.png')
processed = cv.LoadImage('eyez.png',cv.CV_LOAD_IMAGE_GRAYSCALE)
storage = cv.CreateMat(orig.width, 1, cv.CV_32FC3)
#use canny, as HoughCircles seems to prefer ring like circles to filled ones.
cv.Canny(processed, processed, 5, 70, 3)
#smooth to reduce noise a bit more
cv.Smooth(processed, processed, cv.CV_GAUSSIAN, 7, 7)
cv.HoughCircles(processed, storage, cv.CV_HOUGH_GRADIENT, 2, 32.0, 30, 550)
draw_circles(storage, orig)
cv.ShowImage("original with circles", orig)
cv.WaitKey(0)
Update
I realise I somewhat miss-read your question! You actually want to find the iris edges. They are not so clearly defined, as the pupils. So we need to help HoughCircles as much as possible. We can do this, by:
Specifying a size range for the iris (we can work out a plausible range from the pupil size).
Increasing the minimum distance between circle centres (we know two irises can never overlap, so we can safely set this to our minimum iris size)
And then we need to do a param search on param2 again. Replacing the 'HoughCircles' line in the above code with this:
cv.HoughCircles(processed, storage, cv.CV_HOUGH_GRADIENT, 2, 100.0, 30, 150,100,140)
Gets us this:
Which isn't too bad.
My alternative suggestion is to use Threshold and Blob analysis. It more simple to detect iris than using canny edge and hough transform.
My way is... First you threshold it. Pick up any threshold value until the black and white image produce only (black color) iris and eyelashes.
Then separate the the iris and eyelashes by putting in blob analysis value min length at XX and min width at YY. The XX and YY value are the value of iris length and width.