I'm building a shape analysis algorithm and one of the attributes we would like to add is whether the shape is open or closed. For example, the left circle is closed, the middle circle is open and the right circle is more open.
I tried to do it via the contours but I search for a more robust (and hopefully easy) way to achieve it. I know it can be solved with ANN but I don't want to go in that direction.
(I'm working with Python/OpenCV 4)
Any ideas?
There is a great way to do it using ImageDraw.floodfill function.
Attached is a working code:
from PIL import Image, ImageDraw
def openOrClose(img2):
isCLosed = False
img = cv2.cvtColor(img2, cv2.COLOR_GRAY2BGR)
target_pixel = (0,0) #Corner of the image
target_color = (255,255,0) #Yellow
im_pil = Image.fromarray(img)
ImageDraw.floodfill(im_pil,target_pixel,target_color)
im = np.asarray(im_pil)
count =0
for i in range(im.shape[0]):
for j in range(im.shape[1]):
if ((im[i][j] == [255,255,255]).all() == True):
count+=1
if count != 0:
isCLosed = True
return isCLosed
and this is the result:
If the object is the only thing in the array you are analyzing you could flood fill from any of the corners. If the area occupied by your flood fill value is less than the image area with the drawing's area subtracted it would be a closed object.
Related
When I try to make an inverse polar transformation to my image, the output is outside of the output image. There are also some weird white patterns on the top. I tried to make the output image larger but the circle is on the left side so it didn't help.
I am trying to make a line circle using warpPolar function, for that first I'm flipping the line and giving it a black area as shown on the image, then using the cv2.warpPolar function with WARP_INVERSE_MAP flag.
How can I fully draw the circle, and get its bounding box is my question.
line = np.ones(shape=(20,475),dtype=np.uint8)*255
flipped = cv2.rotate(line,cv2.ROTATE_90_CLOCKWISE)
cv2.imshow('flipped',flipped)
h,w = flipped.shape
radius = int(h / (2*np.pi))
new_image = np.zeros(shape=(h,radius+w),dtype=np.uint8)
h2,w2 = new_image.shape
new_image[: ,w2-w:w2] = flipped
cv2.imshow('polar',new_image)
h,w = new_image.shape
center = (w/2,h)
output= cv2.warpPolar(new_image,center=center,maxRadius=radius,dsize=(1500,1500),flags=cv2.WARP_INVERSE_MAP + cv2.WARP_POLAR_LINEAR)
cv2.imshow('output',output)
cv2.waitKey(0)
Note: I am not getting the same result as you showed above when I tried the same code. You may miss some code lines to add ?
If I didn't misunderstand your problem,you are trying to get this result: (If I am wrong, I will update the answer accordingly)
The only point you are missing is that defining the center and radius. You are making inverse transform here, the input is created by you not warpPolar. Since you are defining size as (1500,1500), you need to update center and radius accordingly. Here is my code giving this result:
import cv2
import numpy as np
line = np.ones(shape=(20,475),dtype=np.uint8)*255
flipped = cv2.rotate(line,cv2.ROTATE_90_CLOCKWISE)
cv2.imshow('flipped',flipped)
h,w = flipped.shape
radius = int(h / (2*np.pi))
new_image = np.zeros(shape=(h,radius+w),dtype=np.uint8)
h2,w2 = new_image.shape
new_image[: ,w2-w:w2] = flipped
cv2.imshow('polar',new_image)
h,w = new_image.shape
center = (750,750)
maxRadius = 750
output= cv2.warpPolar(new_image,center=center,maxRadius=radius,dsize=(1500,1500),flags=cv2.WARP_INVERSE_MAP + cv2.WARP_POLAR_LINEAR)
cv2.imshow('output',output)
cv2.waitKey(0)
I have some hundreds of images (scanned documents), most of them are skewed. I wanted to de-skew them using Python.
Here is the code I used:
import numpy as np
import cv2
from skimage.transform import radon
filename = 'path_to_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
I = cv2.cvtColor(img, COLOR_BGR2GRAY)
h, w = I.shape
# If the resolution is high, resize the image to reduce processing time.
if (w > 640):
I = cv2.resize(I, (640, int((h / w) * 640)))
I = I - np.mean(I) # Demean; make the brightness extend above and below zero
# Do the radon transform
sinogram = radon(I)
# Find the RMS value of each row and find "busiest" rotation,
# where the transform is lined up perfectly with the alternating dark
# text and white lines
r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))
# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2,h/2),90 - rotation,1)
dst = cv2.warpAffine(img,M,(w,h))
cv2.imwrite('rotated.jpg', dst)
This code works well with most of the documents, except with some angles: (180 and 0) and (90 and 270) are often detected as the same angle (i.e it does not make difference between (180 and 0) and (90 and 270)). So I get a lot of upside-down documents.
Here is an example:
The resulted image that I get is the same as the input image.
Is there any suggestion to detect if an image is upside down using Opencv and Python?
PS: I tried to check the orientation using EXIF data, but it didn't lead to any solution.
EDIT:
It is possible to detect the orientation using Tesseract (pytesseract for Python), but it is only possible when the image contains a lot of characters.
For anyone who may need this:
import cv2
import pytesseract
print(pytesseract.image_to_osd(cv2.imread(file_name)))
If the document contains enough characters, it is possible for Tesseract to detect the orientation. However, when the image has few lines, the orientation angle suggested by Tesseract is usually wrong. So this can not be a 100% solution.
Python3/OpenCV4 script to align scanned documents.
Rotate the document and sum the rows. When the document has 0 and 180 degrees of rotation, there will be a lot of black pixels in the image:
Use a score keeping method. Score each image for it's likeness to a zebra pattern. The image with the best score has the correct rotation. The image you linked to was off by 0.5 degrees. I omitted some functions for readability, the full code can be found here.
# Rotate the image around in a circle
angle = 0
while angle <= 360:
# Rotate the source image
img = rotate(src, angle)
# Crop the center 1/3rd of the image (roi is filled with text)
h,w = img.shape
buffer = min(h, w) - int(min(h,w)/1.15)
roi = img[int(h/2-buffer):int(h/2+buffer), int(w/2-buffer):int(w/2+buffer)]
# Create background to draw transform on
bg = np.zeros((buffer*2, buffer*2), np.uint8)
# Compute the sums of the rows
row_sums = sum_rows(roi)
# High score --> Zebra stripes
score = np.count_nonzero(row_sums)
scores.append(score)
# Image has best rotation
if score <= min(scores):
# Save the rotatied image
print('found optimal rotation')
best_rotation = img.copy()
k = display_data(roi, row_sums, buffer)
if k == 27: break
# Increment angle and try again
angle += .75
cv2.destroyAllWindows()
How to tell if the document is upside down? Fill in the area from the top of the document to the first non-black pixel in the image. Measure the area in yellow. The image that has the smallest area will be the one that is right-side-up:
# Find the area from the top of page to top of image
_, bg = area_to_top_of_text(best_rotation.copy())
right_side_up = sum(sum(bg))
# Flip image and try again
best_rotation_flipped = rotate(best_rotation, 180)
_, bg = area_to_top_of_text(best_rotation_flipped.copy())
upside_down = sum(sum(bg))
# Check which area is larger
if right_side_up < upside_down: aligned_image = best_rotation
else: aligned_image = best_rotation_flipped
# Save aligned image
cv2.imwrite('/home/stephen/Desktop/best_rotation.png', 255-aligned_image)
cv2.destroyAllWindows()
Assuming you did run the angle-correction already on the image, you can try the following to find out if it is flipped:
Project the corrected image to the y-axis, so that you get a 'peak' for each line. Important: There are actually almost always two sub-peaks!
Smooth this projection by convolving with a gaussian in order to get rid of fine structure, noise, etc.
For each peak, check if the stronger sub-peak is on top or at the bottom.
Calculate the fraction of peaks that have sub-peaks on the bottom side. This is your scalar value that gives you the confidence that the image is oriented correctly.
The peak finding in step 3 is done by finding sections with above average values. The sub-peaks are then found via argmax.
Here's a figure to illustrate the approach; A few lines of you example image
Blue: Original projection
Orange: smoothed projection
Horizontal line: average of the smoothed projection for the whole image.
here's some code that does this:
import cv2
import numpy as np
# load image, convert to grayscale, threshold it at 127 and invert.
page = cv2.imread('Page.jpg')
page = cv2.cvtColor(page, cv2.COLOR_BGR2GRAY)
page = cv2.threshold(page, 127, 255, cv2.THRESH_BINARY_INV)[1]
# project the page to the side and smooth it with a gaussian
projection = np.sum(page, 1)
gaussian_filter = np.exp(-(np.arange(-3, 3, 0.1)**2))
gaussian_filter /= np.sum(gaussian_filter)
smooth = np.convolve(projection, gaussian_filter)
# find the pixel values where we expect lines to start and end
mask = smooth > np.average(smooth)
edges = np.convolve(mask, [1, -1])
line_starts = np.where(edges == 1)[0]
line_endings = np.where(edges == -1)[0]
# count lines with peaks on the lower side
lower_peaks = 0
for start, end in zip(line_starts, line_endings):
line = smooth[start:end]
if np.argmax(line) < len(line)/2:
lower_peaks += 1
print(lower_peaks / len(line_starts))
this prints 0.125 for the given image, so this is not oriented correctly and must be flipped.
Note that this approach might break badly if there are images or anything not organized in lines in the image (maybe math or pictures). Another problem would be too few lines, resulting in bad statistics.
Also different fonts might result in different distributions. You can try this on a few images and see if the approach works. I don't have enough data.
You can use the Alyn module. To install it:
pip install alyn
Then to use it to deskew images(Taken from the homepage):
from alyn import Deskew
d = Deskew(
input_file='path_to_file',
display_image='preview the image on screen',
output_file='path_for_deskewed image',
r_angle='offest_angle_in_degrees_to_control_orientation')`
d.run()
Note that Alyn is only for deskewing text.
I am using dblib to get the eyes of a face. Below are some examples of the results.
I have tried several methods to accomplish the objective. For instance, I tried to detect the center of the eye based on this project; from that, it would be easy to detect the pupil and the iris, however, I did not achieve good results. I also have tried to use Hough Circles but in some cases the results are quite bad.
My best bet is to detect the pupil, which is the only part of the eye with a common color (black) for every eye. I would like to get some ideas to do so.
My first idea is to set a region (between 20 and 60 in the x axis), then, in gray-scale, make the dark pixels (less than 25, for instance) black, and the rest, white. That would create a mask, that can be blurred to use Hough Circles and detect the region of the pupil. Finally, I can set a radius for the iris.
Any idea would be appreciated.
Thanks.
Actually your idea of detecting the shape of the pupil is good but your pictures are not good enough to do it directly. An easy way is to pre-process those to remove all useless data.
I did some example with one of your original pics to show you (on Gimp)
Go to grey scale
Do a high pass filter to remove all small color fluctuations (you have very distinct colors so it should enhance borders very well)
Link to example filtered pic
Apply a threshold on your picture to remove remaining fluctuations (you can calculate the reference threshold value by analyzing your grey scale image color histogram)
Link to example thresholded pic
After those three steps you should have enough data to run your shape detection.
Most of the answers I have read till now say to use the Hough circle method to detect the iris region, but it doesn't really work on all images.
So my approach is pretty simple, which involves following steps
Detect face from the image
Find eye region from the face
Get the RGB values just below the pupil region(thereby getting the iris region RGB values)
And pass the obtained RGB values to find_color function
NOTE: Pass High-resolution image as the input for better results. If you pass low-resolution images such as 480x620, 320x240, you might end up getting poor results.
Below is the code for the same
import cv2
import imutils
from imutils import face_utils
import dlib
import numpy as np
import webcolors
flag=0
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
img= cv2.imread('blue2.jpg')
img_rgb= cv2.cvtColor(img,cv2.COLOR_BGR2RGB) #convert to RGB
#cap = cv2.VideoCapture(0) #turns on the webcam
(left_Start, left_End) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
#points for left eye and right eye
(right_Start, right_End) = face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]
def find_color(requested_colour): #finds the color name from RGB values
min_colours = {}
for name, key in webcolors.CSS3_HEX_TO_NAMES.items():
r_c, g_c, b_c = webcolors.hex_to_rgb(name)
rd = (r_c - requested_colour[0]) ** 2
gd = (g_c - requested_colour[1]) ** 2
bd = (b_c - requested_colour[2]) ** 2
min_colours[(rd + gd + bd)] = key
closest_name = min_colours[min(min_colours.keys())]
return closest_name
#ret, frame=cap.read()
#frame = cv2.flip(frame, 1)
#cv2.imshow(winname='face',mat=frame)
gray = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2GRAY)
# detect dlib face rectangles in the grayscale frame
dlib_faces = detector(gray, 0)
for face in dlib_faces:
eyes = [] # store 2 eyes
# convert dlib rect to a bounding box
(x,y,w,h) = face_utils.rect_to_bb(face)
cv2.rectangle(img_rgb,(x,y),(x+w,y+h),(255,0,0),1) #draws blue box over face
shape = predictor(gray, face)
shape = face_utils.shape_to_np(shape)
leftEye = shape[left_Start:left_End]
# indexes for left eye key points
rightEye = shape[right_Start:right_End]
eyes.append(leftEye) # wrap in a list
eyes.append(rightEye)
for index, eye in enumerate(eyes):
flag+=1
left_side_eye = eye[0] # left edge of eye
right_side_eye = eye[3] # right edge of eye
top_side_eye = eye[1] # top side of eye
bottom_side_eye = eye[4] # bottom side of eye
# calculate height and width of dlib eye keypoints
eye_width = right_side_eye[0] - left_side_eye[0]
eye_height = bottom_side_eye[1] - top_side_eye[1]
# create bounding box with buffer around keypoints
eye_x1 = int(left_side_eye[0] - 0 * eye_width)
eye_x2 = int(right_side_eye[0] + 0 * eye_width)
eye_y1 = int(top_side_eye[1] - 1 * eye_height)
eye_y2 = int(bottom_side_eye[1] + 0.75 * eye_height)
# draw bounding box around eye roi
#cv2.rectangle(img_rgb,(eye_x1, eye_y1), (eye_x2, eye_y2),(0,255,0),2)
roi_eye = img_rgb[eye_y1:eye_y2 ,eye_x1:eye_x2] # desired EYE Region(RGB)
if flag==1:
break
x=roi_eye.shape
row=x[0]
col=x[1]
# this is the main part,
# where you pick RGB values from the area just below pupil
array1=roi_eye[row//2:(row//2)+1,int((col//3)+3):int((col//3))+6]
array1=array1[0][2]
array1=tuple(array1) #store it in tuple and pass this tuple to "find_color" Funtion
print(find_color(array1))
cv2.imshow("frame",roi_eye)
cv2.waitKey(0)
cv2.destroyAllWindows()
Below are some examples.
An actress with blue eyes
Now this is the output of our code when the above image is given as the input: lightsteelblue
An actress with brown eyes
The output of our code when the above image is given as the input: saddlebrown
Mila kunis (one brown eye and other is green)
The output of our code when the above image is given as the input: sienna(shade of brown)
An actress with grey eyes
The output of our code when the above image is given as the input: darkgrey
So, you can see how close the results are to the actual eye color. This works pretty well with high-resolution images as I already mentioned.
PS: Correct me if am wrong, open to suggestions.
I have a webcam feed using OpenCV, and I am trying to fit an ellipse in real time.
The code I am using at the moment works, but it fails to fit an ellipse to the image a lot of the time. What other methods of ellipse fitting to an image can I pursue?
Current code:
def find_ellipses(img): #img is grayscale image of what I want to fit
ret,thresh = cv2.threshold(img,127,255,0)
_,contours,hierarchy = cv2.findContours(thresh, 1, 2)
if len(contours) != 0:
for cont in contours:
if len(cont) < 5:
break
elps = cv2.fitEllipse(cont)
return elps #only returns one ellipse for now
return None
Where elps is of the form (x_centre,y_centre),(minor_axis,major_axis),angle
Here is an example of what I want to successfully fit an ellipse to. My current code fails with this image when I don't want it to.
Turns out I was wrong is just getting the first ellipse from the function. While I thought the first calculated ellipse was the most correct one, what I actually had to do was go through all the ellipses - and choose the most suitable one that bounded the object in the image.
I would define my contours outside of the function, as you don't need to keep re-defining them in this image.
def create_ellipse(thresh,cnt):
ellipse = cv2.fitEllipse(cnt)
thresh = cv2.ellipse(thresh,ellipse,(0,255,0),2)
return thresh
What this code is doing is im taking my thresh image stream and adding an ellipse on top of it. Later on in my code when I want to call it I use the line
thresh = create_ellipse(thresh,cnt)
I'm using PIL.
im = im.rotate(angle=-90, expand = True)
When I do this, it adds a greyish border to my image.
Why?
Here's m full code. Note that if I don't rotate, it adds no borders
def fixRotation(f, quality=96, image_type="JPEG"):
#http://sylvana.net/jpegcrop/exif_orientation.html
d =getEXIF(f)
if d:
orientation = int(d['Orientation'])
im = Image.open(StringIO(f))
if orientation == 6:
im = im.rotate(angle=-90, expand = True)
elif orientation == 3:
im = im.rotate(angle=-180, expand=True)
elif orientation == 8:
im = im.rotate(angle=-270, expand=True)
else:
#It doesn't add a border here.
im = im.rotate(0, expand=True)
res = StringIO()
im.save(res, image_type, quality=quality)
res.seek(0)
return res
else:
return StringIO(f)
I made some experiment and indeed image size is altered but I didn't understad the exact behavior. To me looks like a bug in PIL... you should report it.
If you only need k*90 degrees then to do the rotation you can also use numpy...
img = Image.fromarray(numpy.rot90(numpy.array(img), n))
n is the number of times to rotate by 90 degrees.
Note the following (CPython 2.6, PIL 1.1.7):
import Image
i= Image.open("someimage.jpg")
ir0= i.rotate(-90)
ir1= i.rotate(-90, expand=1)
for img in i, ir0, ir1:
print(img.size)
# output follows
(720, 400)
(400, 720)
(401, 721)
When the angle is a multiple of 90, expand is not needed.
expand in the case of right-angle multiples seems to activate a bug
So, if all you care about is 90°-180°-270° rotations, just drop the expand=1 argument; even better, use the transpose method (see Geometrical Transforms in the PIL tutorial )
When rotating, the algorithm (essentially) averages each pixel with the "next" pixel value. For all pixels "inside" the image, the next pixel is defined. For all pixels at the edge of the image, that pixel is undefined.
So the grey is the average between the known perimeter pixel and an undefined exterior pixel.