Need to detect books via OpenCV Python - python

See, I have been trying to detect books in a bookshelf:
I used Contours for bounding boxes. However, I just want to capture the actual book objects. If I lessen the threshold from Canny, it won't detect the book edges themselves but it detects the book titles or some images from the spine.
I used houghlines and it worked well for detecting the book edge. How can I apply bounding boxes but with houghlines instead of contours?
code I used for Contour finding:
edges = cv2.Canny(blur,thresh,thresh*2)
drawing = np.zeros(img.shape,np.uint8)
contours,hierarchy = cv2.findContours(edges,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
rect = cv2.minAreaRect(cnt)
box = cv2.cv.BoxPoints(rect)
box = np.int0(box)
where:
img = cv2.imread('books3.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(5,5),0)
For the houghlines:
lines = cv2.HoughLines(edges,1,np.pi/180,120)
for rho,theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
where:
im = cv2.imread('books2.jpg')
gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,100,300,apertureSize = 3)
Thank you so much in advance.

I am actually working on something similar myself. trying to segment the books from one another in a bookshelf. I wish to ask what is the progress on your side so far?
I have yet to tried contours method. However, what I did try was to pre-process the image, canny the image before using HoughLines. The image belows shows a rough result.
I admit I have to perfectly segment out the books either. As you can see in the image, there are more lines then I actually wanted due to the nature of the book spine. I am looking into preprocessing methods that can help me to rid such problem.
I noticed you mentioned that "If I lessen the threshold from Canny, it won't detect the book edges themselves but it detects the book titles or some images from the spine." Maybe for HoughLine Parameters, you can adjust the theta? for instance to 90 degrees so that the book titles, etc will not be detected.
You can also try HoughLineP which is basically Probabilistic Hough Line Transform. More details about that can be found in:
http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html
Hope my methods give some ideas.I also hope to hear updates from you in regards to your contour method. Hope we can share tips and work together as we have a common goal (: Hope to hear from you soon.

Related

How can I prevent HoughLines from detecting certain lines multiple times

So I'm working on this piece of code to extract data from some graphs in images. These images are all scanned from a book. Since we're talking about 100+ images here, I would like to automate the process of course. My first step was to make sure that all images are aligned. Because the pages of the book were scanned by hand, the scans are all slightly shifted or rotated in regards to each other. Luckily there are some dotted lines on the images, which can be used as a reference point to align them. Afterwards I can then divide the image into smaller subimages, by slicing the image on these dotted lines. In that way, all subimages will be equal for all scanned images.
So, first step of course is to detect these dotted lines. My strategy can be described in 4 steps:
turn the dotted lines into solid lines, using Morphological Transformation
detect all edges, using Canny Edge Detection
identify the lines, using HoughLines
draw these lines on a mask for further usage
Now there are several problems which may occur. Sometimes HoughLines will detect a wrong line (such as the fold of the next page in the book), but this could potentially be fixed by cropping the image a little on the right side (better solutions are always welcome). The second (and biggest) problem is that HoughLines sometimes tends to identify a single line as multiple lines. I think this has something to do with Canny Edge Detection being too rough or vague about the edges, so that HoughLines actually sees multiple lines. Is there a way I could "smooth" the output from Canny so that HoughLines detects each line exactly once?
In the case of this specific image, the vertical dotted lines in the middle didn't get identified, whereas the fold of the next page in the book did. Furthermore the vertical dotted lines got identified as multiple lines. (left source image, middle edges detected, right lines detected)
# load image
img_large = cv2.imread("image.png")
# resize for ease of use
img_ori = cv2.resize(img_large, None, fx=0.2, fy=0.2, interpolation=cv2.INTER_CUBIC)
# create grayscale
img = cv2.cvtColor(img_ori, cv2.COLOR_BGR2GRAY)
# create mask for image size
mask = np.zeros((img.shape[:2]), dtype=np.uint8)
# do a morphologic close to merge dotted line
kernel = np.ones((8, 8))
res = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
# detect edges for houghlines
edges = cv2.Canny(res, 50, 50)
# detect lines
lines = cv2.HoughLines(edges, 1, np.pi/180, 200)
# draw detected lines
for line in lines:
rho, theta = line[0]
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*a)
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*a)
cv2.line(mask, (x1, y1), (x2, y2), 255, 2)
cv2.line(img, (x1, y1), (x2, y2), 127, 2)
In your script, the pixel-bins and the rotation bins are too fine for the threshold you've set:
lines = cv2.HoughLines(edges, 1, np.pi/180, 200)
So you can tune the threshold parameter (200) to get only one line, or tune the rho (1) and theta (np.pi/180) parameters, or tune all these. You can select a set of image that contain only one horizontal or vertical line from your images. Then do grid search to find the parameters that detect only one line in your set of test images.

OpenCV Python measuring distance with HoughLinesP() algorithm to determine water level

I'm trying to measure water level in a glass channel using OpenCV and Python. I've decided to use HaughLines in a selected ROI and find the midpoints of the said lines so I can calculate the difference between the ones that I want and multiply it with a set reference size that I'll get later on. So far the part where I find the lines look like this:
import cv2
import numpy as np
def midpoint(ptA, ptB, ptC, ptD):
return ((ptA + ptC) * 0.5, (ptB + ptD) * 0.5)
img = cv2.imread("b2924.JPG")
img = cv2.resize(img, None, fx=3/10, fy=3/10)
r = cv2.selectROI("main", img, False, False)
cropped = img[r[1]:(r[1]+r[3]), r[0]:(r[0]+r[2])]
cv2.destroyWindow("main")
imgray = cv2.cvtColor(cropped, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(imgray, 35, 75)
lines = cv2.HoughLinesP(edges, 1, np.pi/180, 75, maxLineGap=1000)
midPoint = []
for line in lines:
x1, y1, x2, y2 = line[0]
cv2.line(cropped, (x1, y1), (x2, y2), (0, 0, 255), 1)
mP = midpoint(x1, y1, x2, y2)
midPoint.append(mP)
midPoint.sort(key = lambda x: x[1])
img[r[1]:(r[1]+r[3]), r[0]:(r[0]+r[2])] = cropped
print(lines)
print(midPoint)
cv2.imshow("img", img)
cv2.waitKey()
cv2.destroyAllWindows()
Depending on the image and the ROI I select I find inconsistent results. Image examples and where I select the ROIs:
Note that base of the channel starts where the duct tape reaches. It looks like I can almost never find that exact line because how noisy it is at the base. Right now these threshold values with no morphology seem to give the better results. I tried to use sobel derivative aswell instead of canny but got worse results.
Is it even possible to get exact measurements in this enviroment? Is it a matter of coding or changing the way I take the pictures or both? In the future I will possibly need to map the water profile during heavy turbulance, should I simply move away from OpenCV for that, since the noise is too much? Any help is appreciated.
I would not invest in any image processing with that setup.
If you insist on image processing (if you are only interested in the level at a few positions you might be better off using conventional level sensors)
Add LED panels or any other kind of homogeneous background illumination to the back of the basin. Add dye to the water to get some contrast.
Get rid of the window reflections. Clean the glass.
Alternatively make the background dark and add something to the water that makes it stray light or fluorescent.
You could also add stuff that floats on the surface and is either retroreflective or self-illuminated. That way you would get a bright surface level indicator that is easily detected in an image.

Python: Calculating the angle between two bones in an x-ray

I am trying to write a script to calculate the angle between two bones given an x-ray.
A sample x-ray would look like the following:
I am trying to calculate the midline of each bone, essentially a line following the midpoints of the two sides of a bone, and then compare the angle between the two midlines.
I have tried using OpenCV to get the outline of the bones, but it does not seem accurate enough and gets lots of extra data. I am stuck on how to move next and how I would calculate the midline. I am quite new to image processing but have experience with Python.
Getting edges using OpenCV results:
Code for OpenCV:
import cv2
# Load the image
img = cv2.imread("xray-3.jpg")
# Find the contours
imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(img,60,200)
im2, contours, hierarchy = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
hierarchy = hierarchy[0] # get the actual inner list of hierarchy descriptions
# For each contour, find the bounding rectangle and draw it
cv2.drawContours(img, contours, -1, (0,255,0), 3)
# Finally show the image
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
If it's not cheating, i'd recommend cropping the image to not in include as much of the labels and scales as possible without removing any areas of interest.
That being said, I think your method of getting the contours will be usable if you do some preprocessing to the image. One algorithm that might do the trick is a Difference of Gaussians (DoG) filter which will bring out the edges a little more. I modified slightly this code which will compute the DoG filter using a few different sigma and k values.
from skimage import io, feature, color, filters, img_as_float
from matplotlib import pyplot as plt
raw_img = io.imread('xray-3.jpg')
original_image = img_as_float(raw_img)
img = color.rgb2gray(original_image)
k = 1.6
plt.subplot(2,3,1)
plt.imshow(original_image)
plt.title('Original Image')
for idx,sigma in enumerate([4.0, 8.0, 16.0, 32.0]):
s1 = filters.gaussian(img, k*sigma)
s2 = filters.gaussian(img, sigma)
# multiply by sigma to get scale invariance
dog = s1 - s2
plt.subplot(2,3,idx+2)
print("min: {} max: {}".format(dog.min(), dog.max())
plt.imshow(dog, cmap='RdBu')
plt.title('DoG with sigma=' + str(sigma) + ', k=' + str(k))
ax = plt.subplot(2, 3, 6)
blobs_dog = [(x[0], x[1], x[2]) for x in feature.blob_dog(img, min_sigma=4, max_sigma=32, threshold=0.5, overlap=1.0)]
# skimage has a bug in my version where only maxima were returned by the above
blobs_dog += [(x[0], x[1], x[2]) for x in feature.blob_dog(-img, min_sigma=4, max_sigma=32, threshold=0.5, overlap=1.0)]
#remove duplicates
blobs_dog = set(blobs_dog)
img_blobs = color.gray2rgb(img)
for blob in blobs_dog:
y, x, r = blob
c = plt.Circle((x, y), r, color='red', linewidth=2, fill=False)
ax.add_patch(c)
plt.imshow(img_blobs)
plt.title('Detected DoG Maxima')
plt.show()
At first glance, it appears that sigma=8.0, k=1.6 might be your best bet as this seems to best exaggerate the edges of the lower leg while getting rid of the noise across it. Particularly over that of the subjects left (image right) leg. Give your edge detection another go and play around with k and sigma and let me know what you get :)
If the results look good you should be able to get a center point between the edges detected for either leg in each row of the image. Then just find the line of best fit for the mid points for either leg and you should be good to go. You will also need to isolate one leg from another, so again, if it's not cheating, maybe crop the image down the middle into two images.

Road Lane Detection program failing to detect lane properly

I am trying to develop a program that can detect lanes on the road. I have experimented with both Hough Line Transform and Probabilistic Hough Line Transform. However none of these are getting the results that I want.
Original Image:
Hough Line Transform
Probabilistic Hough Line Transform
It seems that for Hough Line Transform, I can at least detect the entire lane, but unfortunately, the line just goes on infinitely (until they move off the picture), to the point where the lines intersect with each other, which is not a good graphical lane detection marker.
I also tried Probalistic Hough Line Transform, and the green line used for lane detection does not go off to infinitely like the other one, but it fails to mark and detect the entire lane.
I am trying to replicate results here (by writing it in Python)
http://www.transistor.io/revisiting-lane-detection-using-opencv.html
What can I do to fix this problem?
Code:
import numpy as np
import cv2
from matplotlib import pyplot as plt
from PIL import Image
import imutils
def invert_img(img):
img = (255-img)
return img
def canny(imgray):
imgray = cv2.GaussianBlur(imgray, (5,5), 200)
canny_low = 5
canny_high = 150
thresh = cv2.Canny(imgray,canny_low,canny_high)
return thresh
def filtering(imgray):
thresh = canny(imgray)
minLineLength = 1
maxLineGap = 1
lines = cv2.HoughLines(thresh,1,np.pi/180,0)
#lines = cv2.HoughLinesP(thresh,2,np.pi/180,100,minLineLength,maxLineGap)
print lines.shape
# Code for HoughLinesP
'''
for i in range(0,lines.shape[0]):
for x1,y1,x2,y2 in lines[i]:
cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)
'''
# Code for HoughLines
for i in range(0,5):
for rho,theta in lines[i]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
return thresh
img = cv2.imread('images/road_0.bmp')
imgray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = imutils.resize(img, height = 500)
imgray = imutils.resize(imgray, height = 500)
thresh = filtering(imgray)
cv2.imshow('original', img)
cv2.imshow('result', thresh)
cv2.waitKey(0)
Cool topic! First of all, why did you add the Gaussian blur? Your source article doesn't mention that at all. If I remove that, I instantly get extra crazy lines, which I can tone down with the canny_low and canny_high. About the best I could find was low=100 and high=180.
Second, you did quite a good job translating the article to Python. However, I think you left out a crucial detail. The author writes:
// Canny algorithm
Mat contours;
Canny(image,contours,50,350);
Mat contoursInv;
threshold(contours,contoursInv,128,255,THRESH_BINARY_INV);
You implement the Canny function (cv2.canny()), but you don't call the threshold function. According to documentation I found, this function "applies a fixed-level threshold to each array element." I experimented with your code and came up with the following.
#thresh = canny(imgray) # original
edges = canny(imgray) # docs refer to return value as "edges"
retval, dst = cv2.threshold(edges, 128, 255,cv2.THRESH_BINARY_INV)
Two values are returned - retval isn't particularly important for us right now. dst is the destination 2D array of image data after thresholding. You would then update your call to cv2.HoughLines and cv2.HoughLinesP replacing "thresh" with "dst." When I did this I got a lot more interesting behavior, though I was not able to find the correct tuning values to make the lines work well.
So, hopefully that gives you some pointers. Try my tips, and also read the article once or twice more to double check that you have the same program flow as the author. This seems like a fun project, have fun!

Detecting edges of lasers/lights in images using Python

I am writing a program in Python to loop through images extracted from the frames of a video and detect lines within them. The images are of fairly poor quality and vary significantly in their content. Here are two examples:
Sample Image 1 | Sample Image 2
I am trying to detect the lasers in each image and look at their angles. Eventually I would like to look at the distribution of these angles and output a sample of three of them.
In order to detect the lines in the images, I have looked at various combinations of the following:
Hough Lines
Canny Edge Detection
Bilateral / Gaussian Filtering
Denoising
Histogram Equalising
Morphological Transformations
Thresholding
I have tried lots of combinations of lots of different methods and I can't seem to come up with anything that really works. What I have been trying is along these lines:
import cv2
import numpy as np
img = cv2.imread('testimg.jpg')
grey = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
equal = clahe.apply(grey)
denoise = cv2.fastNlMeansDenoising(equal, 10, 10, 7, 21)
blurred = cv2.GaussianBlur(denoise, (3, 3), 0)
blurred = cv2.medianBlur(blurred, 9)
(mu, sigma) = cv2.meanStdDev(blurred)
edge = cv2.Canny(blurred, mu - sigma, mu + sigma)
lines = cv2.HoughLines(edge, 1, np.pi/180, 50)
if lines is not None:
print len(lines[0])
for rho,theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 2)
cv2.imshow("preview", img)
cv2.waitKey(0)
This is just one of many different attempts. Even if I can find a method that works slightly better for one of the images, it proves to be much worse for another one. I am not expecting completely perfect results, but I'm sure that they could be better than I've managed so far!
Could anyone suggest a tactic to help me move forward?
Here is one answer. It is an answer that would help you if your camera is in a fixed position and so are your lasers...and your lasers emit from coordinates that you can determine. So, if you have many experiments that happen concurrently with the same setup, this can be a starting point.
The question image information along a polar coordinate system was helpful to get a polar transform. I chose not to use openCV because not everybody can get it going (windows). I took the code from the linked question and played around a bit. If you add his code to mine (without the imports or main method) then you'll have the required functions.
import numpy as np
import scipy as sp
import scipy.ndimage
import matplotlib.pyplot as plt
import sys
import Image
def main():
data = np.array(Image.open('crop1.jpg').convert('LA').convert('RGB'))
origin = (188, -30)
polar_grid, r, theta = reproject_image_into_polar(data, origin=origin)
means, angs = mean_move(polar_grid, 10, 5)
means = np.array(means)
means -= np.mean(means)
means[means<0] = 0
means *= means
plt.figure()
plt.bar(angs, means)
plt.show()
def mean_move(data, width, stride):
means = []
angs = []
x = 0
while True:
if x + width > data.shape[1]:
break
d = data[:,x:x+width]
m = np.mean(d[d!=0])
means.append(m)
ang = 180./data.shape[1] * float(x + x+width)/2.
angs.append(ang)
x += stride
return means, angs
# copy-paste Joe Kington code here
Image around the upper source.
Notice that I chose one laser and cropped a region around its source. This can be done automatically and repeated for each image. I also estimated the source coordinates (188, -30) (in x,y form) based on where I thought it emitted from. Following image(a gimp screenshot!) shows my reasoning(it appeared that there was a very faint ray that I traced back too and took the intersection)...it also shows the measurement of the angle ~140 degrees.
polar transform of image(notice the vertical band if intensity...it is vertical because we chose the correct origin for the laser)
And using a very hastily created moving window mean function and rough mapping to degree angles, along with a diff from mean + zeroing + squaring.
So your task becomes grabbing these peaks. Oh look ~140! Who's your daddy!
In recap, if the setup is fixed, then this may help you! I really need to get back to work and stop procrastinating.

Categories

Resources