Python: Fast way to remove horizontal black line in image - python

I would like to remove horizontal black lines on an image:
To do this, I interpolate the RGB values ​​of each column of pixels.
The black line disappear but I think it is possible to optimize this function:
def fillMissingValue(img_in):
img_out = np.copy(img_in)
#Proceed column by column
for i in tqdm(range(img_in.shape[1])):
col = img_in[:,i]
col_r = col[:,0]
col_g = col[:,1]
col_b = col[:,2]
r = interpolate(col_r)
g = interpolate(col_g)
b = interpolate(col_b)
img_out[:,i,0] = r
img_out[:,i,1] = g
img_out[:,i,2] = b
return img_out
def interpolate(y):
x = np.arange(len(y))
idx = np.nonzero(y)
interp = interp1d(x[idx],y[idx],fill_value="extrapolate" )
return interp(x)
if __name__ == "__main__":
img = cv2.imread("lena.png")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (1024,1024))
start = time.time()
img2 = fillMissingValue(img)
end = time.time()
print("Process time: {}".format(np.round(end-start,3)))
Do you have any ideas ?
I thought of doing a prepocessing step by identifying the position of the black lines. And thus only interpolated neighboring pixels. But I don't think it's faster
Current result:

interp1d is not very efficient.
As proposed by #ChristophRackwitz in the comments, you can detect the location of the lines and use the inpainting method provided by OpenCV:
img = cv2.imread('lena.jpg')
# Locate the relatively black lines
threshold = 25
lineIdx = np.where(np.mean(img, axis=(1,2)) < threshold)
# Perform inpainting on the located lines
mask = np.zeros(img.shape[:2], dtype=np.uint8)
mask[lineIdx] = 255
# Actual inpainting.
# Note: using 2 or 1 instead of 3 make the computation
# respectively ~2 and ~4 time faster on my machine but
# the result is not as beautiful with 3.
img2 = cv2.inpaint(img, mask, 3, cv2.INPAINT_NS)
The computational part takes 87 ms on my machine while your code takes 342 ms. Note that because of JPEG compression, the result is not so great. You can inpaint the neighbour lines (eg. lineIdx-1 and lineIdx+1) so to get a much better result at the expense of the slower computation (about 2.5 slower on my machine).
An alternative solution is to perform the interpolation yourself in Numpy:
%%time
# Locate the relatively black lines
threshold = 25
lineIdx = np.where(np.mean(img, axis=(1,2)) < threshold)[0]
lineIdxSet = set(lineIdx)
img2 = img.copy()
start, end = None, None
interpolate = False
for i in range(img.shape[0]+1):
if i in lineIdxSet:
if start is None:
start = i
end = i
else:
if not (start is None):
assert not (end is None)
# The first lines are black
if start <= 0:
i0, i1 = end+1, end+1
# The last lines are black
elif end >= img.shape[0]-1:
i0, i1 = start-1, start-1
# usual case
else:
i0, i1 = start-1, end+1
# The full image is black
if i0 < 0 or i1 >= img.shape[0]:
continue
end = min(end, img.shape[0]-1)
# Actual linear interpolation (of a bloc of lines)
coef = np.linspace(0, 1, end-start+3)[1:-1].reshape(-1, 1)
l0 = img[i0].reshape(-1)
l1 = img[i1].reshape(-1)
img2[start:end+1] = (coef * l0 + (1.0-coef) * l1).reshape(-1, img.shape[1], 3)
start, end = None, None
This code takes only 5 ms on my machine. The result should be similar to the one of your original code except that it works line-by-line and not column by column and that the detection is not independent for each colour channel. Note that the inpainting method give more beautiful results when large blocs of lines are black.

Related

How do you efficiently remove rows and columns from a 3d numpy array?

My goal is to remove dark horizontal and vertical lines from an image after it has been converted into a numpy array. I didn't want to use a predefined image module for this since I wanted fine control over parameters such as threshold values.
My logic was as follows:
Convert a color image to a 3D numpy array image (BGR) using cv2.imread.
Iterate over row indices and extract each row using row = image[row_index,:,:].
In each row, calculate how many pixels are "black pixels" based on whether all 3 channel values are below the defined threshold.
If enough number (or ratio) of pixels in a row meet the above criteria, store this row index into the list remove_rows.
After all iterations, determine the rows to be preserved, stored into preserve_rows, based on the list remove_rows.
The new image after the rows deletion can be estimated by image = image[preserve_rows,:,:].
Repeat the process for columns as well.
The program worked, but it takes a very long time. I think the time complexity is O(rows * columns * 3) because every value has to be visited and compared with the threshold. The program takes around 9 seconds for a single image, which is unacceptable since I eventually plan to use this function for preprocessing in Keras in the ImageDataGenerator function and I'm not sure whether this function uses the GPU during neural network training. The full code is below:
def edge_removal(image, threshold=50, max_black_ratio=0.7):
num_rows, _, _ = image.shape
remove_rows = []
threshold_times = []
start_time = time.time()
for row_index in range(num_rows):
row = image[row_index,:,:]
pixel_count = 0
black_pixel_count = 0
for pixel in row:
pixel_count += 1
b,g,r = pixel
pre_threshold_time = time.time()
if all([x<=threshold for x in [b,g,r]]):
black_pixel_count += 1
threshold_times.append(time.time()-pre_threshold_time)
if pixel_count > 0 and (black_pixel_count/pixel_count)>max_black_ratio:
remove_rows.append(row_index)
time_taken = time.time() - start_time
print(f"Time taken for thresholding = {sum(threshold_times)}")
print(f"Time taken till row for loop = {time_taken}")
preserve_rows = [x for x in range(num_rows) if x not in remove_rows]
image = image[preserve_rows,:,:]
_, num_cols, _ = image.shape
remove_cols = []
for col_index in range(num_cols):
col = image[:,col_index,:]
pixel_count = 0
black_pixel_count = 0
for pixel in col:
pixel_count += 1
b,g,r = pixel
if all([x<=threshold for x in [b,g,r]]):
black_pixel_count += 1
if pixel_count > 0 and (black_pixel_count/pixel_count)>max_black_ratio:
remove_cols.append(col_index)
preserve_cols = [x for x in range(num_cols) if x not in remove_cols]
image = image[:,preserve_cols,:]
time_taken = time.time() - start_time
print(f"Total time taken = {time_taken}")
return image
And the output of the code is:
Time taken for thresholding = 3.586946487426758
Time taken till row for loop = 4.530229091644287
Total time taken = 8.74315094947815
I've tried the following:
Using mutlithreading to replace the outer for loop, where the argument to the threaded function is the threadnumber (no of threads = no of rows in the image). However, this did not speed up the program. This is probably because the for loop is a CPU-bound process, which cannot be sped up due to the Global Interpreter Lock, as described by this SO answer.
Looking for other suggestions how to reduce the time complexity of the program. This answer did not help me much since its not the deletion that's the bottleneck as can be seen in the output. The number of comparisons to perform thresholding is what's slowing this program down.
Any suggestions or heuristics to reduce the amount of computation and thereby the processing time of the program?
Since your code is comprised of two parts that do the same job, simply on a different dimension of the image, I moved all that logic in a single function that tells you whether the "series of pixels" (row or column, does not matter) provided is above or below the threshold.
I replaced all the manual counts with len calls.
The various generators (r, g, b = pixel; x <= threshold for x in (r, g, b)) are replaced with direct numpy array comparison like pixel <= threshold and python's all is replaced by numpy's .all().
The old and new codes process my test image in 5.9 s and 37 ms respectively, with the added benefit of readability.
def edge_removal(image, threshold=50, max_black_ratio=0.7):
def pixels_should_be_conserved(pixels) -> bool:
black_pixel_count = (pixels <= threshold).all(axis=1).sum()
pixel_count = len(pixels)
return pixel_count > 0 and black_pixel_count/pixel_count <= max_black_ratio
num_rows, num_columns, _ = image.shape
preserved_rows = [r for r in range(num_rows) if pixels_should_be_conserved(image[r, :, :])]
preserved_columns = [c for c in range(num_columns) if pixels_should_be_conserved(image[:, c, :])]
image = image[preserved_rows,:,:]
image = image[:,preserved_columns,:]
return image
To explain further the change that saved us the most time (counting black pixels), let's take a look at a simplified example.
red = np.array([255, 0, 0])
black = np.array([0, 0, 0])
pixels = np.array([red, red, red, black, red]) # Simple line of 5 pixels.
threshold = 50
pixels <= threshold
# >>> array([[False, True, True],
# [False, True, True],
# [False, True, True],
# [True, True, True],
# [False, True, True]])
(pixels <= threshold).all(axis=1)
# >>> array([False,
# False,
# False,
# True,
# False])
# We successfully detect that the fourth pixel has all its rgb values below the threshold.
(pixels <= threshold).all(axis=1).sum()
# >>> 1
# Summing a boolean area is a handy way of counting how many element in the
# array are true, i.e. dark enough in our case.
Alternative 1: HSV.
Another thing we can consider is using the HSV color system, since you only worry about brightness in your problem. That will allow use to check whether s <= threshold for each pixel, so one comparison instead of three.
The image is processed in 18 ms instead of 37 ms despite the two conversions to and from HSV.
def edge_removal(image, threshold=50, max_black_ratio=0.7):
def pixels_should_be_conserved(pixels) -> bool:
black_pixel_count = (pixels[:,2] <= threshold).sum() # Notice the change here.
pixel_count = len(pixels)
return pixel_count > 0 and black_pixel_count/pixel_count <= max_black_ratio
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
num_rows, num_columns, _ = image.shape
preserved_rows = [r for r in range(num_rows) if pixels_should_be_conserved(image[r, :, :])]
preserved_columns = [c for c in range(num_columns) if pixels_should_be_conserved(image[:, c, :])]
image = image[preserved_rows,:,:]
image = image[:,preserved_columns,:]
image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)
return image
Alternative 2: grayscale.
We can also work in grayscale mode and compare the "pixel" (a single value now) directly against the threshold. We save one conversion compared to the HSV alternative but use a little more memory.
It runs in 14 ms.
def edge_removal(image, threshold=50, max_black_ratio=0.7):
def pixels_should_be_conserved(pixels) -> bool:
black_pixel_count = (pixels <= threshold).sum()
pixel_count = len(pixels)
return pixel_count > 0 and black_pixel_count/pixel_count < max_black_ratio
image_grayscale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
num_rows, num_columns, _ = image.shape
preserved_rows = [r for r in range(num_rows) if pixels_should_be_conserved(image_grayscale[r, :])]
preserved_columns = [c for c in range(num_columns) if pixels_should_be_conserved(image_grayscale[:, c])]
image = image[preserved_rows,:,:]
image = image[:,preserved_columns,:]
return image
Benchmarking:
RGB (OP)
RGB
HSV
Gray
5900 ms
37 ms
18 ms
14 ms

Pixels intensity values between two lines

I have created an alghoritm that detects the edges of an extruded colagen casing and draws a centerline between these edges on an image. Casing with a centerline.
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
img = cv2.imread("C:/Users/5.jpg", cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (1500, 1200))
#ROI
fromCenter = False
r = cv2.selectROI(img, fromCenter)
imCrop = img[int(r[1]):int(r[1]+r[3]), int(r[0]):int(r[0]+r[2])]
#Operations on an image
_,thresh = cv2.threshold(imCrop,100,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
kernel = np.ones((5,5),np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
blur = cv2.GaussianBlur(opening,(7,7),0)
edges = cv2.Canny(blur, 0,20)
#Edges localization, packing coords into a list
indices = np.where(edges != [0])
coordinates = list(zip(indices[1], indices[0]))
num = len(coordinates)
#Separating into top and bot edge
bot_cor = coordinates[:int(num/2)]
top_cor = coordinates[-int(num/2):]
#Converting to arrays, sorting
a, b = np.array(top_cor), np.array(bot_cor)
a, b = a[a[:,0].argsort()], b[b[:,0].argsort()]
#Edges approximation by a 5th degree polynomial
min_a_x, max_a_x = np.min(a[:,0]), np.max(a[:,0])
new_a_x = np.linspace(min_a_x, max_a_x, imCrop.shape[1])
a_coefs = np.polyfit(a[:,0],a[:,1], 5)
new_a_y = np.polyval(a_coefs, new_a_x)
min_b_x, max_b_x = np.min(b[:,0]), np.max(b[:,0])
new_b_x = np.linspace(min_b_x, max_b_x, imCrop.shape[1])
b_coefs = np.polyfit(b[:,0],b[:,1], 5)
new_b_y = np.polyval(b_coefs, new_b_x)
#Defining a centerline
midx = [np.average([new_a_x[i], new_b_x[i]], axis = 0) for i in range(imCrop.shape[1])]
midy = [np.average([new_a_y[i], new_b_y[i]], axis = 0) for i in range(imCrop.shape[1])]
plt.figure(figsize=(16,8))
plt.title('Cross section')
plt.xlabel('Length of the casing', fontsize=18)
plt.ylabel('Width of the casing', fontsize=18)
plt.plot(new_a_x, new_a_y,c='black')
plt.plot(new_b_x, new_b_y,c='black')
plt.plot(midx, midy, '-', c='blue')
plt.show()
#Converting coords type to a list (plotting purposes)
coords = list(zip(midx, midy))
points = list(np.int_(coords))
mask = np.zeros((imCrop.shape[:2]), np.uint8)
mask = edges
#Plotting
for point in points:
cv2.circle(mask, tuple(point), 1, (255,255,255), -1)
for point in points:
cv2.circle(imCrop, tuple(point), 1, (255,255,255), -1)
cv2.imshow('imCrop', imCrop)
cv2.imshow('mask', mask)
cv2.waitKey(0)
cv2.destroyAllWindows()
Now I would like to sum up the intensities of each pixel in a region between top edge and a centerline (same thing for a region between centerline and a bottom edge).
Is there any way to limit the ROI to the region between the detected edges and split it into two regions based on the calculated centerline?
Or is there any way to access the pixels which are contained between the edge and a centerline based on theirs coordinates?
(It's my very first post here, sorry in advance for all the mistakes)
I wrote a somewhat naïve code to get masks for the upper and lower part. My code considers that the source image will be always like yours: with horizontal stripes.
After applying Canny I get this:
Then I run some loops through image array to fill unwanted areas of your image. This is done separately for upper and lower part, creating masks. The results are:
Then you can use this masks to sum only the elements you're interested in, using cv.sumElems.
import cv2 as cv
#open as grayscale image
src = cv.imread("colagen.png",cv.IMREAD_GRAYSCALE)
# apply canny and find contours
threshold = 100
canny_output = cv.Canny(src, threshold, threshold * 2)
# find mask for upper part
mask1 = canny_output.copy()
x, y = canny_output.shape
area = 0
for j in range(y):
area = 0
for i in range(x):
if area == 0:
if mask1[i][j] > 0:
area = 1
continue
else:
mask1[i][j] = 255
elif area == 1:
if mask1[i][j] > 0:
area = 2
else:
continue
else:
mask1[i][j] = 255
mask1 = cv.bitwise_not(mask1)
# find mask for lower part
mask2 = canny_output.copy()
x, y = canny_output.shape
area = 0
for j in range(y):
area = 0
for i in range(x):
if area == 0:
if mask2[-i][j] > 0:
area = 1
continue
else:
mask2[-i][j] = 255
elif area == 1:
if mask2[-i][j] > 0:
area = 2
else:
continue
else:
mask2[-i][j] = 255
mask2 = cv.bitwise_not(mask2)
# apply masks and calculate sum of elements in upper and lower part
sums = [0,0]
(sums[0],_,_,_) = cv.sumElems(cv.bitwise_and(src,mask1))
(sums[1],_,_,_) = cv.sumElems(cv.bitwise_and(src,mask2))
cv.imshow('src',src)
cv.imshow('canny',canny_output)
cv.imshow('mask1',mask1)
cv.imshow('mask2',mask2)
cv.imshow('masked1',cv.bitwise_and(src,mask1))
cv.imshow('masked2',cv.bitwise_and(src,mask2))
cv.waitKey()
Alternatives...
Probably there exist some function that fill the areas of the Canny result. I tried cv.fillPoly and cv.floodFill, but didn't manage to make them work easily... But maybe someone else can help you with that...
Edit
Found another way to get the masks with a cleaner code. Using numpy np.add.accumulate then np.clip, and then a modulo operation:
# first divide canny_output by 255 to get 0's and 1's, then perform
# an accumulate addition for each column. Thus you'll get +1 for every
# line, "painting" areas with 1, 2, 3...
a = np.add.accumulate(canny_output/255,0)
# clip values: anything greater than 2 becomes 2
a = np.clip(a, 0, 2)
# performe a modulo, to get areas alternating with 0 or 1; then multiply by 255
a = a%2 * 255
# convert to uint8
mask1 = cv.convertScaleAbs(a)
# to get mask2 (the lower mask) flip the array then do the same as above
a = np.add.accumulate(np.flip(canny_output,0)/255,0)
a = np.clip(a, 0, 2)
a = a%2 * 255
mask2 = cv.convertScaleAbs(np.flip(a,0))
This returns almost the same result. The border of the mask is a little bit different...

Image stitching

I've recorded the video while bottle was rotated.Then i got frames from video and cut the central block from all images.
So for all frames I got the following images:
I've tried to stitch them to get panorama, but I got bad results.
I used the following program:
import glob
#rom panorama import Panorama
import sys
import numpy
import imutils
import cv2
def readImages(imageString):
images = []
# Get images from arguments.
for i in range(0, len(imageString)):
img = cv2.imread(imageString[i])
images.append(img)
return images
def findAndDescribeFeatures(image):
# Getting gray image
grayImage = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Find and describe the features.
# Fast: sift = cv2.xfeatures2d.SURF_create()
sift = cv2.xfeatures2d.SIFT_create()
# Find interest points.
keypoints = sift.detect(grayImage, None)
# Computing features.
keypoints, features = sift.compute(grayImage, keypoints)
# Converting keypoints to numbers.
keypoints = numpy.float32([kp.pt for kp in keypoints])
return keypoints, features
def matchFeatures(featuresA, featuresB):
# Slow: featureMatcher = cv2.DescriptorMatcher_create("BruteForce")
featureMatcher = cv2.DescriptorMatcher_create("FlannBased")
matches = featureMatcher.knnMatch(featuresA, featuresB, k=2)
return matches
def generateHomography(allMatches, keypointsA, keypointsB, ratio, ransacRep):
if not allMatches:
return None
matches = []
for match in allMatches:
# Lowe's ratio test
if len(match) == 2 and (match[0].distance / match[1].distance) < ratio:
matches.append(match[0])
pointsA = numpy.float32([keypointsA[m.queryIdx] for m in matches])
pointsB = numpy.float32([keypointsB[m.trainIdx] for m in matches])
if len(pointsA) > 4:
H, status = cv2.findHomography(pointsA, pointsB, cv2.RANSAC, ransacRep)
return matches, H, status
else:
return None
paths = glob.glob("C:/Users/andre/Desktop/Panorama-master/frames/*.jpg")
images = readImages(paths[::-1])
while len(images) > 1:
imgR = images.pop()
imgL = images.pop()
interestsR, featuresR = findAndDescribeFeatures(imgR)
interestsL, featuresL = findAndDescribeFeatures(imgL)
try:
try:
allMatches = matchFeatures(featuresR, featuresL)
_, H, _ = generateHomography(allMatches, interestsR, interestsL, 0.75, 4.0)
result = cv2.warpPerspective(imgR, H,
(imgR.shape[1] + imgL.shape[1], imgR.shape[0]))
result[0:imgL.shape[0], 0:imgL.shape[1]] = imgL
images.append(result)
except TypeError:
pass
except cv2.error:
pass
result = imutils.resize(images[0], height=260)
cv2.imshow("Result", result)
cv2.imwrite("Result.jpg", result)
cv2.waitKey(0)
My result was:
May be someone know hot to do it better? I think that using small blocks from frame should remove roundness... But...
Data: https://1drv.ms/f/s!ArcAdXhy6TxPho0FLKxyRCL-808Y9g
I managed to achieve a nice result. I rewrote your code just a little bit, here is the changed part:
def generateTransformation(allMatches, keypointsA, keypointsB, ratio):
if not allMatches:
return None
matches = []
for match in allMatches:
# Lowe's ratio test
if len(match) == 2 and (match[0].distance / match[1].distance) < ratio:
matches.append(match[0])
pointsA = numpy.float32([keypointsA[m.queryIdx] for m in matches])
pointsB = numpy.float32([keypointsB[m.trainIdx] for m in matches])
if len(pointsA) > 2:
transformation = cv2.estimateRigidTransform(pointsA, pointsB, True)
if transformation is None or transformation.shape[1] < 1 or transformation.shape[0] < 1:
return None
return transformation
else:
return None
paths = glob.glob("a*.jpg")
images = readImages(paths[::-1])
result = images[0]
while len(images) > 1:
imgR = images.pop()
imgL = images.pop()
interestsR, featuresR = findAndDescribeFeatures(imgR)
interestsL, featuresL = findAndDescribeFeatures(imgL)
allMatches = matchFeatures(featuresR, featuresL)
transformation = generateTransformation(allMatches, interestsR, interestsL, 0.75)
if transformation is None or transformation[0, 2] < 0:
images.append(imgR)
continue
transformation[0, 0] = 1
transformation[1, 1] = 1
transformation[0, 1] = 0
transformation[1, 0] = 0
transformation[1, 2] = 0
result = cv2.warpAffine(imgR, transformation, (imgR.shape[1] +
int(transformation[0, 2] + 1), imgR.shape[0]))
result[:, :imgL.shape[1]] = imgL
cv2.imshow("R", result)
images.append(result)
cv2.waitKey(1)
cv2.imshow("Result", result)
So the key thing I changed is the transformation of the images. I use estimateRigidTransform() instead of findHomography() to calculate transformation of the image. From that transformation matrix I only extract the x coordinate translation, which is in the [0, 2] cell of the resulting Affine Transformation matrix transformation. I set the other transformation matrix elements as if it is an identity transformation (no scaling, no perspective, no rotation or y translation). Then I pass it to warpAffine() to transform the imgR the same way you did with warpPerspective().
You can do it because you have stable camera and spinning object positions and you capture with a straight front view of the object. It means that you don't have to do any perspective / scaling / rotation image corrections and can just "glue" them together by x axis.
I think your approach fails because you actually observe the bottle with a slightly tilted down camera view or the bottle is not in the middle of the screen. I'll try to describe that with an image. I depict some text on the bottle with red. For example the algorithm finds a matching points pair (green) on the bottom of the captured round object. Note that the point moves not only right, but diagonally up too. The program then calculates the transformation taking into account the points which move up slightly. This continues to get worse frame by frame.
The recognition of matching image points also may be slightly inaccurate, so extracting only the x translation is even better because you give the algorithm "a clue" what actual situation you have. This makes it less applicable for another conditions, but in your case it improves the result a lot.
Also I filter out some incorrect results with if transformation[0, 2] < 0 check (it can rotate only one direction, and the code wont work if that is negative anyways).

Detect the green lines in this image and calculate their lengths

Sample Images
The image can be more noisy at times where more objects intervene from the background. Right now I am using various techniques using the RGB colour space to detect the lines but it fails when there is change in the colour due to intervening obstacles from the background. I am using opencv and python.
I have read that HSV is better for colour detection and used but haven't been successful yet.
I am not able to find a generic solution to this problem. Any hints or clues in this direction would be of great help.
STILL IN PROGRESS
First of all, an RGB image consists of 3 grayscale images. Since you need the green color you will deal only with one channel. The green one. To do so, you can split the image, you can use b,g,r = cv2.split('Your Image'). You will get an output like that if you are showing the green channel:
After that you should threshold the image using your desired way. I prefer Otsu's thresholding in this case. The output after thresholding is:
It's obvious that the thresholded image is extremley noisy. So performing erosion will reduce the noise a little bit. The noise reduced image will be similar to the following:
I tried using closing instead of dilation, but closing preserves some unwanted noise. So I separately performed erosion followed by dilation. After dilation the output is:
Note that: You can do your own way in morphological operation. You can use opening instead of what I did. The results are subjective from
one person to another.
Now you can try one these two methods:
1. Blob Detection.
2. HoughLine Transform.
TODO
Try out these two methods and choose the best.
You should use the fact that you know you are trying to detect a line by using the line hough transform.
http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html
When the obstacle also look like a line use the fact that you know approximately what is the orientation of the green lines.
If you don't know the orientation of the line use hte fact that there are several green lines with the same orientation and only one line that is the obstacle
Here is a code for what i meant:
import cv2
import numpy as np
# Params
minLineCount = 300 # min number of point alogn line with the a specif orientation
minArea = 100
# Read img
img = cv2.imread('i.png')
greenChannel = img[:,:,1]
# Do noise reduction
iFilter = cv2.bilateralFilter(greenChannel,5,5,5)
# Threshold data
#ret,iThresh = cv2.threshold(iFilter,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
iThresh = (greenChannel > 4).astype(np.uint8)*255
# Remove small areas
se1 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
iThreshRemove = cv2.morphologyEx(iThresh, cv2.MORPH_OPEN, se1)
# Find edges
iEdge = cv2.Canny(iThreshRemove,50,100)
# Hough line transform
lines = cv2.HoughLines(iEdge, 1, 3.14/180,75)
# Find the theta with the most lines
thetaCounter = dict()
for line in lines:
theta = line[0, 1]
if theta in thetaCounter:
thetaCounter[theta] += 1
else:
thetaCounter[theta] = 1
maxThetaCount = 0
maxTheta = 0
for theta in thetaCounter:
if thetaCounter[theta] > maxThetaCount:
maxThetaCount = thetaCounter[theta]
maxTheta = theta
# Find the rhos that corresponds to max theta
rhoValues = []
for line in lines:
rho = line[0, 0]
theta = line[0, 1]
if theta == maxTheta:
rhoValues.append(rho)
# Go over all the lines with the specific orientation and count the number of pixels on that line
# if the number is bigger than minLineCount draw the pixels in finaImage
lineImage = np.zeros_like(iThresh, np.uint8)
for rho in range(min(rhoValues), max(rhoValues), 1):
a = np.cos(maxTheta)
b = np.sin(maxTheta)
x0 = round(a*rho)
y0 = round(b*rho)
lineCount = 0
pixelList = []
for jump in range(-1000, 1000, 1):
x1 = int(x0 + jump * (-b))
y1 = int(y0 + jump * (a))
if x1 < 0 or y1 < 0 or x1 >= lineImage.shape[1] or y1 >= lineImage.shape[0]:
continue
if iThreshRemove[y1, x1] == int(255):
pixelList.append((y1, x1))
lineCount += 1
if lineCount > minLineCount:
for y,x in pixelList:
lineImage[y, x] = int(255)
# Remove small areas
## Opencv 2.4
im2, contours, hierarchy = cv2.findContours(lineImage,cv2.RETR_CCOMP,cv2.CHAIN_APPROX_NONE )
finalImage = np.zeros_like(lineImage)
finalShapes = []
for contour in contours:
if contour.size > minArea:
finalShapes.append(contour)
cv2.fillPoly(finalImage, finalShapes, 255)
## Opencv 3.0
# output = cv2.connectedComponentsWithStats(lineImage, 8, cv2.CV_32S)
#
# finalImage = np.zeros_like(output[1])
# finalImage = output[1]
# stat = output[2]
# for label in range(output[0]):
# if label == 0:
# continue
# cc = stat[label,:]
# if cc[cv2.CC_STAT_AREA] < minArea:
# finalImage[finalImage == label] = 0
# else:
# finalImage[finalImage == label] = 255
# Show image
#cv2.imwrite('finalImage2.jpg',finalImage)
cv2.imshow('a', finalImage.astype(np.uint8))
cv2.waitKey(0)
and the result for the images:

Copy a part of an image in opencv and python

I'm trying to split an image into several sub-images with opencv by identifying templates of the original image and then copy the regions where I matched those templates. I'm a TOTAL newbie to opencv! I've identified the sub-images using:
result = cv2.matchTemplate(img, template, cv2.TM_CCORR_NORMED)
After some cleanup I get a list of tuples called points in which I iterate to show the rectangles. tw and th is the template width and height respectively.
for pt in points:
re = cv2.rectangle(img, pt, (pt[0] + tw, pt[1] + th), 0, 2)
print('%s, %s' % (str(pt[0]), str(pt[1])))
count+=1
What I would like to accomplish is to save the octagons (https://dl.dropbox.com/u/239592/region01.png) into separated files.
How can I do this? I've read something about contours but I'm not sure how to use it. Ideally I would like to contour the octagon.
Thanks a lot for your help!
If template matching is working for you, stick to it. For instance, I considered the following template:
Then, we can pre-process the input in order to make it a binary one and discard small components. After this step, the template matching is performed. Then it is a matter of filtering the matches by means of discarding close ones (I've used a dummy method for that, so if there are too many matches you could see it taking some time). After we decide which points are far apart (and thus identify different hexagons), we can do minor adjusts to them in the following manner:
Sort by y-coordinate;
If two adjacent items start at a y-coordinate that is too close, then set them both to the same y-coord.
Now you can sort this point list in an appropriate order such that the crops are done in raster order. The cropping part is easily achieved using slicing provided by numpy.
import sys
import cv2
import numpy
outbasename = 'hexagon_%02d.png'
img = cv2.imread(sys.argv[1])
template = cv2.cvtColor(cv2.imread(sys.argv[2]), cv2.COLOR_BGR2GRAY)
theight, twidth = template.shape[:2]
# Binarize the input based on the saturation and value.
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
saturation = hsv[:,:,1]
value = hsv[:,:,2]
value[saturation > 35] = 255
value = cv2.threshold(value, 0, 255, cv2.THRESH_OTSU)[1]
# Pad the image.
value = cv2.copyMakeBorder(255 - value, 3, 3, 3, 3, cv2.BORDER_CONSTANT, value=0)
# Discard small components.
img_clean = numpy.zeros(value.shape, dtype=numpy.uint8)
contours, _ = cv2.findContours(value, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
for i, c in enumerate(contours):
area = cv2.contourArea(c)
if area > 500:
cv2.drawContours(img_clean, contours, i, 255, 2)
def closest_pt(a, pt):
if not len(a):
return (float('inf'), float('inf'))
d = a - pt
return a[numpy.argmin((d * d).sum(1))]
match = cv2.matchTemplate(img_clean, template, cv2.TM_CCORR_NORMED)
# Filter matches.
threshold = 0.8
dist_threshold = twidth / 1.5
loc = numpy.where(match > threshold)
ptlist = numpy.zeros((len(loc[0]), 2), dtype=int)
count = 0
print "%d matches" % len(loc[0])
for pt in zip(*loc[::-1]):
cpt = closest_pt(ptlist[:count], pt)
dist = ((cpt[0] - pt[0]) ** 2 + (cpt[1] - pt[1]) ** 2) ** 0.5
if dist > dist_threshold:
ptlist[count] = pt
count += 1
# Adjust points (could do for the x coords too).
ptlist = ptlist[:count]
view = ptlist.ravel().view([('x', int), ('y', int)])
view.sort(order=['y', 'x'])
for i in xrange(1, ptlist.shape[0]):
prev, curr = ptlist[i - 1], ptlist[i]
if abs(curr[1] - prev[1]) < 5:
y = min(curr[1], prev[1])
curr[1], prev[1] = y, y
# Crop in raster order.
view.sort(order=['y', 'x'])
for i, pt in enumerate(ptlist, start=1):
cv2.imwrite(outbasename % i,
img[pt[1]-2:pt[1]+theight-2, pt[0]-2:pt[0]+twidth-2])
print 'Wrote %s' % (outbasename % i)
If you want only the contours of the hexagons, then crop on img_clean instead of img (but then it is pointless to sort the hexagons in raster order).
Here is a representation of the different regions that would be cut for your two examples without modifying the code above:
I am sorry, I didn't understand from your question on how do you relate matchTemplate and Contours.
Anyway, below is a small technique using contours. It is on the assumption that your other images are also like the one you provided. I am not sure if it works with your other images. But I think it would help to get a startup. Try this yourself and make necessary adjustments and modifications.
What I did :
1 - I needed the edge of octagons . So Thresholded Image using Otsu and apply dilation and erosion (or use any method you like that works well for all your images, beware of the edges in left edge of image).
2 - Then found contours (More about contours : http://goo.gl/r0ID0
3 - For each contours, find its convex hull, find its area(A) & perimeter(P)
4 - For a perfect octagon, P*P/A = 13.25 approximately. I used it here and cut it and saved it.
5 - You can see cropping it also removes some edges of octagon. If you want it, adjust the cropping dimension.
Code :
import cv2
import numpy as np
img = cv2.imread('region01.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
thresh = cv2.dilate(thresh,None,iterations = 2)
thresh = cv2.erode(thresh,None)
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
number = 0
for cnt in contours:
hull = cv2.convexHull(cnt)
area = cv2.contourArea(hull)
P = cv2.arcLength(hull,True)
if ((area != 0) and (13<= P**2/area <= 14)):
#cv2.drawContours(img,[hull],0,255,3)
x,y,w,h = cv2.boundingRect(hull)
number = number + 1
roi = img[y:y+h,x:x+w]
cv2.imshow(str(number),roi)
cv2.imwrite("1"+str(number)+".jpg",roi)
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Those 6 octagons will be stored as separate files.
Hope it helps !!!

Categories

Resources