Related
I'm trying to take the output of a yolov5s.onnx model and and run NMSBoxes on it. But I keep getting this error:
Traceback (most recent call last):
File "python_detection.py", line 132, in <module>
class_ids, confidences, boxes = wrap_detection(inputImage, outs[0])
File "python_detection.py", line 88, in wrap_detection
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.25, 0.45)
TypeError: Can't convert vector element for 'scores', index=0
Everywhere I look, people are using the exact same code as me. Which makes sense, since this code was mostly copied from a tutorial. So I don't know what I'm doing so wrong that keeps giving me this error.
Here's the full function:
def wrap_detection(input_image, output_data):
class_ids = []
confidences = []
boxes = []
rows = output_data.shape[0]
image_width, image_height, _ = input_image.shape
x_factor = image_width / INPUT_WIDTH
y_factor = image_height / INPUT_HEIGHT
for r in range(rows):
row = output_data[r]
confidence = row[4]
if confidence >= 0.4:
classes_scores = row[5:]
_, _, _, max_indx = cv2.minMaxLoc(classes_scores)
class_id = max_indx[1]
if (classes_scores[class_id] > .25):
confidences.append(confidence)
class_ids.append(class_id)
x, y, w, h = row[0].item(), row[1].item(), row[2].item(), row[3].item()
left = int((x - 0.5 * w) * x_factor)
top = int((y - 0.5 * h) * y_factor)
width = int(w * x_factor)
height = int(h * y_factor)
box = np.array([left, top, width, height])
boxes.append(box)
'''
Print the raw output
'''
# Save output
np.set_printoptions(threshold=sys.maxsize)
file = open("python_raw_model_output.txt", "w+")
for i in range(len(boxes)):
file.write(str(boxes[i]) + " " + str(confidences[i]) + " " + str(class_ids[i]))
file.write("\n")
file.close()
# NMS on the lists
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.25, 0.45)
result_class_ids = []
result_confidences = []
result_boxes = []
for i in indexes:
result_confidences.append(confidences[i])
result_class_ids.append(class_ids[i])
result_boxes.append(boxes[i])
return result_class_ids, result_confidences, result_boxes
I had the same issue. It seemed to be related to the cuda configuration as it works fine on the cpu. I did not figure out exactly what was wrong but I worked around the issue by using fastNMS: enter link description here
I dont know if you still need help, but for anyone that comes across this problem like I did, the fix is that you need to make sure that the Python version you're using is Python>=3.8 and the opencv version is atleast 4.5.4.
Using pip install opencv-python==4.5.5.64 fixed my problem.
For others who come across this issue, I was able to get around the same (on Python 3.8.10 and OpenCV 4.5.3) by making confidences a Numpy array (instead of a list). The answers pointing at CUDA or Python/OpenCV versions are probably still right, but this may be a simpler solution for some situations.
TLDR:
Need help trying to calculate overlap region between 2 graphs.
So I'm trying to stitch these 2 images:
Since I know that the images I will be stitching definitely come from the same image, I feel that I should be able to code this up myself. Using libraries like OpenCV feels a little like overkill for me for this task.
My current idea is that I can simplify this task by doing the following steps for each image:
Load image using PIL
Convert image to black and white (PIL image mode “L”)
[Optional: crop images to overlapping region by inspection by eye]
Create vector row_sum, which is a sum of each row
[Optional: log row_sum, to reduce the size of values we're working with]
Plot row_sum.
This would reduce the (potentially) (3*2)-dimensional problem, with 3 RGB channels for each pixel on the 2D image to a (1*2)-D problem with the black and white pixel for the 2D image instead. Then, summing across the rows reduces this to a 1D problem.
I used the following code to implement the above:
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
class Stitcher():
def combine_2(self, img1, img2):
# thr1, thr2 = self.get_cropped_bw(img1, 115, img2, 80)
thr1, thr2 = self.get_cropped_bw(img1, 0, img2, 0)
row_sum1 = np.log(thr1.sum(1))
row_sum2 = np.log(thr2.sum(1))
self.plot_4x4(thr1, thr2, row_sum1, row_sum2)
def get_cropped_bw(self, img1, img1_keep_from, img2, img2_keep_till):
im1 = Image.open(img1).convert("L")
im2 = Image.open(img2).convert("L")
data1 = (np.array(im1)[img1_keep_from:]
if img1_keep_from != 0 else np.array(im1))
data2 = (np.array(im2)[:img2_keep_till]
if img2_keep_till != 0 else np.array(im2))
return data1, data2
def plot_4x4(self, thr1, thr2, row_sum1, row_sum2):
fig, ax = plt.subplots(2, 2, sharey="row", constrained_layout=True)
ax[0, 0].imshow(thr1, cmap="Greys")
ax[0, 1].imshow(thr2, cmap="Greys")
ax[1, 0].plot(row_sum1, "k.")
ax[1, 1].plot(row_sum2, "r.")
ax[1, 0].set(
xlabel="Index Value",
ylabel="Row Sum",
)
plt.show()
imgs = (r"combine\imgs\test_image_part_1.jpg",
r"combine\imgs\test_image_part_2.jpg")
s = Stitcher()
s.combine_2(*imgs)
This gave me this graph:
(I've added in those yellow boxes, to indicate the overlap regions.)
This is the bit I'm stuck at. I want to find exactly:
the index value of the left-side of the yellow box for the 1st image and
the index value of the right-side of the yellow box for the 2nd image.
I define the overlap region as the longest range for which the end of the 1st graph 'matches' the start of the 2nd graph. For the method to find the overlap region, what should I do if the row sum values aren't exactly the same (what if one is the other scaled by some factor)?
I feel like this could be a problem that could use dot products to find the similarity between the 2 graphs? But I can't think of how to implement this.
I had a lot more fun with this than I expected. I wrote this using opencv, but that's just to load and show the image. Everything else is done with numpy so swapping this to PIL shouldn't be too difficult.
I'm using a brute-force matcher. I also wrote a random-start hillclimber that runs in much less time, but I can't guarantee it'll find the correct answer since the gradient space isn't smooth. I won't include it in my code since it's long and janky, but if you really need the time efficiency I can add it back in later.
I added a random crop and some salt and pepper noise to the images to test for robustness.
The brute-force matcher operates on the idea that we don't know which section of the two images overlap, so we need to convolve the smaller image over the larger image from left to right, top to bottom. This means our search space is:
horizontal = small_width + big_width
vertical = small_height + big_height
area = horizontal * vertical
This will grow very quickly with image size. I motivate the algorithm by giving it points for having a larger overlap, but it loses more points for having differences in color for the overlapped area.
Here are some pictures from an execution of this program
import cv2
import numpy as np
import random
# randomly snips edges
def randCrop(image, maxMargin):
c = [random.randint(0,maxMargin) for a in range(4)];
return image[c[0]:-c[1], c[2]:-c[3]];
# adds noise to image
def saltPepper(image, minNoise, maxNoise):
h,w = image.shape;
randNum = random.randint(minNoise, maxNoise);
for a in range(randNum):
x = random.randint(0, w-1);
y = random.randint(0, h-1);
image[y,x] = random.randint(0, 255);
return image;
# evaluate layout
def getScore(one, two):
# do raw subtraction
left = one - two;
right = two - one;
sub = np.minimum(left, right);
return np.count_nonzero(sub);
# return 2d random position within range
def randPos(img, big_shape):
th,tw = big_shape;
h,w = img.shape;
x = random.randint(0, tw - w);
y = random.randint(0, th - h);
return [x,y];
# overlays small image onto big image
def overlay(small, big, pos):
# unpack
h,w = small.shape;
x,y = pos;
# copy and place
copy = big.copy();
copy[y:y+h, x:x+w] = small;
return copy;
# calculates overlap region
def overlap(one, two, pos_one, pos_two):
# unpack
h1,w1 = one.shape;
h2,w2 = two.shape;
x1,y1 = pos_one;
x2,y2 = pos_two;
# set edges
l1 = x1;
l2 = x2;
r1 = x1 + w1;
r2 = x2 + w2;
t1 = y1;
t2 = y2;
b1 = y1 + h1;
b2 = y2 + h2;
# go
left = max(l1, l2);
right = min(r1, r2);
top = max(t1, t2);
bottom = min(b1, b2);
return [left, right, top, bottom];
# wrapper for overlay + getScore
def fullScore(one, two, pos_one, pos_two, big_empty):
# check positions
x,y = pos_two;
h,w = two.shape;
th,tw = big_empty.shape;
if y+h > th or x+w > tw or x < 0 or y < 0:
return -99999999;
# overlay
temp_one = overlay(one, big_empty, pos_one);
temp_two = overlay(two, big_empty, pos_two);
# get overlap
l,r,t,b = overlap(one, two, pos_one, pos_two);
temp_one = temp_one[t:b, l:r];
temp_two = temp_two[t:b, l:r];
# score
diff = getScore(temp_one, temp_two);
score = (r-l) * (b-t);
score -= diff*2;
return score;
# do brute force
def bruteForce(one, two):
# calculate search space
# unpack size
h,w = one.shape;
one_size = h*w;
h,w = two.shape;
two_size = h*w;
# small and big
if one_size < two_size:
small = one;
big = two;
else:
small = two;
big = one;
# unpack size
sh, sw = small.shape;
bh, bw = big.shape;
total_width = bw + sw * 2;
total_height = bh + sh * 2;
# set up empty images
empty = np.zeros((total_height, total_width), np.uint8);
# set global best
best_score = -999999;
best_pos = None;
# start scrolling
ybound = total_height - sh;
xbound = total_width - sw;
for y in range(ybound):
print("y: " + str(y) + " || " + str(empty.shape));
for x in range(xbound):
# get score
score = fullScore(big, small, [sw,sh], [x,y], empty);
# show
# prog = overlay(big, empty, [sw,sh]);
# prog = overlay(small, prog, [x,y]);
# cv2.imshow("prog", prog);
# cv2.waitKey(1);
# compare
if score > best_score:
best_score = score;
best_pos = [x,y];
print("best_score: " + str(best_score));
return best_pos, [sw,sh], small, big, empty;
# do a step of hill climber
def hillStep(one, two, best_pos, big_empty, step):
# make a step
new_pos = best_pos[1][:];
new_pos[0] += step[0];
new_pos[1] += step[1];
# get score
return fullScore(one, two, best_pos[0], new_pos, big_empty), new_pos;
# hunt around for good position
# let's do a random-start hillclimber
def randHill(one, two, shape):
# set up empty images
big_empty = np.zeros(shape, np.uint8);
# set global best
g_best_score = -999999;
g_best_pos = None;
# lets do 200 iterations
iters = 200;
for a in range(iters):
# progress check
print(str(a) + " of " + str(iters));
# start with random position
h,w = two.shape[:2];
pos_one = [w,h];
pos_two = randPos(two, shape);
# get score
best_score = fullScore(one, two, pos_one, pos_two, big_empty);
best_pos = [pos_one, pos_two];
# hill climb (only on second image)
while True:
# end condition: no step improves score
end_flag = True;
# 8-way
for y in range(-1, 1+1):
for x in range(-1, 1+1):
if x != 0 or y != 0:
# get score and update
score, new_pos = hillStep(one, two, best_pos, big_empty, [x,y]);
if score > best_score:
best_score = score;
best_pos[1] = new_pos[:];
end_flag = False;
# end
if end_flag:
break;
else:
# show
# prog = overlay(one, big_empty, best_pos[0]);
# prog = overlay(two, prog, best_pos[1]);
# cv2.imshow("prog", prog);
# cv2.waitKey(1);
pass;
# check for new global best
if best_score > g_best_score:
g_best_score = best_score;
g_best_pos = best_pos[:];
print("top score: " + str(g_best_score));
return g_best_score, g_best_pos;
# load both images
top = cv2.imread("top.jpg");
bottom = cv2.imread("bottom.jpg");
top = cv2.cvtColor(top, cv2.COLOR_BGR2GRAY);
bottom = cv2.cvtColor(bottom, cv2.COLOR_BGR2GRAY);
# randomly crop
top = randCrop(top, 20);
bottom = randCrop(bottom, 20);
# randomly add noise
saltPepper(top, 200, 1000);
saltPepper(bottom, 200, 1000);
# set up max image (assume no overlap whatsoever)
tw = 0;
th = 0;
h, w = top.shape;
tw += w;
th += h;
h, w = bottom.shape;
tw += w*2;
th += h*2;
# do random-start hill climb
_, best_pos = randHill(top, bottom, (th, tw));
# show
empty = np.zeros((th, tw), np.uint8);
pos1, pos2 = best_pos;
image = overlay(top, empty, pos1);
image = overlay(bottom, image, pos2);
# do brute force
# small_pos, big_pos, small, big, empty = bruteForce(top, bottom);
# image = overlay(big, empty, big_pos);
# image = overlay(small, image, small_pos);
# recolor overlap
h,w = empty.shape;
color = np.zeros((h,w,3), np.uint8);
l,r,t,b = overlap(top, bottom, pos1, pos2);
color[:,:,0] = image;
color[:,:,1] = image;
color[:,:,2] = image;
color[t:b, l:r, 0] += 100;
# show images
cv2.imshow("top", top);
cv2.imshow("bottom", bottom);
cv2.imshow("overlayed", image);
cv2.imshow("Color", color);
cv2.waitKey(0);
Edit: I added in the random-start hillclimber
I am a newbie here. I am trying to get a single line of the edge of the 2D flame then I can calculate the actual area - 3D flame area. The first thing is getting the edge. The 2D flame is sort of side-viewed concave flame, so the flame base (flat part) is brighter than the concave segment. I use the code below the find the edge, my method is finding the maximum pixel value follow the y-axis. The result seems not to get my purpose, could you please help me figure out? Thanks very much in advance.
Original image In the code I rotate the image
from PIL import Image
import numpy as np
import cv2
def initialization_rotate(path):
global h,w,img
img4 = np.array(Image.open(path).convert('L'))
img3 = img4.transpose(1,0)
img2 = img3[::-1,::1]
img = img2[400:1000,1:248]
h, w = img.shape
path = 'D:\\20190520\\14\\14\\1767.jpg'
#Noise cancellation
def opening(binary):
opened = np.zeros_like(binary)
for j in range(1,w-1):
for i in range(1,h-1):
if binary[i][j]> 100:
n1 = binary[i-1][j-1]
n2 = binary[i-1][j]
n3 = binary[i-1][j+1]
n4 = binary[i][j-1]
n5 = binary[i][j+1]
n6 = binary[i+1][j-1]
n7 = binary[i+1][j]
n8 = binary[i+1][j+1]
sum8 = int(n1) + int(n2) + int(n3) + int(n4) + int(n5) + int(n6) + int(n7) + int(n8)
if sum8 < 1000:
opened[i][j] = 0
else:
opened[i][j] = 255
else:
pass
return opened
edge = np.zeros_like(img)
# Find the max pixel value and extract the postion
for j in range(w-1):
ys = [0]
ymax = []
for i in range(h-1):
if img[i][j] > 100:
ys.append(i)
else:
pass
ymax = np.amax(ys)
edge[ymax][j] = 255
cv2.namedWindow('edge')
while(True):
cv2.imshow('edge',edge)
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
cv2.destroyAllWindows()
I have done a very quick coding and from ground up (without looking into established or state of art algorithms on edge detection). Not very suprisingly, the results are very poor. The code that I have pasted below will work only for RGB (i.e. only for three channels and not for images that are CMYK, grey-scale or RGBA or anything else). Also I have tested on single very simplistic image. In real life the images are complicated. I don't think it will fair very well there, yet. It needs a lot of work. However I am, hesitatingly, sharing it since it was requested by #Gia Tri.
Here is what I did. For every column I calculated the mean intensities and stddev intensities. I hoped that at the edge there will be change in the intensities from the average +- stdev (multiplied by a factor). If I mark the first and last in the column, I will have edge for every column and hopfully, once I stitch it, it will form and edge. The code and the attached image is for you to see, how I fared.
from scipy import ndimage
import numpy as np
import matplotlib.pyplot as plt
UppperStdBoundaryMultiplier = 1.0
LowerStdBoundaryMultiplier = 1.0
NegativeSelection = False
def SumSquareRGBintensityOfPixel(Pixel):
return np.sum(np.power(Pixel,2),axis=0)
def GetTheContinousStretchForAcolumn(Column):
global UppperStdBoundaryMultiplier
global LowerStdBoundaryMultiplier
global NegativeSelection
SumSquaresIntensityOfColumn = np.apply_along_axis(SumSquareRGBintensityOfPixel,1,Column)
Mean = np.mean(SumSquaresIntensityOfColumn)
StdDev = np.std(SumSquaresIntensityOfColumn)
LowerThreshold = Mean - LowerStdBoundaryMultiplier*StdDev
UpperThreshold = Mean + UppperStdBoundaryMultiplier*StdDev
if NegativeSelection:
Index = np.where(SumSquaresIntensityOfColumn < LowerThreshold)
Column[Index,:] = np.array([255,255,255])
else:
Index = np.where(SumSquaresIntensityOfColumn >= LowerThreshold)
LeastIndex = Index[Index==True][0]
LastIndex = Index[Index==True][-1]
Column[[LeastIndex,LastIndex],:] = np.array([255,0,0])
return Column
def DoEdgeDetection(ImageFilePath):
FileHandle = ndimage.imread(ImageFilePath)
for Column in range(FileHandle.shape[1]):
FileHandle[:,Column,:] = GetTheContinousStretchForAcolumn(FileHandle[:,Column,:])
plt.imshow(FileHandle)
plt.show()
DoEdgeDetection("/PathToImage/Image_1.jpg")
And below is the result. To the left is the query image whose edge had to be detected and to the right is the edge detected image. Edge points are marked in red dots. As you can see it fared poorly but with some investment of time and thinking, it might do far better ... or may be not. May be it is a good start but far from finish .. You, please, be the judge!
***** Edit after clarification on requirement from GiaTri ***************
So I did manage to change the program, the idea remained same. However this time the problem is overly simplified to the case that you want to detect only blue flame. Actually I went ahead and made it functional for all three color channels. However I doubt, it will be useful to you beyond blue channel.
**How to use the program below **
If your flame is vertical then choose edges = "horizontal" in the class asignment. If your edges are horizontal then choose edges = "vertical". This might be a little confusing but for the time being please use it. Later either you can change it or I can change it.
So first let me convince you that the edge detection is working much better than yesterday. See the two images below. I have taken these two flame images from internet. As before the image whose edge has to be detected is on the left and on the right is the edge-detected image. The edges are in red dot.
First horizontal flame.
and then a vertical flame.
.
There is still a lot of work left in this. However if you are a little more convinced than yesterday, then below is the code.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.image import imread
class DetectEdges():
def __init__(self, ImagePath, Channel = ["blue"], edges="vertical"):
self.Channel = Channel
self.edges = edges
self.Image_ = imread(ImagePath)
self.Image = np.copy(self.Image_)
self.Dimensions_X, self.Dimensions_Y, self.Channels = self.Image.shape
self.BackGroundSamplingPercentage = 0.5
def ShowTheImage(self):
plt.imshow(self.Image)
plt.show()
def GetTheBackGroundPixels(self):
NumberOfPoints = int(self.BackGroundSamplingPercentage*min(self.Dimensions_X, self.Dimensions_Y))
Random_X = np.random.choice(self.Dimensions_X, size=NumberOfPoints, replace=False)
Random_Y = np.random.choice(self.Dimensions_Y, size=NumberOfPoints, replace=False)
Random_Pixels = np.array(list(zip(Random_X,Random_Y)))
return Random_Pixels
def GetTheChannelEdge(self):
BackGroundPixels = self.GetTheBackGroundPixels()
if self.edges == "vertical":
if self.Channel == ["blue"]:
MeanBackGroundInensity = np.mean(self.Image[BackGroundPixels[:,0],BackGroundPixels[:,1],2])
for column in range(self.Dimensions_Y):
PixelsAboveBackGround = np.where(self.Image[:,column,2]>MeanBackGroundInensity)
if PixelsAboveBackGround[PixelsAboveBackGround==True].shape[0] > 0:
TopPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][0]
BottomPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][-1]
self.Image[[TopPixel,BottomPixel],column,:] = [255,0,0]
if self.Channel == ["red"]:
MeanBackGroundInensity = np.mean(self.Image[BackGroundPixels[:,0],BackGroundPixels[:,1],0])
for column in range(self.Dimensions_Y):
PixelsAboveBackGround = np.where(self.Image[:,column,0]>MeanBackGroundInensity)
if PixelsAboveBackGround[PixelsAboveBackGround==True].shape[0] > 0:
TopPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][0]
BottomPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][-1]
self.Image[[TopPixel,BottomPixel],column,:] = [0,255,0]
if self.Channel == ["green"]:
MeanBackGroundInensity = np.mean(self.Image[BackGroundPixels[:,0],BackGroundPixels[:,1],1])
for column in range(self.Dimensions_Y):
PixelsAboveBackGround = np.where(self.Image[:,column,1]>MeanBackGroundInensity)
if PixelsAboveBackGround[PixelsAboveBackGround==True].shape[0] > 0:
TopPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][0]
BottomPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][-1]
self.Image[[TopPixel,BottomPixel],column,:] = [255,0,0]
elif self.edges=="horizontal":
if self.Channel == ["blue"]:
MeanBackGroundInensity = np.mean(self.Image[BackGroundPixels[:,0],BackGroundPixels[:,1],2])
for row in range(self.Dimensions_X):
PixelsAboveBackGround = np.where(self.Image[row,:,2]>MeanBackGroundInensity)
if PixelsAboveBackGround[PixelsAboveBackGround==True].shape[0] > 0:
LeftPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][0]
RightPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][-1]
self.Image[row,[LeftPixel,RightPixel],:] = [255,0,0]
if self.Channel == ["red"]:
MeanBackGroundInensity = np.mean(self.Image[BackGroundPixels[:,0],BackGroundPixels[:,1],0])
for row in range(self.Dimensions_X):
PixelsAboveBackGround = np.where(self.Image[row,:,0]>MeanBackGroundInensity)
if PixelsAboveBackGround[PixelsAboveBackGround==True].shape[0] > 0:
LeftPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][0]
RightPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][-1]
self.Image[row,[LeftPixel,RightPixel],:] = [0,255,0]
if self.Channel == ["green"]:
MeanBackGroundInensity = np.mean(self.Image[BackGroundPixels[:,0],BackGroundPixels[:,1],1])
for row in range(self.Dimensions_X):
PixelsAboveBackGround = np.where(self.Image[row,:,1]>MeanBackGroundInensity)
if PixelsAboveBackGround[PixelsAboveBackGround==True].shape[0] > 0:
LeftPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][0]
RightPixel = PixelsAboveBackGround[PixelsAboveBackGround==True][-1]
self.Image[row,[LeftPixel,RightPixel],:] = [255,0,0]
Test = DetectEdges("FlameImagePath",Channel = ["blue"],edges="vertical")
Test.GetTheChannelEdge()
Test.ShowTheImage()
Please let me know if this was of any "more" help or I missed some salient requirements.
Best wishes,
By the way, Amit, I would like to show my code using the idea of the thresholding pixel value. I would love to discuss with you.
if __name__ == '__main__':
path = 'D:\\20181229__\\7\\Area 7\\1767.jpg'
img1 = cv2.imread(path)
b,g,r = cv2.split(img1)
img3 = b[94:223, 600:700]
img4 = cv2.flip(img3, 1)
h,w = img3.shape
data = []
th_val = 20
for i in range(h):
for j in range(w):
val = img3[i, -j]
if (val >= th_val):
data.append(j)
break
x = range(len(data))
plt.figure(figsize = (10, 7))
plt.subplot(121)
plt.imshow(img4)
plt.plot(data, x)
plt.subplot(121)
plt.plot(data, x)
please see the link for the result. The thing is the method still not fit totally my desire. I hope a discussion with you.
Link: https://imgur.com/QtNk7c7
Question:
How can I programmatically return a raster that is the difference of two (differently sized) red bands?
i.e.
gdal_calc.py -A 'WARPED.tif' -B 'DSC_1636.tif' --outfile = 'dif.tif' --calc = "A-B"
QGIS raster calculator performs this function just fine. However, the previous code returns the following error.
Exception: Error! Dimensions of file DSC_1636.tif (7380, 4928) are different from other files (7743, 5507). Cannot proceed
I am currently under the impression I should read in the rasters using a defined extent, created by finding the overlap as shown below, but I am still not able to make this work.
# Subtract two rasters of different dimensions
# Pixel coordinates define overlap
import os, sys
from PIL import Image
from osgeo import gdal, ogr, osr
gdal.UseExceptions()
# Use PIL to get information from images
im1 = Image.open('DSC_0934-warped.tif')
print('warped image size is %s ' % str(im1.size))
im2 = Image.open('DSC_1636.png')
print('initial image (image 2) size is %s' % str(im2.size))
warped image size is (7743, 5507)
initial image (image 2) size is (7380, 4928)
# Use GDAL to get information about images
def get_extent(fn):
'''Returns min_x, max_y, max_x, min_y'''
ds = gdal.Open(fn)
gt = ds.GetGeoTransform()
return (gt[0], gt[3], gt[0] + gt[1] * ds.RasterXSize,
gt[3] + gt[5] * ds.RasterYSize)
print('extent of warped.tif is %s' % str(get_extent('DSC_0934-warped.tif')))
print('extent of 1636.png is %s' % str(get_extent('DSC_1636.png')))
extent of warped.tif is (-375.3831214210602, 692.5167764068751, 7991.3588371542955, -5258.102875649754)
extent of 1636.png is (0.0, 0.0, 7380.0, 4928.0)
r1 = get_extent('DSC_0934-warped.tif')
r2 = get_extent('DSC_1636.png')
# Get left, top, right, bottom of dataset's bounds in pixel coordinates
intersection = [max(r1[0], r2[0]),
min(r1[1], r2[1]),
min(r1[2], r2[2]),
max(r1[3], r2[3])]
print('checking for overlap')
if (intersection[2] < intersection[0]) or (intersection[1] > intersection[3]):
intersection = None
print('no overlap')
else:
print('intersection overlaps at: %s' % intersection)
checking for overlap
intersection overlaps at: [0.0, 0.0, 7380.0, 4928.0]
The most straight forward answer is to read in the images as an array of defined dimensions.
Without reposting the code above used to check where the overlap is, the solution can be had with the following additions. (Thank you #Val)
# Get the data
ds1_src = gdal.Open( "DSC_1636.png" )
ds2_src = gdal.Open( "DSC_0934-warped.tif")
ds1_bnd = ds1_src.GetRasterBand(1).ReadAsArray(xoff=0, yoff=0, win_xsize=7380, win_ysize=4928)
ds2_bnd = ds2_src.GetRasterBand(1).ReadAsArray(xoff=0, yoff=0, win_xsize=7380, win_ysize=4928)
# Do the maths...
data_out = ds2_bnd - ds1_bnd
#Write the out file
driver = gdal.GetDriverByName("GTiff")
dsOut = driver.Create("out.tiff", 7380, 4928, 1, GDT_Byte)
CopyDatasetInfo(ds1_src,dsOut)
bandOut=dsOut.GetRasterBand(1)
BandWriteArray(bandOut, data_out)
#Close the datasets
ds1_src = None
ds2_src = None
ds1_bnd = None
ds2_bnd = None
bandOut = None
dsOut = None
I have read an image and I have converted the image to HSV image.
I want to apply threshold limits for hue, saturation , and value components separately.
Hue thershold 0 to 1, saturation thershold 0.28 to 1 and value thershold 0 to 0.55
I want to this application for color masking !
how to apply these limits on my image files.
image_read = cv2.imread('tryimage.jpg')
im = cv2.cvtColor(image_read,cv2.COLOR_RGB2HSV)
im_hue = im[:,:,0]
im_sat = im[:,:,1]
im_val = im[:,:,2]
# how to apply thershold ?
fig, ax = plt.subplots(nrows=1,ncols=3)
ax[0].imshow(im_hue)
ax[1].imshow(im_sat)
ax[2].imshow(im_val)
plt.show()
I have done the same in Matlab, I have taken only the pixels of my interest in each band and then merged these back to get the pixels of my interest.
Here is my matlab code snippet , which I want to do the same in python.
color.hueThresholdLow = 0;
color.hueThresholdHigh = 1;
color.saturationThresholdLow = 0;
color.saturationThresholdHigh = 0.28;
color.valueThresholdLow = 0.38;
color.valueThresholdHigh = 0.97;
maskedRGBImage = color_masking(rgbImage,color);
function color_masking(rgbImage, color)
hsvimage = rgb2hsv(rgbImage);
himage = hsvimage(:,:,1);
simage = hsvimage(:,:2);
vimage = hsvimage(:,:,3);
hMask = (hImage >= color.hueThresholdLow) & (hImage <= color.hueThresholdHigh);
sMask = (sImage >= color.saturationThresholdLow) & (sImage <= color.saturationThresholdHigh);
vMask = (vImage >= color.valueThresholdLow) & (vImage <= color.valueThresholdHigh);
ObjectsMask = uint8(hMask & sMask & vMask);
.....
In python you can write it very similar to matlab. It is usually a good idea to create a function for methods that you might use more than once, but feel free of removing the function declaration if it doesn't suit your needs.
def threshold_hsv(im_hsv, hlow, hhigh, slow, shigh, vlow, vhigh):
im_hue = im_hsv[:,:,0]
im_sat = im_hsv[:,:,1]
im_val = im_hsv[:,:,2]
h_mask = (im_hue >= hlow) & (im_hue <= hhigh)
s_mask = (im_sat >= slow) & (im_sat <= shigh)
v_mask = (im_val >= vlow) & (im_val <= vhigh)
return h_mask & s_mask & v_mask
And then you can call the function with your data as:
>>> object_mask = threshold_hsv(hsvimage, 0, 1, 0, 0.28, 0.38, 0.97)
As you can see, the syntax is pretty similar (if not identical) to that of the matlab. This holds as long as your hsvimage is a numpy array, which is what OpenCV generates in python.
To select values that satisfy your limits (and discard the ones not in the limits), use list comprehensions:
# filtered_pixels is a list of tuples, which are ordered as (h, s, v)
# i.e. filtered_pixels[0][0] = h, filtered_pixels[0][1] = s and
# filtered_pixels[0][2] = v
filtered_pixels = [(im_hue[i], im_sat[i], im_val[i]) for i in range(len(im_hue)) if satisfies_limits(im_hue[i], im_sat[i], im_val[i])]
satisfies_limits is a function that checks whether the passed hue, saturation and value are in the required limits. You can unwrap the above list comprehension to a for loop to if you wish.
To limit all values to the given limits, use the map() builtin:
clamped_hue = map(lambda h: max(hue_min, min(h, hue_max)), im_hue)
# And so on for saturation and value