Image Cropping with transformation matrix? - python

I am training a neural network to do Human Single Pose Estimation on the MPII dataset. Within it, many images contain more than one person and I need to crop the image in order to extract each single person.
Of each I have the position (or center) and the scale w.r.t. 200 px height.
This code does just what I need:
def get_transform(center, scale, res, rot=0):
# Generate transformation matrix
h = 200 * scale
t = np.zeros((3, 3))
t[0, 0] = float(res[1]) / h
t[1, 1] = float(res[0]) / h
t[0, 2] = res[1] * (-float(center[0]) / h + .5)
t[1, 2] = res[0] * (-float(center[1]) / h + .5)
t[2, 2] = 1
if not rot == 0:
rot = -rot # To match direction of rotation from cropping
rot_mat = np.zeros((3,3))
rot_rad = rot * np.pi / 180
sn,cs = np.sin(rot_rad), np.cos(rot_rad)
rot_mat[0,:2] = [cs, -sn]
rot_mat[1,:2] = [sn, cs]
rot_mat[2,2] = 1
# Need to rotate around center
t_mat = np.eye(3)
t_mat[0,2] = -res[1]/2
t_mat[1,2] = -res[0]/2
t_inv = t_mat.copy()
t_inv[:2,2] *= -1
t = np.dot(t_inv,np.dot(rot_mat,np.dot(t_mat,t)))
return t
def transform(pt, center, scale, res, invert=0, rot=0):
# Transform pixel location to different reference
t = get_transform(center, scale, res, rot=rot)
if invert:
t = np.linalg.inv(t)
new_pt = np.array([pt[0], pt[1], 1.]).T
new_pt = np.dot(t, new_pt)
return new_pt[:2].astype(int)
def crop(img, center, scale, res, rot=0):
# Upper left point
ul = np.array(transform([0, 0], center, scale, res, invert=1))
# Bottom right point
br = np.array(transform(res, center, scale, res, invert=1))
new_shape = [br[1] - ul[1], br[0] - ul[0]]
if len(img.shape) > 2:
new_shape += [img.shape[2]]
new_img = np.zeros(new_shape)
# Range to fill new array
new_x = max(0, -ul[0]), min(br[0], len(img[0])) - ul[0]
new_y = max(0, -ul[1]), min(br[1], len(img)) - ul[1]
# Range to sample from original image
old_x = max(0, ul[0]), min(len(img[0]), br[0])
old_y = max(0, ul[1]), min(len(img), br[1])
new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1], old_x[0]:old_x[1]]
return cv2.resize(new_img, res)
However, I haven't figured out what kind of transformation matrix it is (the one that was created to derive ul or br).
Could someone explain to me what happens in these functions?
Thank you

Related

OpenCV: Remap in reverse order

I want to project an image from spherical to cubemap. From what I understood studying maths, I need to create a theta, phi distribution for each pixel and then convert it into cartesian system to get a normalized pixel map.
I used the following code to do so
theta = 0
phi = np.pi/2
squareLength = 2048
# theta phi distribution for X-positive face
t = np.linspace(theta + np.pi/4, theta - np.pi/4, squareLength)
p = np.linspace(phi + np.pi/4, phi - np.pi/4, squareLength)
x, y = np.meshgrid(t, p)
# converting into cartesion sytem for X-positive face (where r is the distance from sphere center to cube plane and X is constantly 0.5 in cartesian system)
X = np.zeros_like(y)
X[:,:] = 0.5
r = X / (np.cos(x) * np.sin(y))
Y = r * np.sin(x) * np.sin(y)
Z = r * np.cos(y)
XYZ = np.stack((X, Y, Z), axis=2)
# shifting pixels from the negative side
XYZ = XYZ + [0, 0.5, 0.5]
# since i want to project on X-positive face my map should be
x_map = -XYZ[:, :, 1] * squareLength
y_map = XYZ[:,:, 2] * squareLength
The above map created should give me my desired result with cv2.remap() but it's not. Then I tried looping through pixels and implement my own remap without interpolation or extrapolation. With some hit and trial, I deduced the following formula which gives me the correct result
for i in range(2048):
for j in range(2048):
try:
image[int(y_map[i,j]), int(x_map[i,j])] = im[i, j]
except:
pass
which is reverse of actual cv2 remapping which says dst(x,y)=src(mapx(x,y),mapy(x,y))
I do not understand if did the math all wrong or is there a way to covert x_map and y_map to correct forms so that cv2.remap() gives the desired result.
INPUT IMAGE
DESIRED RESULT (this one is without interpolation using loops)
CURRENT RESULT (using cv2.remap())
I'm quite new in opencv and I didn't work with so difficult math algorithms before but I tried to do this. I rewrote your code a bit and here it is:
import numpy as np
import cv2
src = cv2.imread("data/pink_sq.png")
def make_map():
theta = 0
phi = np.pi / 2
squareLength = 4000
# theta phi distribution for X-positive face
t = np.linspace((theta - np.pi / 4), (theta + np.pi / 4), squareLength)
p = np.linspace((phi + np.pi / 4), (phi - np.pi / 4), squareLength)
x, y = np.meshgrid(t, p)
x_res = np.zeros_like(y)
x_res[:, :] = 0.5
r = x_res * (np.cos(x))
r /= np.amax(r)
y_res = r * x
z_res = r * np.cos(y)
xyz = np.stack((x_res, y_res, z_res), axis=2)
# shifting pixels from the negative side
xyz = xyz + [0, 0.5, 0.5]
# since i want to project on X-positive face my map should be
x_map = xyz[:, :, 1] * squareLength
y_map = xyz[:, :, 2] * squareLength
map_x = y_map.astype("float32")
map_y = x_map.astype("float32")
return map_x, map_y
map_x, map_y = make_map()
dst = cv2.remap(src, map_y, map_x, cv2.INTER_LINEAR)
cv2.imwrite("res.png", dst)
I don't understand the math in this code at all but I rewrote it a bit and I should say that it works quite good. Here is the result image:
And yes, there is a bit difference between my result picture and yours but I hope it is ok :) If I'm not right somewhere of course downvote this answer because I'm not sure that it is correct one.
I'm almost certain the issue has to do with the orientation of the reference frame in space. Maybe if you clarify the Math a bit we can help.

Centroidal Voronoi tessellation

I am trying to build a bounded Voronoi diagram using the scipy package and in each iteration I compute the centroids of the Voronoi cells and move a bit say some delta towards the centroid and recompute the Voronoi diagram by updating the generator points. When I try to plot the updated points I get a weird error as in the point I plot is not where it is expected to be.
Here's the code
import matplotlib.pyplot as pl
import numpy as np
import scipy as sp
import scipy.spatial
import sys
np.random.seed(1)
eps = sys.float_info.epsilon
n_robots = 10
robots = np.random.rand(n_robots, 2)
#print(robots)
bounding_box = np.array([0., 1., 0., 1.])
def in_box(robots, bounding_box):
return np.logical_and(np.logical_and(bounding_box[0] <= robots[:, 0],
robots[:, 0] <= bounding_box[1]),
np.logical_and(bounding_box[2] <= robots[:, 1],
robots[:, 1] <= bounding_box[3]))
def voronoi(robots, bounding_box):
i = in_box(robots, bounding_box)
points_center = robots[i, :]
points_left = np.copy(points_center)
points_left[:, 0] = bounding_box[0] - (points_left[:, 0] - bounding_box[0])
points_right = np.copy(points_center)
points_right[:, 0] = bounding_box[1] + (bounding_box[1] - points_right[:, 0])
points_down = np.copy(points_center)
points_down[:, 1] = bounding_box[2] - (points_down[:, 1] - bounding_box[2])
points_up = np.copy(points_center)
points_up[:, 1] = bounding_box[3] + (bounding_box[3] - points_up[:, 1])
points = np.append(points_center,
np.append(np.append(points_left,
points_right,
axis=0),
np.append(points_down,
points_up,
axis=0),
axis=0),
axis=0)
# Compute Voronoi
vor = sp.spatial.Voronoi(points)
# Filter regions
regions = []
ind = np.arange(points.shape[0])
ind = np.expand_dims(ind,axis= 1)
for region in vor.regions:
flag = True
for index in region:
if index == -1:
flag = False
break
else:
x = vor.vertices[index, 0]
y = vor.vertices[index, 1]
if not(bounding_box[0] - eps <= x and x <= bounding_box[1] + eps and
bounding_box[2] - eps <= y and y <= bounding_box[3] + eps):
flag = False
break
if region != [] and flag:
regions.append(region)
vor.filtered_points = points_center
vor.filtered_regions = regions
return vor
def centroid_region(vertices):
A = 0
C_x = 0
C_y = 0
for i in range(0, len(vertices) - 1):
s = (vertices[i, 0] * vertices[i + 1, 1] - vertices[i + 1, 0] * vertices[i, 1])
A = A + s
C_x = C_x + (vertices[i, 0] + vertices[i + 1, 0]) * s
C_y = C_y + (vertices[i, 1] + vertices[i + 1, 1]) * s
A = 0.5 * A
C_x = (1.0 / (6.0 * A)) * C_x
C_y = (1.0 / (6.0 * A)) * C_y
return np.array([[C_x, C_y]])
def plot(r,index):
vor = voronoi(r, bounding_box)
fig = pl.figure()
ax = fig.gca()
# Plot initial points
ax.plot(vor.filtered_points[:, 0], vor.filtered_points[:, 1], 'b.')
print("initial",vor.filtered_points)
# Plot ridges points
for region in vor.filtered_regions:
vertices = vor.vertices[region, :]
ax.plot(vertices[:, 0], vertices[:, 1], 'go')
# Plot ridges
for region in vor.filtered_regions:
vertices = vor.vertices[region + [region[0]], :]
ax.plot(vertices[:, 0], vertices[:, 1], 'k-')
# Compute and plot centroids
centroids = []
for region in vor.filtered_regions:
vertices = vor.vertices[region + [region[0]], :]
centroid = centroid_region(vertices)
centroids.append(list(centroid[0, :]))
ax.plot(centroid[:, 0], centroid[:, 1], 'r.')
centroids = np.asarray(centroids)
rob = np.copy(vor.filtered_points)
# the below code is for the plotting purpose the update happens in the update function
interim_x = np.asarray(centroids[:,0] - rob[:,0])
interim_y = np.asarray(centroids[:,1] - rob[:,1])
magn = [np.linalg.norm(centroids[i,:] - rob[i,:]) for i in range(rob.shape[0])]
x = np.copy(interim_x)
x = np.asarray([interim_x[i]/magn[i] for i in range(interim_x.shape[0])])
y = np.copy(interim_y)
y = np.asarray([interim_y[i]/magn[i] for i in range(interim_y.shape[0])])
nor = np.copy(rob)
for i in range(x.shape[0]):
nor[i,0] = x[i]
nor[i,1] = y[i]
temp = np.copy(rob)
temp[:,0] = [rob[i,0] + 0.5*interim_x[i] for i in range(rob.shape[0])]
temp[:,1] = [rob[i,1] + 0.5*interim_y[i] for i in range(rob.shape[0])]
ax.plot(temp[:,0] ,temp[:,1], 'y.' )
ax.set_xlim([-0.1, 1.1])
ax.set_ylim([-0.1, 1.1])
pl.savefig("voronoi" + str(index) + ".png")
return centroids
def update(rob,centroids):
interim_x = np.asarray(centroids[:,0] - rob[:,0])
interim_y = np.asarray(centroids[:,1] - rob[:,1])
magn = [np.linalg.norm(centroids[i,:] - rob[i,:]) for i in range(rob.shape[0])]
x = np.copy(interim_x)
x = np.asarray([interim_x[i]/magn[i] for i in range(interim_x.shape[0])])
y = np.copy(interim_y)
y = np.asarray([interim_y[i]/magn[i] for i in range(interim_y.shape[0])])
nor = [np.linalg.norm([x[i],y[i]]) for i in range(x.shape[0])]
temp = np.copy(rob)
temp[:,0] = [rob[i,0] + 0.5*interim_x[i] for i in range(rob.shape[0])]
temp[:,1] = [rob[i,1] + 0.5*interim_y[i] for i in range(rob.shape[0])]
return np.asarray(temp)
if __name__ == '__main__':
for i in range(1):
centroids = plot(robots,i)
robots = update(robots,centroids)
Also here is an image of what the code does. The blue points are the generator points, red are the centroids and yellow are supposed to be the midway points between the blue and red points. But as you can see the yellow points are not in between the blue and red points.
The problem is that your points when fed to Voronoi get inflated during the construction of the tessellation, and when you later filter them out the points are in the wrong order. Consequently when you set vor.filtered_points = points_center in voronoi(), the points are shuffled compared to the order of regions. So while you're computing the midpoints correctly, you're using the wrong pairs of points.
I circled two correct pairings in green and an incorrect one in red here:
As you can see from the red circle, the basis point in an edge cell is paired with the centroid of an adjacent cell.
The solution is simple: when you're filtering the regions and find a region to keep, you need to gather the point which falls inside the corresponding region. You can do this by matching vor.points to vor.point_region and finding the corresponding region, for which you'll need to enumerate your regions:
# Compute Voronoi
vor = sp.spatial.Voronoi(points)
# Filter regions and select corresponding points
regions = []
points_to_filter = [] # we'll need to gather points too
ind = np.arange(points.shape[0])
ind = np.expand_dims(ind,axis= 1)
for i,region in enumerate(vor.regions): # enumerate the regions
if not region: # nicer to skip the empty region altogether
continue
flag = True
for index in region:
if index == -1:
flag = False
break
else:
x = vor.vertices[index, 0]
y = vor.vertices[index, 1]
if not(bounding_box[0] - eps <= x and x <= bounding_box[1] + eps and
bounding_box[2] - eps <= y and y <= bounding_box[3] + eps):
flag = False
break
if flag:
regions.append(region)
# find the point which lies inside
points_to_filter.append(vor.points[vor.point_region == i][0,:])
vor.filtered_points = np.array(points_to_filter)
vor.filtered_regions = regions
With these modifications the averaging works fine:

How do I transform the values of an accumulator [Hough Transformation] back to a line on a canvas?

I am trying to detect the lines within an image using the Hough Transformation. Therefore I first create the accumulator like this:
from math import hypot, pi, cos, sin
from PIL import Image
import numpy as np
import cv2 as cv
import math
def hough(img):
thetaAxisSize = 460 #Width of the hough space image
rAxisSize = 360 #Height of the hough space image
rAxisSize= int(rAxisSize/2)*2 #we make sure that this number is even
img = im.load()
w, h = im.size
houghed_img = Image.new("L", (thetaAxisSize, rAxisSize), 0) #legt Bildgroesse fest
pixel_houghed_img = houghed_img.load()
max_radius = hypot(w, h)
d_theta = pi / thetaAxisSize
d_rho = max_radius / (rAxisSize/2)
#Accumulator
for x in range(0, w):
for y in range(0, h):
treshold = 255
col = img[x, y]
if col >= treshold: #determines for each pixel at (x,y) if there is enough evidence of a straight line at that pixel.
for vx in range(0, thetaAxisSize):
theta = d_theta * vx #angle between the x axis and the line connecting the origin with that closest point.
rho = x*cos(theta) + y*sin(theta) #distance from the origin to the closest point on the straight line
vy = rAxisSize/2 + int(rho/d_rho+0.5) #Berechne Y-Werte im hough space image
pixel_houghed_img[vx, vy] += 1 #voting
return houghed_imgcode here
And then call the function like this:
im = Image.open("img3.pgm").convert("L")
houghed_img = hough(im)
houghed_img.save("ho.bmp")
houghed_img.show()
The result seems to be okay:
So here comes the problem. I know want to find the top 3 highest values in the hough space and transform it back to 3 lines. The highest values should be the strongest lines.
Therefore I am first looking for the highest values within the pixel array and take the X and Y values of the maxima I found. From my understading this X and Y values are my rho and theta. I finding the maxima like this:
def find_maxima(houghed_img):
w, h = houghed_img.size
max_radius = hypot(w, h)
pixel_houghed_img = houghed_img.load()
max1, max2, max3 = 0, 0, 0
x1position, x2position, x3position = 0, 0, 0
y1position, y2position, y3position = 0, 0, 0
rho1, rho2, rho3 = 0, 0, 0
theta1, theta2, theta3 = 0, 0, 0
for x in range(1, w):
for y in range(1, h):
value = pixel_houghed_img[x, y]
if(value > max1):
max1 = value
x1position = x
y1position = y
rho1 = x
theta1 = y
elif(value > max2):
max2 = value
x2position = x
x3position = y
rho2 = x
theta2 = y
elif(value > max3):
max3 = value
x3position = x
y3position = y
rho3 = x
theta3 = y
print('max', max1, max2, max3)
print('rho', rho1, rho2, rho3)
print('theta', theta1, theta2, theta3)
# Results of the print:
# ('max', 255, 255, 255)
# ('rho', 1, 1, 1)
# ('theta', 183, 184, 186)
return rho1, theta1, rho2, theta2, rho3, theta3
And now I want to use this rho and theta values to draw the detected lines. I am doing this with the following code:
img_copy = np.ones(im.size)
rho1, theta1, rho2, theta2, rho3, theta3 = find_maxima(houghed_img)
a1 = math.cos(theta1)
b1 = math.sin(theta1)
x01 = a1 * rho1
y01 = b1 * rho1
pt11 = (int(x01 + 1000*(-b1)), int(y01 + 1000*(a1)))
pt21 = (int(x01 - 1000*(-b1)), int(y01 - 1000*(a1)))
cv.line(img_copy, pt11, pt21, (0,0,255), 3, cv.LINE_AA)
a2 = math.cos(theta2)
b2 = math.sin(theta2)
x02 = a2 * rho2
y02 = b2 * rho2
pt12 = (int(x02 + 1000*(-b2)), int(y02 + 1000*(a2)))
pt22 = (int(x02 - 1000*(-b2)), int(y02 - 1000*(a2)))
cv.line(img_copy, pt12, pt22, (0,0,255), 3, cv.LINE_AA)
a3 = math.cos(theta3)
b3 = math.sin(theta3)
x03 = a3 * rho3
y03 = b3 * rho3
pt13 = (int(x03 + 1000*(-b3)), int(y03 + 1000*(a3)))
pt23 = (int(x03 - 1000*(-b3)), int(y03 - 1000*(a3)))
cv.line(img_copy, pt13, pt23, (0,0,255), 3, cv.LINE_AA)
cv.imshow('lines', img_copy)
cv.waitKey(0)
cv.destroyAllWindows()
However, the result seems to be wrong:
So my assuption is that I either do something wrong when I declare the rho and theta values in the find_maxima() function, meaning that something is wrong with this:
max1 = value
x1position = x
y1position = y
rho1 = x
theta1 = y
OR that I am doing something wrong when translating the rho and theta value back to a line.
I would be very thankful if someone can help me with that!
Edit1: As request please finde the original Image where I want to finde the lines from below:
Edit2:
Thanks to the input of #Alessandro Jacopson and #Cris Luegno I was able to make some changes that definitely give me some hope!
In my def hough(img): I was setting the threshold to 255, which means that I only voted for white pixels, which is wrong since I want to look at the black pixels, since these pixels will indicate lines and not the white background of my image. So the calculation of the accumlator in def hough(img): looks like this now:
#Accumulator
for x in range(0, w):
for y in range(0, h):
treshold = 0
col = img[x, y]
if col <= treshold: #determines for each pixel at (x,y) if there is enough evidence of a straight line at that pixel.
for vx in range(0, thetaAxisSize):
theta = d_theta * vx #angle between the x axis and the line connecting the origin with that closest point.
rho = x*cos(theta) + y*sin(theta) #distance from the origin to the closest point on the straight line
vy = rAxisSize/2 + int(rho/d_rho+0.5) #Berechne Y-Werte im hough space image
pixel_houghed_img[vx, vy] += 1 #voting
return houghed_img
This leads to the following Accumulator and the following rho and thea values, when using the find_maxima() function:
# Results of the prints: (now top 8 instead of top 3)
# ('max', 155, 144, 142, 119, 119, 104, 103, 98)
# ('rho', 120, 264, 157, 121, 119, 198, 197, 197)
# ('theta', 416, 31, 458, 414, 417, 288, 291, 292)
The Lines that I can draw from this values look like this:
So this results are much more better but something seems to be still wrong. I have a strong suspicion that still something is wrong here:
for x in range(1, w):
for y in range(1, h):
value = pixel_houghed_img[x, y]
if(value > max1):
max1 = value
x1position = x
y1position = y
rho1 = value
theta1 = x
Here I am setting rho and theta equals [0...w] respectively [0...h]. I think that this is wrong since in the hough space values of X and why Y are not 0, 1,2,3... since we are in a another space. So I assume, that I have to multiply X and Y with something to bring them back in hough space. But this is just an assumption, maybe you guys can think of something else?
Again thank you very much to Alessandro and Cris for helping me out here!
Edit3: Working Code, thanks to #Cris Luengo
from math import hypot, pi, cos, sin
from PIL import Image
import numpy as np
import cv2 as cv
import math
def hough(img):
img = im.load()
w, h = im.size
thetaAxisSize = w #Width of the hough space image
rAxisSize = h #Height of the hough space image
rAxisSize= int(rAxisSize/2)*2 #we make sure that this number is even
houghed_img = Image.new("L", (thetaAxisSize, rAxisSize), 0) #legt Bildgroesse fest
pixel_houghed_img = houghed_img.load()
max_radius = hypot(w, h)
d_theta = pi / thetaAxisSize
d_rho = max_radius / (rAxisSize/2)
#Accumulator
for x in range(0, w):
for y in range(0, h):
treshold = 0
col = img[x, y]
if col <= treshold: #determines for each pixel at (x,y) if there is enough evidence of a straight line at that pixel.
for vx in range(0, thetaAxisSize):
theta = d_theta * vx #angle between the x axis and the line connecting the origin with that closest point.
rho = x*cos(theta) + y*sin(theta) #distance from the origin to the closest point on the straight line
vy = rAxisSize/2 + int(rho/d_rho+0.5) #Berechne Y-Werte im hough space image
pixel_houghed_img[vx, vy] += 1 #voting
return houghed_img, rAxisSize, d_rho, d_theta
def find_maxima(houghed_img, rAxisSize, d_rho, d_theta):
w, h = houghed_img.size
pixel_houghed_img = houghed_img.load()
maxNumbers = 9
ignoreRadius = 10
maxima = [0] * maxNumbers
rhos = [0] * maxNumbers
thetas = [0] * maxNumbers
for u in range(0, maxNumbers):
print('u:', u)
value = 0
xposition = 0
yposition = 0
#find maxima in the image
for x in range(0, w):
for y in range(0, h):
if(pixel_houghed_img[x,y] > value):
value = pixel_houghed_img[x, y]
xposition = x
yposition = y
#Save Maxima, rhos and thetas
maxima[u] = value
rhos[u] = (yposition - rAxisSize/2) * d_rho
thetas[u] = xposition * d_theta
pixel_houghed_img[xposition, yposition] = 0
#Delete the values around the found maxima
radius = ignoreRadius
for vx2 in range (-radius, radius): #checks the values around the center
for vy2 in range (-radius, radius): #checks the values around the center
x2 = xposition + vx2 #sets the spectated position on the shifted value
y2 = yposition + vy2
if not(x2 < 0 or x2 >= w):
if not(y2 < 0 or y2 >= h):
pixel_houghed_img[x2, y2] = 0
print(pixel_houghed_img[x2, y2])
print('max', maxima)
print('rho', rhos)
print('theta', thetas)
return maxima, rhos, thetas
im = Image.open("img5.pgm").convert("L")
houghed_img, rAxisSize, d_rho, d_theta = hough(im)
houghed_img.save("houghspace.bmp")
houghed_img.show()
img_copy = np.ones(im.size)
maxima, rhos, thetas = find_maxima(houghed_img, rAxisSize, d_rho, d_theta)
for t in range(0, len(maxima)):
a = math.cos(thetas[t])
b = math.sin(thetas[t])
x = a * rhos[t]
y = b * rhos[t]
pt1 = (int(x + 1000*(-b)), int(y + 1000*(a)))
pt2 = (int(x - 1000*(-b)), int(y - 1000*(a)))
cv.line(img_copy, pt1, pt2, (0,0,255), 3, cv.LINE_AA)
cv.imshow('lines', img_copy)
cv.waitKey(0)
cv.destroyAllWindows()
Original Image:
Accumulator:
Successful Line Detection:
This part of your code doesn't seem right:
max1 = value
x1position = x
y1position = y
rho1 = value
theta1 = x
If x and y are the two coordinates in the parameter space, they will correspond to rho and theta. Setting rho equal to the value makes no sense. I also don't know why you store x1position and y1position, since you don't use these variables.
Next, you need to transform these coordinates back to actual rho and theta values, inverting the transform you do when writing:
theta = d_theta * vx #angle between the x axis and the line connecting the origin with that closest point.
rho = x*cos(theta) + y*sin(theta) #distance from the origin to the closest point on the straight line
vy = rAxisSize/2 + int(rho/d_rho+0.5) #Berechne Y-Werte im hough space image
The inverse would be:
rho = (y - rAxisSize/2) * d_rho
theta = x * d_theta
First of all, following How to create a Minimal, Complete, and Verifiable example you should post or give a link to your image img3.pgm, if possible.
Then, you wrote that:
# Results of the print:
# ('max', 255, 255, 255)
# ('rho', 1, 1, 1)
# ('theta', 183, 184, 186)
so rho is the same for the three lines and theta is not so different varying between 183 and 186; so the three lines are almost equal each other and this fact does not depend on the method you use to get the line equation and draw it.
According to the tutorial Hough Line Transform it seems to me that your method for finding two points on a line is correct. That's is what the tutorial is suggesting and it seems to me equivalent to your code:
lines = cv2.HoughLines(edges,1,np.pi/180,200)
for rho,theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
I suspect the peak finding algorithm may not be correct.
Your peak finding algorithm finds the location of the largest peak and then the two locations very close to that maximum.
For the sake of simplicity see what happens in just one dimension, a peak finding algorithm is expected to find three peak locations at x=-1, x=0 and x=1 and the peak values should be close to .25, .5 and 1.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-2, 2, 1000)
y = np.exp(-(x-1)**2/0.01)+.5*np.exp(-(x)**2/0.01)+.25*np.exp(-(x+1)**2/0.01)
max1, max2, max3 = 0, 0, 0
m1 = np.zeros(1000)
m2 = np.zeros(1000)
m3 = np.zeros(1000)
x1position, x2position, x3position = 0, 0, 0
for i in range(0,1000):
value = y[i]
if(value > max1):
max1 = value
x1position = x[i]
elif(value > max2):
max2 = value
x2position = x[i]
elif(value > max3):
max3 = value
x3position = x[i]
m1[i] = max1
m2[i] = max2
m3[i] = max3
print('xposition',x1position, x2position, x3position )
print('max', max1, max2, max3)
plt.figure()
plt.subplot(4,1,1)
plt.plot(x, y)
plt.ylabel('$y$')
plt.subplot(4,1,2)
plt.plot(x, m1)
plt.ylabel('$max_1$')
plt.subplot(4,1,3)
plt.plot(x, m2)
plt.ylabel('$max_2$')
plt.subplot(4,1,4)
plt.plot(x, m3)
plt.xlabel('$x$')
plt.ylabel('$max_3$')
plt.show()
the output is
('xposition', 0.99899899899899891, 1.0030030030030028, 1.0070070070070072)
('max', 0.99989980471948192, 0.99909860379824966, 0.99510221871862647)
and it is not what expected.
Here you have a visual trace of the program:
To detect multiple peaks in a 2D field you should have a look for example at this Peak detection in a 2D array

resize image with bilinear interpolation in python

I want to resize image with bilinear interpolation. I found new intensity value but I do not know how can I use it.. The code is below which is I written..
def resizeImageBI(im,width,height):
temp = np.zeros((height,width),dtype=np.uint8)
ratio_1 = float(im.size[0] - 1 )/ float(width - 1)
ratio_0 = float(im.size[1] - 1) / float(height - 1)
xx,yy = np.mgrid[:height, :width]
xmap = np.around(xx * ratio_0)
ymap = np.around(yy * ratio_1)
for i in xrange(0, height):
for j in xrange(0,width):
temp[i][j]=im.getpixel( ( ymap[i][j], xmap[i][j]) ) * getNewIntensity(i,j,ratio_1,ratio_0)
return Image.fromarray(temp)
firstly get variable image width ratio and height ratio
lena.png 0.5 1
Orginal image is here
That is output accorting to written code
I just had to do this for a class and I haven't been graded yet, so you should check this out before using.
Basic Interpolation function
def interpolation(y0,x0, y1,x1, x):
frac = (x - x0) / (x1 - x0)
return y0*(1-frac) + y1 * frac
Step 1: Map the original coordinates to the newly resized image
def get_coords(im, W, H):
h,w = im.shape
x = np.arange(0,w+1,1) * W/w
y = np.arange(0,h+1,1) * H/h
return x,y
Step 2: Create a function to interpolate in the x-direction on all rows.
def im_interp(im, H,W):
X = np.zeros(shape=(W,H))
x, y = get_coords(im, W, H)
for i,v in enumerate(X):
y0_idx = np.argmax(y >i) - 1
for j,_ in enumerate(v):
# subtracting 1 because this is the first val
# that is greater than j, want the idx before that
x0_idx = np.argmax(x > j) - 1
x1_idx = np.argmax(j < x)
x0 = x[x0_idx]
x1 = x[x1_idx]
y0 = im[y0_idx, x0_idx - 1]
y1 = im[y0_idx, x1_idx - 1]
X[i,j] = interpolation(y0, x0, y1, x1, j)
return X
Step 3: Use function from the above step to interpolate twice. First on the image in the x-direction, then on the transpose of the newly created image (y-direction)
def im_resize(im,H,W):
X_lin = im_interp(im, H,W)
X = im_interp(X_lin.T, H,W)
return X_lin, X.T
I return both images just to look at the difference.
i'm not sure if you want to do this manually as an exercise...
if not, there is scipy.mics.imresize that can do what you want

How to find the Gaussian Weighted-Average and Standard deviation of a structural element

I'm trying to implement an Intensity Normalization algorithm that is described by this formula:
x' = (x - gaussian_weighted_average) / std_deviation
The paper I'm following describes that I have to find the gaussian weighted average and the standard deviation corresponding to each pixel "x" neighbors using a 7x7 kernel.
PS: x' is the new pixel value.
So, my question is: how can I compute a gaussian weighted average and the standard deviation for each pixel in image using a 7x7 kernel?
Does OpenCV provide any method to solve this?
import cv2
img = cv2.imread("b.png", 0)
widht = img.shape[0]
height = img.shape[1]
for i in range (widht):
for j in range (height):
new_image = np.zeros((height,width,1), np.uint8)
new_image[i][j] = img[i][j] - ...
The original implementation (C++) of the author can be found here: see GenerateIntensityNormalizedDatabase().
This has been re-implemented by another student in python. The python implementation is:
import cv2
import numpy as np
def StdDev(img, meanPoint, point, kSize):
kSizeX, kSizeY = kSize / 2, kSize / 2
ystart = point[1] - kSizeY if 0 < point[1] - kSizeY < img.shape[0] else 0
yend = point[1] + kSizeY + 1 if 0 < point[1] + kSizeY + 1 < img.shape[0] else img.shape[0] - 1
xstart = point[0] - kSizeX if 0 < point[0] - kSizeX < img.shape[1] else 0
xend = point[0] + kSizeX + 1 if 0 < point[0] + kSizeX + 1 < img.shape[1] else img.shape[1] - 1
patch = (img[ystart:yend, xstart:xend] - meanPoint) ** 2
total = np.sum(patch)
n = patch.size
return 1 if total == 0 or n == 0 else np.sqrt(total / float(n))
def IntensityNormalization(img, kSize):
blur = cv2.GaussianBlur(img, (kSize, kSize), 0, 0).astype(np.float64)
newImg = np.ones(img.shape, dtype=np.float64) * 127
for x in range(img.shape[1]):
for y in range(img.shape[0]):
original = img[y, x]
gauss = blur[y, x]
desvio = StdDev(img, gauss, [x, y], kSize)
novoPixel = 127
if desvio > 0:
novoPixel = (original - gauss) / float(desvio)
newVal = np.clip((novoPixel * 127 / float(2.0)) + 127, 0, 255)
newImg[y, x] = newVal
return newImg
To use the intensity normalization, you could do this:
kSize = 7
img = cv2.imread('{IMG_FILENAME}', cv2.IMREAD_GRAYSCALE).astype(np.float64)
out = IntensityNormalization(img, kSize)
To visualize the resulting image, don't forget to convert out back to np.uint8 (why?). I'd recommend you to use the original implementation in C++ if you want to reproduce his results.
Disclaimer: I'm from the same lab of the author of this paper.

Categories

Resources