I am using Python v2.7 for this work.
As an input i have a relatively white image with a clear black line on it. The line is always linear, no polynomial of second or above order. The line can be anyway on the image
I am trying to define the equation of this line in the form of y = ax +b
Currently my approach would be to find which pixel belongs to the line then do a linear regression to get the equation. But i am trying to find out which function in python i need to use to achieve this and this is where I would need some help
Or maybe you have an even simpler way of doing it.
adding one image as example
Okay so i found the way i wanted to do quite simply in the end
def estimate_coef(x, y):
# number of observations/points
n = np.size(x)
# mean of x and y vector
m_x, m_y = np.mean(x), np.mean(y)
# calculating cross-deviation and deviation about x
SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
# calculating regression coefficients
a = SS_xy / SS_xx
b = m_y - a*m_x
return(a, b)
# MAIN CODE
# 1. Read image
# 2. find where the pixel belonging to the line are located
# 3. perform linear regression to get coeff
image = [] # contain the image read
# for all images to analyze
for x in range(len(dut.images)):
print "\n\nimage ",x, dut.images[x]
# read image (convert to greyscale)
image = imread(dut.images[x], mode="L")
height = image.shape[0] - 1
threshold = (np.min(image) + np.max(image)) / 2
line = np.where(image < threshold) #get coordinate of the pixel belonging to the line
x = line[1] # store the x position
y = height - line[0] # store the y position. Need to invert because of image origine being on top left corner instead of bottom left
#position = np.array([x,y])
a, b = estimate_coef(x, y)
print("Estimated coefficients:\n \
a = %.6f \n \
b = %.6f" % (a, b))
Related
I try to rotate an image clockwise 45 degree and translate the image -50,-50.
Rotation process works fine:(I refer to this page:How do I rotate an image manually without using cv2.getRotationMatrix2D)
import numpy as np
import math
from scipy import ndimage
from PIL import Image
# inputs
img = ndimage.imread("A.png")
rotation_amount_degree = 45
# convert rotation amount to radian
rotation_amount_rad = rotation_amount_degree * np.pi / 180.0
# get dimension info
height, width, num_channels = img.shape
# create output image, for worst case size (45 degree)
max_len = int(math.sqrt(height*height + width*width))
rotated_image = np.zeros((max_len, max_len, num_channels))
#rotated_image = np.zeros((img.shape))
rotated_height, rotated_width, _ = rotated_image.shape
mid_row = int( (rotated_height+1)/2 )
mid_col = int( (rotated_width+1)/2 )
# for each pixel in output image, find which pixel
#it corresponds to in the input image
for r in range(rotated_height):
for c in range(rotated_width):
# apply rotation matrix, the other way
y = (r-mid_col)*math.cos(rotation_amount_rad) + (c-mid_row)*math.sin(rotation_amount_rad)
x = -(r-mid_col)*math.sin(rotation_amount_rad) + (c-mid_row)*math.cos(rotation_amount_rad)
# add offset
y += mid_col
x += mid_row
# get nearest index
#a better way is linear interpolation
x = round(x)
y = round(y)
#print(r, " ", c, " corresponds to-> " , y, " ", x)
# check if x/y corresponds to a valid pixel in input image
if (x >= 0 and y >= 0 and x < width and y < height):
rotated_image[r][c][:] = img[y][x][:]
# save output image
output_image = Image.fromarray(rotated_image.astype("uint8"))
output_image.save("rotated_image.png")
However, when I try to translate the image. I edited the above code to this:
if (x >= 0 and y >= 0 and x < width and y < height):
rotated_image[r-50][c-50][:] = img[y][x][:]
But I got something like this:
It seems the right and the bottom did not show the right pixel. How could I solve it?
Any suggestions would be highly appreciated.
The translation needs to be handled as a wholly separate step. Trying to translate the value from the source image doesn't account for newly created 0,0,0 (if RGB) valued pixels by the rotation.
Further, simply subtracting 50 from the rotated array index values, without validating them at that stage for positivity, is allowing for a negative valued index, which is fully supported by Python. That is why you are getting a "wrap" effect instead of a translation
You said your script rotated the image as intended, so while perhaps not the most efficient, the most intuitive is to simply shift the values of the image assembled after you rotate. You could test that the values for the new image remain positive after subtracting 50 and only saving the ones >= 0 or being cognizant of the fact that you are shifting the values downward by 50, any number less than 50 will be discarded and you get:
<what you in the block you said was functional then:>
translated_image = np.zeros((max_len, max_len, num_channels))
for i in range(0, rotated_height-50): # range(start, stop[, step])
for j in range(0, rotated_width-50):
translated_image[i+50][j+50][:] = rotated[i][j][:]
# save output image
output_image = Image.fromarray(translated_image.astype("uint8"))
output_image.save("rotated_translated_image.png")
I need to draw slanted lines like this programmatically using opencv-python, and it has to be similar in terms of the slant angle and the distance between the lines:
If using OpenCV cv.line() i need to supply the function with the line's start and endpoint.
Following this StackOverflow accepted answer, I think I will be able to know those two points, but first I need to calculate the line equation itself.
So what I have done is first I calculate the slant angle of the line using the measure tool in ai (The actual image was given by the graphic designer as ai (adobe illustrator) file), and I got 67deg and I solve the gradient of the line. But the problem is I don't know how to get the horizontal spacing/distance between the lines. I needed that so i can supply the start.X. I used the illustrator, and try to measure the distance between the lines but how to map it to opencv coordinate?
Overall is my idea feasible? Or is there a better way to achieve this?
Update 1:
I managed to draw this experimental image:
And this is code:
def show_image_scaled(window_name,image,height,width):
cv2.namedWindow(window_name,cv2.WINDOW_NORMAL)
cv2.resizeWindow(window_name,width,height)
cv2.imshow(window_name,image)
cv2.waitKey(0)
cv2.destroyAllWindows()
def slanted_lines_background():
canvas = np.ones((200,300)) * 255
end_x = 0
start_y = 0
m = 2.35
end_x = 0
for x in range(0,canvas.shape[1],10):
start_x = x
end_y = start_y + compute_length(m,start_x,start_y,end_x)
cv2.line(canvas,(start_x,start_y),(end_x,end_y),(0,0,0),2)
show_image_scaled("Slant",canvas,200,300)
def compute_length(m,start_x,start_y,end_x=0):
c = start_y - (m * start_x)
length_square = (end_x - start_x)**2 + ((m *end_x) + c - start_y) ** 2
length = math.sqrt(length_square)
return int(length)
Still working on to fill the left part of the rectangle
This code "shades" every pixel in a given image to produce your hatched pattern. Don't worry about the math. It's mostly correct. I've checked the edge cases for small and wide lines. The sampling isn't exactly correct but nobody's gonna notice anyway because the imperfection amounts to small fractions of a pixel. And I've used numba to make it fast.
import numpy as np
from numba import njit, prange
#njit(parallel=True)
def hatch(im, angle=45, stride=10, dc=None):
stride = float(stride)
if dc is None:
dc = stride * 0.5
assert 0 <= dc <= stride
stride2 = stride / 2
dc2 = dc / 2
angle = angle / 180 * np.pi
c = np.cos(angle)
s = np.sin(angle)
(height, width) = im.shape[:2]
for y in prange(height):
for x in range(width):
# distance to origin along normal
dist_origin = c*x - s*y
# distance to center of nearest line
dist_center = stride2 - abs((dist_origin % stride) - stride2)
# distance to edge of nearest line
dist_edge = dist_center - dc2
# shade pixel, with antialiasing
# use edge-0.5 to edge+0.5 as "gradient" <=> 1-sized pixel straddles edge
# for thick/thin lines, needs hairline handling
# thin line -> gradient hits far edge of line / pixel may span both edges of line
# thick line -> gradient hits edge of adjacent line / pixel may span adjacent line
if dist_edge > 0.5: # background
val = 0
else: # pixel starts covering line
val = 0.5 - dist_edge
if dc < 1: # thin line, clipped to line width
val = min(val, dc)
elif stride - dc < 1: # thick line, little background
val = max(val, 1 - (stride - dc))
im[y,x] = val
canvas = np.zeros((128, 512), 'f4')
hatch(canvas, angle=-23, stride=5, dc=2.5)
# mind the gamma mapping before imshow
I need to undistort the pixel coordinates of an image -- and need the corrected coordinates returned. I do not want an undistorted image returned-- just the corrected coordinates of the pixels. The camera is calibrated, and I have the camera intrinsic parameters, and the distortion matrix. I am using OpenCV in python 3
I have read up as much of the theory as I can find and questions here. Key info is:
https://docs.opencv.org/2.4/doc/tutorials/calib3d/camera_calibration/camera_calibration.html
This pretty clearly describes the radial distortion and tangential distortion that needs to be considered.
radial:
x_{corrected} = x( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6)
y_{corrected} = y( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6)
Tangential:
x_{corrected} = x + [ 2p_1xy + p_2(r^2+2x^2)]
y_{corrected} = y + [ p_1(r^2+ 2y^2)+ 2p_2xy]
I suspect that I can't simply apply these corrections sequentially. Perhaps there is a function to do what I want to do directly, anyway -- and I'd love to hear about that.
I can't simply use the normal undistort procedure on the image, as I am attempting to apply an IR camera's distortion correction to the depth data from the same camera. If you undistort a depth image like this -- you split pixels across coordinates and the answer makes no sense. Hopefully I am on the right track with this.
The code so far:
import numpy as np
import cv2
imgIR = cv2.imread("/20190529-150017-305-1235-depth.png",0)
#you could try this on any image...
h, w = imgIR.shape[:2]
X = np.array([i for i in range(0,w)]*(h))
X = X.reshape(h, w)
Y = np.array([[i]*(w) for i in range(0,h)])
fx = 483.0 #x focal length
fy = 490.2
CentreX = 361.4 #optical centre of the image - x axis
CentreY = 275.6
#Relative to the optical centre, it is possible to determine the `#coordinates of each pixel in the image`
#then do the above operation without loops using a scalar subtraction
Xref = X - CentreX
Yref = Y - CentreY
#"scaling factor" refers to the relation between depth units and meters;
scalingFactor = 18.0/36.0 # 18pixels / 36 mm;
# I'm not sure what should be yet -- whether [pixels at the shelf]/mm
#or mm/[pixels at the shelf]
Z = imgIR / scalingFactor
#using numpy
Xcoord = np.multiply(Xref,Z/fx)
Ycoord = np.multiply(Yref,Z/fy)
#how to correct these coords for the radial and tangential distortion?
#parameters as returned for the distortion matrix using
cv2.calibrateCamera
dstvec = array([[-0.1225, -0.0159, 0.001616, -0.0018924,-0.00120696]])
What I am looking for is a new matrix of undistorted (radial and tangential distortion removed) X coordinates and a matrix of undistored Y coordinates -- with each matrix element representing one of the original pixels.
Thanks for your help!
I think you are looking for OpenCV's undistortPoints (https://amroamroamro.github.io/mexopencv/matlab/cv.undistortPoints.html).
px_distorted = np.zeros((1, 1, 2), dtype=np.float32)
px_distorted[0][0][0] = x_coordinate
px_distorted[0][0][1] = y_coordinate
px_undistorted = cv2.undistortPoints(px_distorted, intrinsics_mat, dist_coefficients)
I have detected a line contour and I want to fit a line equation to describe it.
I tried least square fitting, but due to the perspective distortion, one end of the line is thicker and thus the line equation with drift to a side at one end.
I have also considered using zhang-suen thinning method, but such algorithm seems over-kill for a simple line
A simple and effective method is to compute the first principal component of the points on the line. Here is the code in matlab:
% Read image
im = imread('https://i.stack.imgur.com/pJ5Si.png');
% Binarize image and extract indices of line pixels
imbw = imbinarize(rgb2gray(im), 'global'); % Threshold with Otsu's method
[y, x] = ind2sub(size(imbw), find(imbw)); % Get indices of line pixels
% Extract first principal component
C = cov(x, y); % Compute covariance of x and y
coeff = pcacov(C); % Compute eigenvectors of C
vector_xy = [coeff(1,1), coeff(2,1)]; % Get fist principal component
% Plot
figure; imshow(im); hold on
xx = vector_xy(1) * [-1 1] * size(imbw,2) + mean(x(:));
yy = vector_xy(2) * [-1 1] * size(imbw,2) + mean(y(:));
plot(xx,yy,'c','LineWidth',2)
axis on, legend('Principal Axis','Location','NorthWest')
You can obtain the coefficients of the line equation y = a*x + b with
a = vector_xy(2) / vector_xy(1);
b = mean(y(:)) - a * mean(x(:));
I was reading this paper "Self-Invertible 2D Log-Gabor Wavelets" it defines 2D log gabor filter as such:
The paper also states that the filter only covers one side of the frequency space and shows that in this image
On my attempt to implement the filter I get results that do not match with what is said in the paper. Let me start with my implementation then I will state the problems.
Implementation:
I created a 2d array that contains the filter and transformed each index so that the origin of the frequency domain is at the center of the array with positive x-axis going right and positive y-axis going up.
number_scales = 5 # scale resolution
number_orientations = 9 # orientation resolution
N = constantDim # image dimensions
def getLogGaborKernal(scale, angle, logfun=math.log2, norm = True):
# setup up filter configuration
center_scale = logfun(N) - scale
center_angle = ((np.pi/number_orientations) * angle) if (scale % 2) \
else ((np.pi/number_orientations) * (angle+0.5))
scale_bandwidth = 0.996 * math.sqrt(2/3)
angle_bandwidth = 0.996 * (1/math.sqrt(2)) * (np.pi/number_orientations)
# 2d array that will hold the filter
kernel = np.zeros((N, N))
# get the center of the 2d array so we can shift origin
middle = math.ceil((N/2)+0.1)-1
# calculate the filter
for x in range(0,constantDim):
for y in range(0,constantDim):
# get the transformed x and y where origin is at center
# and positive x-axis goes right while positive y-axis goes up
x_t, y_t = (x-middle),-(y-middle)
# calculate the filter value at given index
kernel[y,x] = logGaborValue(x_t,y_t,center_scale,center_angle,
scale_bandwidth, angle_bandwidth,logfun)
# normalize the filter energy
if norm:
Kernel = kernel / np.sum(kernel**2)
return kernel
To calculate the filter value at each index another transform is made where we go to the log-polar space
def logGaborValue(x,y,center_scale,center_angle,scale_bandwidth,
angle_bandwidth, logfun):
# transform to polar coordinates
raw, theta = getPolar(x,y)
# if we are at the center, return 0 as in the log space
# zero is not defined
if raw == 0:
return 0
# go to log polar coordinates
raw = logfun(raw)
# calculate (theta-center_theta), we calculate cos(theta-center_theta)
# and sin(theta-center_theta) then use atan to get the required value,
# this way we can eliminate the angular distance wrap around problem
costheta, sintheta = math.cos(theta), math.sin(theta)
ds = sintheta * math.cos(center_angle) - costheta * math.sin(center_angle)
dc = costheta * math.cos(center_angle) + sintheta * math.sin(center_angle)
dtheta = math.atan2(ds,dc)
# final value, multiply the radial component by the angular one
return math.exp(-0.5 * ((raw-center_scale) / scale_bandwidth)**2) * \
math.exp(-0.5 * (dtheta/angle_bandwidth)**2)
Problems:
The angle: the paper stated that indexing the angles from 1->8 would produce good coverage of the orientation, but in my implementation angles from 1->n don't cover except for half orientations. Even the vertical orientation is not correctly covered. This can be shown in this figure which contains sets of filters of scale 3 and orientations ranging from 1->8:
The coverage: from filters above it is clear the filter covers both sides of the space which is not what the paper says. This can be made more explicit by using 9 orientations ranging from -4 -> 4. The following image contains all the filters in one image to show how it covers both sides of the spectrum (this image is created by taking the maximum at each location from all filters):
Middle Column (orientation $\pi / 2$): in the first figure in orientation from 3 -> 8 it can be seen that the filter vanishes at orientation $ \pi / 2$. Is this normal? This can be seen too when I combine all the filters(of all 5 scales and 9 orientations) in one image:
Update:
Adding the impulse response of the filter in spatial domain, as you can see there is an obvious distortion in -4 & 4 orientations:
After a lot of code analysis, I found that my implementation was correct but the getPolar function was messed up, so the code above should work just fine. This is the a new code without the getPolar function if any one was looking for it:
number_scales = 5 # scale resolution
number_orientations = 8 # orientation resolution
N = 128 # image dimensions
def getFilter(f_0, theta_0):
# filter configuration
scale_bandwidth = 0.996 * math.sqrt(2/3)
angle_bandwidth = 0.996 * (1/math.sqrt(2)) * (np.pi/number_orientations)
# x,y grid
extent = np.arange(-N/2, N/2 + N%2)
x, y = np.meshgrid(extent,extent)
mid = int(N/2)
## orientation component ##
theta = np.arctan2(y,x)
center_angle = ((np.pi/number_orientations) * theta_0) if (f_0 % 2) \
else ((np.pi/number_orientations) * (theta_0+0.5))
# calculate (theta-center_theta), we calculate cos(theta-center_theta)
# and sin(theta-center_theta) then use atan to get the required value,
# this way we can eliminate the angular distance wrap around problem
costheta = np.cos(theta)
sintheta = np.sin(theta)
ds = sintheta * math.cos(center_angle) - costheta * math.sin(center_angle)
dc = costheta * math.cos(center_angle) + sintheta * math.sin(center_angle)
dtheta = np.arctan2(ds,dc)
orientation_component = np.exp(-0.5 * (dtheta/angle_bandwidth)**2)
## frequency componenet ##
# go to polar space
raw = np.sqrt(x**2+y**2)
# set origin to 1 as in the log space zero is not defined
raw[mid,mid] = 1
# go to log space
raw = np.log2(raw)
center_scale = math.log2(N) - f_0
draw = raw-center_scale
frequency_component = np.exp(-0.5 * (draw/ scale_bandwidth)**2)
# reset origin to zero (not needed as it is already 0?)
frequency_component[mid,mid] = 0
return frequency_component * orientation_component