I tried to draw bounding box of text on a image.The image
is perspective-transformed with a given set of coefficients. The coordinates of text before transformation is known, and I want to calculate the coordinates of text after transformation.
To my understanding if I apply perspective transformation with the coefficients used in image transform to the text coordinates, I will get the resulting coordinates of the text after transformation. However, the text does not appear on the place it is supposed to be.
See the following graphs
The smaller white box bounds the text well because I know the coordinates of the text.
The smaller white box is not bounding the text because of some error during transforming the coordinates.
I follow the documentation reference for coefficients of perspective transformation
and find the coefficients of image transformation using the following code:origin of the code is from this answer
def find_coeffs(pa, pb):
'''
find the coefficients for perspective transform.
parameters:
pa : verticies in the resulting plane
pb : verticies in the current plane
retrun:
coeffs : 8- tuple
coefficents for PIL perspective transform
'''
matrix = []
for p1, p2 in zip(pa, pb):
matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
A = np.matrix(matrix, dtype=np.float)
B = np.array(pb).reshape(8)
res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
return np.array(res).reshape(8)
My code for text bounding box transformation:
# perspective transformation
a, b, c, d, e, f, g, h = coeffs
# return two vertices defining the bounding box
new_x0 = float(a * new_x0 - b * new_y0 + c) / float(g * new_x0 + h * new_y0 + 1)
new_y0 = float(d * new_x0 + e * new_y0 + f) / float(g * new_x0 + h * new_y0 + 1)
new_x1 = float(a * new_x1 - b * new_y1 + c) / float(g * new_x1 + h * new_y1 + 1)
new_y1 = float(d * new_x1 + e * new_y1 + f) / float(g * new_x1 + h * new_y1 + 1)
I also went to Pillow Github, but I could not find the source code where perspective transformation is defined.
Some more info about the math of perspective transformation. The Geometry of Perspective Drawing on the Computer
Thanks.
To compute the new point after a transformation you should get the coefficients from A -> B not from B -> A, which is the standard from PIL library. As example:
# A1, B1 ... are points
# direct transform
coefs = find_coefs([B1, B2, B3, B4], [A1, A2, A3, A4])
# inverse transform
coefs_inv = find_coefs([A1, A2, A3, A4], [B1, B2, B3, B4])
You call the image.transform() function using the coefs_inv but calculate the new point using coefs to get something like this:
img = image.transform(((1500,800)),
method=Image.PERSPECTIVE,
data=coefs_inv)
a, b, c, d, e, f, g, h = coefs
old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))
old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))
Full code below:
import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
def find_coefs(original_coords, warped_coords):
matrix = []
for p1, p2 in zip(original_coords, warped_coords):
matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
A = np.matrix(matrix, dtype=np.float)
B = np.array(warped_coords).reshape(8)
res = np.dot(np.linalg.inv(A.T * A) * A.T, B)
return np.array(res).reshape(8)
coefs = find_coefs(
[(867,652), (1020,580), (1206,666), (1057,757)],
[(700,732), (869,754), (906,916), (712,906)]
)
coefs_inv = find_coefs(
[(700,732), (869,754), (906,916), (712,906)],
[(867,652), (1020,580), (1206,666), (1057,757)]
)
image = Image.open('sample.png')
img = image.transform(((1500,800)),
method=Image.PERSPECTIVE,
data=coefs_inv)
a, b, c, d, e, f, g, h = coefs
old_p1 = [50, 100]
x,y = old_p1
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p1 = (int(new_x),int(new_y))
old_p2 = [400, 500]
x,y = old_p2
new_x = (a * x + b * y + c) / (g * x + h * y + 1)
new_y = (d * x + e * y + f) / (g * x + h * y + 1)
new_p2 = (int(new_x),int(new_y))
plt.figure()
plt.imshow(image)
plt.scatter([old_p1[0], old_p2[0]],[old_p1[1], old_p2[1]] , s=150, marker='.', c='b')
plt.show()
plt.figure()
plt.imshow(img)
plt.scatter([new_p1[0], new_p2[0]],[new_p1[1], new_p2[1]] , s=150, marker='.', c='r')
plt.show()
Related
enter code hereI want to color a point-cloud by its intensity.
currently I use the following for-loop to apply a colormap function to the intensity(4th-dim) of points:
import numpy as np
points = np.random.random([128*1200,4])
points_colors = np.zeros([points.shape[0], 3])
for idx, p_c in enumerate(points[:, 3]):
points_colors[idx, :] = color_map(p_c)
points_colors /= 255.0
an example of color mapping function:
def color_map( value, minimum=0, maximum=255):
minimum, maximum = float(minimum), float(maximum)
ratio = 2 * (value-minimum) / (maximum - minimum)
b = int(max(0, 255*(1 - ratio)))
r = int(max(0, 255*(ratio - 1)))
g = 255 - b - r
return r, g, b
Coloring the point clouds consumes much more time than directly use open3d's original colormap(i.e. color by points' x,y,z-pose)
How could I accelerate the process of color-mapping point-clouds by its intensity?
Other solution that does not convert xyzi-point-cloud to xyzrgb-point-cloud is also welcomed.
Ps. the color_map I am actually using is a bit more complicated but has same output:
def rainbow_color_map(
val,
minval = 0
maxval=256,
normalize=False,
colors=[(1, 1, 255), (1, 255, 1), (255, 1, 1)] * 10,
):
i_f = float(val - minval) / float(maxval - minval) * (len(colors) - 1)
i, f = int(i_f // 1), i_f % 1 # Split into whole & fractional parts.
(r1, g1, b1), (r2, g2, b2) = colors[i], colors[i + 1]
if normalize:
return (
(r1 + f * (r2 - r1)) / maxval,
(g1 + f * (g2 - g1)) / maxval,
(b1 + f * (b2 - b1)) / maxval,
)
else:
return r1 + f * (r2 - r1), g1 + f * (g2 - g1), b1 + f * (b2 - b1)
You can modify the function to calculate the array as a whole without using a loop:
def color_map(minimum, maximum, value):
minimum, maximum = float(minimum), float(maximum)
ratio = 2 * (value-minimum) / (maximum - minimum)
b = 255*(1 - ratio)
b[b<0] = 0
b = b.astype(int)
r = 255*(ratio - 1)
r[r<0] = 0
r = r.astype(int)
g = 255 - b - r
points_colors = np.c_[r, g, b]
return points_colors
Then call the function like this :
import numpy as np
points = np.random.random([128*1200,4])
minimum, maximum = np.min(points[:, 3]), np.max(points[:, 3])
points_colors = color_map(minimum, maximum, points[:, 3])
I want to find the x, y coordinate of an object moving in a bezier curve after it moved for a certain amount of time.
I have included a simple graph of the object trajectory here.
Consider the following known attributes of the object:
startingXY = [900, 450]
destinationXY = [-300, -600]
innerXY = [100, -50]
move_time = 15 # Moving duration in seconds
I tried using the below formula to calculate the (x,y), but it does not appear to be correct.
def calculate(a, b, c, d, t):
return math.pow(1 - t, 3) * a + 3 * t * math.pow(1 - t, 2) * b + 3 * math.pow(t, 2) * (1 - t) * c + math.pow(t, 3) * d
t = (now - start_time) * 0.1
x = calculate(0, startx, innerx, destinationx, t)
y = calculate(0, starty, innery, destinationy, t)
I somewhat solved my problem with the following fomula:
def the_math(a, b, c, t, p, f):
return (math.pow((1 - t), 2) * a + 2 * math.pow((1 - t), t) * b + math.pow(t, 2) * c) + p * f
elapsed_time = round(now - start)
t = elapsed_time / self.config.move_time[fish.fish_type]
x = the_math(path[0][0], path[1][0], path[2][0], t, 667, 0.85)
y = the_math(path[0][1], path[1][1], path[2][1], t, 375, 0.8)
print(x,y)
What would be a better formula to calculate the (x,y) coordinates?
I am seeking to find a finite difference solution to the 1D Nonlinear PDE
u_t = u_xx + u(u_x)^2
Code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
import math
'''
We explore three different numerical methods for solving the PDE, with solution u(x, t),
u_t = u_xx + u(u_x)^2
for (x, t) in (0, 1) . (0, 1/5)
u(x, 0) = 40 * x^2 * (1 - x) / 3
u(0, t) = u(1, t) = 0
'''
M = 30
dx = 1 / M
r = 0.25
dt = r * dx**2
N = math.floor(0.2 / dt)
x = np.linspace(0, 1, M + 1)
t = np.linspace(0, 0.2, N + 1)
U = np.zeros((M + 1, N + 1)) # Initial array for solution u(x, t)
U[:, 0] = 40 * x**2 * (1 - x) / 3 # Initial condition (: for the whole of that array)
U[0, :] = 0 # Boundary condition at x = 0
U[-1, :] = 0 # Boundary condition at x = 1 (-1 means end of the array)
'''
Explicit Scheme - Simple Forward Difference Scheme
'''
for q in range(0, N - 1):
for p in range(0, M - 1):
b = 1 / (1 - 2 * r)
C = r * U[p, q] * (U[p + 1, q] - U[p, q])**2
U[p, q + 1] = b * (U[p, q] + r * (U[p + 1, q + 1] + U[p - 1, q + 1]) - C)
T, X = np.meshgrid(t, x)
fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(T, X, U)
#fig.colorbar(surf, shrink=0.5, aspect=5) # colour bar for reference
ax.set_xlabel('t')
ax.set_ylabel('x')
ax.set_zlabel('u(x, t)')
plt.tight_layout()
plt.savefig('FDExplSol.png', bbox_inches='tight')
plt.show()
The code I use produces the following error:
overflow encountered in double_scalars
C = r * U[p, q] * (U[p + 1, q] - U[p, q])**2
invalid value encountered in double_scalars
U[p, q + 1] = b * (U[p, q] + r * (U[p + 1, q + 1] + U[p - 1, q + 1]) - C)
invalid value encountered in double_scalars
C = r * U[p, q] * (U[p + 1, q] - U[p, q])**2
Z contains NaN values. This may result in rendering artifacts.
surf = ax.plot_surface(T, X, U)
I've looked up these errors and I assume that the square term generates values too small for the dtype. However when I try changing the dtype to account for a larger range of numbers (np.complex128) I get the same error.
The resulting plot obviously has most of its contents missing. So, my question is, what do I do?
Discretisation expression was incorrect.
Should be
for q in range(0, N - 1):
for p in range(0, M - 1):
U[p, q + 1] = r * (U[p + 1, q] - 2 * U[p, q] + U[p - 1, q]) + r * U[p, q] * (U[p + 1, q] - U[p, q])
I have implemented equalization for HSI color based images. I used numpy and math modules.
Firstly, I convert RGB image into HSI using this functions:
import math
import numpy as np
def rgb2hsi_px(px):
eps = 0.00000001
r, g, b = float(px[0]) / 255, float(px[1]) / 255, float(px[2]) / 255
# Hue component
numerator = 0.5 * ((r - g) + (r - b))
denominator = math.sqrt((r - g) ** 2 + (r - b) * (g - b))
theta = math.acos(numerator / (denominator + eps))
h = theta
if b > g:
h = 2 * math.pi - h
# Saturation component
num = min(r, g, b)
den = r + g + b
if den == 0:
den = eps
s = 1 - 3 * num / den
if s == 0:
h = 0
# Intensity component
i = (r + g + b) / 3
return h, s, i
def rgb2hsi(image):
hsi_image = np.zeros_like(image).astype('float')
height, width, _ = image.shape
for x in range(height):
for y in range(width):
px = rgb2hsi_px(image[x, y])
hsi_image[x, y] = px
return np.array(hsi_image)
Then I equalize an intensity value of converted image. The equalize function was implemented using this article:
import math
import numpy as np
def equalize(img):
eps = 0.000000000001
h, w, _ = img.shape
num_of_pxs = h * w
mean = 0.0
new_img = np.array(img)
while not abs(mean - 0.5) < eps:
for i in range(h):
for j in range(w):
mean += new_img[i, j, 2]
mean /= num_of_pxs
if mean != 0.5:
theta = math.log(0.5, math.e) / math.log(mean, math.e)
for x in range(h):
for y in range(w):
px = list(new_img[x, y])
px[2] = (px[2] ** theta)
new_img[x, y] = px
return new_img
After, I convert HSI image back to RGB using the next code:
import math
import numpy as np
def hsi2rgb_px(px):
h, s, i = float(px[0]), float(px[1]), float(px[2]) * 255
if 0 <= h < 2 * math.pi / 3:
b = i * (1 - s)
r = i * (1 + (s * math.cos(h)) / math.cos(math.pi / 3 - h))
g = 3 * i - (r + b)
elif 2 * math.pi / 3 <= h < 4 * math.pi / 3:
r = i * (1 - s)
g = i * (1 + (s * math.cos(h - 2 * math.pi / 3) / math.cos(math.pi / 3 - (h - 2 * math.pi / 3))))
b = 3 * i - (r + g)
elif 4 * math.pi / 3 <= h <= 2 * math.pi:
g = i * (1 - s)
b = i * (1 + (s * math.cos(h - 4 * math.pi / 3) / math.cos(math.pi / 3 - (h - 4 * math.pi / 3))))
r = 3 * i - (g + b)
else:
raise IndexError('h is out of range: {}'.format(h))
return round(r), round(g), round(b)
def hsi2rgb(image):
rgb_image = np.zeros_like(image).astype(np.uint8)
height, width, _ = image.shape
for x in range(height):
for y in range(width):
px = hsi2rgb_px(image[x, y])
rgb_image[x, y] = px
return np.array(rgb_image)
But an equalization gives an incorrect result. The size (in megabytes) of equalized image is larger than the original one. I'm not sure if it's normal but if yes, please, let me know. And another problem is that an output image has worse quality.
Here is an original image:
And the equalized image:
Can someone help me to fix my code, or reference me to similar article/question?
[UPDATE]
Driver program to test an algorithm:
import matplotlib.image as mp_img
input_img = mp_img.imread('input.bmp')
hsi_img = rgb2hsi(input_img)
equalized_img = equalize(hsi_img)
out_img = hsi2rgb(equalized_img)
mp_img.imsave('out.bmp', out_img)
I am trying to rotate and translate arbitrary planes around some arbitrary axis.
For testing purposes I have written a simple python program that rotates a random plane around the X axis in degrees.
Unfortunately when checking the angle between the planes I get inconsistent results. This is the code:
def angle_between_planes(plane1, plane2):
plane1 = (plane1 / np.linalg.norm(plane1))[:3]
plane2 = (plane2/ np.linalg.norm(plane2))[:3]
cos_a = np.dot(plane1.T, plane2) / (np.linalg.norm(plane1) * np.linalg.norm(plane2))
print(np.arccos(cos_a)[0, 0])
def test():
axis = np.array([1, 0, 0])
theta = np.pi / 2
translation = np.array([0, 0, 0])
T = get_transformation(translation, axis * theta)
for i in range(1, 10):
source = np.append(np.random.randint(1, 20, size=3), 0).reshape(4, 1)
target = np.dot(T, source)
angle_between_planes(source, target)
It prints:
1.21297144225
1.1614420953
1.48042948278
1.10098697889
0.992418096794
1.16954303911
1.04180591409
1.08015300394
1.51949177153
When debugging this code I see that the transformation matrix is correct, as it shows that it is
I'm not sure what's wrong and would love any assistance here.
*
The code that generates the transformation matrix is:
def get_transformation(translation_vec, rotation_vec):
r_4 = np.array([0, 0, 0, 1]).reshape(1, 4)
rotation_vec= rotation_vec.reshape(3, 1)
theta = np.linalg.norm(rotation_vec)
axis = rotation_vec/ theta
R = get_rotation_mat_from_axis_and_angle(axis, theta)
T = translation_vec.reshape(3, 1)
R_T = np.append(R, T, axis = 1)
return np.append(R_T, r_4, axis=0)
def get_rotation_mat_from_axis_and_angle(axis, theta):
axis = axis / np.linalg.norm(axis)
a, b, c = axis
omct = 1 - np.cos(theta)
ct = np.cos(theta)
st = np.sin(theta)
rotation_matrix = np.array([a * a * omct + ct, a * b * omct - c * st, a * c * omct + b * st,
a * b * omct + c * st, b * b * omct + ct, b * c * omct - a * st,
a * c * omct - b * st, b * c * omct + a * st, c * c * omct + ct]).reshape(3, 3)
rotation_matrix[abs(rotation_matrix) < 1e-8] = 0
return rotation_matrix
The source you generate is not a vector. In order to be one, it should have its fourth coordinate equal to zero.
You could generate valid ones with:
source = np.append(np.random.randint(1, 20, size=3), 0).reshape(4, 1)
Note that your code can't be tested as you pasted it in your question: for example, vec = vec.reshape(3, 1) in get_transformation uses vec that hasn't been defined anywhere before...