OpenCV / Python related:
Given a photo of a round object, how can you output that object flattened, while adjusting for surface area? Here is an example image of an input:
Soccer ball
It is similar to adjusting for camera distortion (turning a round object into flat one), but in this case the distortion comes from the object itself and not the camera.
Distorted image:
Undistorted image:
Any suggestions would help. Thank you!
Edit: The package squircle is just what I needed, thank you fmw42!
Here is a solution in Python/OpenCV. It creates transformation maps that define the equations from output back to input and applies them using cv2.remap(). The equations come from https://arxiv.org/pdf/1509.06344.pdf for the Elliptical Grid Mapping approach.
Input:
import numpy as np
import cv2
import math
# References:
# https://arxiv.org/pdf/1509.06344.pdf
# http://squircular.blogspot.com/2015/09/mapping-circle-to-square.html
# Evaluate:
# u = x*sqrt(1-y**2/2)
# v = y*sqrt(1-x**2/2)
# u,v are input circle coordinates and x,y are output square coordinates
# read input
img = cv2.imread("rings.png")
# get dimensions and center
h, w = img.shape[:2]
xcent = w / 2
ycent = h / 2
# set up the maps as float32 from output square (x,y) to input circle (u,v)
map_u = np.zeros((h, w), np.float32)
map_v = np.zeros((h, w), np.float32)
# create u and v maps where x,y is measured from the center and scaled from -1 to 1
for y in range(h):
Y = (y - ycent)/ycent
for x in range(w):
X = (x - xcent)/xcent
map_u[y, x] = xcent * X * math.sqrt(1 - 0.5*Y**2) + xcent
map_v[y, x] = ycent * Y * math.sqrt(1 - 0.5*X**2) + ycent
# do the remap
result = cv2.remap(img, map_u, map_v, cv2.INTER_LINEAR, borderMode = cv2.BORDER_REFLECT_101, borderValue=(0,0,0))
# save results
cv2.imwrite("rings_circle2square.png", result)
# display images
cv2.imshow('img', img)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
Here is another example:
Input:
Result:
And here is a 3rd example:
Input:
Result:
ADDITION
Here is an alternate approach based upon the Simple Stretch equations in the reference above:
import numpy as np
import cv2
import math
# References:
# https://arxiv.org/pdf/1509.06344.pdf
# Simple stretch equations
# read input
img = cv2.imread("rings.png")
#img = cv2.imread("ICM.png")
#img = cv2.imread("soccerball_small.jpg")
# get dimensions and center
h, w = img.shape[:2]
xcent = w / 2
ycent = h / 2
# set up the maps as float32 from output square (x,y) to input circle (u,v)
map_u = np.zeros((h, w), np.float32)
map_v = np.zeros((h, w), np.float32)
# create u and v maps where x,y is measured from the center and scaled from -1 to 1
# note: copysign(1,x) is signum(x) and returns 1 ,0, or -1 depending upon sign of x
for y in range(h):
Y = (y - ycent)/ycent
for x in range(w):
X = (x - xcent)/xcent
X2 = X*X
Y2 = Y*Y
XY = X*Y
R = math.sqrt(X2+Y2)
if R == 0:
map_u[y, x] = xcent
map_v[y, x] = ycent
elif X2 >= Y2:
map_u[y, x] = xcent * math.copysign(1, X) * X2/R + xcent
map_v[y, x] = ycent * math.copysign(1, X) * XY/R + ycent
else:
map_u[y, x] = xcent * math.copysign(1, Y) * XY/R + xcent
map_v[y, x] = ycent * math.copysign(1, Y) * Y2/R + ycent
# do the remap
result = cv2.remap(img, map_u, map_v, cv2.INTER_LINEAR, borderMode = cv2.BORDER_REFLECT_101, borderValue=(0,0,0))
# save results
cv2.imwrite("rings_circle2square2.png", result)
#cv2.imwrite("ICM_circle2square2.png", result)
#cv2.imwrite("soccerball_small_circle2square2.png", result)
# display images
cv2.imshow('img', img)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Input:
Result:
Input:
Result:
Input:
Result:
Related
Trying to convert Kitti label format to Yolo. But after converting the bbox is misplaced.
this is kitti bounding box
This is conversion code:
def convertToYoloBBox(bbox, size):
# Yolo uses bounding bbox coordinates and size relative to the image size.
# This is taken from https://pjreddie.com/media/files/voc_label.py .
dw = 1. / size[0]
dh = 1. / size[1]
x = (bbox[0] + bbox[1]) / 2.0
y = (bbox[2] + bbox[3]) / 2.0
w = bbox[1] - bbox[0]
h = bbox[3] - bbox[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return (x, y, w, h)
convert =convertToYoloBBox([kitti_bbox[0],kitti_bbox[1],kitti_bbox[2],kitti_bbox[3]],image.shape[:2])
The function does some normalization which is essential for yolo and outputs following:
(0.14763590391908976,
0.3397063758389261,
0.20452591656131477,
0.01810402684563757)
but when i try to check if the normalization is being done correctly with this code:
x = int(convert[0] * image.shape[0])
y = int(convert[1] * image.shape[1])
width = x+int(convert[2] * image.shape[0])
height = y+ int(convert[3] * image.shape[1])
cv.rectangle(image, (int(x), int(y)), (int(width), int(height)), (255,0,0), 2 )
the bounding box is misplaced:
Any suggestions ? Is conversion fucntion correct? or the problem is in the checking code ?
You got the centroid calculation wrong.
Kitti labels are given in the order of left, top, right, and bottom.
to get the centroid you have to do (left + right)/ 2 and (top + bottom)/2
so your code will become
x = (bbox[0] + bbox[2]) / 2.0
y = (bbox[1] + bbox[3]) / 2.0
w = bbox[2] - bbox[0]
h = bbox[3] - bbox[1]
This question already has answers here:
Segmenting License Plate Characters
(2 answers)
Closed 2 years ago.
So, I need to build a homomorphic filter, but my code seems to be wrong. I don't know if it's execution or if it's some detail I don't know about in python, but I do know that it's wrong. I'd love som insights on what I can do to improve it.
I'm using
image as input reference, because it's on the Ricardo C. Gonzales book of DIP and I know how the output should look like. I'm even using the same parameters the book used in it's filter but it isn't working.
Gonzalez's Input and output, respectively:
.
My output:
My code is as follows:
# coding: utf-8
import cv2
import numpy as np
from matplotlib import pyplot as plt
tss = cv2.imread("The_Seventh_Seal_1.jpg", 0)
mc = cv2.imread("mussels_cave_050.JPG", 0)
sh = cv2.imread("shelter_homomorphic.bmp", 0)
pet = cv2.imread("pet.png", 0)
def filtro_gaussiano_livro(img, gl, gh, inc, Dz):
im = np.copy(img)
P = im.shape[0] / 2
Q = im.shape[1] / 2
h = np.zeros(im.shape)
U, V = np.meshgrid(range(im.shape[0]), range(im.shape[1]), sparse=False, indexing='ij')
d = ((U - P) ** 2 + (V - Q) ** 2).astype(float)
d0 = Dz
c = inc
h = (gh - gl) * (1 - (np.exp(-c * (d / (d0 ** 2))))) + gl
return h
def filtro_gaussiano(img, Dz):
im = np.copy(img)
P = im.shape[0] / 2
Q = im.shape[1] / 2
h = np.zeros(im.shape)
U, V = np.meshgrid(range(im.shape[0]), range(im.shape[1]), sparse=False, indexing='ij')
d = (((U - P) ** 2) + ((V - Q) ** 2)).astype(float)
h = 1 - np.exp(-(d / (2 * (Dz ** 2))))
return h
def uint8_conv(img):
mat = np.copy(img)
for i in range(mat.shape[0]):
for j in range(mat.shape[1]):
if mat[i, j] < 0:
mat[i, j] = 0
elif mat[i, j] > 255:
mat[i, j] = 255
else:
mat[i, j] = mat[i, j]
return np.uint8(mat)
def reescalona(img, min, max):
mat = np.copy(img)
ph = cv2.add(min, (
cv2.divide((cv2.multiply((cv2.subtract(mat, np.min(mat))), (max - min))), (np.max(mat) - np.min(mat)))))
rtn = np.uint8(ph)
return rtn
def homomorfica(img, l, s):
im = np.float64(np.copy(img))
cv2.imshow("BORDER", im)
if s == 0:
f = filtro_gaussiano(im, l)
elif s == 1:
f = filtro_gaussiano_livro(im, 0.05, 3.5, 1, l)
cv2.imshow("gauss " + str(s), f)
im_log = np.log1p(im)
Im_shift = np.fft.fftshift(np.fft.fft2(im_log))
Im_fft_filt = np.multiply(f, Im_shift)
cv2.imshow("FFT Shift", uint8_conv(np.real(Im_shift)))
Im_filt = np.real(np.fft.ifft2(np.fft.ifftshift(Im_fft_filt)))
Im = np.exp(Im_filt) - 1
Im = reescalona(Im, 0, 255)
return uint8_conv(Im)
# def notch(img):
raio = 2500
i = pet
a = homomorfica(i, raio, 0)
b = homomorfica(i, raio, 1)
cv2.imshow("Imagem original", i)
cv2.imshow("Filtro homofobico comum", a)
cv2.imshow("Filtro homofobico do livro", b)
k = 0
while k != 27:
k = cv2.waitKey(0)
cv2.destroyAllWindows()
Here is one way to do homomorphic filtering in the frequency domain using Python/Numpy/OpenCV.
I believe your issue is just your filtering. I will show two different filters below that vary in radius of the circle and Gaussian filtering.
Read the input as grayscale
Take the natural log of the input
Do FFT to real/imaginary components
Shift the FFT so DC point is in the center
Create a black circular mask on a white background of small radius
Apply Gaussian blur to the mask
Shift the FFT so DC point is at the top left corner
Do IFFT and convert to a simple real image
Take the exponential of the IFFT
Stretch that to the range 0 to 255
Save the result
import numpy as np
import cv2
# read input and convert to grayscale
img = cv2.imread('person.png', cv2.IMREAD_GRAYSCALE)
hh, ww = img.shape[:2]
# take ln of image
img_log = np.log(np.float64(img), dtype=np.float64)
# do dft saving as complex output
dft = np.fft.fft2(img_log, axes=(0,1))
# apply shift of origin to center of image
dft_shift = np.fft.fftshift(dft)
# create black circle on white background for high pass filter
#radius = 3
radius = 13
mask = np.zeros_like(img, dtype=np.float64)
cy = mask.shape[0] // 2
cx = mask.shape[1] // 2
cv2.circle(mask, (cx,cy), radius, 1, -1)
mask = 1 - mask
# antialias mask via blurring
#mask = cv2.GaussianBlur(mask, (7,7), 0)
mask = cv2.GaussianBlur(mask, (47,47), 0)
# apply mask to dft_shift
dft_shift_filtered = np.multiply(dft_shift,mask)
# shift origin from center to upper left corner
back_ishift = np.fft.ifftshift(dft_shift_filtered)
# do idft saving as complex
img_back = np.fft.ifft2(back_ishift, axes=(0,1))
# combine complex real and imaginary components to form (the magnitude for) the original image again
img_back = np.abs(img_back)
# apply exp to reverse the earlier log
img_homomorphic = np.exp(img_back, dtype=np.float64)
# scale result
img_homomorphic = cv2.normalize(img_homomorphic, None, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)
# write result to disk
cv2.imwrite("person_dft_numpy_mask.png", (255*mask).astype(np.uint8))
cv2.imwrite("person_dft_numpy_homomorphic.png", img_homomorphic)
cv2.imshow("ORIGINAL", img)
cv2.imshow("MASK", mask)
cv2.imshow("FILTERED DFT/IFT ROUND TRIP", img_back)
cv2.imshow("HOMOMORPHIC", img_homomorphic)
cv2.waitKey(0)
cv2.destroyAllWindows()
High Pass Filter Mask and Homomorphic Result for radius=3 and blur=7:
High Pass Filter Mask and Homomorphic Result for radius=13 and blur=47:
A lot of questions about removing radial (or barrel) distortion, but how would I add it?
Visually, I want to take my input, which is presumed to be image (a), and distort it, to be like image (b):
And ideally I'd like a tunable parameter of some "radius" to control "how much distortion" I get. Based on what I want to do, it looks like I'd just need one parameter to control the 'radius of distortion' or whatever it would be called (correct me if I'm wrong).
How can I achieve this with OpenCV? I figure it must be possible because a lot of people try going the other way for things like this. I'm just not as familiar with the proper math operations and library calls to do it.
Any help much appreciated, cheers.
The program below creates barrel distortion
from wand.image import Image
import numpy as np
import cv2
with Image(filename='Path/to/Img/') as img:
print(img.size)
img.virtual_pixel = 'transparent'
img.distort('barrel', (0.1, 0.0, 0.0, 1.0)) # play around these values to create distortion
img.save(filename='filname.png')
# convert to opencv/numpy array format
img_opencv = np.array(img)
# display result with opencv
cv2.imshow("BARREL", img_opencv)
cv2.waitKey(0)
cv2.destroyAllWindows()
Here is one way to produce barrel or pincushion distortion in Python/OpenCV by creating the X and Y distortion maps and then using cv.remap() to do the warping.
Input:
import numpy as np
import cv2 as cv
import math
img = cv.imread('lena.jpg')
# grab the dimensions of the image
(h, w, _) = img.shape
# set up the x and y maps as float32
map_x = np.zeros((h, w), np.float32)
map_y = np.zeros((h, w), np.float32)
scale_x = 1
scale_y = 1
center_x = w/2
center_y = h/2
radius = w/2
#amount = -0.75 # negative values produce pincushion
amount = 0.75 # positive values produce barrel
# create map with the barrel pincushion distortion formula
for y in range(h):
delta_y = scale_y * (y - center_y)
for x in range(w):
# determine if pixel is within an ellipse
delta_x = scale_x * (x - center_x)
distance = delta_x * delta_x + delta_y * delta_y
if distance >= (radius * radius):
map_x[y, x] = x
map_y[y, x] = y
else:
factor = 1.0
if distance > 0.0:
factor = math.pow(math.sin(math.pi * math.sqrt(distance) / radius / 2), amount)
map_x[y, x] = factor * delta_x / scale_x + center_x
map_y[y, x] = factor * delta_y / scale_y + center_y
# do the remap
dst = cv.remap(img, map_x, map_y, cv.INTER_LINEAR)
# save the result
#cv.imwrite('lena_pincushion.jpg',dst)
cv.imwrite('lena_barrel.jpg',dst)
# show the result
cv.imshow('src', img)
cv.imshow('dst', dst)
cv.waitKey(0)
cv.destroyAllWindows()
Barrel (positive amount):
Pincushion (negative amount):
Description:
I have this data represented in a cartesian coordinate system with 256 columns and 640 rows. Each column represents an angle, theta, from -65 deg to 65 deg. Each row represents a range, r, from 0 to 20 m.
An example is given below:
With the following code I try to make a grid and transform each pixel location to the location it would have on a polar grid:
def polar_image(image, bearings):
(h,w) = image.shape
x_max = (np.ceil(np.sin(np.deg2rad(np.max(bearings)))*h)*2+1).astype(int)
y_max = (np.ceil(np.cos(np.deg2rad(np.min(np.abs(bearings))))*h)+1).astype(int)
blank = np.zeros((y_max,x_max,1), np.uint8)
for i in range(w):
for j in range(h):
X = (np.sin(np.deg2rad( bearings[i]))*j)
Y = (-np.cos(np.deg2rad(bearings[i]))*j)
blank[(Y+h).astype(int),(X+562).astype(int)] = image[h-1-j,w-1-i]
return blank
This returns an image as below:
Questions:
This is sort of what I actually want to achieve except from two things:
1) there seem to be some artifacts in the new image and also the mapping seems a bit coarse.
Does someone have a suggestion on how to interpolate to get rid of this?
2) The image remains in a Cartesian representation, meaning that I don't have any polar gridlines, nor can I visualize intervals of range/angle.
Anybody know how to visualize the polar grids with axis ticks in theta and range?
You can use pyplot.pcolormesh() to plot the converted mesh:
import numpy as np
import pylab as pl
img = pl.imread("c:/tmp/Wnov4.png")
angle_max = np.deg2rad(65)
h, w = img.shape
angle, r = np.mgrid[-angle_max:angle_max:h*1j, 0:20:w*1j]
x = r * np.sin(angle)
y = r * np.cos(angle)
fig, ax = pl.subplots()
ax.set_aspect("equal")
pl.pcolormesh(x, y, img, cmap="gray");
or you can use the remap() in OpenCV to convert it to a new image:
import cv2
import numpy as np
from PIL import Image
img = cv2.imread(r"c:/tmp/Wnov4.png", cv2.IMREAD_GRAYSCALE)
angle_max = np.deg2rad(65)
r_max = 20
x = np.linspace(-20, 20, 800)
y = np.linspace(20, 0, 400)
y, x = np.ix_(y, x)
r = np.hypot(x, y)
a = np.arctan2(x, y)
map_x = r / r_max * img.shape[1]
map_y = a / (2 * angle_max) * img.shape[0] + img.shape[0] * 0.5
img2 = cv2.remap(img, map_x.astype(np.float32), map_y.astype(np.float32), cv2.INTER_CUBIC)
Image.fromarray(img2)
I need to synthesize many FishEye images with different intrinsic matrices based on normal pictures. I am following the method mentioned in this paper.
Ideally, if the algorithm is correct, the ideal fish eye effect should look like this:
.
But when I used my algorithm to convert a picture
it looks like this
So below is my code's flow:
1. First, I read the raw image with cv2
def read_img(image):
img = ndimage.imread(image) #this would return a 4-d array: [R,G,B,255]
img_shape = img.shape
print(img_shape)
#get the pixel coordinate
w = img_shape[1] #the width
# print(w)
h= img_shape[0] #the height
# print(h)
uv_coord = []
for u in range(w):
for v in range(h):
uv_coord.append([float(u),float(v)]) #this records the coord in the fashion of [x1,y1],[x1, y2], [x1, y3]....
return np.array(uv_coord)
Then, based on the paper:
r(θ) = k1θ + k2θ^3 + k3θ^5 + k4θ^7, (1)
where Ks are the distorted coefficients
Given pixel coordinates (x,y) in the pinhole projection image, the corresponding image coordinates (x',y')in the fisheye can be computed as:
x'=r(θ) cos(ϕ), y' = r(θ) sin(ϕ), (2)
where ϕ = arctan((y − y0)/(x − x0)), and (x0, y0) are the coordinates of the principal point in the pinhole projection image.
And then the image coordinates (x',y') is converted into pixel coordinates (xf,yf): (xf, yf):
*xf = mu * x' + u0, yf = mv * y' + v0,* (3)
where (u0, v0) are the coordinates of the principle points in the fisheye, and mu, mv denote the number of pixels per unit distance in the horizontal and vertica directions. So I am guessing there are just from the intrinsic matrix [fx, fy] and u0 v0 are the [cx, cy].
def add_distortion(sourceUV, dmatrix,Kmatrix):
'''This function is programmed to remove the pixel of the given original image coords
input arguments:
dmatrix -- the intrinsic matrix [k1,k2,k3,k4] for tweaking purposes
Kmatrix -- [fx, fy, cx, cy, s]'''
u = sourceUV[:,0] #width in x
v = sourceUV[:,1] #height in y
rho = np.sqrt(u**2 + v**2)
#get theta
theta = np.arctan(rho,np.full_like(u,1))
# rho_mat = np.array([rho, rho**3, rho**5, rho**7])
rho_mat = np.array([theta,theta**3, theta**5, theta**7])
#get the: rho(theta) = k1*theta + k2*theta**3 + k3*theta**5 + k4*theta**7
rho_d = dmatrix#rho_mat
#get phi
phi = np.arctan2((v - Kmatrix[3]), (u - Kmatrix[2]))
xd = rho_d * np.cos(phi)
yd = rho_d * np.sin(phi)
#converting the coords from image plane back to pixel coords
ud = Kmatrix[0] * (xd + Kmatrix[4] * yd) + Kmatrix[2]
vd = Kmatrix[1] * yd + Kmatrix[3]
return np.column_stack((ud,vd))
Then after gaining the distorded coordinates, I perform moving pixels in this way, where I think the problem might be:
def main():
image_name = "original.png"
img = cv2.imread(image_name)
img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR) #the cv2 read the image as BGR
w = img.shape[1]
h = img.shape[0]
uv_coord = read_img(image_name)
#for adding distortion
dmatrix = [-0.391942708316175,0.012746418822063 ,-0.001374061848026 ,0.005349692659231]
#the Intrinsic matrix of the original picture's
Kmatrix = np.array([9.842439e+02,9.808141e+02 , 1392/2, 2.331966e+02, 0.000000e+00])
# Kmatrix = np.array([2234.23470710156 ,2223.78349134123, 947.511596277837, 647.103139639432,-3.20443253476976]) #the distorted intrinsics
uv = add_distortion(uv_coord,dmatrix,Kmatrix)
i = 0
dstimg = np.zeros_like(img)
for x in range(w): #tthe coo
for y in range(h):
if i > (512 * 1392 -1):
break
xu = uv[i][0] #x, y1, y2, y3
yu = uv[i][1]
i +=1
# if new pixel is in bounds copy from source pixel to destination pixel
if 0 <= xu and xu < img.shape[1] and 0 <= yu and yu < img.shape[0]:
dstimg[int(yu)][int(xu)] = img[int(y)][int(x)]
img = Image.fromarray(dstimg, 'RGB')
img.save('my.png')
img.show()
However, this code does not perform in the way I want. Could you guys please help me with debugging it? I spent 3 days but I still could not see any problem with it. Thanks!!