I'm trying to randomly rotate some images with annotations,
I'm trying to understand how to get the new point location after rotation,
my images have different shapes
Example of what I'm trying to do is: this function to calculate the new position according to this answer (here)
def new_pixel(x,y,theta,X,Y):
sin = math.sin(theta)
cos = math.cos(theta)
x_new = (x-X/2)*cos + (y-X/2)*sin + X/2
y_new = -(x-X/2)*sin + (y-Y/2)*cos + Y/2
return int(x_new),int(y_new)
the code of open original image:
img = cv2.imread('D://ubun/1.jpg')
print(img.shape)
X, Y, c = img.shape
p1 = (124,291)
p2 = (168,291)
p3 = (169,391)
p4 = (125,391)
img1 = img.copy()
cv2.circle(img1, p1, 10, color=(255,0,0), thickness=2)
plt.imshow(img1)
the red dot is the point
the code for rotation:
rotated = ndimage.rotate(img, 45)
print(rotated.shape)
p11 = new_pixel(p1[0],p1[1],45,X,Y)
p22 = new_pixel(p2[0],p2[1],45,X,Y)
p33 = new_pixel(p3[0],p3[1],45,X,Y)
p44 = new_pixel(p4[0],p4[1],45,X,Y)
cv2.circle(rotated, p11, 10, color=(255,0,0), thickness=2)
plt.imshow(rotated)
The image after rotation and see the point is not in the correct position after rotation:
I noticed that image shape is different after rotation, does this effect the calculations ?
This reply is really late, but even I faced the same issue. So there are basically two issues with your code.
math.sin /math.cos takes radians as input not degrees. SO just convert it to radians to get the right result.
X,Y : You are taking the height and width of the initial image which was not rotated. This would be right if the resulting image is the same size as the initial image , but since the resulting image resizes the output to fit the complete image after rotation , you need to take X , Y as the new width and height. But you need to consider the new height and width only while your adding it after transformation. SO the formula would be :
x_new = (x-old_Width/2)*cos + (y-old_height/2)*sin + new_width/2
y_new = -(x-old_Width/2)*sin + (y-old_height/2)*cos + new_height/2
I know this post is too late , but hope this helps out the people whoever faces the same issue.
I want to make an affine transformation and afterwards use nearest neighbor interpolation while keeping the same dimensions for input and output images. I use for example the scaling transformation T= [[2,0,0],[0,2,0],[0,0,1]]. Any idea how can I fill the black pixels with nearest neighbor ? I tryied giving them the min value of neighbors' intensities. For ex. if a pixel has neighbors [55,22,44,11,22,55,23,231], I give it the value of min intensity: 11. But the result is not anything clear..
import numpy as np
from matplotlib import pyplot as plt
#Importing the original image and init the output image
img = plt.imread('/home/left/Desktop/computerVision/SET1/brain0030slice150_101x101.png',0)
outImg = np.zeros_like(img)
# Dimensions of the input image and output image (the same dimensions)
(width , height) = (img.shape[0], img.shape[1])
# Initialize the transformation matrix
T = np.array([[2,0,0], [0,2,0], [0,0,1]])
# Make an array with input image (x,y) coordinations and add [0 0 ... 1] row
coords = np.indices((width, height), 'uint8').reshape(2, -1)
coords = np.vstack((coords, np.zeros(coords.shape[1], 'uint8')))
output = T # coords
# Arrays of x and y coordinations of the output image within the image dimensions
x_array, y_array = output[0] ,output[1]
indices = np.where((x_array >= 0) & (x_array < width) & (y_array >= 0) & (y_array < height))
# Final coordinations of the output image
fx, fy = x_array[indices], y_array[indices]
# Final output image after the affine transformation
outImg[fx, fy] = img[fx, fy]
The input image is:
The output image after scaling is:
well you could simply use the opencv resize function
import cv2
new_image = cv2.resize(image, new_dim, interpolation=cv.INTER_AREA)
it'll do the resize and fill in the empty pixels in one go
more on cv2.resize
If you need to do it manually, then you could simply detect dark pixels in resized image and change their value to mean of 4 neighbour pixels (for example - it depends on your required alghoritm)
See: nereast neighbour, bilinear, bicubic, etc.
I have been working on a script which calculates the rotational shift between two images using cv2's phaseCorrelate method.
I have two images, the second is a 90 degree rotated version of the first image. After loading in the images, I convert them to log-polar before passing them into the phaseCorrelate function.
From what I have read, I believe that this should yield a rotational shift between two images.
The code below describes the implementation.
#bitwise right binary shift function
def rshift(val, n): return (val % 0x100000000)
base_img = cv2.imread('img1.jpg')
cur_img = cv2.imread('dataa//t_sv_1.jpg')
curr_img = rotateImage(cur_img, 90)
rows,cols,chan = base_img.shape
x, y, c = curr_img.shape
#convert images to valid type
ref32 = np.float32(cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY))
curr32 = np.float32(cv2.cvtColor(curr_img, cv2.COLOR_BGR2GRAY))
value = np.sqrt(((rows/2.0)**2.0)+((cols/2.0)**2.0))
value2 = np.sqrt(((x/2.0)**2.0)+((y/2.0)**2.0))
polar_image = cv2.linearPolar(ref32,(rows/2, cols/2), value, cv2.WARP_FILL_OUTLIERS)
log_img = cv2.linearPolar(curr32,(x/2, y/2), value2, cv2.WARP_FILL_OUTLIERS)
shift = cv2.phaseCorrelate(polar_image, log_img)
sx = shift[0][0]
sy = shift[0][1]
sf = shift[1]
polar_image = polar_image.astype(np.uint8)
log_img = log_img.astype(np.uint8)
cv2.imshow("Polar Image", polar_image)
cv2.imshow('polar', log_img)
#get rotation from shift along y axis
rotation = sy * 180 / (rshift(y, 1));
print(rotation)
cv2.waitKey(0)
cv2.destroyAllWindows()
I am unsure how to interpret the results of this function. The expected outcome is a value similar to 90 degrees, however, I get the value below.
Output: -0.00717516014538333
How can I make the output correct?
A method, typically referred to as the Fourier Mellin transform, and published as:
B. Srinivasa Reddy and B.N. Chatterji, "An FFT-Based Technique for Translation, Rotation, and Scale-Invariant Image Registration", IEEE Trans. on Image Proc. 5(8):1266-1271, 1996
uses the FFT and the log-polar transform to obtain the translation, rotation and scaling of one image to match the other. I find this tutorial to be very clear and informative, I will give a summary here:
Compute the magnitude of the FFT of the two images (apply a windowing function first to avoid issues with periodicity of the FFT).
Compute the log-polar transform of the magnitude of the frequency-domain images (typically a high-pass filter is applied first, but I have not seen its usefulness).
Compute the cross-correlation (actually phase correlation) between the two. This leads to a knowledge of scale and rotation.
Apply the scaling and rotation to one of the original input images.
Compute the cross-correlation (actually phase correlation) of the original input images, after correction for scaling and rotation. This leads to knowledge of the translation.
This works because:
The magnitude of the FFT is translation-invariant, we can solely focus on scaling and rotation without worrying about translation. Note that the rotation of the image is identical to the rotation of the FFT, and that scaling of the image is inverse to the scaling of the FFT.
The log-polar transform converts rotation into a vertical translation, and scaling into a horizontal translation. Phase correlation allows us to determine these translations. Converting them to a rotation and scaling is non-trivial (especially the scaling is hard to get right, but a bit of math shows the way).
If the tutorial linked above is not clear enough, one can look at the C++ code that comes with it, or at this other Python code.
OP is interested only in the rotation aspect of the method above. If we can assume that the translation is 0 (this means we know around which point the rotation was made, if we don't know the origin we need to estimate it as a translation), then we don't need to compute the magnitude of the FFT (remember it is used to make the problem translation invariant), we can apply the log-polar transform directly to the images. But note that we need to use the center of rotation as the origin for the log-polar transform. If we additionally assume that the scaling is 1, we can further simplify things by taking the linear-polar transform. That is, we logarithmic scaling of the radius axis is only necessary to estimate scaling.
OP is doing this more or less correctly, I believe. Where OP's code goes wrong is in the extent of the radius axis in the polar transform. By going all the way to the extreme corners of the image, OpenCV needs to fill in parts of the transformed image with zeros. These parts are dictated by the shape of the image, not by the contents of the image. That is, both polar images contain exactly the same sharp, high-contrast curve between image content and filled-in zeros. The phase correlation is aligning these curves, leading to an estimate of 0 degree rotation. The image content is more or less ignored because its contrast is much lower.
Instead, make the extent of the radius axis that of the largest circle that fits completely inside the image. This way, no parts of the output need to be filled with zeros, and the phase correlation can focus on the actual image content. Furthermore, considering the two images are rotated versions of each other, it is likely that the data in the corners of the images do not match, there is no need to take that into account at all!
Here is code I implemented quickly based on OP's code. I read in Lena, rotated the image by 38 degrees, computed the linear-polar transform of the original and rotated images, then the phase correlation between these two, and then determined a rotation angle based on the vertical translation. The result was 37.99560, plenty close to 38.
import cv2
import numpy as np
base_img = cv2.imread('lena512color.tif')
base_img = np.float32(cv2.cvtColor(base_img, cv2.COLOR_BGR2GRAY)) / 255.0
(h, w) = base_img.shape
(cX, cY) = (w // 2, h // 2)
angle = 38
M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0)
curr_img = cv2.warpAffine(base_img, M, (w, h))
cv2.imshow("base_img", base_img)
cv2.imshow("curr_img", curr_img)
base_polar = cv2.linearPolar(base_img,(cX, cY), min(cX, cY), 0)
curr_polar = cv2.linearPolar(curr_img,(cX, cY), min(cX, cY), 0)
cv2.imshow("base_polar", base_polar)
cv2.imshow("curr_polar", curr_polar)
(sx, sy), sf = cv2.phaseCorrelate(base_polar, curr_polar)
rotation = -sy / h * 360;
print(rotation)
cv2.waitKey(0)
cv2.destroyAllWindows()
These are the four image windows shown by the code:
I created a figure that shows the phase correlation values for multiple rotations. This has been edited to reflect Cris Luengo's comment. The image is cropped to get rid of the edges of the square insert.
import cv2
import numpy as np
paths = ["lena.png", "rotate45.png", "rotate90.png", "rotate135.png", "rotate180.png"]
import os
os.chdir('/home/stephen/Desktop/rotations/')
images, rotations, polar = [],[], []
for image_path in paths:
alignedImage = cv2.imread('lena.png')
rotatedImage = cv2.imread(image_path)
rows,cols,chan = alignedImage.shape
x, y, c = rotatedImage.shape
x,y,w,h = 220,220,360,360
alignedImage = alignedImage[y:y+h, x:x+h].copy()
rotatedImage = rotatedImage[y:y+h, x:x+h].copy()
#convert images to valid type
ref32 = np.float32(cv2.cvtColor(alignedImage, cv2.COLOR_BGR2GRAY))
curr32 = np.float32(cv2.cvtColor(rotatedImage, cv2.COLOR_BGR2GRAY))
value = np.sqrt(((rows/2.0)**2.0)+((cols/2.0)**2.0))
value2 = np.sqrt(((x/2.0)**2.0)+((y/2.0)**2.0))
polar_image = cv2.linearPolar(ref32,(rows/2, cols/2), value, cv2.WARP_FILL_OUTLIERS)
log_img = cv2.linearPolar(curr32,(x/2, y/2), value2, cv2.WARP_FILL_OUTLIERS)
shift = cv2.phaseCorrelate(polar_image, log_img)
(sx, sy), sf = shift
polar_image = polar_image.astype(np.uint8)
log_img = log_img.astype(np.uint8)
sx, sy, sf = round(sx, 4), round(sy, 4), round(sf, 4)
text = image_path + "\n" + "sx: " + str(sx) + " \nsy: " + str(sy) + " \nsf: " + str(sf)
images.append(rotatedImage)
rotations.append(text)
polar.append(polar_image)
Here's an approach to determine the rotational shift between two images in degrees. The idea is to find the skew angle for each image in relation to a horizontal line. If we can find this skewed angle then we can calculate the angle difference between the two images. Here are some example images to illustrate this concept
Original unrotated image
Rotated counterclockwise by 10 degrees (neg_10) and counterclockwise by 35 degrees (neg_35)
Rotated clockwise by 7.9 degrees (pos_7_9) and clockwise by 21 degrees (pos_21)
For each image, we want to determine the skew angle in relation to a horizontal line with negative being rotated counterclockwise and positive being rotated clockwise
Here's the helper function to determine this skew angle
def compute_angle(image):
# Convert to grayscale, invert, and Otsu's threshold
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = 255 - gray
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find coordinates of all pixel values greater than zero
# then compute minimum rotated bounding box of all coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]
# The cv2.minAreaRect() function returns values in the range
# [-90, 0) so need to correct angle
if angle < -45:
angle = -(90 + angle)
else:
angle = -angle
# Rotate image to horizontal position
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
borderMode=cv2.BORDER_REPLICATE)
return (angle, rotated)
After determining the skew angle for each image, we can simply calculate the difference
angle1, rotated1 = compute_angle(image1)
angle2, rotated2 = compute_angle(image2)
# Both angles are positive
if angle1 >= 0 and angle2 >= 0:
difference_angle = abs(angle1 - angle2)
# One positive, one negative
elif (angle1 < 0 and angle2 > 0) or (angle1 > 0 and angle2 < 0):
difference_angle = abs(angle1) + abs(angle2)
# Both negative
elif angle1 < 0 and angle2 < 0:
angle1 = abs(angle1)
angle2 = abs(angle2)
difference_angle = max(angle1, angle2) - min(angle1, angle2)
Here's the step by step walk through of whats going on. Using pos_21 and neg_10, the compute_angle() function will return the skew angle and the normalized image
For pos_21, we normalize the image and determine the skew angle. Left (before) -> right (after)
20.99871826171875
Similarly for neg_10, we also normalize the image and determine the skew angle. Left (before) -> right (after)
-10.007980346679688
Now that we have both angles, we can compute the difference angle. Here's the result
31.006698608398438
Here's results with other combinations. With neg_10 and neg_35 we get
24.984039306640625
With pos_7_9 and pos_21,
13.09155559539795
Finally with pos_7_9 and neg_35,
42.89918231964111
Here's the full code
import cv2
import numpy as np
def rotational_shift(image1, image2):
def compute_angle(image):
# Convert to grayscale, invert, and Otsu's threshold
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = 255 - gray
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Find coordinates of all pixel values greater than zero
# then compute minimum rotated bounding box of all coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]
# The cv2.minAreaRect() function returns values in the range
# [-90, 0) so need to correct angle
if angle < -45:
angle = -(90 + angle)
else:
angle = -angle
# Rotate image to horizontal position
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h), flags=cv2.INTER_CUBIC, \
borderMode=cv2.BORDER_REPLICATE)
return (angle, rotated)
angle1, rotated1 = compute_angle(image1)
angle2, rotated2 = compute_angle(image2)
# Both angles are positive
if angle1 >= 0 and angle2 >= 0:
difference_angle = abs(angle1 - angle2)
# One positive, one negative
elif (angle1 < 0 and angle2 > 0) or (angle1 > 0 and angle2 < 0):
difference_angle = abs(angle1) + abs(angle2)
# Both negative
elif angle1 < 0 and angle2 < 0:
angle1 = abs(angle1)
angle2 = abs(angle2)
difference_angle = max(angle1, angle2) - min(angle1, angle2)
return (difference_angle, rotated1, rotated2)
if __name__ == '__main__':
image1 = cv2.imread('pos_7_9.png')
image2 = cv2.imread('neg_35.png')
angle, rotated1, rotated2 = rotational_shift(image1, image2)
print(angle)
I've been trying to draw an elliptical arc in openCV using the ellipse function (https://docs.opencv.org/3.0-beta/modules/imgproc/doc/drawing_functions.html), however, for high radius values the arcs seem segmented.
Do you know how can I sort of increase the arc's resolution to appear better for high radius values?
I tried to draw an arc with a small radius and it looked smooth and I also tried increasing image resolution but no difference was noticed.
My code is as follows:
A[0] = round(A[0]*dpm - Xmin + margin) #Normalize CenterX
A[1] = round(A[1]*dpm - Ymin + margin) #Normalize CenterY
A[2] = round(A[2]*dpm) #Normalize Radius
startAng = A[3]
endAng = A[4]
A=A.astype(int)
cv2.ellipse(Blank,(A[0],A[1]),(A[2],A[2]), 0, startAng, endAng, 0 ,1)
while:
Blank is the image I want to draw the arc on (np array, size= (398, 847)
(A[0],A[1]) is the center point
(A[2],A[2]) ellipse axes
0 is the angle
startAng is the starting angle of the arc
endAng is the ending angle of the arc
0 is the line color (Black)
1 is the line thickess
The code should produce a smooth arc but it looks segmented as if it is made of 4 lines.
I ended up writing a function to plot an arc on an input image:
import numpy as np
import cv2
blank = np.ones((500,500))
def DrawArc(image, center, radius, startAng, endAng, color,resolution):
'''Draws an arc with specific reslution and color on an input image
Args:
image - The input image to draw the arc on
center - Arc's center
radius - Arc's radius
startAng - the starting angle of the arc
engAng - the ending angle of the arc
color - Arc's color on the input image
resolution - Number of points for calculation
output:
image - updated image with plotted arc'''
startAng += 90
endAng += 90
theta = np.linspace(startAng,endAng,resolution)
x = np.round(radius*np.cos(np.deg2rad(theta))) + center[0]
y = np.round(radius*np.sin(np.deg2rad(theta))) + center[1]
x=x.astype(int)
y=y.astype(int)
for k in range(np.size(theta)):
image[x[k]][y[k]] = color
return image
image = DrawArc(blank,(250,250),200,0,90,0,1000)
cv2.imshow("Arc",image)
cv2.waitKey()
The output image is
Output
I have bunch of images (say 10) I have generated both as array or PIL object.
I need to integrate them into a circular fashion to display them and it should adjust itself to the resolution of the screen, is there anything in python that can do this?
I have tried using paste, but figuring out the resolution canvas and positions to paste is painful, wondering if there is an easier solution?
We can say that points are arranged evenly in a circle when there is a constant angle theta between neighboring points. theta can be calculated as 2*pi radians divided by the number of points. The first point is at angle 0 with respect to the x axis, the second point at angle theta*1, the third point at angle theta*2, etc.
Using simple trigonometry, you can find the X and Y coordinates of any point that lies on the edge of a circle. For a point at angle ohm lying on a circle with radius r:
xFromCenter = r*cos(ohm)
yFromCenter = r*sin(ohm)
Using this math, it is possible to arrange your images evenly on a circle:
import math
from PIL import Image
def arrangeImagesInCircle(masterImage, imagesToArrange):
imgWidth, imgHeight = masterImage.size
#we want the circle to be as large as possible.
#but the circle shouldn't extend all the way to the edge of the image.
#If we do that, then when we paste images onto the circle, those images will partially fall over the edge.
#so we reduce the diameter of the circle by the width/height of the widest/tallest image.
diameter = min(
imgWidth - max(img.size[0] for img in imagesToArrange),
imgHeight - max(img.size[1] for img in imagesToArrange)
)
radius = diameter / 2
circleCenterX = imgWidth / 2
circleCenterY = imgHeight / 2
theta = 2*math.pi / len(imagesToArrange)
for i, curImg in enumerate(imagesToArrange):
angle = i * theta
dx = int(radius * math.cos(angle))
dy = int(radius * math.sin(angle))
#dx and dy give the coordinates of where the center of our images would go.
#so we must subtract half the height/width of the image to find where their top-left corners should be.
pos = (
circleCenterX + dx - curImg.size[0]/2,
circleCenterY + dy - curImg.size[1]/2
)
masterImage.paste(curImg, pos)
img = Image.new("RGB", (500,500), (255,255,255))
#red.png, blue.png, green.png are simple 50x50 pngs of solid color
imageFilenames = ["red.png", "blue.png", "green.png"] * 5
images = [Image.open(filename) for filename in imageFilenames]
arrangeImagesInCircle(img, images)
img.save("output.png")
Result: