I want to apply rigid body transformations to a large set of 2D image matrices. Ideally, I'd like to be able to just supply an affine transformation matrix specifying both the translation and rotation, apply this in one go, then do cubic spline interpolation on the output.
Unfortunately it seems that affine_transform in scipy.ndimage.interpolation doesn't do translation. I know I could use a combination of shift and rotate, but this is kind of messy and in involves interpolating the output multiple times.
I've also tried using the generic geometric_transformation like this:
import numpy as np
from scipy.ndimage.interpolation import geometric_transformation
# make the affine matrix
def maketmat(xshift,yshift,rotation,dimin=(0,0)):
# centre on the origin
in2orig = np.identity(3)
in2orig[:2,2] = -dimin[0]/2.,-dimin[1]/2.
# rotate about the origin
theta = np.deg2rad(rotation)
rotmat = np.identity(3)
rotmat[:2,:2] = [np.cos(theta),np.sin(theta)],[-np.sin(theta),np.cos(theta)]
# translate to new position
orig2out = np.identity(3)
orig2out[:2,2] = xshift,yshift
# the final affine matrix is just the product
tmat = np.dot(orig2out,np.dot(rotmat,in2orig))
# function that maps output space to input space
def out2in(outcoords,affinemat):
outcoords = np.asarray(outcoords)
outcoords = np.concatenate((outcoords,(1.,)))
incoords = np.dot(affinemat,outcoords)
incoords = tuple(incoords[0:2])
return incoords
def rbtransform(source,xshift,yshift,rotation,outdims):
# source --> target
forward = maketmat(xshift,yshift,rotation,source.shape)
# target --> source
backward = np.linalg.inv(forward)
# now we can use geometric_transform to do the interpolation etc.
tformed = geometric_transform(source,out2in,output_shape=outdims,extra_arguments=(backward,))
return tformed
This works, but it's horribly slow, since it's essentially looping over pixel coordinates! What's a good way to do this?
Can you use the scikit image?
If this is the case, you could try to apply an homography. An homography cab used to represent both translation and rotation at the same time through a 3x3 matrix.
You can use the skimage.transform.fast_homography function.
import numpy as np
import scipy
import skimage.transform
im = scipy.misc.lena()
H = np.asarray([[1, 0, 10], [0, 1, 20], [0, 0, 1]])
skimage.transform.fast_homography(im, H)
The transform took about 30 ms on my old Core 2 Duo.
About homography : http://en.wikipedia.org/wiki/Homography
I think affine_transform does do translation --- there's the offset parameter.
Related
I am trying to rotate 3d point cloud data to theta degree using rotation matrix.
The shape of point cloud data that is indicated as 'xyz' in code is (64,2048,3), so there are 131,072 points including x,y,z.
I have tried with 2d rotation matrix (since I want to rotate in bird's eye view).
The 2d rotation matrix that I used is:
And this is my code for the rotation:
def rotation (xyz):
original_x = []
original_y = []
for i in range(64):
for j in range(2048):
original_x.append(xyz[i][j][0])
original_y.append(xyz[i][j][1])
original = np.matrix.transpose(np.array([original_x,original_y]))
rotation_2d = [[np.cos(theta), -np.sin(theta)],[np.sin(theta),np.cos(theta)]]
rotated_points = np.matmul(original,rotation_2d)
return rotated_points
The result looks successful from bird's eye view, but not successful from the front view, as it is very messy so it is not possible to understand the data.
I have also tried with 3d rotation matrix with [x,y,z] that is :
However, the result was exactly the same when 2d rotation matrix was used.
What could be the possible reason? Is there any mistake in the rotation method?
Here's a function that applies the rotation you're after and a demonstration of this code.
Note that the rotation is applied clockwise. For a counterclockwise rotation, use the transpose of the rotation matrix.
import numpy as np
import matplotlib.pyplot as plt
from scipy.linalg import block_diag
def rotation(xyz):
before = xyz.reshape((-1,3))
rot_3d = block_diag([[np.cos(theta), -np.sin(theta)],[np.sin(theta),np.cos(theta)]],1)
after = before # rot_3d
return after
theta = 3*np.pi/2
xyz_test = np.random.rand(8,16)
xyz_test = np.stack([xyz_test]*3,axis = -1)
xyz_test += np.random.randn(8,16,3)*.1
original = xyz_test.reshape([8*16,3])
rotated = rotation(xyz_test)
# plt.plot(1,2,3,projection = '3d')
fig,ax = plt.subplots(figsize = (10,10), subplot_kw = {"projection":"3d"})
ax.plot3D(*original.T, 'x', ms = 20)
ax.plot3D(*rotated.T, 'x', ms = 20)
Sample resulting plot (of cloud and rotated cloud):
I have a gray scale image that I want to rotate. However, I need to do optimization on it. Therefore, I cannot use pillow or opencv.
I want to reshape this image using python with numpy.reshape into an one dimensional vector (where I use the default settings C-style reshape).
And thereafter, I want to rotate this image around a point using matrix multiplication and addition, i.e. it should be something like
rotated_image_vector = A # vector + b # (or the equivalent in homogenious coordinates).
After this operation I want to reshape the outcome back to two dimensions and have the rotated image.
It would be best if it would as well use linear interpolation between the pixels that do not fit exactly to an other pixel.
The mathematical theory tells it is possible, and I believe there is a very elegant solution to this problem, but I do not see how to create this matrix. Did anyone already have this problem or sees an immediate solution?
Thanks a lot,
Eike
I like your approach but there is a slight misconception in it. What you want to transform are not the pixel values themselves but the coordinates. So you don't reshape your image but rather do a np.indices on it to obtain coordinates to each pixel. For those a rotation around a point looks like
rotation_matrix#(coordinates-fixed_point)+fixed_point
except that I have to transpose a bit to get the dimensions to align. The cove below is a slight adoption of my code in this answer.
As an example I am going to use the Wikipedia-logo-v2 by Nohat. It is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.
First I read in the picture, swap x and y axis to not get mad and rotate the coordinates as described above.
import numpy as np
import matplotlib.pyplot as plt
import itertools
image = plt.imread('wikipedia.jpg')
image = np.swapaxes(image,0,1)/255
fixed_point = np.array(image.shape[:2], dtype='float')/2
points = np.moveaxis(np.indices(image.shape[:2]),0,-1).reshape(-1,2)
a = 2*np.pi/8
A = np.array([[np.cos(a),-np.sin(a)],[np.sin(a),np.cos(a)]])
rotated_coordinates = (A#(points-fixed_point.reshape(1,2)).T).T+fixed_point.reshape(1,2)
Now I set up a little class to interpolate between the pixels that do not fit exactly to an other pixel. And finally I swap the axis back and plot it.
class Image_knn():
def fit(self, image):
self.image = image.astype('float')
def predict(self, x, y):
image = self.image
weights_x = [(1-(x % 1)).reshape(*x.shape,1), (x % 1).reshape(*x.shape,1)]
weights_y = [(1-(y % 1)).reshape(*x.shape,1), (y % 1).reshape(*x.shape,1)]
start_x = np.floor(x)
start_y = np.floor(y)
return sum([image[np.clip(np.floor(start_x + x), 0, image.shape[0]-1).astype('int'),
np.clip(np.floor(start_y + y), 0, image.shape[1]-1).astype('int')] * weights_x[x]*weights_y[y]
for x,y in itertools.product(range(2),range(2))])
image_model = Image_knn()
image_model.fit(image)
transformed_image = image_model.predict(*rotated_coordinates.T).reshape(*image.shape)
plt.imshow(np.swapaxes(transformed_image,0,1))
And I get a result like this
Possible Issue
The artifact in the bottom left that looks like one needs to clean the screen comes from the following problem: When we rotate it can happen that we don't have enough pixels to paint the lower left. What we do by default in image_knn is to clip the coordinates to an area where we have information. That means when we ask image knn for pixels coming from outside the image it gives us the pixels at the boundary of the image. This looks good if there is a background but if an object touches the edge of the picture it looks odd like here. Just something to keep in mind when using this.
Thank you for your answer!
But actually it is not a misconception that you could let this roation be represented by a matrix multiplication with the reshaped vector.
I used your code to generate such a matrix (its surely not the most efficient way but it works, most likely you see a more efficient implementation immediately XD. You see I really need it as a matix multiplication :-D).
What I basically did is to generate the representation matrix of the linear transformation, by computing how every of the 100*100 basis images (i.e. the image with zeros everywhere und a one) is mapped by your transformation.
import sys
import numpy as np
import matplotlib.pyplot as plt
import itertools
angle = 2*np.pi/6
image_expl = plt.imread('wikipedia.jpg')
image_expl = image_expl[:,:,0]
plt.imshow(image_expl)
plt.title("Image")
plt.show()
image_shape = image_expl.shape
pixel_number = image_shape[0]*image_shape[1]
rot_mat = np.zeros((pixel_number,pixel_number))
for i in range(pixel_number):
vector = np.zeros(pixel_number)
vector[i] = 1
image = vector.reshape(*image_shape)
fixed_point = np.array(image.shape, dtype='float')/2
points = np.moveaxis(np.indices(image.shape),0,-1).reshape(-1,2)
a = -angle
A = np.array([[np.cos(a),-np.sin(a)],[np.sin(a),np.cos(a)]])
rotated_coordinates = (A#(points-fixed_point.reshape(1,2)).T).T+fixed_point.reshape(1,2)
x,y = rotated_coordinates.T
image = image.astype('float')
weights_x = [(1-(x % 1)).reshape(*x.shape), (x % 1).reshape(*x.shape)]
weights_y = [(1-(y % 1)).reshape(*x.shape), (y % 1).reshape(*x.shape)]
start_x = np.floor(x)
start_y = np.floor(y)
transformed_image_returned = sum([image[np.clip(np.floor(start_x + x), 0, image.shape[0]-1).astype('int'),
np.clip(np.floor(start_y + y), 0, image.shape[1]-1).astype('int')] * weights_x[x]*weights_y[y]
for x,y in itertools.product(range(2),range(2))])
rot_mat[:,i] = transformed_image_returned
if i%100 == 0: print(int(100*i/pixel_number), "% finisched")
plt.imshow((rot_mat # image_expl.reshape(-1)).reshape(image_shape))
Thank you again :-)
I am using python opencv version 4.5.
import cv2
import numpy as np
rigidRect = np.float32([[50,-50],[50,50],[-50,50]])
shiftRect = np.float32([[50,-30],[50,70],[-50,70]])
M = cv2.getAffineTransform(rigidRect, shiftRect) #this return [[1,0,0],[0,1,20]]
validateRect = cv2.warpAffine(rigidRect, M, (2,3))
and validateRect return a 3 by 2 zeroes matrix.
I thought validateRect will equal to shiftRect?
warpAffine is used to transform an image using the affine transform matrix. What you are trying to do is to transform the given points, which is achieved by the transform function. Documentation of getAffineTransform gives hint about related functions in see also part.
validateRect = cv2.transform(rigidRect[None,:,:], M)
I have a simple 2x2 transformation matrix, s, which encodes some liner transformation of coordinates such that X' = sX.
I have generated a set of uniformley distributed coordinates on a grid using the np.meshgrid() function and at the moment I traverse each coordinate and apply the transformation at a coordinate by coordinate level. Unfortunately, this very slow for large arrays. Are there any fast ways of doing this? Thanks!
import numpy as np
image_dimension = 1024
image_index = np.arange(0,image_dimension,1)
xx, yy = np.meshgrid(image_index,image_index)
# Pre-calculated Transformation Matrix.
s = np.array([[ -2.45963439e+04, -2.54997726e-01], [ 3.55680731e-02, -2.48005486e+04]])
xx_f = xx.flatten()
yy_f = yy.flatten()
for x_t in range(0, image_dimension*image_dimension):
# Get the current (x,y) coordinate.
x_y_in = np.matrix([[xx_f[x_t]],[yy_f[x_t]]])
# Perform the transformation with x.
optout = s * x_y_in
# Store the new coordinate.
xx_f[x_t] = np.array(optout)[0][0]
yy_f[x_t] = np.array(optout)[1][0]
# Reshape Output
xx_t = xx_f.reshape((image_dimension, image_dimension))
yy_t = yy_f.reshape((image_dimension, image_dimension))
You can use the numpy dot function to get the dot product of your matices as:
xx_tn,yy_tn = np.dot(s,[xx.flatten(),yy.flatten()])
xx_t = xx_tn.reshape((image_dimension, image_dimension))
yy_t = yy_tn.reshape((image_dimension, image_dimension))
Which is much faster
Loops are slow in Python. It is better to use vectorization.
In a nutshell, the idea is to let numpy do the loops in C, which is much faster.
You can express your problem as matrix multiplications X' = sX, where you put all the points in X and transform them all with just one call to numpy's dot product:
xy = np.vstack([xx.ravel(), yy.ravel()])
xy_t = np.dot(s, xy)
xx_t, yy_t = xy_t.reshape((2, image_dimension, image_dimension))
The image (test.tif) is attached.
The np.nan values are the whitest region.
How to fill those whitest region using some gap filling algorithms that uses values from the neighbours?
import scipy.ndimage
data = ndimage.imread('test.tif')
As others have suggested, scipy.interpolate can be used. However, it requires fairly extensive index manipulation to get this to work.
Complete example:
from pylab import *
import numpy
import scipy.ndimage
import scipy.interpolate
import pdb
data = scipy.ndimage.imread('data.png')
# a boolean array of (width, height) which False where there are missing values and True where there are valid (non-missing) values
mask = ~( (data[:,:,0] == 255) & (data[:,:,1] == 255) & (data[:,:,2] == 255) )
# array of (number of points, 2) containing the x,y coordinates of the valid values only
xx, yy = numpy.meshgrid(numpy.arange(data.shape[1]), numpy.arange(data.shape[0]))
xym = numpy.vstack( (numpy.ravel(xx[mask]), numpy.ravel(yy[mask])) ).T
# the valid values in the first, second, third color channel, as 1D arrays (in the same order as their coordinates in xym)
data0 = numpy.ravel( data[:,:,0][mask] )
data1 = numpy.ravel( data[:,:,1][mask] )
data2 = numpy.ravel( data[:,:,2][mask] )
# three separate interpolators for the separate color channels
interp0 = scipy.interpolate.NearestNDInterpolator( xym, data0 )
interp1 = scipy.interpolate.NearestNDInterpolator( xym, data1 )
interp2 = scipy.interpolate.NearestNDInterpolator( xym, data2 )
# interpolate the whole image, one color channel at a time
result0 = interp0(numpy.ravel(xx), numpy.ravel(yy)).reshape( xx.shape )
result1 = interp1(numpy.ravel(xx), numpy.ravel(yy)).reshape( xx.shape )
result2 = interp2(numpy.ravel(xx), numpy.ravel(yy)).reshape( xx.shape )
# combine them into an output image
result = numpy.dstack( (result0, result1, result2) )
imshow(result)
show()
Output:
This passes to the interpolator all values we have, not just the ones next to the missing values (which may be somewhat inefficient). It also interpolates every point in the output, not just the missing values (which is extremely inefficient). A better way is to interpolate just the missing values, and then patch them into the original image. This is just a quick working example to get started :)
I think viena's question is more related to an inpainting problem.
Here are some ideas:
In order to fill the gaps in B/W images you can use some filling algorithm like scipy.ndimage.morphology.binary_fill_holes. But you have a gray level image, so you can't use it.
I suppose that you don't want to use a complex inpainting algorithm. My first suggestion is: Don't try to use Nearest gray value (you don't know the real value of the NaN pixels). Using the NEarest value will generate a dirty algorithm. Instead, I would suggest you to fill the gaps with some other value (e.g. the mean of the row). You can do it without coding by using scikit-learn:
Source:
>>> from sklearn.preprocessing import Imputer
>>> imp = Imputer(strategy="mean")
>>> a = np.random.random((5,5))
>>> a[(1,4,0,3),(2,4,2,0)] = np.nan
>>> a
array([[ 0.77473361, 0.62987193, nan, 0.11367791, 0.17633671],
[ 0.68555944, 0.54680378, nan, 0.64186838, 0.15563309],
[ 0.37784422, 0.59678177, 0.08103329, 0.60760487, 0.65288022],
[ nan, 0.54097945, 0.30680838, 0.82303869, 0.22784574],
[ 0.21223024, 0.06426663, 0.34254093, 0.22115931, nan]])
>>> a = imp.fit_transform(a)
>>> a
array([[ 0.77473361, 0.62987193, 0.24346087, 0.11367791, 0.17633671],
[ 0.68555944, 0.54680378, 0.24346087, 0.64186838, 0.15563309],
[ 0.37784422, 0.59678177, 0.08103329, 0.60760487, 0.65288022],
[ 0.51259188, 0.54097945, 0.30680838, 0.82303869, 0.22784574],
[ 0.21223024, 0.06426663, 0.34254093, 0.22115931, 0.30317394]])
The dirty solution that uses the Nearest values can be this:
1) Find the perimeter points of the NaN regions
2) Compute all the distances between the NaN points and the perimeter
3) Replace the NaNs with the nearest's point gray value
If you want values from the nearest neighbors, you could use the NearestNDInterpolator from scipy.interpolate. There are also other interpolators as well you can consider.
You can locate the X,Y index values for the NaN values with:
import numpy as np
nan_locs = np.where(np.isnan(data))
There are some other options for the interpolation as well. One option is to replace NaN values with the results of a median filter (but your areas are kind of large for this). Another option might be grayscale dilation. The correct interpolation depends on your end domain.
If you haven't used a SciPy ND interpolator before, you'll need to provide X, Y, and value data to fit the interpolator to then X and Y data for values to interpolate at. You can do this using the where example above as a template.
OpenCV has some image in-painting algorithms that you could use. You just need to provide a binary mask which indicates which pixels should be in-painted.
import cv2
import numpy as np
import scipy.ndimage
data = ndimage.imread("test.tif")
mask = np.isnan(data)
inpainted_img = cv2.inpaint(img, mask, inpaintRadius=3, flags=cv2.INPAINT_TELEA)