I have a gray scale image that I want to rotate. However, I need to do optimization on it. Therefore, I cannot use pillow or opencv.
I want to reshape this image using python with numpy.reshape into an one dimensional vector (where I use the default settings C-style reshape).
And thereafter, I want to rotate this image around a point using matrix multiplication and addition, i.e. it should be something like
rotated_image_vector = A # vector + b # (or the equivalent in homogenious coordinates).
After this operation I want to reshape the outcome back to two dimensions and have the rotated image.
It would be best if it would as well use linear interpolation between the pixels that do not fit exactly to an other pixel.
The mathematical theory tells it is possible, and I believe there is a very elegant solution to this problem, but I do not see how to create this matrix. Did anyone already have this problem or sees an immediate solution?
Thanks a lot,
I like your approach but there is a slight misconception in it. What you want to transform are not the pixel values themselves but the coordinates. So you don't reshape your image but rather do a np.indices on it to obtain coordinates to each pixel. For those a rotation around a point looks like
except that I have to transpose a bit to get the dimensions to align. The cove below is a slight adoption of my code in this answer.
As an example I am going to use the Wikipedia-logo-v2 by Nohat. It is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.
First I read in the picture, swap x and y axis to not get mad and rotate the coordinates as described above.
import numpy as np
import matplotlib.pyplot as plt
import itertools
image = plt.imread('wikipedia.jpg')
image = np.swapaxes(image,0,1)/255
fixed_point = np.array(image.shape[:2], dtype='float')/2
points = np.moveaxis(np.indices(image.shape[:2]),0,-1).reshape(-1,2)
a = 2*np.pi/8
A = np.array([[np.cos(a),-np.sin(a)],[np.sin(a),np.cos(a)]])
rotated_coordinates = (A#(points-fixed_point.reshape(1,2)).T).T+fixed_point.reshape(1,2)
Now I set up a little class to interpolate between the pixels that do not fit exactly to an other pixel. And finally I swap the axis back and plot it.
class Image_knn():
def fit(self, image):
self.image = image.astype('float')
def predict(self, x, y):
image = self.image
weights_x = [(1-(x % 1)).reshape(*x.shape,1), (x % 1).reshape(*x.shape,1)]
weights_y = [(1-(y % 1)).reshape(*x.shape,1), (y % 1).reshape(*x.shape,1)]
start_x = np.floor(x)
start_y = np.floor(y)
return sum([image[np.clip(np.floor(start_x + x), 0, image.shape[0]-1).astype('int'),
np.clip(np.floor(start_y + y), 0, image.shape[1]-1).astype('int')] * weights_x[x]*weights_y[y]
for x,y in itertools.product(range(2),range(2))])
image_model = Image_knn()
transformed_image = image_model.predict(*rotated_coordinates.T).reshape(*image.shape)
And I get a result like this
Possible Issue
The artifact in the bottom left that looks like one needs to clean the screen comes from the following problem: When we rotate it can happen that we don't have enough pixels to paint the lower left. What we do by default in image_knn is to clip the coordinates to an area where we have information. That means when we ask image knn for pixels coming from outside the image it gives us the pixels at the boundary of the image. This looks good if there is a background but if an object touches the edge of the picture it looks odd like here. Just something to keep in mind when using this.
Thank you for your answer!
But actually it is not a misconception that you could let this roation be represented by a matrix multiplication with the reshaped vector.
I used your code to generate such a matrix (its surely not the most efficient way but it works, most likely you see a more efficient implementation immediately XD. You see I really need it as a matix multiplication :-D).
What I basically did is to generate the representation matrix of the linear transformation, by computing how every of the 100*100 basis images (i.e. the image with zeros everywhere und a one) is mapped by your transformation.
import sys
import numpy as np
import matplotlib.pyplot as plt
import itertools
angle = 2*np.pi/6
image_expl = plt.imread('wikipedia.jpg')
image_expl = image_expl[:,:,0]
image_shape = image_expl.shape
pixel_number = image_shape[0]*image_shape[1]
rot_mat = np.zeros((pixel_number,pixel_number))
for i in range(pixel_number):
vector = np.zeros(pixel_number)
vector[i] = 1
image = vector.reshape(*image_shape)
fixed_point = np.array(image.shape, dtype='float')/2
points = np.moveaxis(np.indices(image.shape),0,-1).reshape(-1,2)
a = -angle
A = np.array([[np.cos(a),-np.sin(a)],[np.sin(a),np.cos(a)]])
rotated_coordinates = (A#(points-fixed_point.reshape(1,2)).T).T+fixed_point.reshape(1,2)
x,y = rotated_coordinates.T
image = image.astype('float')
weights_x = [(1-(x % 1)).reshape(*x.shape), (x % 1).reshape(*x.shape)]
weights_y = [(1-(y % 1)).reshape(*x.shape), (y % 1).reshape(*x.shape)]
start_x = np.floor(x)
start_y = np.floor(y)
transformed_image_returned = sum([image[np.clip(np.floor(start_x + x), 0, image.shape[0]-1).astype('int'),
np.clip(np.floor(start_y + y), 0, image.shape[1]-1).astype('int')] * weights_x[x]*weights_y[y]
for x,y in itertools.product(range(2),range(2))])
rot_mat[:,i] = transformed_image_returned
if i%100 == 0: print(int(100*i/pixel_number), "% finisched")
plt.imshow((rot_mat # image_expl.reshape(-1)).reshape(image_shape))
Thank you again :-)
I am trying (hard) to find an efficient way to visualize range-bearing data in a fan-shaped image.
So far I have come up with the method shown below using OpenCV's cv2.linearPolar function. However, since my data is only for a sensor opening angle of 120 deg, and not a full circle, it requires some hacks that are not very efficient in order to get the correct results from cv2.linearPolar. I.e. I need to insert empty blocks of data to represent a 360 deg image, in order to transform it, and then crop it afterwards to obtain only the region I am interested in.
Calculating the direct coordinates and mapping from A-->B results in non-interpolated, empty value between each "bearing-line" and does not look visually nice.
Since this need to run in real-time from a fast sensor, I would be interested in learning about another way to achieve this. Perhaps using OpenGL or something else.
Current ineffective method that obtains the wanted result:
import numpy as np
import cv2
import matplotlib.pyplot as plt
image = np.random.randint(256, size=(256,1000),dtype=np.uint8)
n_beams, n_ranges = image.shape[:2]
opening_angle_deg = 120
angle_res_deg = opening_angle_deg/n_beams
start_v = np.zeros((int(opening_angle_deg/angle_res_deg),n_ranges),dtype=np.uint8)
end_v = np.zeros((int((360-opening_angle_deg*2)/angle_res_deg),n_ranges),dtype=np.uint8)
polar = cv2.vconcat([start_v,image])
polar = cv2.vconcat([polar,end_v])
center = (n_ranges,polar.shape[0]/2)
if n_beams > n_ranges:
dst = cv2.linearPolar(polar,(center),polar.shape[1]/2,cv2.INTER_LINEAR+cv2.WARP_INVERSE_MAP + cv2.WARP_FILL_OUTLIERS)
x_max = np.ceil(np.sin(np.deg2rad(opening_angle_deg/2))*polar.shape[1]/2)*2
dst = dst [int(center[1]-x_max/2):int(center[1]+x_max/2),int(dst.shape[1]-polar.shape[1]/2):]
dst = cv2.linearPolar(polar,(center),polar.shape[0]/2,cv2.INTER_LINEAR+cv2.WARP_INVERSE_MAP + cv2.WARP_FILL_OUTLIERS)
x_max = np.ceil(np.sin(np.deg2rad(opening_angle_deg/2))*polar.shape[0]/2)*2
dst = dst [int(center[1]-x_max/2):int(center[1]+x_max/2),int(dst.shape[1]-polar.shape[0]/2):]
dst = cv2.transpose(dst)
I'm using OpenCV+Python+Numpy and I have three points in the image, I know the exact locations of those points.
(P1, P2);
I am going to transform the image to another view, (for example I am transforming the perspective view to side view). If I do so I will not have the exact location of those three points in the image plane.
I should write the code in a way that I can get new coordinates of those points.
cv2.imshow('Image',Image1) cv2.imshow('Tran',result)
My question is: How can I determine the new locations of those 3 points?
Easy, you can look in the documentation how warpPerspective works. To transform the location of a point you can use the following transformation:
Where [x, y] is the original point, and M is your perspective matrix
Implementing this in python you can use the following code:
p = (50,100) # your original point
px = (matrix[0][0]*p[0] + matrix[0][1]*p[1] + matrix[0][2]) / ((matrix[2][0]*p[0] + matrix[2][1]*p[1] + matrix[2][2]))
py = (matrix[1][0]*p[0] + matrix[1][1]*p[1] + matrix[1][2]) / ((matrix[2][0]*p[0] + matrix[2][1]*p[1] + matrix[2][2]))
p_after = (int(px), int(py)) # after transformation
You can see the result in a code below. The red dot is your original point. The second figure shows where it went after the perspective transform. The blue circle is the point you calculated in formula above.
You can have a look in my Jupyter Notebook here or here.
The code:
import numpy as np
import cv2
import matplotlib.pyplot as plt
# load the image, clone it for output, and then convert it to grayscale
image = cv2.imread('sample.png')
# Draw the point
p = (50,100)
cv2.circle(image,p, 20, (255,0,0), -1)
# Put in perspective
# Show images
# Here you can transform your point
p = (50,100)
px = (matrix[0][0]*p[0] + matrix[0][1]*p[1] + matrix[0][2]) / ((matrix[2][0]*p[0] + matrix[2][1]*p[1] + matrix[2][2]))
py = (matrix[1][0]*p[0] + matrix[1][1]*p[1] + matrix[1][2]) / ((matrix[2][0]*p[0] + matrix[2][1]*p[1] + matrix[2][2]))
p_after = (int(px), int(py))
# Draw the new point
cv2.circle(result,p_after, 20, (0,0,255), 12)
# Show the result
plt.title('Predicted position of your point in blue')
Have a look a the documentation, but in general:
cv2.perspectiveTransform(points, matrix)
For example:
# note you need to add a new axis, to match the supposed format
cv2.perspectiveTransform(pts1[np.newaxis, ...], matrix)
# returns array equal to pts2
I already achieved the goal described in the title but I was wondering if there was a more efficient (or generally better) way to do it. First of all let me introduce the problem.
I have a set of images of different sizes but with a width/height ratio less than (or equal) 2 (could be anything but let's say 2 for now), I want to normalize each one, meaning I want all of them to have the same size. Specifically I am going to do so like this:
Extract the max height above all images
Zoom the image so that each image reaches the max height keeping its ratio
Add a padding to the right with just white pixels until the image has a width/height ratio of 2
Keep in mind the images are represented as numpy matrices of grey scale values [0,255].
This is how I'm doing it now in Python:
max_height = numpy.max([len(obs) for obs in data if len(obs[0])/len(obs) <= 2])
for obs in data:
if len(obs[0])/len(obs) <= 2:
new_img = ndimage.zoom(obs, round(max_height/len(obs), 2), order=3)
missing_cols = max_height * 2 - len(new_img[0])
norm_img = []
for row in new_img:
norm_img.append(np.pad(row, (0, missing_cols), mode='constant', constant_values=255))
norm_img = np.resize(norm_img, (max_height, max_height*2))
There's a note about this code:
I'm rounding the zoom ratio because it makes the final height equal to max_height, I'm sure this is not the best approach but it's working (any suggestion is appreciated here). What I'd like to do is to expand the image keeping the ratio until it reaches a height equal to max_height. This is the only solution I found so far and it worked right away, the interpolation works pretty good.
So my final questions are:
Is there a better approach to achieve what explained above (image normalization) ? Do you think I could have done this differently ? Is there a common good practice I'm not following ?
Thanks in advance for your time.
Instead of ndimage.zoom you could use
scipy.misc.imresize. This
function allows you to specify the target size as a tuple, instead of by zoom
factor. Thus you won't have to call np.resize later to get the size exactly as
Note that scipy.misc.imresize calls
under the hood, so PIL (or Pillow) is a dependency.
Instead of using np.pad in a for-loop, you could allocate space for the desired array, norm_arr, first:
norm_arr = np.full((max_height, max_width), fill_value=255)
and then copy the resized image, new_arr into norm_arr:
nh, nw = new_arr.shape
norm_arr[:nh, :nw] = new_arr
For example,
from __future__ import division
import numpy as np
from scipy import misc
data = [np.linspace(255, 0, i*10).reshape(i,10)
for i in range(5, 100, 11)]
max_height = np.max([len(obs) for obs in data if len(obs[0])/len(obs) <= 2])
max_width = 2*max_height
result = []
for obs in data:
norm_arr = obs
h, w = obs.shape
if float(w)/h <= 2:
scale_factor = max_height/float(h)
target_size = (max_height, int(round(w*scale_factor)))
new_arr = misc.imresize(obs, target_size, interp='bicubic')
norm_arr = np.full((max_height, max_width), fill_value=255)
# check the shapes
# print(obs.shape, new_arr.shape, norm_arr.shape)
nh, nw = new_arr.shape
norm_arr[:nh, :nw] = new_arr
# visually check the result
# misc.toimage(norm_arr).show()
I want to apply rigid body transformations to a large set of 2D image matrices. Ideally, I'd like to be able to just supply an affine transformation matrix specifying both the translation and rotation, apply this in one go, then do cubic spline interpolation on the output.
Unfortunately it seems that affine_transform in scipy.ndimage.interpolation doesn't do translation. I know I could use a combination of shift and rotate, but this is kind of messy and in involves interpolating the output multiple times.
I've also tried using the generic geometric_transformation like this:
import numpy as np
from scipy.ndimage.interpolation import geometric_transformation
# make the affine matrix
def maketmat(xshift,yshift,rotation,dimin=(0,0)):
# centre on the origin
in2orig = np.identity(3)
in2orig[:2,2] = -dimin[0]/2.,-dimin[1]/2.
# rotate about the origin
theta = np.deg2rad(rotation)
rotmat = np.identity(3)
rotmat[:2,:2] = [np.cos(theta),np.sin(theta)],[-np.sin(theta),np.cos(theta)]
# translate to new position
orig2out = np.identity(3)
orig2out[:2,2] = xshift,yshift
# the final affine matrix is just the product
tmat = np.dot(orig2out,np.dot(rotmat,in2orig))
# function that maps output space to input space
def out2in(outcoords,affinemat):
outcoords = np.asarray(outcoords)
outcoords = np.concatenate((outcoords,(1.,)))
incoords = np.dot(affinemat,outcoords)
incoords = tuple(incoords[0:2])
return incoords
def rbtransform(source,xshift,yshift,rotation,outdims):
# source --> target
forward = maketmat(xshift,yshift,rotation,source.shape)
# target --> source
backward = np.linalg.inv(forward)
# now we can use geometric_transform to do the interpolation etc.
tformed = geometric_transform(source,out2in,output_shape=outdims,extra_arguments=(backward,))
return tformed
This works, but it's horribly slow, since it's essentially looping over pixel coordinates! What's a good way to do this?
Can you use the scikit image?
If this is the case, you could try to apply an homography. An homography cab used to represent both translation and rotation at the same time through a 3x3 matrix.
You can use the skimage.transform.fast_homography function.
import numpy as np
import scipy
import skimage.transform
im = scipy.misc.lena()
H = np.asarray([[1, 0, 10], [0, 1, 20], [0, 0, 1]])
skimage.transform.fast_homography(im, H)
The transform took about 30 ms on my old Core 2 Duo.
About homography : http://en.wikipedia.org/wiki/Homography
I think affine_transform does do translation --- there's the offset parameter.
I am trying to extract data from a binary mask. All goes well but changing to python will cause the data to shift a few pixels. It is enough so I cannot find the center. However saving the image will oldly enough display the pixels at the correct location
Here is my code. I basically create a normal mat to use as output. However a matnd is outputed according to the docs
Am I extracting the data properly? If so tell me. I am trying to find the center given points along the center. I kidda dont want my data to be shifted.
import cv2.cv as cv
def main():
imgColor = cv.LoadImage(OPTICIMAGE, cv.CV_LOAD_IMAGE_COLOR)
center, radius = centerandradus(imgColor)
def centerandradus(cvImg, ColorLower=None,ColorUpper=None):
lowerBound = cv.Scalar(130, 0, 130);
upperBound = cv.Scalar(171, 80, 171);
size = cv.GetSize(cvImg)
output = cv.CreateMat(size[0],size[1],cv.CV_8UC1)
cv.InRangeS(cvImg, lowerBound, upperBound,output)
mask = np.asarray( output[:,:] )
x,y = np.nonzero(mask)
x, y = np.array(x),np.array(y)
h,k = centerEstimate(x,y)
return np.array([h,k]), radius
def centerEstimate(xList,yList):
x_m = np.mean( np.r_[xList])
y_m = np.mean( np.r_[yList])
return x_m, y_m
Edit: I think it the problem with matND, since i notice the data is already shifted when I try to print out the data. If you need any more information please ask
Thank You for your time
It seems there is no more differences between Mat and MatND. MatND is now obsolete.
By looking at opencv2/core.hpp (version 2.4.8):
typedef Mat MatND;
I learn that the orientation of the data is different when I use findcontours or this matrix.
This matrix use height X width, while the contour put is as width X height. I hate reading apis.