I'm trying to perform a rigid + scale transformation on a 3D volume with pytorch, but I can't seem to understand how the theta required for torch.nn.functional.affine_grid works.
I have a transformation matrix of size (1,4,4) generated by multiplying the matrices Translation * Scale * Rotation. If I use this matrix in, for example, scipy.ndimage.affine_transform, it works with no issues. However, the same matrix (cropped to size (1,3,4)) fails completely with torch.nn.functional.affine_grid.
I have managed to understand how the translation works (range -1 to 1) and I have confirmed that the Translation matrix works by simply normalizing the values to that range. As for the other two, I am lost.
I tried using a basic Scaling matrix alone (below) as a most basic comparison but the results in pytorch are different than that of scipy
Scaling =
[[0.75, 0, 0, 0],
[[0, 0.75, 0, 0],
[[0, 0, 0.75, 0],
[[0, 0, 0, 1]]
How can I convert the (1,4,4) affine matrix to work the same with torch.nn.functional.affine_grid? Alternatively, is there a way to generate the correct matrix based on the transformation parameters (shift, euler angles, scaling)?
To anyone that comes across a similar issue in the future, the problem with scipy vs pytorch affine transforms is that scipy applies the transforms around (0, 0, 0) while pytorch applies it around the middle of the image/volume.
For example, let's take the parameters:
euler_angles = [ea0, ea1, ea2]
translation = [tr0, tr1, tr2]
scale = [sc0, sc1, sc2]
and create the following transformation matrices:
# Rotation matrix
R_x(ea0, ea1, ea2) = np.array([[1, 0, 0, 0],
[0, math.cos(ea0), -math.sin(ea0), 0],
[0, math.sin(ea0), math.cos(ea0), 0],
[0, 0, 0, 1]])
R_y(ea0, ea1, ea2) = np.array([[math.cos(ea1), 0, math.sin(ea1), 0],
[0, 1, 0, 0],
[-math.sin(ea1), 0, math.cos(ea1)], 0],
[0, 0, 0, 1]])
R_z(ea0, ea1, ea2) = np.array([[math.cos(ea2), -math.sin(ea2), 0, 0],
[math.sin(ea2), math.cos(ea2), 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])
R = R_x.dot(R_y).dot(R_z)
# Translation matrix
T(tr0, tr1, tr2) = np.array([[1, 0, 0, -tr0],
[0, 1, 0, -tr1],
[0, 0, 1, -tr2],
[0, 0, 0, 1]])
# Scaling matrix
S(sc0, sc1, sc2) = np.array([[1/sc0, 0, 0, 0],
[0, 1/sc1, 0, 0],
[0, 0, 1/sc2, 0],
[0, 0, 0, 1]])
If you have a volume of size (100, 100, 100), the scipy transform around the centre of the volume requires moving the centre of the volume to (0, 0, 0) first, and then moving it back to (50, 50, 50) after S, T, and R have been applied. Defining:
T_zero = np.array([[1, 0, 0, 50],
[0, 1, 0, 50],
[0, 0, 1, 50],
[0, 0, 0, 1]])
T_centre = np.array([[1, 0, 0, -50],
[0, 1, 0, -50],
[0, 0, 1, -50],
[0, 0, 0, 1]])
The scipy transform around the centre is then:
transform_scipy_centre = T_zero.dot(T).dot(S).dot(R).T_centre
In pytorch, there are some slight differences to the parameters.
The translation is defined between -1 and 1. Their order is also different. Using the same (100, 100, 100) volume as an example, the translation parameters in pytorch are given by:
# Note the order difference
translation_pytorch = =[tr0_p, tr1_p, tr2_p] = [tr0/50, tr2/50, tr1/50]
T_p = T(tr0_p, tr1_p, tr2_p)
The scale parameters are in a different order:
scale_pytorch = [sc0_p, sc1_p, sc2_p] = [sc2, sc0, sc1]
S_p = S(sc0_p, sc1_p, sc2_p)
The euler angles are the biggest difference. To get the equivalent transform, first the parameters are negative and in a different order:
# Note the order difference
euler_angles_pytorch = [ea0_p, ea1_p, ea2_p] = [-ea0, -ea2, -ea1]
R_x_p = R_x(ea0_p, ea1_p, ea2_p)
R_y_p = R_y(ea0_p, ea1_p, ea2_p)
R_z_p = R_z(ea0_p, ea1_p, ea2_p)
The order in which the rotation matrix is calculated is also different:
# Note the order difference
R_p = R_x_p.dot(R_z_p).dot(R_y_p)
With all these considerations, the scipy transform with:
transform_scipy_centre = T_zero.dot(T).dot(S).dot(R).T_centre
is equivalent to the pytorch transform with:
transform_pytorch = T_p.dot(S_p).dot(R_p)
I hope this helps!
Related
How to set a limited random values by amount and range in nupmy matrix ?
Means instead :
random_matrix = np.random.rand(5, 5)
[[0.38555213 0.96454126 0.91586422 0.92638243 0.85516641]
[0.64717218 0.2716665 0.70945594 0.74754943 0.48870502]
[0.23381316 0.01992578 0.86749684 0.85797792 0.19308509]
[0.63565231 0.7056163 0.69110815 0.73506642 0.804646 ]
[0.35512519 0.54900446 0.66311323 0.04899527 0.49349834]]
the wanted setting for example is 3 random integers between the range 1-5
in a null matrix :
0,0,0,4,0
0,0,0,0,0
0,1,0,0,0
0,0,0,3,0
0,0,0,0,0
Thanks in advance
If i understand the question correctly, you want to create a matrix that is zero in all places except for 3 random indices that will have a random value between the range 1-5.
For this i would suggest doing:
null_matrix = np.zeros((5,5), dtype=np.int32)
rng = np.random.default_rng()
x = rng.choice(5, size=3, replace=False)
y = rng.choice(5, size=3, replace=False)
null_matrix[x,y] = rng.choice(np.arange(1,5), 3)
print(null_matrix)
Output:
array([[0, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[4, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 2]], dtype=int32)
Suppose I have original_image: as (451, 521, 3) shape.
And it contains [0,0,0] RGB values at some locations.
I would like to replace all [0,0,0] with [0,255,0]
What I tried was
I created mask which has True where [0,0,0] are located in original_image
And that mask has (451, 521) shape
I thought I could use following
new_original_image=original_image[mask]
But it turned out new_original_image is just an array (shape is like (18, 3)) whose all elements (for example, [[ 97 68 108],[127 99 139],[156 130 170],...]) are filtered by True of mask array from original_image
Here is one way
idx=np.all(np.vstack(a)==np.array([0,0,5]),1)
a1=np.vstack(a)
a1[idx]=[0,0,0]
yourary=a1.reshape(2,-1,3)
Out[150]:
array([[[0, 0, 0],
[0, 0, 1],
[0, 0, 0],
[0, 0, 0]],
[[0, 0, 0],
[0, 0, 1],
[0, 0, 0],
[0, 0, 0]]])
Data input
a
Out[133]:
array([[[0, 0, 0],
[0, 0, 1],
[0, 0, 5],
[0, 0, 5]],
[[0, 0, 0],
[0, 0, 1],
[0, 0, 5],
[0, 0, 5]]])
I would like to replace all [0,0,0] with [0,255,0]
import cv2
img = cv2.imread("test.jpg")
rows, cols, channels = img.shape
for r in range(rows):
for c in range(cols):
if np.all(img[r,c][0]==[0,0,0]):
img[r,c]=[0,255,0]
Based on reply solution from Wen-Ben, I try to write detailed code snippet that I wanted to implement
# original_image which contains [0,0,0] at several location
# in 2 (last) axis from (451, 521, 3) shape image
# Stack original_image or using original_image.reshape((-1,3)) is also working
stacked=np.vstack(original_image)
# print(stacked.shape)
# (234971, 3)
# Create mask array which has True where [0,0,0] are located in stacked array
idx=np.all(stacked==[0,0,0],1)
# print(idxs.shape)
# (234971,)
# Replace existing values which are filtered by idx with [0,255,0]
stacked[idx]=[0,255,0]
# Back to original image shape
original_image_new=stacked.reshape(original_image.shape[0],original_image.shape[1],3)
# print(original_image_new.shape)
# (451, 521, 3)
I've been working to improve the speed of my code by replacing for loops of array operations to appropriate NumPy functions.
The function aims to get the end points of a line, which is the only two points that has exactly one neighbor pixel in 255.
Is there a way I could get two points from np.where with conditions or some NumPy functions I'm not familiar with will do the job?
def get_end_points(image):
x1=-1
y1=-1
x2=-1
y2=-1
for i in range(image.shape[0]):
for j in range(image.shape[1]):
if image[i][j]==255 and neighbours_sum(i,j,image) == 255:
if x1==-1:
x1 = j
y1 = i
else:
x2=j
y2=i
return x1,y1,x2,y2
Here is a solution with convolution:
import numpy as np
import scipy.signal
def find_endpoints(img):
# Kernel to sum the neighbours
kernel = [[1, 1, 1],
[1, 0, 1],
[1, 1, 1]]
# 2D convolution (cast image to int32 to avoid overflow)
img_conv = scipy.signal.convolve2d(img.astype(np.int32), kernel, mode='same')
# Pick points where pixel is 255 and neighbours sum 255
endpoints = np.stack(np.where((img == 255) & (img_conv == 255)), axis=1)
return endpoints
# Test
img = np.zeros((1000, 1000), dtype=np.uint8)
# Draw a line from (200, 130) to (800, 370)
for i in range(200, 801):
j = round(i * 0.4 + 50)
img[i, j] = 255
print(find_endpoints(img))
# [[200 130]
# [800 370]]
EDIT:
You may also consider using Numba for this. The code would be pretty much what you already have, so maybe not particularly "elegant", but much faster. For example, something like this:
import numpy as np
import numba as nb
#nb.njit
def find_endpoints_nb(img):
endpoints = []
# Iterate through every row and column
for i in range(img.shape[0]):
for j in range(img.shape[1]):
# Check current pixel is white
if img[i, j] != 255:
continue
# Sum neighbours
s = 0
for ii in range(max(i - 1, 0), min(i + 2, img.shape[0])):
for jj in range(max(j - 1, 0), min(j + 2, img.shape[1])):
s += img[ii, jj]
# Sum including self pixel for simplicity, check for two white pixels
if s == 255 * 2:
endpoints.append((i, j))
if len(endpoints) >= 2:
break
if len(endpoints) >= 2:
break
return np.array(endpoints)
print(find_endpoints_nb(img))
# [[200 130]
# [800 370]]
This runs comparatively faster in my computer:
%timeit find_endpoints(img)
# 34.4 ms ± 64.4 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit find_endpoints_nb(img)
# 552 µs ± 4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Also, it should use less memory. The code above assumes there will be only two endpoints. You may be able to make it even faster if you add parallelization (although you would have to make some changes, because you would not be able to modify the list endpoints from parallel threads).
Edit: I didnt notice you have grayscale image, but as far as the idea is concerned, nothing changed
I cannot give you exact solution, but I can give you faster way to find what you want
1) a) Find indexes (pixels) where is white [255,255,255]
indice =np.where(np.all(image==255, axis=2))
1) b) do your loops around this points
this is faster because you are not doing useless loops
2) This solution should be very very fast, but it will be hard to program
a) find the indexes like in 1)
indice =np.where(np.all(image==255, axis=2))
b) move indice array +1 in X axis and add it to image
indices = =np.where(np.all(image==255, axis=2))
indices_up = # somehow add to all indexes in x dimension +1 (simply move it up)
add_up = image[indices]+image[indices_up]
# if in add_up matrix is array with(rgb channel) [510,510,510] # 255+255, then it has neightbour in x+1
# Note that you cant do it with image of dtype uint8, because 255 is max, adding up you will end up back at 255
You have to this for all neighbours though -> x+1,x-1,y+1,y-1, x+1,y+1....
It will be extra fast tough
EDIT2: I was able to make a script that should do it, but you should test it first
import numpy as np
image = np.array([[0, 0, 0, 0, 0, 0, 0,0,0],
[0, 0, 255, 0, 0, 0, 0,0,0],
[0, 0, 255, 0, 255, 0, 0,0,0],
[0, 0, 0, 255,0, 255, 0,0,0],
[0, 0, 0, 0, 0, 255, 0,0,0],
[0, 0, 0, 0, 0, 0, 0,0,0],
[0, 0, 0, 0, 0, 0, 0,0,0]])
image_f = image[1:-1,1:-1] # cut image
i = np.where(image_f==255) # find 255 in the cut image
x = i[0]+1 # calibrate x indexes for original image
y = i[1]+1 # calibrate y indexes for original image
# this is done so you dont search in get_indexes() out of image
def get_indexes(xx,yy,image):
for i in np.where(image[xx,yy]==255):
for a in i:
yield xx[a],yy[a]
# Search for horizontal and vertical duplicates(neighbours)
for neighbours_index in get_indexes(x+1,y,image):
print(neighbours_index )
for neighbours_index in get_indexes(x-1,y,image):
print(neighbours_index )
for neighbours_index in get_indexes(x,y+1,image):
print(neighbours_index )
for neighbours_index in get_indexes(x,y-1,image):
print(neighbours_index )
I think I can at least provide an elegant solution using convolutions.
We can look for the amount of neighbouring pixels by convolving the original image with a 3x3 ring. Then we can determine if the line end was there if the center pixel also had a white pixel in it.
>>> import numpy as np
>>> from scipy.signal import convolve2d
>>> a = np.array([[0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0], [0, 0, 0, 0, 1, 0, 0], [0, 0, 0, 0, 1, 1, 0]])
>>> a
array([[0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 1, 0]])
>>> c = np.full((3, 3), 1)
>>> c[1, 1] = 0
>>> c
array([[1, 1, 1],
[1, 0, 1],
[1, 1, 1]])
>>> np.logical_and(convolve2d(a, c, mode='same') == 1, a == 1).astype(int)
array([[0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
Feel free to see what the individual components produce, but for the sake of brevity I didn't include them here. And as you might have noticed, it does correctly reject cases where the line ends with two neighbouring pixels.
This you can of course convert to the arbitrary amount of indices of line endings with np.where:
np.array(np.where(result))
I am working on a piece of python code that will take in an image in grey scale, scale it, and output a 3d model with the height of each pixel being determined by the value of the grey scale. I have everything working except the output of the 3d model. I am using numpy-stl to create it based on an array of values derived from the image. Using the numpy-stl library I create a box and then copy it as many times as i need for the image. then I translate each one to the position and height corresponding with the image. This all works. The problem comes when I try to save it all as one .stl file. I cant figure out how to combine all the individual meshes of the cubes into one.
Here is just the code dealing with the creation of the 3d array. I can plot the created meshes but not save them.
from stl import mesh
import math
import numpy
test = [[1,2],[2,1]]
a = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
# Create 6 faces of a cube, 2 triagles per face
data = numpy.zeros(12, dtype=mesh.Mesh.dtype)
#cube defined in stl format
# Top of the cube
data['vectors'][0] = numpy.array([[0, 1, 1],
[1, 0, 1],
[0, 0, 1]])
data['vectors'][1] = numpy.array([[1, 0, 1],
[0, 1, 1],
[1, 1, 1]])
# Right face
data['vectors'][2] = numpy.array([[1, 0, 0],
[1, 0, 1],
[1, 1, 0]])
data['vectors'][3] = numpy.array([[1, 1, 1],
[1, 0, 1],
[1, 1, 0]])
# Left face
data['vectors'][4] = numpy.array([[0, 0, 0],
[1, 0, 0],
[1, 0, 1]])
data['vectors'][5] = numpy.array([[0, 0, 0],
[0, 0, 1],
[1, 0, 1]])
# Bottem of the cube
data['vectors'][6] = numpy.array([[0, 1, 0],
[1, 0, 0],
[0, 0, 0]])
data['vectors'][7] = numpy.array([[1, 0, 0],
[0, 1, 0],
[1, 1, 0]])
# Right back
data['vectors'][8] = numpy.array([[0, 0, 0],
[0, 0, 1],
[0, 1, 0]])
data['vectors'][9] = numpy.array([[0, 1, 1],
[0, 0, 1],
[0, 1, 0]])
# Left back
data['vectors'][10] = numpy.array([[0, 1, 0],
[1, 1, 0],
[1, 1, 1]])
data['vectors'][11] = numpy.array([[0, 1, 0],
[0, 1, 1],
[1, 1, 1]])
# Generate 4 different meshes so we can rotate them later
meshes = [mesh.Mesh(data.copy()) for _ in range(16)]
#iterates through the array and translates cube in the x and y direction according
#to position in array and in the z direction according to eh value stored in the array
def ArrayToSTL(array, STLmesh):
y_count = 0
x_count = 0
count = 0
for row in array:
x_count = 0
for item in row:
meshes[count].x += x_count
meshes[count].y += y_count
meshes[count].z += item
x_count +=1
count += 1
y_count += 1
ArrayToSTL(a, meshes)
# Optionally render the rotated cube faces
from matplotlib import pyplot
from mpl_toolkits import mplot3d
# Create a new plot
figure = pyplot.figure()
axes = mplot3d.Axes3D(figure)
# Render the cube faces
for m in meshes:
axes.add_collection3d(mplot3d.art3d.Poly3DCollection(m.vectors))
# Auto scale to the mesh size
scale = numpy.concatenate([m.points for m in meshes]).flatten(-1)
axes.auto_scale_xyz(scale, scale, scale)
# Show the plot to the screen
pyplot.show()
This works well:
import numpy as np
import stl
from stl import mesh
import os
def combined_stl(meshes, save_path="./combined.stl"):
combined = mesh.Mesh(np.concatenate([m.data for m in meshes]))
combined.save(save_path, mode=stl.Mode.ASCII)
loading stored stl files and meshing them, use this.
direc = "path_of_directory"
paths = [os.path.join(direc, i) for i in os.listdir(direc)]
meshes = [mesh.Mesh.from_file(path) for path in paths]
combined_stl(meshes)
I have a m x n matrix where each row is a sample and each column is a class. Each row contains the soft-max probabilities of each class. I want to replace the maximum value in each row with 1 and others with 0. How can I do it efficiently in Python?
Some made up data:
>>> a = np.random.rand(5, 5)
>>> a
array([[ 0.06922196, 0.66444783, 0.2582146 , 0.03886282, 0.75403153],
[ 0.74530361, 0.36357237, 0.3689877 , 0.71927017, 0.55944165],
[ 0.84674582, 0.2834574 , 0.11472191, 0.29572721, 0.03846353],
[ 0.10322931, 0.90932896, 0.03913152, 0.50660894, 0.45083403],
[ 0.55196367, 0.92418942, 0.38171512, 0.01016748, 0.04845774]])
In one line:
>>> (a == a.max(axis=1)[:, None]).astype(int)
array([[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0]])
A more efficient (and verbose) approach:
>>> b = np.zeros_like(a, dtype=int)
>>> b[np.arange(a.shape[0]), np.argmax(a, axis=1)] = 1
>>> b
array([[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0]])
I think the best answer to your particular question is to use a matrix type object.
A sparse matrix should be the most performant in terms of storing large numbers of these matrices of large sizes in a memory friendly way, given that most of the matrix is populated with zeroes. This should be superior to using numpy arrays directly especially for very large matrices in both dimensions, if not in terms of speed of computation, in terms of memory.
import numpy as np
import scipy #older versions may require `import scipy.sparse`
matrix = np.matrix(np.random.randn(10, 5))
maxes = matrix.argmax(axis=1).A1
# was .A[:,0], slightly faster, but .A1 seems more readable
n_rows = len(matrix) # could do matrix.shape[0], but that's slower
data = np.ones(n_rows)
row = np.arange(n_rows)
sparse_matrix = scipy.sparse.coo_matrix((data, (row, maxes)),
shape=matrix.shape,
dtype=np.int8)
This sparse_matrix object should be very lightweight relative to a regular matrix object, which would needlessly track each and every zero in it. To materialize it as a normal matrix:
sparse_matrix.todense()
returns:
matrix([[0, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
[1, 0, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 1, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 1, 0]], dtype=int8)
Which we can compare to matrix:
matrix([[ 1.41049496, 0.24737968, -0.70849012, 0.24794031, 1.9231408 ],
[-0.08323096, -0.32134873, 2.14154425, -1.30430663, 0.64934781],
[ 0.56249379, 0.07851507, 0.63024234, -0.38683508, -1.75887624],
[-0.41063182, 0.15657594, 0.11175805, 0.37646245, 1.58261556],
[ 1.10421356, -0.26151637, 0.64442885, -1.23544526, -0.91119517],
[ 0.51384883, 1.5901419 , 1.92496778, -1.23541699, 1.00231508],
[-2.42759787, -0.23592018, -0.33534536, 0.17577329, -1.14793293],
[-0.06051458, 1.24004714, 1.23588228, -0.11727146, -0.02627196],
[ 1.66071534, -0.07734444, 1.40305686, -1.02098911, -1.10752638],
[ 0.12466003, -1.60874191, 1.81127175, 2.26257234, -1.26008476]])
This approach using basic numpy and list comprehensions works, but is the least performant. I'm leaving this answer here as it may be somewhat instructive. First we create a numpy matrix:
matrix = np.matrix(np.random.randn(2,2))
matrix is, e.g.:
matrix([[-0.84558168, 0.08836042],
[-0.01963479, 0.35331933]])
Now map 1 to a new matrix if the element is max, else 0:
newmatrix = np.matrix([[1 if i == row.max() else 0 for i in row]
for row in np.array(matrix)])
newmatrix is now:
matrix([[0, 1],
[0, 1]])
Y = np.random.rand(10,10)
X=np.zeros ((5,5))
y_insert=2
x_insert=3
offset = (1,2)
for index_x, row in enumerate(X):
for index_y, e in enumerate(row):
Y[index_x + offset[0]][index_y + offset[1]] = e