Related
I'm new to numpy and I want the value of the dot product of A[i] and B[i] stored in a new matrix. I have written code that does that, but it's using a nested loop and I know there has to be a more numpy way of doing it. Can anyone help please?
A = np.array([[0, 1, 1, 1, 0],
[0, 0, 0, 1, 1],
[0, 1, 0, 1, 1],
[0, 0, 0, 0, 0],
[1, 0, 0, 1, 0]])
B = np.transpose(A)
l = []
for i in A:
for j in B:
l.append(np.dot(i, j))
np.array(l).reshape(5,5)
So here is what I can get with torch.eye(3,4) now
The matrix I get:
[[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]]
Is there any (easy)way to transform it, or make such a mask in this format:
The matrix I want:
[[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]]
You can do it by using torch.diagonal and specifying the diagonal you want:
>>> torch.diag(torch.tensor([1,1,1]), diagonal=1)[:-1]
tensor([[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])
If :attr:diagonal = 0, it is the main diagonal.
If :attr:diagonal > 0, it is above the main diagonal.
If :attr:diagonal < 0, it is below the main diagonal.
Here is another solution using torch.diagflat(), and using a positive offset for shifting/moving the diagonal above the main diagonal.
# diagonal values to fill
In [253]: diagonal_vals = torch.ones(3, dtype=torch.long)
# desired tensor but ...
In [254]: torch.diagflat(diagonal_vals, offset=1)
Out[254]:
tensor([[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1],
[0, 0, 0, 0]])
The above operation gives us a square matrix; however, we need a non-square matrix of shape (3,4). So, we'll just ignore the last row with simple indexing:
# shape (3, 4) with 1's above the main diagonal
In [255]: torch.diagflat(diagonal_vals, offset=1)[:-1]
Out[255]:
tensor([[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])
I'm trying to analyse map terrain given by the StarCraft 2 bot API.
A beginner's task for this analysis was finding cliffs for reapers, which are special units in SC2 that can jump up and down cliffs.
To solve this, I analyse points where the point itself is not pathable (=cliff) and the northern and southern two points are pathable. Pathable points are marked as 1 and not pathable as 0 in the array.
The terrain map exists as a 2D numpy array. The following is a small excerpt from a larger 200x200 array:
import numpy as np
example = np.array([[0, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 0]])
Here, the points [2, 1] and [2, 2] would match the criteria where the points themselves are not pathable (=0) and the points above and below them are pathable (=1).
This can be achieved by the following code:
above = np.roll(example, 1, axis=0) # Shift rows downwards
below = np.roll(example, -1, axis=0) # Shift rows upwards
result = np.zeros_like(example) # Create array with zeros
result[(example == 0) & (above == 1) & (below == 1)] = 1 # Set cells to 1 that match condition
print(repr(result))
# array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
Now my question is if the same can be achieved with less code?
The np.roll function creates a new np.array object each time, so analysing hundreds of nearby points could probably result in 100 lines of uncessary code and high memory usage.
I'm trying to find something similar to
result = np.zeros_like(example)
result[(example == 0) & (example[-1, 0] == 1) & (example[1, 0 == 1)] = 1
# or
result[(example == 0) & ((example[-1:2, 0].sum() == 2)] = 1
Here the numbers in the brackets display the relative position to the currently analysed point, but I don't know if there is a way to get this to work with numpy.
Also the result for the zeroth row wouldn't be clear when checking the point "above" it: It could access either the last row, result in an error or return a default value (0 or 1).
Edit:
I found this post and it pointed me towards the scipy convolve2d function which can be applied here, which might be what I am looking for:
import numpy as np
from scipy import signal
example = np.array([[0, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 0]])
kernel = np.zeros((3, 3), dtype=int)
kernel[::2, 1] = 1
print(repr(kernel))
# array([[0, 1, 0],
# [0, 0, 0],
# [0, 1, 0]])
result2 = signal.convolve2d(example, kernel, mode="same")
print(repr(result2))
# array([[0, 1, 1, 0],
# [0, 0, 0, 0],
# [0, 2, 2, 0],
# [0, 0, 0, 0],
# [0, 1, 1, 0]])
result2[result2 < 2] = 0
result2[result2 == 2] = 1
print(repr(result2))
# array([[0, 0, 0, 0],
# [0, 0, 0, 0],
# [0, 1, 1, 0],
# [0, 0, 0, 0],
# [0, 0, 0, 0]])
Edit2:
Another solution may be scipy.ndimage.minimum_filter which seems to work similarly:
import numpy as np
from scipy import ndimage
example = np.array([[0, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 0],
[0, 1, 1, 0],
[0, 0, 0, 0]])
kernel = np.zeros((3, 3), dtype=int)
kernel[::2, 1] = 1
print(repr(kernel))
# array([[0, 1, 0],
# [0, 0, 0],
# [0, 1, 0]])
result3 = ndimage.minimum_filter(example, footprint=kernel_vertical, mode="constant")
print(repr(result3))
# array([[0, 0, 0, 0],
# [0, 0, 0, 0],
# [0, 1, 1, 0],
# [0, 0, 0, 0],
# [0, 0, 0, 0]])
I have a raster with a set of unique ID patches/regions which I've converted into a two-dimensional Python numpy array. I would like to calculate pairwise Euclidean distances between all regions to obtain the minimum distance separating the nearest edges of each raster patch. As the array was originally a raster, a solution needs to account for diagonal distances across cells (I can always convert any distances measured in cells back to metres by multiplying by the raster resolution).
I've experimented with the cdist function from scipy.spatial.distance as suggested in this answer to a related question, but so far I've been unable to solve my problem using the available documentation. As an end result I would ideally have a 3 by X array in the form of "from ID, to ID, distance", including distances between all possible combinations of regions.
Here's a sample dataset resembling my input data:
import numpy as np
import matplotlib.pyplot as plt
# Sample study area array
example_array = np.array([[0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 0, 2, 2, 0, 6, 0, 3, 3, 3],
[0, 0, 0, 0, 2, 2, 0, 0, 0, 3, 3, 3],
[0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 3, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3],
[1, 1, 0, 0, 0, 0, 0, 0, 3, 3, 3, 3],
[1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 3],
[1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 3, 3, 3, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 0, 0, 5, 5, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4]])
# Plot array
plt.imshow(example_array, cmap="spectral", interpolation='nearest')
Distances between labeled regions of an image can be calculated with the following code,
import itertools
from scipy.spatial.distance import cdist
# making sure that IDs are integer
example_array = np.asarray(example_array, dtype=np.int)
# we assume that IDs start from 1, so we have n-1 unique IDs between 1 and n
n = example_array.max()
indexes = []
for k in range(1, n):
tmp = np.nonzero(example_array == k)
tmp = np.asarray(tmp).T
indexes.append(tmp)
# calculating the distance matrix
distance_matrix = np.zeros((n-1, n-1), dtype=np.float)
for i, j in itertools.combinations(range(n-1), 2):
# use squared Euclidean distance (more efficient), and take the square root only of the single element we are interested in.
d2 = cdist(indexes[i], indexes[j], metric='sqeuclidean')
distance_matrix[i, j] = distance_matrix[j, i] = d2.min()**0.5
# mapping the distance matrix to labeled IDs (could be improved/extended)
labels_i, labels_j = np.meshgrid( range(1, n), range(1, n))
results = np.dstack((labels_i, labels_j, distance_matrix)).reshape((-1, 3))
print(distance_matrix)
print(results)
This assumes integer IDs, and would need to be extended if that is not the case. For instance, with the test data above, the calculated distance matrix is,
# From 1 2 3 4 5 # To
[[ 0. 4.12310563 4. 9.05538514 5. ] # 1
[ 4.12310563 0. 3.16227766 10.81665383 8.24621125] # 2
[ 4. 3.16227766 0. 4.24264069 2. ] # 3
[ 9.05538514 10.81665383 4.24264069 0. 3.16227766] # 4
[ 5. 8.24621125 2. 3.16227766 0. ]] # 5
while the full output can be found here. Note that this takes the Eucledian distance from the center of each pixel. For instance, the distance between zones 1 and 3 is 2.0, while they are separated by 1 pixel.
This is a brute-force approach, where we calculate all the pairwise distances between pixels of different regions. This should be sufficient for most applications. Still, if you need better performance, have a look at scipy.spatial.cKDTree which would be more efficient in computing the minimum distance between two regions, when compared to cdist.
Using theano tensor operations, how can I toggle one cell on each row of a matrix based on a integer position indicator on the correspond row index of a vector (i.e. |v| = rows of the matrix). For example, given a 100x5 matrix of zeros
M = [
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
...
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]
] # |M| = 100x5
and a 100-element vector of integer in the range of [0, 4].
V = [2, 4, ..., 0, 2] # |V| = 100, max(V) = 4, min(V) = 0
update (or create another) matrix M to
M = [
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
...
[1, 0, 0, 0, 0],
[0, 0, 1, 0, 0]
] # |M| = 100x5
(I know how to do this iteratively using conventional codes, but I want to run it as part of an algorithm on GPU without complicating my input which is currently vector V, so a direct theano implementation would be great.)
I figured out the answer myself. This operation is known as one-hot and it is supported as the "to_one_hot" in Theano's extra_ops package. Code:
M_one_hot = theano.tensor.extra_ops.to_one_hot(V, 5, dtype='int32')