Say I have one matrix and one vector as follows:
import torch
x = torch.tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
y = torch.tensor([0, 2, 1])
is there a way to slice it x[y] so the result is:
res = [1, 6, 8]
So basically I take the first element of y and take the element in x that corresponds to the first row and the elements' column.
You can specify the corresponding row index as:
import torch
x = torch.tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
y = torch.tensor([0, 2, 1])
x[range(x.shape[0]), y]
tensor([1, 6, 8])
Advanced indexing in pytorch works just as NumPy's, i.e the indexing arrays are broadcast together across the axes. So you could do as in FBruzzesi's answer.
Though similarly to np.take_along_axis, in pytorch you also have torch.gather, to take values along a specific axis:
x.gather(1, y.view(-1,1)).view(-1)
# tensor([1, 6, 8])
Related
I have the following matrix:
import numpy as np
A:
matrix([[ 1, 2, 3, 4],
[ 3, 4, 10, 8]])
The question is how do I input the following restriction: if any number of a column in the matrix A is less than or equal to (<=) K (3), then change the last number of that column to minimum between the last entry of the column and 5? So basically, my matrix should transform to this:
A:
matrix([[ 1, 2, 3, 4],
[ 3, 4, 5, 8]])
I tried this function:
A[-1][np.any(A <= 3, axis=0)] = np.maximum(A[-1], 5)
But I have the following error:
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions
You should be using np.minimum here. Create a mask, and index, setting values accordingly.
B = np.array(A)
m = (B <= 3).any(0)
A[-1, m] = np.minimum(A[-1, m], 5)
A
matrix([[1, 2, 3, 4],
[3, 4, 5, 8]])
Here is one way:
A[-1][np.logical_and(A[-1] > 5, np.any(A <= 3, axis=0))] = 5
# matrix([[1, 2, 3, 4],
# [3, 4, 5, 8]])
This takes advantage of the fact you only need to change a number if it greater than 5. Therefore, the minimum criterion is taken care of by the A[-1] > 5 condition.
I am struggeling with pretty easy thing but unfortunatelly I cannot solve it. I have a matrix 64x64 elements as you can see on the image. Where reds are zeros and greens are values I am interested in.
I would like to end up with only lower triangular part under diagonal (green values) into one array.
I use Python 2.7
Thank you a lot,
Michael
Assuming you can pull your data into a numpy array, use the tril_indices function. It looks like your data doesn't include the main diagonal so you can shift by -1
data = np.arange(4096).reshape(64, 64)
inds = np.tril_indices(64, -1)
vals = data[inds]
You can use np.tril_indices which returns the indices of a lower triangular part of a matrix with given shape, the indices can be further used to extract values from the matrix, suppose your matrix is called arr:
arr[np.tril_indices(n=64,m=64)]
You can provide an extra offset parameter if you want to exclude the diagonal:
arr[np.tril_indices(n = 64, m = 64, k = -1)]
An example:
arr = np.array([list(range(i, 5+i)) for i in range(5)])
arr
#array([[0, 1, 2, 3, 4],
# [1, 2, 3, 4, 5],
# [2, 3, 4, 5, 6],
# [3, 4, 5, 6, 7],
# [4, 5, 6, 7, 8]])
arr[np.tril_indices(n = 5, m = 5)]
# array([0, 1, 2, 2, 3, 4, 3, 4, 5, 6, 4, 5, 6, 7, 8])
Two time faster than triu on this example :
np.concatenate([arr[i,:i] for i in range(1,n)])
I have a 3 dimensional numpy array. The dimension can go up to 128 x 64 x 8192. What I want to do is to change the order in the first dimension by interchanging pairwise.
The only idea I had so far is to create a list of the indices in the correct order.
order = [1,0,3,2...127,126]
data_new = data[order]
I fear, that this is not very efficient but I have no better idea so far
You could reshape to split the first axis into two axes, such that latter of those axes is of length 2 and then flip the array along that axis with [::-1] and finally reshape back to original shape.
Thus, we would have an implementation like so -
a.reshape(-1,2,*a.shape[1:])[:,::-1].reshape(a.shape)
Sample run -
In [170]: a = np.random.randint(0,9,(6,3))
In [171]: order = [1,0,3,2,5,4]
In [172]: a[order]
Out[172]:
array([[0, 8, 5],
[4, 5, 6],
[0, 0, 2],
[7, 3, 8],
[1, 6, 3],
[2, 4, 4]])
In [173]: a.reshape(-1,2,*a.shape[1:])[:,::-1].reshape(a.shape)
Out[173]:
array([[0, 8, 5],
[4, 5, 6],
[0, 0, 2],
[7, 3, 8],
[1, 6, 3],
[2, 4, 4]])
Alternatively, if you are looking to efficiently create those constantly flipping indices order, we could do something like this -
order = np.arange(data.shape[0]).reshape(-1,2)[:,::-1].ravel()
Is it possible to calculate the mean of multiple arrays, when they may have different lengths? I am using numpy. So let's say I have:
numpy.array([[1, 2, 3, 4, 8], [3, 4, 5, 6, 0]])
numpy.array([[5, 6, 7, 8, 7, 8], [7, 8, 9, 10, 11, 12]])
numpy.array([[1, 2, 3, 4], [5, 6, 7, 8]])
Now I want to calculate the mean, but ignoring elements that are 'missing' (Naturally, I can not just append zeros as this would mess up the mean)
Is there a way to do this without iterating through the arrays?
PS. These arrays are all 2-D, but will always have the same amount of coordinates for that array. I.e. the 1st array is 5 and 5, 2nd is 6 and 6, 3rd is 4 and 4.
An example:
np.array([[1, 2], [3, 4]])
np.array([[1, 2, 3], [3, 4, 5]])
np.array([[7], [8]])
This must give
(1+1+7)/3 (2+2)/2 3/1
(3+3+8)/3 (4+4)/2 5/1
And graphically:
[1, 2] [1, 2, 3] [7]
[3, 4] [3, 4, 5] [8]
Now imagine that these 2-D arrays are placed on top of each other with coordinates overlapping contributing to that coordinate's mean.
I often needed this for plotting mean of performance curves with different lengths.
Solved it with simple function (based on answer of #unutbu):
def tolerant_mean(arrs):
lens = [len(i) for i in arrs]
arr = np.ma.empty((np.max(lens),len(arrs)))
arr.mask = True
for idx, l in enumerate(arrs):
arr[:len(l),idx] = l
return arr.mean(axis = -1), arr.std(axis=-1)
y, error = tolerant_mean(list_of_ys_diff_len)
ax.plot(np.arange(len(y))+1, y, color='green')
So applying that function to the list of above-plotted curves yields the following:
numpy.ma.mean allows you to compute the mean of non-masked array elements. However, to use numpy.ma.mean, you have to first combine your three numpy arrays into one masked array:
import numpy as np
x = np.array([[1, 2], [3, 4]])
y = np.array([[1, 2, 3], [3, 4, 5]])
z = np.array([[7], [8]])
arr = np.ma.empty((2,3,3))
arr.mask = True
arr[:x.shape[0],:x.shape[1],0] = x
arr[:y.shape[0],:y.shape[1],1] = y
arr[:z.shape[0],:z.shape[1],2] = z
print(arr.mean(axis = 2))
yields
[[3.0 2.0 3.0]
[4.66666666667 4.0 5.0]]
The below function also works by adding columns of arrays of different lengths:
def avgNestedLists(nested_vals):
"""
Averages a 2-D array and returns a 1-D array of all of the columns
averaged together, regardless of their dimensions.
"""
output = []
maximum = 0
for lst in nested_vals:
if len(lst) > maximum:
maximum = len(lst)
for index in range(maximum): # Go through each index of longest list
temp = []
for lst in nested_vals: # Go through each list
if index < len(lst): # If not an index error
temp.append(lst[index])
output.append(np.nanmean(temp))
return output
Going off of your first example:
avgNestedLists([[1, 2, 3, 4, 8], [5, 6, 7, 8, 7, 8], [1, 2, 3, 4]])
Outputs:
[2.3333333333333335,
3.3333333333333335,
4.333333333333333,
5.333333333333333,
7.5,
8.0]
The reason np.amax(nested_lst) or np.max(nested_lst) was not used in the beginning to find the max value is because it will return an array if the nested lists are of different sizes.
OP, I know you were looking for a non-iterative built-in solution, but the following really only takes 3 lines (2 if you combine transpose and means but then it just gets messy):
arrays = [
np.array([1,2], [3,4]),
np.array([1,2,3], [3,4,5]),
np.array([7], [8])
]
mean = lambda x: sum(x)/float(len(x))
transpose = [[item[i] for item in arrays] for i in range(len(arrays[0]))]
means = [[mean(j[i] for j in t if i < len(j)) for i in range(len(max(t, key = len)))] for t in transpose]
Outputs:
>>>means
[[3.0, 2.0, 3.0], [4.666666666666667, 4.0, 5.0]]
For example, I have a ndarray that is:
a = np.array([1, 3, 5, 7, 2, 4, 6, 8])
Now I want to split a into two parts, one is all numbers <5 and the other is all >=5:
[array([1,3,2,4]), array([5,7,6,8])]
Certainly I can traverse a and create two new array. But I want to know does numpy provide some better ways?
Similarly, for multidimensional array, e.g.
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[2, 4, 7]])
I want to split it according to the first column <3 and >=3, which result is:
[array([[1, 2, 3],
[2, 4, 7]]),
array([[4, 5, 6],
[7, 8, 9]])]
Are there any better ways instead of traverse it? Thanks.
import numpy as np
def split(arr, cond):
return [arr[cond], arr[~cond]]
a = np.array([1,3,5,7,2,4,6,8])
print split(a, a<5)
a = np.array([[1,2,3],[4,5,6],[7,8,9],[2,4,7]])
print split(a, a[:,0]<3)
This produces the following output:
[array([1, 3, 2, 4]), array([5, 7, 6, 8])]
[array([[1, 2, 3],
[2, 4, 7]]), array([[4, 5, 6],
[7, 8, 9]])]
It might be a quick solution
a = np.array([1,3,5,7])
b = a >= 3 # variable with condition
a[b] # to slice the array
len(a[b]) # count the elements in sliced array
1d array
a = numpy.array([2,3,4,...])
a_new = a[(a < 4)] # to get elements less than 5
2d array based on column(consider value of column i should be less than 5,
a = numpy.array([[1,2],[5,6],...]
a = a[(a[:,i] < 5)]
if your condition is multicolumn based, then you can make a new array applying the conditions on the columns. Then you can just compare the new array with value 5(according to my assumption) to get indexes and follow above codes.
Note that, whatever i have written in (), returns the index array.