I have learned how to slice a tensor on one dimension here.
I have learned how to slice a 2D tensor giving a 1D tensor of specific values here.
Both use tf.gather() but I'm pretty sure I need tf.gather_nd() though I'm obviously using it wrong.
In numpy, I have a 5x5 2D array, and I can slice a 2x2 array by using np.ix_() with row and column indices (I always need the same indices for rows and columns, resulting in a squared matrix):
import numpy as np
a = np.array([[1,2,3,4,5],[2,1,6,7,8],[3,6,1,9,10],[4,7,9,1,11],[5,8,10,11,1]])
a
array([[ 1, 2, 3, 4, 5],
[ 2, 1, 6, 7, 8],
[ 3, 6, 1, 9, 10],
[ 4, 7, 9, 1, 11],
[ 5, 8, 10, 11, 1]])
a[np.ix_([1,3], [1,3])]
array([[1, 7],
[7, 1]])
Reading over the tf.gather_nd() docs I assumed this is the way to do it in TF, but I'm using it wrong:
import tensorflow as tf
a = tf.constant([[1,2,3,4,5],[2,1,6,7,8],[3,6,1,9,10],[4,7,9,1,11],[5,8,10,11,1]])
tf.gather_nd(a, [[1,3], [1,3]])
<tf.Tensor: shape=(2,), dtype=int32, numpy=array([7, 7])>
I would have to do something like:
tf.gather_nd(a, [[[1,1], [1,3]],[[3,1],[3,3]]])
<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 7],
[7, 1]])>
Which leads me down another rabbit hole I'm not keen on. My indices vector is a lot longer of course.
My indices, BTW, are 1D integer tensors themselves. So bottom-line I want to slice a with the same indices for rows and columns as I do with np._ix(), and my indices are something like:
idx = tf.constant([1, 3])
# tf.gather_nd(a, indices = "something with idx")
To slice a nxn 2D array with a 1D tensor of length d that results in a dxd 2D array with the specified indices, it can be done by using tf.repeat, tf.tile and then tf.stack:
n = 5
a = tf.constant(np.arange(n * n).reshape(n, n)) # 2D nxn array
idx = [1,2,4] # 1D tensor with length d
d = tf.shape(idx)[0]
ix_ = tf.reshape(tf.stack([tf.repeat(idx,d),tf.tile(idx,[d])],1),[d,d,2])
target = tf.gather_nd(a,ix_) # 2D dxd array
print(a)
print(target)
Expected outputs:
tf.Tensor(
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]], shape=(5, 5), dtype=int64)
tf.Tensor(
[[ 6 7 9]
[11 12 14]
[21 22 24]], shape=(3, 3), dtype=int64)
Related
I am trying to find a numpy operation that gives me the scalar products between all vectors of a 2d array at index i of a 3d array and the vector at index i of a 2d array. Let me give you an example to explain what I am thinking of:
x = np.array([[[1,2,3],
[2,3,4]],
[[11,12,13],
[12,13,14]]])
y = np.array([[1,1,1],
[2,2,2]])
np.?operation?(x,y.T)
output:
[[[1 *1 + 1 *2 + 1 *3],
[1 *2 + 1 *3 + 1 *4]],
[[2 *11 + 2 *12 + 2 *13],
[2 *12 + 2* 13 + 2 *14]]]
= [[[6],
[9]],
[[72],
[78]]]
As you can see, I am basically looking for a reduced dot product operation. The dot product of x and y would yield the following:
np.dot(x, y.T)
output:
[[[ 6 12]
[ 9 18]]
[[36 72]
[39 78]]]
Or is there a way to extract the results I need from the dot product result?
I have also tried np.tensordot(x,y,axis) but I were not able to figure which tuples I should put for -axis-.
I have also come across the np.einsum() operation but couldn't work my head around how this could help me with my problem.
It should be doable with np.einsum or np.matmul/# which has a "batch" operation on the leading dimension. But sorting out your dimensions, and getting the (2,2,1) shape is a bit tricky.
Your np.dot(x, y.T) gives the numbers you want, but you have to extract a kind of diagonal on 2, while retaining a dimension.
Here's one way of doing this - it isn't the fastest or succinct, but should help me wrap my mind around the dimensions.
In [432]: y[:,None,:]
Out[432]:
array([[[1, 1, 1]],
[[2, 2, 2]]])
In [433]: y[:,None,:].repeat(2,1)
Out[433]:
array([[[1, 1, 1],
[1, 1, 1]],
[[2, 2, 2],
[2, 2, 2]]])
In [435]: x*y[:,None,:].repeat(2,1)
Out[435]:
array([[[ 1, 2, 3],
[ 2, 3, 4]],
[[22, 24, 26],
[24, 26, 28]]])
In [436]: (x*y[:,None,:].repeat(2,1)).sum(axis=-1, keepdims=True)
Out[436]:
array([[[ 6],
[ 9]],
[[72],
[78]]])
We don't need the repeat, broadcasting will take its place:
(x*y[:,None,:]).sum(axis=-1, keepdims=True)
This einsum does the same as the dot/#:
In [441]: np.einsum('ijk,lk->ijl',x,y)
Out[441]:
array([[[ 6, 12],
[ 9, 18]],
[[36, 72],
[39, 78]]])
Change the indices a bit to get the "diagonal" (i in all terms)
In [442]: np.einsum('ijk,ik->ij',x,y)
Out[442]:
array([[ 6, 9],
[72, 78]])
and add a trailing dimension:
In [443]: np.einsum('ijk,ik->ij',x,y)[:,:,None]
Out[443]:
array([[[ 6],
[ 9]],
[[72],
[78]]])
Now that I have the einsum I can visualize the matmul/# dimensions. I need to the treat the first dimension of both as the 'batch', and add a new trailing dimension to y, making it (2,3,1). (2,2,3) with (2,3,1) => (2,2,1) with sum-of-products on the 3.
In [445]: x#y[:,:,None]
Out[445]:
array([[[ 6],
[ 9]],
[[72],
[78]]])
If x and y were (4,2,3) and (4,3) shaped, this dimension matching would have been more obvious.
In [446]: X=x.repeat(2,0)
In [447]: Y=y.repeat(2,0)
In [448]: X.shape
Out[448]: (4, 2, 3)
In [449]: Y.shape
Out[449]: (4, 3)
In [450]: X#Y[:,:,None] # (4,2,1)
Out[450]:
array([[[ 6],
[ 9]],
[[ 6],
[ 9]],
[[72],
[78]],
[[72],
[78]]])
With these shapes it's more obvious that 4 is the batch, and 3 is the sum-of-products.
I have three numpy arrays. The shape of the first is (413, 2), the shape of the second is (176, 2), and the shape of the third is (589,). If you'll notice, 413 + 176 = 589. What I want to accomplish is to use the 589 values of the third np array and make the first two arrays of shapes (413, 3) and (176, 3) respectively.
So, what I want is to take the values in the third np array and append them to the columns of the first and second np arrays. I can do the logic for applying to the first and then using the offset of the length of the first to continue appending to the second with the correct values. I suppose I could also combine np arrays 1 and 2, they are separated for a reason though because of my data preprocessing.
To put it visually if that helps, what I have is like this:
Array 1:
[[1 2]
[3 4]
[4 5]]
Array 2:
[[6 7]
[8 9]
[10 11]]
Array 3:
[1 2 3 4 5 6]
And what I want to have is:
Array 1:
[[1 2 1]
[3 4 2]
[4 5 3]]
Array 2:
[[6 7 4]
[8 9 5]
[10 11 6]]
I've tried using np.append, np.concatenate, and np.vstack but have not been able to achieve what I am looking for. I am relatively new to using numpy, and Python in general, so I imagine I am just using these tools incorrectly.
Many thanks for any help that can be offered! This is my first time asking a question here so if I did anything wrong or left anything out please let me know.
Split the third array using the length of array1, then horizontally stack them. You need to use either np.newaxis or array.reshape to change the dimensionality of the slice of array3.
import numpy as np
array1 = np.array(
[[1, 2],
[3, 4],
[4, 5]]
)
array2 = np.array(
[[6, 7],
[8, 9],
[10, 11]]
)
array3 = np.array([1, 2, 3, 4, 5, 6])
array13 = np.hstack([array1, array3[:len(array1), np.newaxis]])
array23 = np.hstack([array1, array3[len(array1):, np.newaxis]])
Outputs:
array13
array([[1, 2, 4],
[3, 4, 5],
[4, 5, 6]])
array23
array([[ 6, 7, 4],
[ 8, 9, 5],
[10, 11, 6]])
Let's say I have a matrix:
>> a = np.arange(25).reshape(5, 5)`
>> a
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
and two vectors of indices that define a span of matrix elements that I want to extract:
>> indices1 = np.array([0, 1, 1, 0, 0])
>> indices2 = np.array([2, 3, 3, 2, 2])
As you can see, difference between each corresponding index is equal to 2.
I would like to do sth like this extract a part of the matrix:
>> submatrix = a[indices1:indices2, :]
so that the result would be 2x5 matrix:
>> submatrix
[[ 0 6 7 3 4],
[ 5 11 12 8 9]]
For all I know, numpy allows to provide indices as a boundaries, but does not allow to provide arrays, only integers, e.g. a[0:2].
Note what I want to subtract is not a submatrix:
Do you know of some other way of indexing a numpy matrix so that it is possible to provide arrays defining spans? For now I managed to do it only with for loops.
For reference, the most obvious loop (still took several experimental steps):
In [87]: np.concatenate([a[i:j,n] for n,(i,j) in enumerate(zip(indices1,indices2))], ).reshape(-1,2).T
Out[87]:
array([[ 0, 6, 7, 3, 4],
[ 5, 11, 12, 8, 9]])
Broadcasted indices taking advantage of the constant length:
In [88]: indices1+np.arange(2)[:,None]
Out[88]:
array([[0, 1, 1, 0, 0],
[1, 2, 2, 1, 1]])
In [89]: a[indices1+np.arange(2)[:,None],np.arange(5)]
Out[89]:
array([[ 0, 6, 7, 3, 4],
[ 5, 11, 12, 8, 9]])
I have two arrays A and B. In NumPy you can use A as an index to B e.g.
A = np.array([[1,2,3,1,7,3,1,2,3],[4,5,6,4,5,6,4,5,6],[7,8,9,7,8,9,7,8,9]])
B= np.array([1,2,3,4,5,6,7,8,9,0])
c = B[A]
Which produces:
[[2 3 4 2 8 4 2 3 4] [5 6 7 5 6 7 5 6 7] [8 9 0 8 9 0 8 9 0]]
However, in my case the arrays A and B are SciPy CSR sparse arrays and they don't seem to support indexing.
A_sparse = sparse.csr_matrix(A)
B_sparse = sparse.csr_matrix(B)
c = B_sparse[A_sparse]
This results in:
IndexError: Indexing with sparse matrices is not supported except boolean indexing where matrix and index are equal shapes.
I've come up with the function below to replicate NumPy's behavior with the sparse arrays:
def index_sparse(A,B):
A_sparse = scipy.sparse.coo_matrix(A)
B_sparse = sparse.csr_matrix(B)
res = sparse.csr_matrix(A_sparse)
for i,j,v in zip(A_sparse.row, A_sparse.col, A_sparse.data):
res[i,j] = B_sparse[0, v]
return res
res = index_sparse(A, B)
print res.todense()
Looping over the array and having to create a new array in Python isn't ideal. Is there a better way of doing this using built-in functions from SciPy/ NumPy?
Sparse indexing is less developed. coo format for example doesn't implement it at all.
I haven't tried to implement this problem, though I have answered others that involve working with the sparse format attributes. So I'll just make some general observations.
B_sparse is a matrix, so its shape is (1,10). So the equivalent to B[A] is
In [294]: B_sparse[0,A]
Out[294]:
<3x9 sparse matrix of type '<class 'numpy.int32'>'
with 24 stored elements in Compressed Sparse Row format>
In [295]: _.A
Out[295]:
array([[2, 3, 4, 2, 8, 4, 2, 3, 4],
[5, 6, 7, 5, 6, 7, 5, 6, 7],
[8, 9, 0, 8, 9, 0, 8, 9, 0]], dtype=int32)
B_sparse[A,:] or B_sparse[:,A] gives a 3d warning, since it would be trying to create a matrix version of:
In [298]: B[None,:][:,A]
Out[298]:
array([[[2, 3, 4, 2, 8, 4, 2, 3, 4],
[5, 6, 7, 5, 6, 7, 5, 6, 7],
[8, 9, 0, 8, 9, 0, 8, 9, 0]]])
As to your function:
A_sparse.nonzero() does A_sparse.tocoo() and returns its row and col. Effectively the same as what you do.
Here's something that should be faster, though I haven't tested it enough to be sure it is robust:
In [342]: Ac=A_sparse.tocoo()
In [343]: res=Ac.copy()
In [344]: res.data[:]=B_sparse[0, Ac.data].A[0]
In [345]: res
Out[345]:
<3x9 sparse matrix of type '<class 'numpy.int32'>'
with 27 stored elements in COOrdinate format>
In [346]: res.A
Out[346]:
array([[2, 3, 4, 2, 8, 4, 2, 3, 4],
[5, 6, 7, 5, 6, 7, 5, 6, 7],
[8, 9, 0, 8, 9, 0, 8, 9, 0]], dtype=int32)
In this example there are 2 zeros that could cleaned up as well (look at res.nonzero()).
Since you are setting each res[i,j] with values from Ac.row and Ac.col, res has the same row,col values as Ac, so I initialize it as a copy. Then it's just a matter of updating the res.data attribute. It would be faster to index Bc.data directly, but that doesn't account for its sparsity.
How can I find the dimensions of a matrix in Python. Len(A) returns only one variable.
Edit:
close = dataobj.get_data(timestamps, symbols, closefield)
Is (I assume) generating a matrix of integers (less likely strings). I need to find the size of that matrix, so I can run some tests without having to iterate through all of the elements. As far as the data type goes, I assume it's an array of arrays (or list of lists).
The number of rows of a list of lists would be: len(A) and the number of columns len(A[0]) given that all rows have the same number of columns, i.e. all lists in each index are of the same size.
If you are using NumPy arrays, shape can be used.
For example
>>> a = numpy.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
>>> a
array([[[ 1, 2, 3],
[ 1, 2, 3]],
[[12, 3, 4],
[ 2, 1, 3]]])
>>> a.shape
(2, 2, 3)
As Ayman farhat mentioned
you can use the simple method len(matrix) to get the length of rows and get the length of the first row to get the no. of columns using len(matrix[0]) :
>>> a=[[1,5,6,8],[1,2,5,9],[7,5,6,2]]
>>> len(a)
3
>>> len(a[0])
4
Also you can use a library that helps you with matrices "numpy":
>>> import numpy
>>> numpy.shape(a)
(3,4)
To get just a correct number of dimensions in NumPy:
len(a.shape)
In the first case:
import numpy as np
a = np.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
print("shape = ",np.shape(a))
print("dimensions = ",len(a.shape))
The output will be:
shape = (2, 2, 3)
dimensions = 3
m = [[1, 1, 1, 0],[0, 5, 0, 1],[2, 1, 3, 10]]
print(len(m),len(m[0]))
Output
(3 4)
The correct answer is the following:
import numpy
numpy.shape(a)
Suppose you have a which is an array. to get the dimensions of an array you should use shape.
import numpy as np
a = np.array([[3,20,99],[-13,4.5,26],[0,-1,20],[5,78,-19]])
a.shape
The output of this will be
(4,3)
You may use as following to get Height and Weight of an Numpy array:
int height = arr.shape[0]
int width = arr.shape[1]
If your array has multiple dimensions, you can increase the index to access them.
You simply can find a matrix dimension by using Numpy:
import numpy as np
x = np.arange(24).reshape((6, 4))
x.ndim
output will be:
2
It means this matrix is a 2 dimensional matrix.
x.shape
Will show you the size of each dimension. The shape for x is equal to:
(6, 4)
A simple way I look at it:
example:
h=np.array([[[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]]])
h.ndim
4
h
array([[[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]]])
If you closely observe, the number of opening square brackets at the beginning is what defines the dimension of the array.
In the above array to access 7, the below indexing is used,
h[0,1,1,0]
However if we change the array to 3 dimensions as below,
h=np.array([[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]])
h.ndim
3
h
array([[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]])
To access element 7 in the above array, the index is h[1,1,0]