Operations with Numpy arrays with zero dimensions - python

Is a numpy array of shape (0,10) a numpy array of shape (10). I'm writing a very simple function that will alternate between 2 and 3 dimensions and I am wondering know whether the output of something like this:
def Pick(N = 0, F, R, Choice=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]):
if N==0:
return np.array(np.random.choice(Choice,size=(F,R)))
else:
return np.array(np.random.choice(Choice,size=(N,F,R)))
will behave the same as the output of:
def Pick(N = 0, F, R, Choice=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]):
return np.array(np.random.choice(Choice,size=(N,F,R)))
Theoretically these should be the same but when I try.
a =np.full((10,10,10),1)
then
a+a
I get a (10,10,10) np.array of 2's. But if I try
b=np.full((0,10,10,10),1)
then
b+b
This is the only result I receive
array([], shape=(0, 10, 10, 10), dtype=int64)
any ideas as to why this is?

Abstractly, an array of shape (N,M,L) can be represented identically by an array of shape (<>,N,<>,M,<>,L,<>), where <> can be substituted for a sequence of 1s with arbitrary finite length. Consider the set of indexes corresponding to each data point — if one dimension is of length 0, what index corresponding to that dimension can data points bear? This should explain why defining a numpy array as you have yields the [] result — because you have defined an empty array. Defining
a = np.full((10,10,10),1)
b = np.full((10,10,10,1),1)
the
a+b
operation broadcasts appropriately (and) yields the expected result.

A 0 dimension has the same meaning as a 1, 2 or other positive integer:
In [437]: np.ones((2,3),int)
Out[437]:
array([[1, 1, 1], # 2*3 elements
[1, 1, 1]])
In [438]: np.ones((1,3),int)
Out[438]: array([[1, 1, 1]]) # 1*3 elements
In [439]: np.ones((0,3),int)
Out[439]: array([], shape=(0, 3), dtype=int64) # 0*3 elements

Related

Numpy: for each element in one dimension, find coordinates of maximum of sub-array

I've seen variations of this question asked a few times but so far haven't seen any answers that get to the heart of this general case. I have an n-dimensional array of shape [a, b, c, ...] . For some dimension x, I want to look at each sub-array and find the coordinates of the maximum.
For example, say b = 2, and that's the dimension I'm interested in. I want the coordinates of the maximum of [:, 0, :, ...] and [:, 1, :, ...] in the form a_max = [a_max_b0, a_max_b1], c_max = [c_max_b0, c_max_b1], etc.
I've tried to do this by reshaping my input matrix to a 2d array [b, a*c*d*...], using argmax along axis 0, and unraveling the indices, but the output coordinates don't wind up giving the maxima in my dataset. In this case, n = 3 and I'm interested in axis 1.
shape = gains_3d.shape
idx = gains_3d.reshape(shape[1], -1)
idx = idx.argmax(axis = 1)
a1, a2 = np.unravel_index(idx, [shape[0], shape[2]])
Obviously I could use a loop, but that's not very pythonic.
For a concrete example, I randomly generated a 4x2x3 array. I'm interested in axis 1, so the output should be two arrays of length 2.
testarray = np.array([[[0.17028444, 0.38504759, 0.64852725],
[0.8344524 , 0.54964746, 0.86628204]],
[[0.77089997, 0.25876277, 0.45092835],
[0.6119848 , 0.10096425, 0.627054 ]],
[[0.8466859 , 0.82011746, 0.51123959],
[0.26681694, 0.12952723, 0.94956865]],
[[0.28123628, 0.30465068, 0.29498136],
[0.6624998 , 0.42748154, 0.83362323]]])
testarray[:,0,:] is
array([[0.17028444, 0.38504759, 0.64852725],
[0.77089997, 0.25876277, 0.45092835],
[0.8466859 , 0.82011746, 0.51123959],
[0.28123628, 0.30465068, 0.29498136]])
, so the first element of the first output array will be 2, and the first element of the other will be 0, pointing to 0.8466859. The second elements of the two matrices will be 2 and 2, pointing to 0.94956865 of testarray[:,1,:]
Let's first try to get a clear idea of what you are trying to do:
Sample 3d array:
In [136]: arr = np.random.randint(0,10,(2,3,4))
In [137]: arr
Out[137]:
array([[[1, 7, 6, 2],
[1, 5, 7, 1],
[2, 2, 5, *6*]],
[[*9*, 1, 2, 9],
[2, *9*, 3, 9],
[0, 2, 0, 6]]])
After fiddling around a bit I came up with this iteration, showing the coordinates for each middle dimension, and the max value
In [151]: [(i,np.unravel_index(np.argmax(arr[:,i,:]),(2,4)),np.max(arr[:,i,:])) for i in range
...: (3)]
Out[151]: [(0, (1, 0), 9), (1, (1, 1), 9), (2, (0, 3), 6)]
I can move the unravel outside the iteration:
In [153]: np.unravel_index([np.argmax(arr[:,i,:]) for i in range(3)],(2,4))
Out[153]: (array([1, 1, 0]), array([0, 1, 3]))
Your reshape approach does avoid this loop:
In [154]: arr1 = arr.transpose(1,0,2) # move our axis first
In [155]: arr1 = arr1.reshape(3,-1)
In [156]: arr1
Out[156]:
array([[1, 7, 6, 2, 9, 1, 2, 9],
[1, 5, 7, 1, 2, 9, 3, 9],
[2, 2, 5, 6, 0, 2, 0, 6]])
In [158]: np.argmax(arr1,axis=1)
Out[158]: array([4, 5, 3])
In [159]: np.unravel_index(_,(2,4))
Out[159]: (array([1, 1, 0]), array([0, 1, 3]))
max and argmax take only one axis value, where as you want the equivalent of taking the max along all but one axis. Some ufunc takes a axis tuple, but these do not. The transpose and reshape may be the only way.
In [163]: np.max(arr1,axis=1)
Out[163]: array([9, 9, 6])

NumPy - Excluding all zero 2D arrays from a 3D array

I have multiple 3D arrays with different shapes but I'm going to assume I have an array named A with shape (53, 768, 768) for an example. It consists of 53 2D arrays and some of them may be empty images. Those empty images have only 0 pixel values.
If there are N slices with all 0 values, I want to slice A into a (53 - N, 768, 768) 3D array. Is this possible with indexing?
I tried something like this a[:, ~np.all(a == 0)], but it returns an array with shape (53, 1, 768, 768).
Let's assume your data is something like this:
z = np.array([
[[1, 2, 3], [4, 5, 6]],
[[7, 8, 9], [10, 11, 12]],
[[0, 0, 0], [0, 0, 0]],
[[1, 1, 1], [1, 1, 1]]
])
The shape of z is (4, 2, 3). We therefore need a vector with shape 4, aggregating over the other dimensions. We can use the axis= parameter in most Numpy functions for this:
mask = np.all(z != 0, axis=(1, 2))
a[mask]
In this example, mask will be array([False, False, True, False]).
Axes are numbered 0, 1, 2, etc. So we use 1 and 2 to refer to the 2nd and 3rd axes.
You can also use negative numbers as in the other answer; if you write axis=(-2, -1) that refers to the last and 2nd-to-last axes, i.e. axes 1 and 2 in this example.
In general, use axis= to specify which axes are to be collapsed by aggregating. Any axis not specified in axis= will not be aggregated.
Use:
import numpy as np
A = np.array(A) # if A is not a NumPy array
result = A[np.sum(A, axis = (-1, -2)) != 0]
This will do.

get maximum of absolute along axis

I have a couple of ndarrays with same shape, and I would like to get one array (of same shape) with the maximum of the absolute values for each element. So I decided to stack all arrays, and then pick the values along the new stacked axis. But how to do this?
Example
Say we have two 1-D arrays with 4 elements each, so my stacked array looks like
>>> stack
array([[ 4, 1, 2, 3],
[ 0, -5, 6, 7]])
If I would just be interested in the maximum I could just do
>>> numpy.amax(stack, axis=0)
array([4, 1, 6, 7])
But I need to consider negative values as well, so I was going for
>>> ind = numpy.argmax(numpy.absolute(stack), axis=0)
>>> ind
array([0, 1, 1, 1])
So now I have the indices I need, but how to apply this to the stacked array? If I just index stack by ind, numpy is doing something broadcasting stuff I don't need:
>>> stack[ind]
array([[ 4, 1, 2, 3],
[ 0, -5, 6, 7],
[ 0, -5, 6, 7],
[ 0, -5, 6, 7]])
What I want to get is array([4, -5, 6, 7])
Or to ask from a slightly different perspective: How do I get the array numpy.amax(stack, axis=0) based on the indices returned by numpy.argmax(stack, axis=0)?
The stacking operation would be inefficient. We can simply use np.where to do the choosing based on the absolute valued comparisons -
In [198]: a
Out[198]: array([4, 1, 2, 3])
In [199]: b
Out[199]: array([ 0, -5, 6, 7])
In [200]: np.where(np.abs(a) > np.abs(b), a, b)
Out[200]: array([ 4, -5, 6, 7])
This works on generic n-dim arrays without any modification.
If you have 2D numpy ndarray, classical indexing no longer applies. So to achieve what you want, to avoid brodcatsting, you have to index with 2D array too:
>>> stack[[ind,np.arange(stack.shape[1])]]
array([ 4, -5, 6, 7])
For 'normal' Python:
>>> a=[[1,2],[3,4]]
>>> b=[0,1]
>>> [x[y] for x,y in zip(a,b)]
[1, 4]
Perhaps it can be applied to arrays too, I am not familiar enough with Numpy.
Find array of max and min and combine using where
maxs = np.amax(stack, axis=0)
mins = np.amin(stack, axis=0)
max_abs = np.where(np.abs(maxs) > np.abs(mins), maxs, mins)

numpy column arrays and strange results

I am trying to write a function where its arguments are arrays with different shapes. I am having some troubles to understand column arrays and to make my function work for all shapes of arrays, here are the problems I found:
Transposing:
If the argument array A is not a vector then I can transpose it nicely using A.T however if A is a row vector this will NOT turn A into a column vector. If A is a column vector this will (strangely) turn it into a row vector. Is there a way to transpose an array independently of its shape?
Dot Product
The dot Product of a column vector with a scalar is a column vector (yeahh!). The dot Product of a column vector with a 1 element numpy array is a row vector (nayyy).
A = array((1,2)).reshape(2,1) #this is how I make a column vector (is there a better looking way?)
print dot(A,3) #column vector
b = dot(array((2,4)),a) #array with shape (1,)
print dot(A,b) #row vector..(bah)
Inversion
linalg.inv(array(2)) #gives an error, shouldn't it return 1/2 ?
Thanks for all the help!
P.S. Sorry for being noob I am used to Matlab this way of writing things is very confusing for me ..
P.S.2 I don't want to use matrices because arrays are more general
If you're used to Matlab, Numpy's way of dealing with "column" and "row" vectors is a little strange. The thing to realize is that a 1-d array is neither a column nor a row vector. To be a column or row vector, an array has to be a 2-d array with one dimension set to one. You can tell the difference between a 1-d array and a 2-d array with one row by looking at how many braces there are:
>>> a = numpy.arange(15)
>>> a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> b = a.reshape(1, -1)
>>> b
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]])
Now you can see that when you transpose these two, a stays the same, but b becomes a column vector:
>>> a.T
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> b.T
array([[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5],
[ 6],
[ 7],
[ 8],
[ 9],
[10],
[11],
[12],
[13],
[14]])
Again, this may seem a little strange -- but as you say, "arrays are more general." To achieve that generality, Numpy distinguishes strictly between arrays of different dimensions; a 1-d array simply can't be a "column" or "row" vector in any meaningful sense. The second dimension isn't defined at all!
The answers to your other questions follow from this observation. Your code example code above generates an error for me, so I'll do something slightly different... which also generates an error, but a more informative one:
>>> A
array([[1],
[2]])
>>> B
array([2, 4])
>>> numpy.dot(A, B)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: objects are not aligned
Numpy complains that the objects are not aligned. That's because B is a 1-d array! Let's make it a true row vector:
>>> B = B.reshape(1, -1)
>>> B
array([[2, 4]])
>>> numpy.dot(A, B)
array([[2, 4],
[4, 8]])
>>> numpy.dot(B, A)
array([[10]])
Now everything makes sense. Dot simply performs matrix multiplication here; in one order the operation produces a 2x2 array; in the other, it produces a 1x1 array. Note the number of braces! Both of these are 2-d arrays. In turn, 10, [10], and [[10]] would all be different results.
Similarly, consider these three values:
>>> numpy.array(2)
array(2)
>>> numpy.array((2,))
array([2])
>>> numpy.array((2,)).reshape(1,-1)
array([[2]])
If you pass these to numpy.linalg.inv, you'll get errors for all but the last -- you can't take the matrix inverse of something that isn't a matrix! If you pass the last, the result is also a matrix:
>>> numpy.linalg.inv(numpy.array((2,)).reshape(1,-1))
array([[ 0.5]])
Transposing
It is important to distinguish between 1D arrays and 2D arrays. The row vector you are referring to is 1D, while the column vector is 2D. To demonstrate the difference, have a look at the following example.
First we demonstrate the default behavior of transposing a 2D array (even the column vector is a simple 2D array):
import numpy as np
print np.ones((3, 4)).T.shape
print np.ones((3, 1)).T.shape
The output is - as expected:
(4, 3)
(1, 3)
A 1D vector, however, does not change its size:
print np.ones((3,)).T.shape
Output:
(3,)
To quickly convert it into a 2D array, use [:,None]:
print np.ones((3,))[:,None].T.shape
Output:
(1, 3)
Dot product
To obtain the desired result, you should better work with 2D arrays:
A = np.ones((2, 1) # column vector
b = np.ones((1, 1)) # scalar
print np.dot(A, b) # column vector (as expected)
Output:
[[ 1.]
[ 1.]]
Yeah! :)
Inversion
Again, you need to make sure to work with 2D arrays. This can be done using the ndmin argument:
print np.linalg.inv(np.array(2,ndmin=2))
Output:
[[ 0.5]]

How can I find the dimensions of a matrix in Python?

How can I find the dimensions of a matrix in Python. Len(A) returns only one variable.
Edit:
close = dataobj.get_data(timestamps, symbols, closefield)
Is (I assume) generating a matrix of integers (less likely strings). I need to find the size of that matrix, so I can run some tests without having to iterate through all of the elements. As far as the data type goes, I assume it's an array of arrays (or list of lists).
The number of rows of a list of lists would be: len(A) and the number of columns len(A[0]) given that all rows have the same number of columns, i.e. all lists in each index are of the same size.
If you are using NumPy arrays, shape can be used.
For example
>>> a = numpy.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
>>> a
array([[[ 1, 2, 3],
[ 1, 2, 3]],
[[12, 3, 4],
[ 2, 1, 3]]])
>>> a.shape
(2, 2, 3)
As Ayman farhat mentioned
you can use the simple method len(matrix) to get the length of rows and get the length of the first row to get the no. of columns using len(matrix[0]) :
>>> a=[[1,5,6,8],[1,2,5,9],[7,5,6,2]]
>>> len(a)
3
>>> len(a[0])
4
Also you can use a library that helps you with matrices "numpy":
>>> import numpy
>>> numpy.shape(a)
(3,4)
To get just a correct number of dimensions in NumPy:
len(a.shape)
In the first case:
import numpy as np
a = np.array([[[1,2,3],[1,2,3]],[[12,3,4],[2,1,3]]])
print("shape = ",np.shape(a))
print("dimensions = ",len(a.shape))
The output will be:
shape = (2, 2, 3)
dimensions = 3
m = [[1, 1, 1, 0],[0, 5, 0, 1],[2, 1, 3, 10]]
print(len(m),len(m[0]))
Output
(3 4)
The correct answer is the following:
import numpy
numpy.shape(a)
Suppose you have a which is an array. to get the dimensions of an array you should use shape.
import numpy as np
a = np.array([[3,20,99],[-13,4.5,26],[0,-1,20],[5,78,-19]])
a.shape
The output of this will be
(4,3)
You may use as following to get Height and Weight of an Numpy array:
int height = arr.shape[0]
int width = arr.shape[1]
If your array has multiple dimensions, you can increase the index to access them.
You simply can find a matrix dimension by using Numpy:
import numpy as np
x = np.arange(24).reshape((6, 4))
x.ndim
output will be:
2
It means this matrix is a 2 dimensional matrix.
x.shape
Will show you the size of each dimension. The shape for x is equal to:
(6, 4)
A simple way I look at it:
example:
h=np.array([[[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]]])
h.ndim
4
h
array([[[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]]])
If you closely observe, the number of opening square brackets at the beginning is what defines the dimension of the array.
In the above array to access 7, the below indexing is used,
h[0,1,1,0]
However if we change the array to 3 dimensions as below,
h=np.array([[[1,2,3],[3,4,5]],[[5,6,7],[7,8,9]],[[9,10,11],[12,13,14]]])
h.ndim
3
h
array([[[ 1, 2, 3],
[ 3, 4, 5]],
[[ 5, 6, 7],
[ 7, 8, 9]],
[[ 9, 10, 11],
[12, 13, 14]]])
To access element 7 in the above array, the index is h[1,1,0]

Categories

Resources