Get average of the numpy ndarray - python

I have a shape of A = (8, 64, 64, 64, 1) numpy.ndarray. We can use np.means or np.average to calculate the means of a numpy array. But I want to get the means of the 8 (64,64,64) arrays. That is, i only want 8 values, calculated from the means of the (64,64,64). Of course I can use a for loop, or use [np.means(A[i]) for i in range(A.shape[0])]. I am wondering if there is any numpy method to do this

You can use np.means axis kwarg:
np.mean(A, (1, 2, 3, 4))
The same works with np.average, too.

Related

Is there a fast way to multiply one axis of a 4D array by elements in a vector of the same length as that axis?

I have two lists of shape (130, 64, 2048), call it (s, f, b), and one vector of length 64, call this v. I need to append these two lists together to make a list of shape (130, 2, 64, 2048) and multiply all 2048 values in f[i] with the i th value of v.
The output array also needs to have shape (130, 2, 64, 2048)
Obviously these two steps can be done interchangeably. I want to know the most Pythonic way of doing something like this.
My main issue is that my code takes forever in turning the list into a numpy array which is necessary for some of my calculations. I have:
new_prof = np.asarray( new_prof )
but this seems to take two long for the size and shape of my list. Any thoughts as to how I could initialise this better?
The problem outlined above is shown by my attempt:
# Converted data should have shape (130, 2, 64, 2048)
converted_data = IQUV_to_AABB( data, basis = "cartesian" )
new_converted = np.array((130, 2, 64, 2048))
# I think s.shape is (2, 64, 2048) and cal_fa has length 64
for i, s in enumerate( converted_data ):
aa = np.dot( s[0], cal_fa )
bb = np.dot( s[1], cal_fb )
new_converted[i].append( (aa, bb) )
However, this code doesn't work and I think it's got something to do with the dot product. Maybe??
I would also love to know why the process of changing my list to a numpy array is taking so long.
Try to start small and look at the results in the console:
import numpy as np
x = np.arange(36)
print(x)
y = np.reshape(x, (3, 4, 3))
print(y)
# this is a vector of the same size as dimension 1
a = np.arange(4)
print(a)
# expand and let numpy's broadcasting do the rest
# https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
# https://scipy.github.io/old-wiki/pages/EricsBroadcastingDoc
b = a[np.newaxis, :, np.newaxis]
print(b)
c = y * b
print(c)
You can read about np.newaxis here, here and here.
Using numpy.append is rather slow as it has to preallocate memory and copy the whole array each time. A numpy array is a continuous block of memory.
You might have to use it if you run out of computer memory. But in this case try to iterate over appropriate chunks, as big as your computer can still handle them. Re-aranging the dimension is sometimes a way to speed up calculations.

How to calculate the sums of squares along one dimension of on ndarray?

One miltidimensional matrix with shape (2, 50, 25, 3):
xx = np.random.randn(2, 50, 25, 3)
I want to calculate the sum of squares of the last dimension. The result should be a matrix with a shape (2, 50, 25, 1).
[np.sum(x) for x in np.square(features_displacement[0][0][:])[:]]
This code can successfully calculate the one dimension, output a list with shape (25,1), but how can I calculate all the dimensions as described above?
You can apply the numpy functions along the axis you want, for example:
np.sum(np.square(xx), axis=3)
Will produce an array of shape (2, 50, 25). Not exactly sure this is what you want, if not please be more specific :-)
Use this:
sum_of_squares = np.sum(np.square(features_displacement), axis=-1)

Iterating over 3D numpy using one dimension as iterator remaining dimensions in the loop

Despite there being a number of similar questions related to iterating over a 3D array and after trying out some functions like nditer of numpy, I am still confused on how the following can be achieved:
I have a signal of dimensions (30, 11, 300) which is 30 trials of 11 signals containing 300 signal points.
Let this signal be denoted by the variable x_
I have another function which takes as input a (11, 300) matrix and plots it on 1 graph (11 signals containing 300 signal points plotted on a single graph). Let this function be sliding_window_plot.
Currently, I can get it to do this:
x_plot = x_[0,:,:]
for i in range(x_.shape[0]):
sliding_window_plot(x_plot[:,:])
which plots THE SAME (first trial) 11 signals containing 300 points on 1 plot, 30 times.
I want it to plot the i'th set of signals. Not the first (0th) trial of signals everytime. Any hints on how to attempt this?
You should be able to iterate over the first dimension with a for loop:
for s in x_:
sliding_window_plot(s)
with each iteration s will be the next array of shape (11, 300).
In general for all nD-arrays where n>1, you can iterate over the very first dimension of the array as if you're iterating over any other iterable. For checking whether an array is an iterable, you can use np.iterable(arr). Here is an example:
In [9]: arr = np.arange(3 * 4 * 5).reshape(3, 4, 5)
In [10]: arr.shape
Out[10]: (3, 4, 5)
In [11]: np.iterable(arr)
Out[11]: True
In [12]: for a in arr:
...: print(a.shape)
...:
(4, 5)
(4, 5)
(4, 5)
So, in each iteration we get a matrix (of shape (4, 5)) as output. In total, 3 such outputs constitute the 3D array of shape (3, 4, 5)
If, for some reason, you want to iterate over other dimensions then you can use numpy.rollaxis to move the desired axis to the first position and then iterate over it as mentioned in iterating-over-arbitrary-dimension-of-numpy-array
NOTE: Having said that numpy.rollaxis is only maintained for backwards compatibility. So, it is recommended to use numpy.moveaxis instead for moving the desired axis to the first dimension.
You are hardcoding the 0th slice outside the for loop. You need to create x_plot to be inside the loop. In fact you can simplify your code by not using x_plot at all.
for i in rangge(x_.shape[0]):
sliding_window_plot(x_[i])

Splitting a numpy array into two subsets of different sizes

I have a numpy array of a shape (400, 3, 3, 3) and I want to split it into two parts, so I would get arrays like (100, 3, 3, 3) and (300, 3, 3, 3).
I was playing with numpy split methods, e.g.:
subsets = np.array_split(arr, 2)
which gives me what I want, but it divides the original array into two halves the same size and I don't know how to specify these sizes. It'd be probably easy with some indexing (I guess) but I'm not sure how to do it.
As mentioned in my comment, you can use the Ellipsis notation to specify all axes:
x, y = arr[:100, ...], arr[100:, ...]

keeping track of indices change in numpy.reshape

While using numpy.reshape in Python, is there a way to keep track of the change in indices?
For example, if a numpy array with the shape (m,n,l,k) is reshaped into an array with the shape (m*n,k*l); is there a way to get the initial index ([x,y,w,z]) for the current [X,Y] index and vice versa?
Yes there is, it's called raveling and unraveling the index. For example you have two arrays:
import numpy as np
arr1 = np.arange(10000).reshape(20, 10, 50)
arr2 = arr.reshape(20, 500)
say you want to index the (10, 52) (equivalent to arr2[10, 52]) element but in arr1:
>>> np.unravel_index(np.ravel_multi_index((10, 52), arr2.shape), arr1.shape)
(10, 1, 2)
or in the other direction:
>>> np.unravel_index(np.ravel_multi_index((10, 1, 2), arr1.shape), arr2.shape)
(10, 52)
You don't keep track of it, but you can calculate it. The original m x n is mapped onto the new m*n dimension, e.g. n*x+y == X. But we can verify with a couple of multidimensional ravel/unravel functions (as answered by #MSeifert).
In [671]: m,n,l,k=2,3,4,5
In [672]: np.ravel_multi_index((1,2,3,4), (m,n,l,k))
Out[672]: 119
In [673]: np.unravel_index(52, (m*n,l*k))
Out[673]: (2, 12)

Categories

Resources