NumPy broadcasting vs. simple loop

NumPy broadcasting vs. simple loop - python

I have been using NumPy for a while but there still are instances in which broadcasting is impenetrable to me. Please see the code below:
import numpy as np
np.random.seed(123)
# Number of measurements for the x variable
nx = 7
# Number of measurements for the y variable
ny = 11
# Number of items for which we run the simulation
nc = 23
# Fake some data
x = np.random.uniform(0, 1, size=(nx, ))
y = np.random.uniform(0, 1, size=(ny, ))
# histogram_2d represents the 2d frequency of the x, y measurements
histogram_2d = np.random.randint(0, 20, size=(nx, ny))
# c is the actual simulation results, size=(nx*ny, nc)
c = np.random.uniform(0, 9, size=(nx*ny, nc))
# Try broadcasting
c_3d = c.reshape((nc, nx, ny))
numpy_sum = (c_3d * histogram_2d).sum()
# Attempt possible replacement with a simple loop
partial_sum = 0.0
for i in range(nc):
c_2d = np.reshape(c[:, i], (nx, ny))
partial_sum += (c_2d * histogram_2d).sum()
print('Numpy broadcasting: ', numpy_sum)
print('Actual loop : ', partial_sum)
In my naivete, I was expecting the two approaches to give the same results (up to some multiple of machine precision). But on my system I get this:
Numpy broadcasting: 74331.4423599
Actual loop : 73599.8596346
As my ignorance is showing: given that histogram_2d is a 2D array and c_3d is a 3D array, I was simply thinking that NumPy would magically expand histogram_2d with nc copies of itself in the first axis and do the multiplication. But it appears I am not quite correct.
I would like to know how to replace the condensed, broadcasted multiplication + sum with a proper for loop - I am looking at some code in Fortran to do the same and this Fortran code:
hist_3d = spread(histogram_2d, 1, nc)
c_3d = reshape(c, [nc, nx, ny])
partial_sum = sum(c_3d*hist_3d)
Does not do what the NumPy broadcasting does... Which means I am doing something fundamentally wrong somewhere and/or my understanding of broadcasting is still very limited.

In [3]: c.shape
Out[3]: (77, 23)
This isn't a good reshape; it works, but will mess up the layout
In [5]: c_3d = c.reshape((nc, nx, ny)); c_3d.shape
Out[5]: (23, 7, 11)
This is good - splitting the 77 into 7 and 11:
In [6]: c_3d = c.reshape((nx, ny,nc)); c_3d.shape
Out[6]: (7, 11, 23)
To multiply with:
In [7]: histogram_2d.shape
Out[7]: (7, 11)
Use:
In [8]: (histogram_2d[:,:,None]*c_3d).shape
Out[8]: (7, 11, 23)
In [9]: (histogram_2d[:,:,None]*c_3d).sum()
Out[9]: 73599.85963455029
With this broadcasting
(7,11,1) and (7,11,23) => (7,11,23)
The 2 key rules are:
add leading dimensions as need to match total ndim
change all size 1 dimensions to match
(I used change, because the 1 may actually be changed to 0. That's not a common case, but illustrates the generality of broadcasting. )
New trailing dimensions have to be explicit. This avoids some ambiguities, as when trying to add a (2,) and (3,). One of those can be expanded to (1,2) or (1,3), but which? If one is (3,1), then expanding the other to (1,2) is unambiguous.

Related

How can I append a multidimensional array to a new dimension with Numpy?

I have an empty list: x = [].
I have a numpy array, y, of shape: (180, 161). I can't necessarily define x to be an np.empty of a particular shape, because I won't know the shape of y ahead of time.
I want to append y to x so that x will have a .shape of (1, 180, 161).
Then if I append more, I want it to be (n, 180, 161)
I tried .append and .stack, but I've had a variety of errors:
TypeError: only size-1 arrays can be converted to Python scalars
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 3 dimension(s) and the array at index 1 has 2 dimension(s)
And so on. It seems that this should be simple, but it's strangely difficult.

Assuming all items in x have the same shape, you can first construct a list and then construct the NumPy array from the list.
There, you have two options:
np.array() which is faster but not flexible
np.stack() which is slower but allows you to choose over which axis should the stack happen (it is roughly equivalent to np.array().transpose(...).copy()
The code would look like:
import numpy as np
n = 100
x = [np.random.randint(0, 10, (10, 20)) for _ in range(n)]
# same as: y = np.stack(x, 0)
y = np.array(x)
print(y.shape)
# (100, 10, 20)
Of course this line:
x = [np.random.randint(0, 10, (10, 20)) for _ in range(n)]
can be replaced with:
x = []
for _ in range(n):
x.append(np.random.randint(0, 10, (10, 20)))
You could also use np.append(), e.g.:
def stacker(arrs):
result = arrs[0][None, ...]
for arr in arrs[1:]:
result = np.append(result, arr[None, ...], 0)
return result
but with horrific performances:
n = 1000
shape = (100, 100)
x = [np.random.randint(0, n, shape) for _ in range(n)]
%timeit np.array(x)
# 10 loops, best of 3: 21.1 ms per loop
%timeit np.stack(x)
# 10 loops, best of 3: 21.6 ms per loop
%timeit stacker(x)
# 1 loop, best of 3: 11 s per loop
and, as you can see, performance-wise, the list-based method is way faster.

You can reshape y to be (1, *y.shape).
Then for appending an array you can say:
y_1 = np.vstack((y, new_arr))
where y_1.shape produces a (2, *y.shape) numpy array.
To save memory you can say y = np.vstack((y, new_arr))
You might have to reshape your array to (1, *y.shape) however.
This is a very basic example:
import numpy as np
a = np.ones((1,2,3))
b = np.ones((1,2,3))
np.vstack((a,b)).shape # (2,2,3)
Let me know if this helps!

If you keep x as a list then if you just want to maintain the shape by appending, it is possible:
>>> import numpy as np
>>> x = []
>>> y = np.arange(12).reshape(3,4)
>>> x.append(y)
>>> np.shape(x)
(1, 3, 4)
>>> x.append(y)
>>> np.shape(x)
(2, 3, 4)
>>> for i in range(10):
... x.append(y)
>>> np.shape(x)
(12, 3, 4)
But considering you are dealing with np.arrays it may not be convenient for you to keep x as list, so you may try this:
>>> x = np.array(x)
>>> x.shape
(12, 3, 4)
>>> y[None,...].shape
(1, 3, 4)
>>> np.append(x, y[None,...],axis=0).shape
(13, 3, 4)
Word of caution:
As pointed out by #hpaulj :
np.append should be avoided, as it is extremely slow, probably only faster than:
x = np.array([*x, y])
The correct usage would be:
x = np.concatenate([x, y[None,...]], axis=0)
Either way, concatenating or appending is generally a speed bump in numpy. So unless you absolutely need to create an array this way, you should work with lists. Also most functions applied to np.arrays work on lists as well. Note, functions applied to arrays, not methods of an np.array object. For example:
>>> x = list((1, 2, 3, 4))
>>> np.shape(x)
(4,)
>>> x.shape
Traceback (most recent call last):
File "<ipython-input-100-9f2b259887ef>", line 1, in <module>
x.shape
AttributeError: 'list' object has no attribute 'shape'
So I would suggest appending to list, and then after you have done appending all the arrays, convert the list to np.array if you require.

How can I use a 3d numpy array of indices to retrieve the corresponding values i a 4d array?

I have a 4d numpy array temperature of data with the measured temperature at points x,y,z and time t. Assuming I have an array indices with the indices where the first instance of a condition is met, say temperature < 0, how do I extract a 3d array with the first temperatures satisfying this condition? That is I'm looking for the equivalent of numpy's 1d version (import numpy as np tacitly assumed)
>>> temperatures = np.arange(10,-10,-1)
>>> ind = np.argmax(temperatures < 0)
>>> T = temperature[ind]
I have tried the analogous
In [1]: temperatures = np.random.random((11,8,5,200)) * 1000
In [2]: temperatures.shape
Out[2]: (11, 8, 5, 200)
In [3]: indices= np.argmax(temperatures > 900,axis=3)
In [4]: indices.shape
Out[4]: (11, 8, 5)
In [5]: T = temperatures[:,:,:,indices]
In [6]: T.shape
Out[6]: (11, 8, 5, 11, 8, 5)
However, the dimensions if Tis 6.
I could of course do it with a for loop:
indices = np.argmax(temperatures > 900,axis=3)
x,y,z = temperatures.shape[:-1]
T = np.zeros((x,y,z))
for indx in range(x):
for indy in range(y):
for indz in range(z):
T[indx,indy,indz] = temperatures[indx,indy,indz,indices[indx,indy,indz]]
but I'm looking for something fore elegant and more pythonic. Is there someone more skilled with numpy out there who can help me out on this?
P.S. For the sake of clarity, I'm not just looking for the temperature at these points given by indices, I'm also looking for other quantities in arrays of the same shape as temperature, e.g. the time derivative. Also, in reality the arrays are much larger then this minimal example.

Numpy advanced indexing does always work:
import numpy as np
temperatures = np.random.random((11,8,5, 200)) * 1000
indices = np.argmax(temperatures > 900, axis=3)
x, y, z = temperatures.shape[:-1]
T = temperatures[np.arange(x)[:, np.newaxis, np.newaxis],
np.arange(y)[np.newaxis, :, np.newaxis],
np.arange(z)[np.newaxis, np.newaxis, :],
indices]
As jdehesa pointed out this can be made more concise:
x, y, z = np.ogrid[:x, :y, :z]
T = temperatures[x, y, z, i]

I think you need:
axis = 3
indices = np.argmax(temperatures > 900, axis=axis)
result = np.take_along_axis(temperatures, np.expand_dims(indices, axis), axis)
result = result.squeeze(axis)

How to reshape a matrix and then multiply it by another matrix and then reshape it again in python

I have an issue in using python with matrix multiplication and reshape. for example, I have a column S of size (16,1) and another matrix H of size (4,4), I need to reshape the column S into (4,4) in order to multiply it with H and then reshape it again into (16,1), I did that in matlab as below:
clear all; clc; clear
H = randn(4,4,16) + 1j.*randn(4,4,16);
S = randn(16,1) + 1j.*randn(16,1);
for ij = 1 : 16
y(:,:,ij) = reshape(H(:,:,ij)*reshape(S,4,[]),[],1);
end
y = mean(y,3);
Coming to python :
import numpy as np
H = np.random.randn(4,4,16) + 1j * np.random.randn(4,4,16)
S = np.random.randn(16,) + 1j * np.random.randn(16,)
y = np.zeros((4,4,16),dtype=complex)
for ij in range(16):
y[:,:,ij] = np.reshape(h[:,:,ij]#S.reshape(4,4),16,1)
But I get an error here that we can't reshape the matrix y of size 256 into 16x1.
Does anyone have an idea about how to solve this problem?

Simply do this:
S.shape = (4,4)
for ij in range(16):
y[:,:,ij] = H[:,:,ij] # S
S.shape = -1 # equivalent to 16

np.dot operates over the last and second-to-last axis of the two operands if they have two or more axes. You can move your axes around to use this.
Keep in mind that reshape(S, 4, 4) in Matlab is likely equivalent to S.reshape(4, 4).T in Python.
So given H of shape (4, 4, 16) and S of shape (16,), you can multiply each channel of H by a reshaped S using
np.moveaxis(np.dot(np.moveaxis(H, -1, 0), S.reshape(4, 4).T), 0, -1)
The inner moveaxis call makes H into (16, 4, 4) for easy multiplication. The outer one reverses the effect.
Alternatively, you could use the fact that S will be transposed to write
np.transpose(S.reshape(4, 4), np.transpose(H))

There are two issues in your solution
1) reshape method takes a shape in the form of a single tuple argument, but not multiple arguments.
2) The shape of your y-array should be 16x1x16, not 4x4x16. In Matlab, there is no issue since it automatically reshapes y as you update it.
The correct version would be the following:
import numpy as np
H = np.random.randn(4,4,16) + 1j * np.random.randn(4,4,16)
S = np.random.randn(16,) + 1j * np.random.randn(16,)
y = np.zeros((16,1,16),dtype=complex)
for ij in range(16):
y[:,:,ij] = np.reshape(H[:,:,ij]#S.reshape((4,4)),(16,1))

Sort numpy array in-place using order from a second array

Let two ndarrays: A of shape (n, *m), and B of shape (n, ). Is there a way to sort A in-place using the order that would sort B?
Sorting A with B is easy using np.argsort, but this is not done in-place:
A = A[np.argsort(B)]
Comments:
A and B have different dtypes, and A can have more than two dimensions. Hence they can’t be stacked to use ndarray.sort().
A takes up a lot of space, which is why it needs to be sorted in-place. Any solution requiring twice the space occupied by A would therefore defeat this purpose.
The title of this question “Re-arranging numpy array in place” may sound related, but the question itself is not very clear, and the answers do not match my question.

Here is a solution that works by following cycles in the index array. It can optionally be compiled using pythran giving a significant speedup if rows are small (80x for 10 elements) and a small speedup if rows are large (30% for 1000 elements).
To keep it pythran compatible I had to simplify it a bit, so it only accepts 2D arrays and it only sorts along axis 0.
Code:
import numpy as np
#pythran export take_inplace(float[:, :] or int[:, :], int[:])
def take_inplace(a, idx):
n, m = a.shape
been_there = np.zeros(n, bool)
keep = np.empty(m, a.dtype)
for i in range(n):
if been_there[i]:
continue
keep[:] = a[i]
been_there[i] = True
j = i
k = idx[i]
while not been_there[k]:
a[j] = a[k]
been_there[k] = True
j = k
k = idx[k]
a[j] = keep
Sample run using compiled version. As indicated above compilation is only required for small rows, for larger rows pure python should be fast enough.
>>> from timeit import timeit
>>> import numpy as np
>>> import take_inplace
>>>
>>> a = np.random.random((1000, 10))
>>> idx = a[:, 4].argsort()
>>>
>>> take_inplace.take_inplace(a, idx)
>>>
# correct
>>> np.all(np.arange(1000) == a[:, 4].argsort())
True
>>>
# speed
>>> timeit(lambda: take_inplace.take_inplace(a, idx), number=1000)
0.011950935004279017
>>>
# for comparison
>>> timeit(lambda: a[idx], number=1000)
0.02985276997787878

If you can set A beforehand as a structured array whose datatype is composed of a subarray of shape (m, ) and a scalar of the same type (e.g., np.int32), then you can sort it in-place with respect to B. For example:
import numpy as np
B = np.array([3, 1, 2])
A = np.array([[10, 11], [20, 21], [30, 31]])
(n, m) = A.shape
dt = np.dtype([('a', np.int32, (m, )), ('b', int)])
A2 = np.array([(a, b) for a, b in zip(A, B)], dtype=dt)
A2.sort(order='b')
print A2

Efficiently slice windows from a 1D numpy array, around indices given by second 2D array

I want to extract multiple slices from the same 1D numpy array, where the slice indices are drawn from a random distribution. Basically, I want to achieve the following:
import numpy as np
import numpy.random
# generate some 1D data
data = np.random.randn(500)
# window size (slices are 2*winsize long)
winsize = 60
# number of slices to take from the data
inds_size = (100, 200)
# get random integers that function as indices into the data
inds = np.random.randint(low=winsize, high=len(data)-winsize, size=inds_size)
# now I want to extract slices of data, running from inds[0,0]-60 to inds[0,0]+60
sliced_data = np.zeros( (winsize*2,) + inds_size )
for k in range(inds_size[0]):
for l in range(inds_size[1]):
sliced_data[:,k,l] = data[inds[k,l]-winsize:inds[k,l]+winsize]
# sliced_data.shape is now (120, 100, 200)
The above nested loop works fine, but is very slow. In my real code, I will need to do this thousands of times, for data arrays a lot bigger than these. Is there any way to do this more efficiently?
Note that inds will always be 2D in my case, but after getting the slices I will always be summing over one of these two dimensions, so an approach that only accumulates the sum across the one dimension would be fine.
I found this question and this answer which seem almost the same. However, the question is only about a 1D indexing vector (as opposed to my 2D). Also, the answer lacks a bit of context, as I don't really understand how the suggested as_strided works. Since my problem does not seem uncommon, I thought I'd ask again in the hope of a more explanatory answer rather than just code.

Using as_strided in this way appears to be somewhat faster than Divakar's approach (20 ms vs 35 ms here), although memory usage might be an issue.
data_wins = as_strided(data, shape=(data.size - 2*winsize + 1, 2*winsize), strides=(8, 8))
inds = np.random.randint(low=0, high=data.size - 2*winsize, size=inds_size)
sliced = data_wins[inds]
sliced = sliced.transpose((2, 0, 1)) # to use the same index order as before
Strides are the steps in bytes for the index in each dimension. For example, with an array of shape (x, y, z) and a data type of size d (8 for float64), the strides will ordinarily be (y*z*d, z*d, d), so that the second index steps over whole rows of z items. Setting both values to 8, data_wins[i, j] and data_wins[j, i] will refer to the same memory location.
>>> import numpy as np
>>> from numpy.lib.stride_tricks import as_strided
>>> a = np.arange(10, dtype=np.int8)
>>> as_strided(a, shape=(3, 10 - 2), strides=(1, 1))
array([[0, 1, 2, 3, 4, 5, 6, 7],
[1, 2, 3, 4, 5, 6, 7, 8],
[2, 3, 4, 5, 6, 7, 8, 9]], dtype=int8)

Here's a vectorized approach using broadcasting -
# Get 3D offsetting array and add to inds for all indices
allinds = inds + np.arange(-60,60)[:,None,None]
# Index into data with all indices for desired output
sliced_dataout = data[allinds]
Runtime test -
In [20]: # generate some 1D data
...: data = np.random.randn(500)
...:
...: # window size (slices are 2*winsize long)
...: winsize = 60
...:
...: # number of slices to take from the data
...: inds_size = (100, 200)
...:
...: # get random integers that function as indices into the data
...: inds=np.random.randint(low=winsize,high=len(data)-winsize, size=inds_size)
...:
In [21]: %%timeit
...: sliced_data = np.zeros( (winsize*2,) + inds_size )
...: for k in range(inds_size[0]):
...: for l in range(inds_size[1]):
...: sliced_data[:,k,l] = data[inds[k,l]-winsize:inds[k,l]+winsize]
...:
10 loops, best of 3: 66.9 ms per loop
In [22]: %%timeit
...: allinds = inds + np.arange(-60,60)[:,None,None]
...: sliced_dataout = data[allinds]
...:
10 loops, best of 3: 24.1 ms per loop
Memory consumption : Compromise solution
If memory consumption is an issue, here's a compromise solution with one loop -
sliced_dataout = np.zeros( (winsize*2,) + inds_size )
for k in range(sliced_data.shape[0]):
sliced_dataout[k] = data[inds-winsize+k]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

NumPy broadcasting vs. simple loop - python

Related

How can I append a multidimensional array to a new dimension with Numpy?

How can I use a 3d numpy array of indices to retrieve the corresponding values i a 4d array?

How to reshape a matrix and then multiply it by another matrix and then reshape it again in python

Sort numpy array in-place using order from a second array

Efficiently slice windows from a 1D numpy array, around indices given by second 2D array

Categories

Resources