Related
I need to perform some calculations on consecutive columns in a 2D array, for simplicity's sake, let's say substruction.
I currently do this in the following way:
c = np.array([(a[i, j + 1] - a[i, j]) for j in range(a.shape[1] - 1) for i in range(a.shape[0])]).reshape(a.shape[0], a.shape[1] - 1)
But I suspect there must be a better way using NumPy's vector operations without iteration over 2 values and a reshape.
First of all, I don't think that what you wrote achieves what you try to achieve.
I ran:
>>> a
array([[4, 6, 1, 1, 4],
[7, 1, 7, 0, 6],
[2, 0, 0, 1, 2],
[0, 6, 3, 2, 8]])
>>> c = np.array([(a[i, j + 1] - a[i, j]) for j in range(a.shape[1] - 1) for i in range(a.shape[0])])
>>> c
array([ 2, -6, -2, 6, -5, 6, 0, -3, 0, -7, 1, -1, 3, 6, 1, 6])
>>> c = np.array([(a[i, j + 1] - a[i, j]) for j in range(a.shape[1] - 1) for i in range(a.shape[0])]).reshape(a.shape[0], a.shape[1] - 1)
>>> c
array([[ 2, -6, -2, 6],
[-5, 6, 0, -3],
[ 0, -7, 1, -1],
[ 3, 6, 1, 6]])
The function np.diff receives a vector and returns it's differences array, so:
>>> np.diff([1, 2, 3, 5])
array([1, 1, 2])
But in numpy most functions can handle np.arrays and not just scalars. For this reason, a good key word to know is axis. When passing axis=0 or axis=1 the function will perform like the original one, but on a higher dimension. So instead of subtracting two numbers, it will subtract two vectors. axis=0 and axis=1 will give subtraction of rows and columns (respectively).
Final Answer:
So the final answer is: np.diff(a, axis=1).
Example:
>>> a
array([[4, 6, 1, 1, 4],
[7, 1, 7, 0, 6],
[2, 0, 0, 1, 2],
[0, 6, 3, 2, 8]])
>>> np.diff(a, axis=1)
array([[ 2, -5, 0, 3],
[-6, 6, -7, 6],
[-2, 0, 1, 1],
[ 6, -3, -1, 6]])
First of all, the order of loops in the question differs from what would seem to do the obvious thing. I am going to guess here that you meant to have the i and j loops the other way round.
Given an example:
a = np.arange(8).reshape(2,4) ** 2
i.e.
array([[ 0, 1, 4, 9],
[16, 25, 36, 49]])
Swapping the order of loops gives:
c = np.array([(a[i, j + 1] - a[i, j]) for i in range(a.shape[0]) for j in range(a.shape[1] - 1)]).reshape(a.shape[0], a.shape[1] - 1)
i.e.
array([[ 1, 3, 5],
[ 9, 11, 13]])
So now proceeding to answer the question on that basis, you can do this simply using:
a[:,1:] - a[:,:-1]
Here, a[:,1:] is the array without the first column, a[:,:-1] is the array without the last column, and then the element-by-element difference between the two is calculated.
Replace - with whatever other operator you want. Your question implies that subtraction is just an example, but other operators (e.g. * or whatever) will also similarly output element-by-element results.
Your actual operation does not have to be a single basic operation; provided that it is some combination of basic operations, then you ought to be able to operate on these two subarrays in the same way that you would operate on scalars.
For example, if you have:
def mycalc(right, left):
return 2 * right + left
then
mycalc(a[:,1:], a[:,:-1])
gives:
array([[ 2, 9, 22],
[ 66, 97, 134]])
which is the same as you get when calling mycalc in place of just doing a subtraction in the original example:
np.array([mycalc(a[i, j + 1], a[i, j]) for i in range(a.shape[0]) for j in range(a.shape[1] - 1)]).reshape(a.shape[0], a.shape[1] - 1)
I have an n*n array, and I want to find the min in the array, and get the index of the min in [x,y] format
Of course, this can be done using for loops and using temporary variables, but I am looking for a more sophisticated process to do this.
Example -
[[1,2,8],
[7,4,2],
[9,1,7],
[0,1,5],
[6,-4,3]]
I should get the following output -
Output-
Min = -4
Index = [4,1]
Can I implement something similar?
TIA.
Global minimum value and index
Flatten the array, get the argmin index. Get the corresponding row-col indices from it with np.unravel_index. Also, index into the flattened array with the earlier obtained flattened argmin index for the minimum value.
def smallest_val_index(a):
idx = a.ravel().argmin()
return a.ravel()[idx], np.unravel_index(idx, a.shape)
Sample run -
In [182]: a
Out[182]:
array([[ 1, 2, 8],
[ 7, 4, 2],
[ 9, 1, 7],
[ 0, 1, 5],
[ 6, -4, 3]])
In [183]: val, indx = smallest_val_index(a)
In [184]: val
Out[184]: -4
In [185]: indx
Out[185]: (4, 1)
Global maximum value and index
Similarly, to get the global maximum value, use argmax -
def largest_val_index(a):
idx = a.ravel().argmax()
return a.ravel()[idx], np.unravel_index(idx, a.shape)
Sample run -
In [187]: a
Out[187]:
array([[ 1, 2, 8],
[ 7, 4, 2],
[ 9, 1, 7],
[ 0, 1, 5],
[ 6, -4, 3]])
In [188]: largest_val_index(a)
Out[188]: (9, (2, 0))
I am using python 3
I would like to start from a list of nodes in 3 dimensions and build a grid.
I would like to avoid the construct
import numpy as np
l = np.zeros(len(xv)*len(yv)*len(zv))
for (i,x) in zip(range(len(xv)),xv):
for (j,y) in zip(range(len(yv)),yv):
for (k,z) in zip(range(len(zv)),zv):
l[i,j,k] = func(x,y,z)
I am looking for a more compact version of the above lines. An iterator like zip, but that would iterate on all possible tuple in the grid
You can use something like np.meshgrid to construct your grid. Assuming that func is properly vectorized, that should be good enough to construct l
X, Y, Z = np.meshgrid(xv, yv, zv)
l = func(X, Y, Z)
If func isn't vectorized, you can construct a vectorized version using np.vectorize.
Also note that you might even be able to get away without using np.meshgrid through judicious use of np.newaxis:
>>> x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> y
array([0, 1, 2])
>>> z
array([0, 1])
>>> def func(x, y, z):
... return x + y + z
...
>>> vfunc = np.vectorize(func)
>>> vfunc(x[:, np.newaxis, np.newaxis], y[np.newaxis, :, np.newaxis], z[np.newaxis, np.newaxis, :])
array([[[ 0, 1],
[ 1, 2],
[ 2, 3]],
[[ 1, 2],
[ 2, 3],
[ 3, 4]],
[[ 2, 3],
[ 3, 4],
[ 4, 5]],
[[ 3, 4],
[ 4, 5],
[ 5, 6]],
[[ 4, 5],
[ 5, 6],
[ 6, 7]],
[[ 5, 6],
[ 6, 7],
[ 7, 8]],
[[ 6, 7],
[ 7, 8],
[ 8, 9]],
[[ 7, 8],
[ 8, 9],
[ 9, 10]],
[[ 8, 9],
[ 9, 10],
[10, 11]],
[[ 9, 10],
[10, 11],
[11, 12]]])
As pointed out in the comments, np.ix_ can be used as a shortcut instead of np.newaxis:
vfunc(*np.ix_(xv, yv, zv))
Also note that with this stupid simple function, np.vectorize isn't necessary and will actually hurt our performance a lot...
Say your func is something like
def func(x,y,z,indices):
xv, yv, zv = [i[j] for i,j in zip((x,y,z),indices)]
#do a calc with the value for the specific x,y,z points
Hook the lists you want to it using partial by doing
from functools import partial
f = partial(func, x=xv, y=yv, z=zv)
Now just do a map supplying the indices and you're set!
l = list(map(lambda x: f(indices=x), itertools.product(x,y,z)))
With a simple function:
def foo(x,y,z):
return x**2 + y*2 + z
and space defined by:
In [328]: xv, yv, zv = [np.arange(i) for i in [2,3,4]]
This iteration is as fast any as any, even if it is a bit wordy:
In [329]: res = np.zeros((xv.shape[0], yv.shape[0], zv.shape[0]), dtype=int)
In [330]: for i,x in enumerate(xv):
...: for j,y in enumerate(yv):
...: for k,z in enumerate(zv):
...: res[i,j,k] = foo(x,y,z)
In [331]: res
Out[331]:
array([[[0, 1, 2, 3],
[2, 3, 4, 5],
[4, 5, 6, 7]],
[[1, 2, 3, 4],
[3, 4, 5, 6],
[5, 6, 7, 8]]])
As #mgilson explains, you can generate 3 arrays that define the 3d space with:
In [332]: I,J,K = np.meshgrid(xv,yv,zv,indexing='ij',sparse=True)
In [333]: I.shape
Out[333]: (2, 1, 1)
In [334]: J.shape
Out[334]: (1, 3, 1)
In [335]: I,J,K = np.ix_(xv,yv,zv) # equivalently
In [336]: I.shape
Out[336]: (2, 1, 1)
foo was written so it works with arrays just as well as with scalars, so:
In [337]: res1 = foo(I,J,K)
In [338]: res1
Out[338]:
array([[[0, 1, 2, 3],
...
[5, 6, 7, 8]]])
So if your function fits this pattern, use it. Look at those I,J,K arrays, with and without sparse.
There are other tools for generating the i,j,k sets. For example:
for i,j,k in np.ndindex(res.shape):
res[i,j,k] = foo(xv[i], yv[j], zv[k])
for i,j,k in itertools.product(range(2),range(3),range(4)):
res[i,j,k] = foo(xv[i], yv[j], zv[k])
itertools.product is fast, especially when used as list(product(...)). But the iteration mechanism isn't that important. It's the repeated call to foo that take up most of the time.
ndindex actually uses nditer, which can be used directly in:
it = np.nditer([I,J,K,None],flags=['external_loop','buffered'])
for x,y,z,r in it:
r[...] = foo(x,y,z)
it.operands[-1]
nditer is best described in:
https://docs.scipy.org/doc/numpy/reference/arrays.nditer.html. It is best used as a stepping stone toward a cython version. Otherwise it doesn't have any speed advantages. (Though with this foo, and 'external_loop' it is as fast as foo(I,J,K)). Note that this doesn't need the indices (but see 'multi_index').
And yes, there's vectorize. Convenient, but not a speedy solution.
vfoo=np.vectorize(foo, otypes=['int'])
vfoo(I,J,K)
I have an array of values that I want to replace with from an array of choices based on which choice is linearly closest.
The catch is the size of the choices is defined at runtime.
import numpy as np
a = np.array([[0, 0, 0], [4, 4, 4], [9, 9, 9]])
choices = np.array([1, 5, 10])
If choices was static in size, I would simply use np.where
d = np.where(np.abs(a - choices[0]) > np.abs(a - choices[1]),
np.where(np.abs(a - choices[0]) > np.abs(a - choices[2]), choices[0], choices[2]),
np.where(np.abs(a - choices[1]) > np.abs(a - choices[2]), choices[1], choices[2]))
To get the output:
>>d
>>[[1, 1, 1], [5, 5, 5], [10, 10, 10]]
Is there a way to do this more dynamically while still preserving the vectorization.
Subtract choices from a, find the index of the minimum of the result, substitute.
a = np.array([[0, 0, 0], [4, 4, 4], [9, 9, 9]])
choices = np.array([1, 5, 10])
b = a[:,:,None] - choices
np.absolute(b,b)
i = np.argmin(b, axis = -1)
a = choices[i]
print a
>>>
[[ 1 1 1]
[ 5 5 5]
[10 10 10]]
a = np.array([[0, 3, 0], [4, 8, 4], [9, 1, 9]])
choices = np.array([1, 5, 10])
b = a[:,:,None] - choices
np.absolute(b,b)
i = np.argmin(b, axis = -1)
a = choices[i]
print a
>>>
[[ 1 1 1]
[ 5 10 5]
[10 1 10]]
>>>
The extra dimension was added to a so that each element of choices would be subtracted from each element of a. choices was broadcast against a in the third dimension, This link has a decent graphic. b.shape is (3,3,3). EricsBroadcastingDoc is a pretty good explanation and has a graphic 3-d example at the end.
For the second example:
>>> print b
[[[ 1 5 10]
[ 2 2 7]
[ 1 5 10]]
[[ 3 1 6]
[ 7 3 2]
[ 3 1 6]]
[[ 8 4 1]
[ 0 4 9]
[ 8 4 1]]]
>>> print i
[[0 0 0]
[1 2 1]
[2 0 2]]
>>>
The final assignment uses an Index Array or Integer Array Indexing.
In the second example, notice that there was a tie for element a[0,1] , either one or five could have been substituted.
To explain wwii's excellent answer in a little more detail:
The idea is to create a new dimension which does the job of comparing each element of a to each element in choices using numpy broadcasting. This is easily done for an arbitrary number of dimensions in a using the ellipsis syntax:
>>> b = np.abs(a[..., np.newaxis] - choices)
array([[[ 1, 5, 10],
[ 1, 5, 10],
[ 1, 5, 10]],
[[ 3, 1, 6],
[ 3, 1, 6],
[ 3, 1, 6]],
[[ 8, 4, 1],
[ 8, 4, 1],
[ 8, 4, 1]]])
Taking argmin along the axis you just created (the last axis, with label -1) gives you the desired index in choices that you want to substitute:
>>> np.argmin(b, axis=-1)
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2]])
Which finally allows you to choose those elements from choices:
>>> d = choices[np.argmin(b, axis=-1)]
>>> d
array([[ 1, 1, 1],
[ 5, 5, 5],
[10, 10, 10]])
For a non-symmetric shape:
Let's say a had shape (2, 5):
>>> a = np.arange(10).reshape((2, 5))
>>> a
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
Then you'd get:
>>> b = np.abs(a[..., np.newaxis] - choices)
>>> b
array([[[ 1, 5, 10],
[ 0, 4, 9],
[ 1, 3, 8],
[ 2, 2, 7],
[ 3, 1, 6]],
[[ 4, 0, 5],
[ 5, 1, 4],
[ 6, 2, 3],
[ 7, 3, 2],
[ 8, 4, 1]]])
This is hard to read, but what it's saying is, b has shape:
>>> b.shape
(2, 5, 3)
The first two dimensions came from the shape of a, which is also (2, 5). The last dimension is the one you just created. To get a better idea:
>>> b[:, :, 0] # = abs(a - 1)
array([[1, 0, 1, 2, 3],
[4, 5, 6, 7, 8]])
>>> b[:, :, 1] # = abs(a - 5)
array([[5, 4, 3, 2, 1],
[0, 1, 2, 3, 4]])
>>> b[:, :, 2] # = abs(a - 10)
array([[10, 9, 8, 7, 6],
[ 5, 4, 3, 2, 1]])
Note how b[:, :, i] is the absolute difference between a and choices[i], for each i = 1, 2, 3.
Hope that helps explain this a little more clearly.
I love broadcasting and would have gone that way myself too. But, with large arrays, I would like to suggest another approach with np.searchsorted that keeps it memory efficient and thus achieves performance benefits, like so -
def searchsorted_app(a, choices):
lidx = np.searchsorted(choices, a, 'left').clip(max=choices.size-1)
ridx = (np.searchsorted(choices, a, 'right')-1).clip(min=0)
cl = np.take(choices,lidx) # Or choices[lidx]
cr = np.take(choices,ridx) # Or choices[ridx]
mask = np.abs(a - cl) > np.abs(a - cr)
cl[mask] = cr[mask]
return cl
Please note that if the elements in choices are not sorted, we need to add in the additional argument sorter with np.searchsorted.
Runtime test -
In [160]: # Setup inputs
...: a = np.random.rand(100,100)
...: choices = np.sort(np.random.rand(100))
...:
In [161]: def broadcasting_app(a, choices): # #wwii's solution
...: return choices[np.argmin(np.abs(a[:,:,None] - choices),-1)]
...:
In [162]: np.allclose(broadcasting_app(a,choices),searchsorted_app(a,choices))
Out[162]: True
In [163]: %timeit broadcasting_app(a, choices)
100 loops, best of 3: 9.3 ms per loop
In [164]: %timeit searchsorted_app(a, choices)
1000 loops, best of 3: 1.78 ms per loop
Related post : Find elements of array one nearest to elements of array two
Here is my code. What I want it to return is an array of matrices
[[1,1],[1,1]], [[2,4],[8,16]], [[3,9],[27,81]]
I know I can probably do it using for loop and looping through my vector k, but I was wondering if there is a simple way that I am missing. Thanks!
from numpy import *
import numpy as np
k=np.arange(1,4,1)
print k
def exam(p):
return np.array([[p,p**2],[p**3,p**4]])
print exam(k)
The output:
[1 2 3]
[[[ 1 2 3]
[ 1 4 9]]
[[ 1 8 27]
[ 1 16 81]]]
The key is to play with the shapes and broadcasting.
b = np.arange(1,4) # the base
e = np.arange(1,5) # the exponent
b[:,np.newaxis] ** e
=>
array([[ 1, 1, 1, 1],
[ 2, 4, 8, 16],
[ 3, 9, 27, 81]])
(b[:,None] ** e).reshape(-1,2,2)
=>
array([[[ 1, 1],
[ 1, 1]],
[[ 2, 4],
[ 8, 16]],
[[ 3, 9],
[27, 81]]])
If you must have the output as a list of matrices, do:
m = (b[:,None] ** e).reshape(-1,2,2)
[ np.mat(a) for a in m ]
=>
[matrix([[1, 1],
[1, 1]]),
matrix([[ 2, 4],
[ 8, 16]]),
matrix([[ 3, 9],
[27, 81]])]