I have 4D numpy array a and 3D array b. Also I have 2D arrays of indices i0, j0, k0. Suppose I want to use the following construction:
np.add.at(a, (slice(None), k0, i0, j0), b)
As fas as I understood, a[slice(None), k0, i0, j0] += b is not equivalent to this np.add.at.
The question is how can this np.add.at line can be replaced with a simple numpy adding a[...] += b[...] properly?
You wrote that i0, j0 and k0 are 2-D arrays.
This is why I assumed that each row in these arrays defines a 3-D slice
in b array (each array in the respective dimension).
For the test I defined these arrays as:
k0 = np.array([[0,1],[1,2]])
i0 = np.array([[1,3],[2,4]])
j0 = np.array([[0,2],[2,4]])
so that the first 3-D slice from b is: b[0:1, 1:3, 0:2].
Because k0, i0 and j0 (taken "by row") define consecutive slices,
you can not join them in a single instruction.
My proposition is to perform your addition in the following loop:
for sl in np.dstack([k0, i0, j0]):
idx = tuple(np.insert(np.r_[np.apply_along_axis(
lambda row: slice(*row), 0, sl)], 0, slice(None)))
a[tuple(idx)] += b[tuple(idx[1:])]
idx is a 4-tuple to index array "a". The first element is slice(None) -
operate on all elements in the first dimension.
Following elements are slices for following dimensions.
To index b array the same idx is used, but without the first
element.
To test the above code, I defined both arrays as:
aDim = (2, 2, 4, 4)
bDim = aDim[1:]
a = np.arange(1, np.array(aDim).prod() + 1).reshape(aDim)
b = np.arange(501, np.array(bDim).prod() + 501).reshape(bDim)
(print them to see the initial content).
After my code was executed, a contains:
array([[[[ 1, 2, 3, 4],
[510, 512, 7, 8],
[518, 520, 11, 12],
[ 13, 14, 15, 16]],
[[ 17, 18, 19, 20],
[ 21, 22, 23, 24],
[ 25, 26, 554, 556],
[ 29, 30, 562, 564]]],
[[[ 33, 34, 35, 36],
[542, 544, 39, 40],
[550, 552, 43, 44],
[ 45, 46, 47, 48]],
[[ 49, 50, 51, 52],
[ 53, 54, 55, 56],
[ 57, 58, 586, 588],
[ 61, 62, 594, 596]]]])
Related
I am triying to solve the following problem in a more numpy-friendly way (without loops):
G is NxM matrix fill with 0, 1 or 2
D is a 3xNxM matrix
We want the a NxM matrix (R) with R[i,j] = D[k,i,j] being k=g[i,j]
A loop base solution is:
def getVals(g, d):
arr=np.zeros(g.shape)
for row in range(g.shape[0]):
for column in range(g.shape[1]):
arr[row,column]=d[g[row,column],row,column]
return arr
Try with ogrid and advanced indexing:
x,y = np.ogrid[:N,:M]
out = D[G, x[None], y[None]]
Test:
N,M=4,5
G = np.random.randint(0,3, (N,M))
D = np.random.rand(3,N,M)
np.allclose(getVals(G,D), D[G, x[None], y[None]])
# True
You could also use np.take_along_axis
Then you can simply extract your values along one specific axis:
# Example input data:
G = np.random.randint(0,3,(4,5)) # 4x5 array
D = np.random.randint(0,9,(3,4,5)) # 3x4x5 array
# Get the results:
R = np.take_along_axis(D,G[None,:],axis=0)
Since G should have the same number of dimension as D, we simply add a new dimension to G with G[None,:].
Here's my try (I assume g and d are Numpy Ndarrays):
def getVals(g, d):
m,n = g.shape
indexes = g.flatten()*m*n + np.arange(m*n)
arr = d.flatten()[indexes].reshape(m,n)
return arr
So if
d = [[[96, 89, 51, 40, 51],
[31, 72, 39, 77, 33]],
[[34, 11, 54, 86, 73],
[12, 21, 74, 39, 14]],
[[14, 91, 38, 77, 97],
[44, 55, 93, 88, 55]]]
and
g = [[2, 1, 2, 1, 1],
[0, 2, 0, 0, 2]]
then you are going to get
arr = [[14, 11, 38, 86, 73],
[31, 55, 39, 77, 55]]
If I have an array in numpy a which is n x 1. In addition, I have a function F(x,y) which takes in two values and returns a single value. I want to construct an n x n matrix b where b_ij = F(a_i, a_j) (in the array a). Is there any way to do this without looping over both arrays?
Assume that your function is:
def F(a_i, a_j):
return (a_i + a_j) if a_i % 2 == 0 else (a_i + a_j + 1)
To call it on 2 arrays in 1 go, define the vectorized version of
this function:
FF = np.vectorize(F)
Then call it:
result = FF(a, a.T)
As the source array I used:
a = np.array([[1], [5], [10], [50], [80]])
so its shape is (5, 1) (a single-column array) and got:
array([[ 3, 7, 12, 52, 82],
[ 7, 11, 16, 56, 86],
[ 11, 15, 20, 60, 90],
[ 51, 55, 60, 100, 130],
[ 81, 85, 90, 130, 160]])
I was wondering what the use of the comma was when slicing Python arrays - I have an example that appears to work, but the line that looks weird to me is
p = 20*numpy.log10(numpy.abs(numpy.fft.rfft(data[:2048, 0])))
Now, I know that when slicing an array, the first number is start, the next is end, and the last is step, but what does the comma after the end number designate? Thanks.
It is being used to extract a specific column from a 2D array.
So your example would extract column 0 (the first column) from the first 2048 rows (0 to 2047). Note however that this syntax will only work for numpy arrays and not general python lists.
Empirically - create an array using numpy
m = np.fromfunction(lambda i, j: (i +1)* 10 + j + 1, (9, 4), dtype=int)
which assigns an array like below to m
array(
[[11, 12, 13, 14],
[21, 22, 23, 24],
[31, 32, 33, 34],
[41, 42, 43, 44],
[51, 52, 53, 54],
[61, 62, 63, 64],
[71, 72, 73, 74],
[81, 82, 83, 84],
[91, 92, 93, 94]])
Now for the slice
m[:,0]
giving us
array([11, 21, 31, 41, 51, 61, 71, 81, 91])
I may have misinterpreted Khan Academy (so take with grain of salt):
In linear algebra terms, m[:,n] is taking the nth column vector of
the matrix m
See Abhranil's note how this specific interpretation only applies to numpy
It slices with a tuple. What exactly the tuple means depends on the object being sliced. In NumPy arrays, it performs a m-dimensional slice on a n-dimensional array.
>>> class C(object):
... def __getitem__(self, val):
... print val
...
>>> c = C()
>>> c[1:2,3:4]
(slice(1, 2, None), slice(3, 4, None))
>>> c[5:6,7]
(slice(5, 6, None), 7)
I'm working using numpy 1.6.2 and python 2.7.
Given an N x M x D matrix A and a matrix I that contains a list of indices.
I have to fill a zeros matrix ACopy with the sum of element of A according to the indeces found in I (see code).
Here is my code:
ACopy = zeros(A.shape)
for j in xrange(0, size(A, 0)):
i = I[j]
ACopy[j, i, :] = A[j, i, :] + A[j, i + 1, :]
Indices matrix:
I = array([2, 0, 3, 2, 1])
A matrix:
A = array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]],
[[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29]],
[[30, 31, 32],
[33, 34, 35],
[36, 37, 38],
[39, 40, 41],
[42, 43, 44]],
[[45, 46, 47],
[48, 49, 50],
[51, 52, 53],
[54, 55, 56],
[57, 58, 59]],
[[60, 61, 62],
[63, 64, 65],
[66, 67, 68],
[69, 70, 71],
[72, 73, 74]]])
I try to improve my code avoiding the for loop in this way:
r = r_[0:len(I)]
ACopy[r, I, :] = A[r, I, :] + A[r, I + 1, :]
I noticed that the output matrices ACopy are different and I can't understand why. Any idea?
Thank you all!
EDIT: I'm computing a lot of matrices and I try with np.array_equals(ACopy1,ACopy2), where ACopy1 is the output of the first method and ACopy2 the output of the second method. Sometimes the matrices are the same, but not everytime. The two methods output should be the same or are there any bordeline case?
EDIT2: I noticed that this strange behaviour happens only when matrix height is bigger than 256.
Here is my test suite:
from numpy import *
w = 5
h = 257
for i in xrange(1000):
Z = random.rand(w, h, 5)
I = (random.rand(w) * h - 1).astype(uint8)
r = r_[0:w]
ZCopy = zeros(Z.shape)
ZCopy2 = zeros(Z.shape)
for j in xrange(0, size(Z, 0)):
i = I[j]
ZCopy[j, i, :] = Z[j, i, :] + Z[j, i + 1, :]
ZCopy2[r, I, :] = Z[r, I, :] + Z[r, I + 1, :]
if (ZCopy - ZCopy2).any() > 0:
print(ZCopy, ZCopy2, I)
raise ValueError
I get the problem!
I cast the matrix I to uint8 and so matrix I elements are between 0 and 255.
I resolved using I = (random.rand(w) * h - 1).astype(uint32)
In Python 2.7 using numpy or by any means if I had an array of any size and wanted to excluded certain values and output the new array how would I do that? Here is What I would like
[(1,2,3),
(4,5,6), then exclude [4,2,9] to make the array[(1,5,3),
(7,8,9)] (7,8,6)]
I would always be excluding data the same length as the row length and always only one entry per column. [(1,5,3)] would be another example of data I would want to excluded. So every time I loop the function it reduces the array row size by one. I would imagine I have to use a masked array or convert my mask to a masked array and subtract the two then maybe condense the output but I have no idea how. Thanks for your time.
You can do it very efficiently if you transform your 2-D array in an unraveled 1-D array. Then you repeat the array with the elements to be excluded, called e in order to do an element-wise comparison:
import numpy as np
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
e = [1, 5, 3]
ar = a.T.ravel()
er = np.repeat(e, a.shape[0])
ans = ar[er != ar].reshape(a.shape[1], a.shape[0]-1).T
But it will work if each element in e only matches one row of a.
EDIT:
as suggested by #Jaime, you can avoid the ravel() and get the same result doing directly:
ans = a.T[(a != e).T].reshape(a.shape[1], a.shape[0]-1).T
To exclude vector e from matrix a:
import numpy as np
a = np.array([(1,2,3), (4,5,6), (7,8,9)])
e = [4,2,9]
print np.array([ [ i for i in a.transpose()[j] if i != e[j] ]
for j in range(len(e)) ]).transpose()
This would take some work to generalize, but here's something that can handle 2-d cases of the kind you describe. If passed unexpected input, this won't notice and will generate strange results, but it's at least a starting point:
def columnwise_compress(a, values):
a_shape = a.shape
a_trans_flat = a.transpose().reshape(-1)
compressed = a_trans_flat[~numpy.in1d(a_trans_flat, values)]
return compressed.reshape(a_shape[:-1] + ((a_shape[0] - 1),)).transpose()
Tested:
>>> columnwise_compress(numpy.arange(9).reshape(3, 3) + 1, [4, 2, 9])
array([[1, 5, 3],
[7, 8, 6]])
>>> columnwise_compress(numpy.arange(9).reshape(3, 3) + 1, [1, 5, 3])
array([[4, 2, 6],
[7, 8, 9]])
The difficulty is that you're asking for "compression" of a kind that numpy.compress doesn't do (removing different values for each column or row) and you're asking for compression along columns instead of rows. Compressing along rows is easier because it moves along the natural order of the values in memory; you might consider working with transposed arrays for that reason. If you want to do that, things become a bit simpler:
>>> a = numpy. array([[1, 4, 7],
... [2, 5, 8],
... [3, 6, 9]])
>>> a[~numpy.in1d(a, [4, 2, 9]).reshape(3, 3)].reshape(3, 2)
array([[1, 7],
[5, 8],
[3, 6]])
You'll still need to handle shape parameters intelligently if you do it this way, but it will still be simpler. Also, this assumes there are no duplicates in the original array; if there are, this could generate wrong results. Saullo's excellent answer partially avoids the problem, but any value-based approach isn't guaranteed to work unless you're certain that there aren't duplicate values in the columns.
In the spirit of #SaulloCastro's answer, but handling multiple occurrences of items, you can remove the first occurrence on each column doing the following:
def delete_skew_row(a, b) :
rows, cols = a.shape
row_to_remove = np.argmax(a == b, axis=0)
items_to_remove = np.ravel_multi_index((row_to_remove,
np.arange(cols)),
a.shape, order='F')
ret = np.delete(a.T, items_to_remove)
return np.ascontiguousarray(ret.reshape(cols,rows-1).T)
rows, cols = 5, 10
a = np.random.randint(100, size=(rows, cols))
b = np.random.randint(rows, size=(cols,))
b = a[b, np.arange(cols)]
>>> a
array([[50, 46, 85, 82, 27, 41, 45, 27, 17, 26],
[92, 35, 14, 34, 48, 27, 63, 58, 14, 18],
[90, 91, 39, 19, 90, 29, 67, 52, 68, 69],
[10, 99, 33, 58, 46, 71, 43, 23, 58, 49],
[92, 81, 64, 77, 61, 99, 40, 49, 49, 87]])
>>> b
array([92, 81, 14, 82, 46, 29, 67, 58, 14, 69])
>>> delete_skew_row(a, b)
array([[50, 46, 85, 34, 27, 41, 45, 27, 17, 26],
[90, 35, 39, 19, 48, 27, 63, 52, 68, 18],
[10, 91, 33, 58, 90, 71, 43, 23, 58, 49],
[92, 99, 64, 77, 61, 99, 40, 49, 49, 87]])