store two for loop output in a 2D matrix

store two for loop output in a 2D matrix - python

I have a 3D matrix ‘DATA’ whose dimension is 100(L)X200(B)X50(H). The values are random for each grid point.
I want to the find the number of points where the values are between 10 and 20 in each vertical column. The output will be a 2D matrix.
For this I used the following code:
out = []
for i in range(np.shape(DATA)[0]):
for j in range(np.shape(DATA)[1]):
a = DATA[i,j,:]
b = a[(a>25) & (a<30)]
c = len(b)
out.append(c)
but I am not getting the 2D matrix. Instead I am getting an array
Please Help

if you want to leverage numpy functionality:
import numpy as np
data = np.random.randint(0, 50, size=(100,200,50))
range_sum = np.sum(np.logical_and(np.less_equal(data, 20),
np.greater_equal(data, 10)
), axis=-1)
range_sum.shape
Out[6]: (100, 200)
range_sum
Out[7]:
array([[11, 12, 12, ..., 13, 9, 10],
[ 6, 12, 11, ..., 10, 14, 5],
[11, 11, 16, ..., 10, 12, 15],
...,
[11, 17, 9, ..., 12, 12, 11],
[ 9, 8, 10, ..., 7, 15, 12],
[12, 10, 11, ..., 12, 11, 19]])

You're using out as a list, and appending each value. Here's a quick modification to your code that should give you the desired result:
out = []
for i in range(np.shape(DATA)[0]):
out.append([]) # make a second dim for each i
for j in range(np.shape(DATA)[1]):
a = DATA[i,j,:]
b = a[(a>25) & (a<30)]
c = len(b)
out[i].append(c)
The change is that I made out a list of lists. In each iteration over i, we append a new list. Then in the inner loop, we append values to the list at index i.
Update
If you want an numpy.ndarray instead, you can modify your code as follows:
import numpy as np
out = np.ndarray(np.shape(DATA)) # initialize to the desired shape
for i in range(np.shape(DATA)[0]):
for j in range(np.shape(DATA)[1]):
a = DATA[i,j,:]
b = a[(a>25) & (a<30)]
c = len(b)
out[i][j] = c

Related

Numpy filter matrix based on column

I have a matrix with several different values for each row:
arr1 = np.array([[1,2,3,4,5,6,7,8,9],[10,11,12,13,14,15,16,17,18],[19,20,21,22,23,24,25,26,27]])
arr2 = np.array([["A"],["B"],["C"]])
This produces the following matrices:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18],
[19, 20, 21, 22, 23, 24, 25, 26, 27]])
array([['A'],
['B'],
['C']])
A represents the first 3 columns, B represents the next 3 columns, and C represents the last 3 columns. So the result I'd like here is:
array([[1,2,3],
[13,14,15],
[25,26,27]])
I was thinking about converting arr2 to a mask array, but I'm not even sure how to do this. If it was a 1darray I could do something like this:
arr[0,1,2]
but for a 2darray I'm not even sure how to mask like this. I tried this and got errors:
arr[[0,1,2],[3,4,5],[6,7,8]]
What's the best way to do this?
Thanks.

You could use string.ascii_uppercase to index the index in the alphabet. And reshape arr1 by 3 chunks:
from string import ascii_uppercase
reshaped = np.reshape(arr1, (len(arr1), -1, 3))
reshaped[np.arange(len(arr1)), np.vectorize(ascii_uppercase.index)(arr2).ravel()]
Or just directly map A to 0 and so on...
reshaped = np.reshape(arr1, (len(arr1), -1, 3))
reshaped[np.arange(len(arr1)), np.vectorize(['A', 'B', 'C'].index)(arr2).ravel()]
Both Output:
array([[ 1, 2, 3],
[13, 14, 15],
[25, 26, 27]])

If you gonna have shape of arr1 fixed as shown above (3,9) then it can be done with single line of code as below:
arr2 = np.array([arr1[0][0:3],arr1[1][3:6],arr1[2][6:9]])
The output will be as follows:
[[ 1 2 3]
[13 14 15]
[25 26 27]]

you can use 'advanced indexing' which index the target array by coordinate arrays.
rows = np.array([[0,0,0],[1,1,1],[2,2,2]])
cols = np.array([[0,1,2],[3,4,5],[6,7,8]])
arr1[rows, cols]
>>> array([[ 1, 2, 3],
[13, 14, 15],
[25, 26, 27]])
and you can make some functions like
def diagonal(arr, step):
rows = np.array([[x]*step for x in range(step)])
cols = np.array([[y for y in range(x, x+step)] for x in range(0, step**2, step)])
return arr[rows, cols]
diagonal(arr1, 3)
>>> array([[ 1, 2, 3],
[13, 14, 15],
[25, 26, 27]])
reference: https://numpy.org/devdocs/user/basics.indexing.html

Replacing chunks of elements in numpy array

I have an np.array like this one:
x = [1,2,3,4,5,6,7,8,9,10 ... N]. I need to replace the first n chunks with a certain element, like so:
for i in np.arange(0,125):
x[i] = x[0]
for i in np.arange(125,250):
x[i] = x[125]
for i in np.arange(250,375):
x[i] = x[250]
This is obviously not the way to go, but I just wrote it to this so I can show you what I need to achieve.

One way would be -
In [47]: x
Out[47]: array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21])
In [49]: n = 5
In [50]: x[::n][np.arange(len(x))//n]
Out[50]: array([10, 10, 10, 10, 10, 15, 15, 15, 15, 15, 20, 20])
Another with np.repeat -
In [67]: np.repeat(x[::n], n)[:len(x)]
Out[67]: array([10, 10, 10, 10, 10, 15, 15, 15, 15, 15, 20, 20])
For in-situ edit, we can reshape and assign in a broadcasted-manner, like so -
m = (len(x)-1)//n
x[:n*m].reshape(-1,n)[:] = x[:n*m:n,None]
x[n*m:] = x[n*m]

import numpy as np
x = np.arange(0,1000)
a = x[0]
b = x[125]
c = x[250]
x[0:125] = a
x[125:250] = b
x[250:375] = c
No need to write loops, you can replace bunch of values using slicing.
if the splits are equal, you can loop to calculate the stat and end positions instead of hard coding

To keep flexibility in the number of slice/value pairs you can write something like:
def chunk_replace(array, slice_list, value_list):
for s,v in zip(slice_list, value_list):
array[s] = v
return array
array = np.arange(1000)
slice_list = [slice(0,125), slice(125, 250), slice(250, 375)]
value_list = [array[0], array[125], array[250]]
result = chunk_replace(array, slice_list, value_list)

What is the easiest way in NumPy to index vectors of a matrix excluding one index in each row? [duplicate]

This question already has answers here:
How do I get all the values from a NumPy array excluding a certain index?
(5 answers)
Closed 4 years ago.
Suppose I have a NumPy ndarray M with the following content at M[0,:]:
[2, 3.9, 7, 9, 0, 1, 8.1, 3.2]
and I am given an integer, k, at runtime between 0 and 7. I want to produce the vector consisting of all items in this row except at column k. (Example: if k=3, then the desired vector is [2,3.9,7,0,1,8.1,3.2])
Is there an easy way to do this?
What if I have a vector of indices k, one for each row of M, representing the column I want to exclude from the row?
I'm kind of lost, other than a non-vectorized loop that mutates a result matrix:
nrows = M.shape[0]
result = np.zeros(nrows,M.shape[1]-1))
for irow in xrange(nrows):
result[irow,:k[irow]] = M[irow,:k[irow]] # content before the split point
result[irow,k[irow]:] = M[irow,k[irow]+1:] # content after the split point

One approach would be with masking/boolean-indexing -
mask = np.ones(M.shape,dtype=bool)
mask[np.arange(len(k)),k] = 0
out = M[mask].reshape(len(M),-1)
Alternativley, we could use broadcasting to get that mask -
np.not_equal.outer(k,np.arange(M.shape[1]))
# or k[:,None]!=np.arange(M.shape[1])
Thus, giving us a one-liner/compact version -
out = M[k[:,None]!=np.arange(M.shape[1])].reshape(len(M),-1)
To exclude multiple ones per row, edit the advanced-indexing part for the first method -
def exclude_multiple(M,*klist):
k = np.stack(klist).T
mask = np.ones(M.shape,dtype=bool)
mask[np.arange(len(k))[:,None],k] = 0
out = M[mask].reshape(len(M),-1)
return out
Sample run -
In [185]: M = np.arange(40).reshape(4,10)
In [186]: exclude_multiple(M,[1,3,2,0],[4,5,8,1])
Out[186]:
array([[ 0, 2, 3, 5, 6, 7, 8, 9],
[10, 11, 12, 14, 16, 17, 18, 19],
[20, 21, 23, 24, 25, 26, 27, 29],
[32, 33, 34, 35, 36, 37, 38, 39]])

Improvement on #Divakar's answer to extend this to zero or more excluded indices per row:
def excluding(A, *klist):
"""
excludes column k from each row of A, for each k in klist
(make sure the index vectors have no common elements)
"""
mask = np.ones(A.shape,dtype=bool)
for k in klist:
mask[np.arange(len(k)),k] = 0
return A[mask].reshape(len(A),-1)
Test:
M = np.arange(40).reshape(4,10)
excluding(M,[1,3,2,0],[4,5,8,1])
returns
array([[ 0, 2, 3, 5, 6, 7, 8, 9],
[10, 11, 12, 14, 16, 17, 18, 19],
[20, 21, 23, 24, 25, 26, 27, 29],
[32, 33, 34, 35, 36, 37, 38, 39]])

Merging rows in numpy to form new array

This is a sample of what I am trying to accomplish. I am very new to python and have searched for hours to find out what I am doing wrong. I haven't been able to find what my issue is. I am still new enough that I may be searching for the wrong phrases. If so, could you please point me in the right direction?
I want to combine n mumber of arrays to make one array. I want to have the first row from x as the first row in the combined the first row from y as the second row in combined, the first row from z as the third row in combined the the second row in x as the fourth row in combined, etc.
so I would look something like this.
x = [x1 x2 x3]
[x4 x5 x6]
[x7 x8 x9]
y = [y1 y2 y3]
[y4 y5 y6]
[y7 y8 y9]
x = [z1 z2 z3]
[z4 z5 z6]
[z7 z8 z9]
combined = [x1 x2 x3]
[y1 y2 y3]
[z1 z2 z3]
[x4 x5 x6]
[...]
[z7 z8 z9]
The best I can come up with is the
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((9,3))
for rows in range(len(x)):
combined[0::3] = x[rows,:]
combined[1::3] = y[rows,:]
combined[2::3] = z[rows,:]
print(combined)
All this does is write the last value of the input array to every third row in the output array instead of what I wanted. I am not sure if this is even the best way to do this. Any advice would help out.
*I just figure out this works but if someone knows a higher performance method, *please let me know.
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((18,3))
for rows in range(6):
combined[rows*3,:] = x[rows,:]
combined[rows*3+1,:] = y[rows,:]
combined[rows*3+2,:] = z[rows,:]
print(combined)

You can do this using a list comprehension and zip:
combined = np.array([row for row_group in zip(x, y, z) for row in row_group])

Using vectorised operations only:
A = np.vstack((x, y, z))
idx = np.arange(A.shape[0]).reshape(-1, x.shape[0]).T.flatten()
A = A[idx]
Here's a demo:
import numpy as np
x, y, z = np.random.rand(3,3), np.random.rand(3,3), np.random.rand(3,3)
print(x, y, z)
[[ 0.88259564 0.17609363 0.01067734]
[ 0.50299357 0.35075811 0.47230915]
[ 0.751129 0.81839586 0.80554345]]
[[ 0.09469396 0.33848691 0.51550685]
[ 0.38233976 0.05280427 0.37778962]
[ 0.7169351 0.17752571 0.49581777]]
[[ 0.06056544 0.70273453 0.60681583]
[ 0.57830566 0.71375038 0.14446909]
[ 0.23799775 0.03571076 0.26917939]]
A = np.vstack((x, y, z))
idx = np.arange(A.shape[0]).reshape(-1, x.shape[0]).T.flatten()
print(idx) # [0 3 6 1 4 7 2 5 8]
A = A[idx]
print(A)
[[ 0.88259564 0.17609363 0.01067734]
[ 0.09469396 0.33848691 0.51550685]
[ 0.06056544 0.70273453 0.60681583]
[ 0.50299357 0.35075811 0.47230915]
[ 0.38233976 0.05280427 0.37778962]
[ 0.57830566 0.71375038 0.14446909]
[ 0.751129 0.81839586 0.80554345]
[ 0.7169351 0.17752571 0.49581777]
[ 0.23799775 0.03571076 0.26917939]]

I have changed your code a little bit to get the desired output
import numpy as np
x = np.random.rand(6,3)
y = np.random.rand(6,3)
z = np.random.rand(6,3)
combined = np.zeros((18,3))
combined[0::3] = x
combined[1::3] = y
combined[2::3] = z
print(combined)
You had the shape of the combined matrix wrong and there is no real need for the for loop.

This might not be the most pythonic way to do it but you could
for block in range(len(combined)/3):
for rows in range(len(x)):
combined[block*3+0::3] = x[rows,:]
combined[block*3+1::3] = y[rows,:]
combined[block*3+2::3] = z[rows,:]

A simple numpy solution is to stack the arrays on a new middle axis, and reshape the result to 2d:
In [5]: x = np.arange(9).reshape(3,3)
In [6]: y = np.arange(9).reshape(3,3)+10
In [7]: z = np.arange(9).reshape(3,3)+100
In [8]: np.stack((x,y,z),axis=1).reshape(-1,3)
Out[8]:
array([[ 0, 1, 2],
[ 10, 11, 12],
[100, 101, 102],
[ 3, 4, 5],
[ 13, 14, 15],
[103, 104, 105],
[ 6, 7, 8],
[ 16, 17, 18],
[106, 107, 108]])
It may be easier to see what's happening if we give each dimension a different value; e.g. 2 3x4 arrays:
In [9]: x = np.arange(12).reshape(3,4)
In [10]: y = np.arange(12).reshape(3,4)+10
np.array combines them on a new 1st axis, making a 2x3x4 array. To get the interleaving you want, we can transpose the first 2 dimensions, producing a 3x2x4. Then reshape to a 6x4.
In [13]: np.array((x,y))
Out[13]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[10, 11, 12, 13],
[14, 15, 16, 17],
[18, 19, 20, 21]]])
In [14]: np.array((x,y)).transpose(1,0,2)
Out[14]:
array([[[ 0, 1, 2, 3],
[10, 11, 12, 13]],
[[ 4, 5, 6, 7],
[14, 15, 16, 17]],
[[ 8, 9, 10, 11],
[18, 19, 20, 21]]])
In [15]: np.array((x,y)).transpose(1,0,2).reshape(-1,4)
Out[15]:
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[ 4, 5, 6, 7],
[14, 15, 16, 17],
[ 8, 9, 10, 11],
[18, 19, 20, 21]])
np.vstack produces a 6x4, but with the wrong order. We can't transpose that directly.
np.stack with default axis behaves just like np.array. But with axis=1, it creates a 3x2x4, which we can reshape:
In [16]: np.stack((x,y), 1)
Out[16]:
array([[[ 0, 1, 2, 3],
[10, 11, 12, 13]],
[[ 4, 5, 6, 7],
[14, 15, 16, 17]],
[[ 8, 9, 10, 11],
[18, 19, 20, 21]]])
The list zip in the accepted answer is a list version of transpose, creating a list of 3 2-element tuples.
In [17]: list(zip(x,y))
Out[17]:
[(array([0, 1, 2, 3]), array([10, 11, 12, 13])),
(array([4, 5, 6, 7]), array([14, 15, 16, 17])),
(array([ 8, 9, 10, 11]), array([18, 19, 20, 21]))]
np.array(list(zip(x,y))) produces the same thing as the stack, a 3x2x4 array.
As for speed, I suspect the allocate and assign (as in Ash's answer) is fastest:
In [27]: z = np.zeros((6,4),int)
...: for i, arr in enumerate((x,y)):
...: z[i::2,:] = arr
...:
In [28]: z
Out[28]:
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[ 4, 5, 6, 7],
[14, 15, 16, 17],
[ 8, 9, 10, 11],
[18, 19, 20, 21]])
For serious timings, use much larger examples than this.

Efficient numpy array random views with dropped dimensions

For computer vision training purposes, random cropping is often used as a data augmentation technique. At each iteration, a batch of random crops is generated and fed to the network being trained. This needs to be efficient, as it is done at each training iteration.
If the data has too many dimensions, random dimension selection might also be needed. Random frames can be selected in a video for example. The data can even have 4 dimensions (3 in space + time), or more.
How can one write an efficient generator of random views of lower dimension?
A very naïve version for getting 2D views from 3D data, and only one by one, could be:
import numpy as np
import numpy.random as nr
def views():
# suppose `data` comes from elsewhere
# data.shape is (n1, n2, n3)
while True:
drop_dim = nr.randint(0, 3)
drop_dim_keep = nr.randint(0, shape[drop_dim])
selector = np.zeros(shape, dtype=bool)
if drop_dim == 0:
selector[drop_dim_keep, :, :] = 1
elif drop_dim == 1:
selector[:, drop_dim_keep, :] = 1
else:
selector[:, :, drop_dim_keep] = 1
yield np.squeeze(data[selector])
A more elegant solution probably exists, where at least:
there is no ugly if/else on the randomly chosen dimension
views can take a batch_size integer argument and generate several views at once without a loop
the dimension of input/output data is not specified (e.g. can do 3D -> 2D as well as 4D -> 2D)

I tweaked your function to clarify what it's doing:
def views():
# suppose `data` comes from elsewhere
# data.shape is (n1, n2, n3)
while True:
drop_dim = nr.randint(0, 3)
dropshape = list(shape[:])
dropshape[drop_dim] -= 1
drop_dim_keep = nr.randint(0, shape[drop_dim])
print(drop_dim, drop_dim_keep)
selector = np.ones(shape, dtype=bool)
if drop_dim == 0:
selector[drop_dim_keep, :, :] = 0
elif drop_dim == 1:
selector[:, drop_dim_keep, :] = 0
else:
selector[:, :, drop_dim_keep] = 0
yield data[selector].reshape(dropshape)
A small sample run:
In [534]: data = np.arange(24).reshape(shape)
In [535]: data
Out[535]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [536]: v = views()
In [537]: next(v)
2 1
Out[537]:
array([[[ 0, 2, 3],
[ 4, 6, 7],
[ 8, 10, 11]],
[[12, 14, 15],
[16, 18, 19],
[20, 22, 23]]])
In [538]: next(v)
0 0
Out[538]:
array([[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
So it's picking one of the dimensions, and for that dimension dropping one 'column'.
The main efficiency issue is whether it's returning a view or a copy. In this case it has to return a copy.
You are using a boolean mask to select the return, exactly the same as what np.delete does in this case.
In [544]: np.delete(data,1,2).shape
Out[544]: (2, 3, 3)
In [545]: np.delete(data,0,0).shape
Out[545]: (1, 3, 4)
So you could replace much of your interals with delete, letting it take care of generalizing the dimensions. Look at its code to see how it handles those details (It isn't short and sweet!).
def rand_delete():
# suppose `data` comes from elsewhere
# data.shape is (n1, n2, n3)
while True:
drop_dim = nr.randint(0, 3)
drop_dim_keep = nr.randint(0, shape[drop_dim])
print(drop_dim, drop_dim_keep)
yield np.delete(data, drop_dim_keep, drop_dim)
In [547]: v1=rand_delete()
In [548]: next(v1)
0 1
Out[548]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]])
In [549]: next(v1)
2 0
Out[549]:
array([[[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]],
[[13, 14, 15],
[17, 18, 19],
[21, 22, 23]]])
Replace the delete with take:
def rand_take():
while True:
take_dim = nr.randint(0, 3)
take_keep = nr.randint(0, shape[take_dim])
print(take_dim, take_keep)
yield np.take(data, take_keep, axis=take_dim)
In [580]: t = rand_take()
In [581]: next(t)
0 0
Out[581]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [582]: next(t)
2 3
Out[582]:
array([[ 3, 7, 11],
[15, 19, 23]])
np.take returns a copy, but the equivalent slicing does not
In [601]: data.__array_interface__['data']
Out[601]: (182632568, False)
In [602]: np.take(data,0,1).__array_interface__['data']
Out[602]: (180099120, False)
In [603]: data[:,0,:].__array_interface__['data']
Out[603]: (182632568, False)
A slicing tuple can be generated with expressions like
In [604]: idx = [slice(None)]*data.ndim
In [605]: idx[1] = 0
In [606]: data[tuple(idx)]
Out[606]:
array([[ 0, 1, 2, 3],
[12, 13, 14, 15]])
Various numpy functions that take an axis parameter construct an indexing tuple like this. (For example one or more of the apply... functions.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

store two for loop output in a 2D matrix - python

Related

Numpy filter matrix based on column

Replacing chunks of elements in numpy array

What is the easiest way in NumPy to index vectors of a matrix excluding one index in each row? [duplicate]

Merging rows in numpy to form new array

Efficient numpy array random views with dropped dimensions

Categories

Resources