Changed value in numpy array based on index and on criteria - python

I have a numpy array:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
>> arr
[[ 1 2 3 4 5]
[ 6 7 8 9 10]]
I want to take a portion of the array based on indices (not slices):
ix = np.ix_([0, 1], [0, 2])
>> arr[ix]
[[1 3]
[6 8]]
And I want to modify those elements in the original array, which would work if I did this:
arr[ix] = 0
>> arr
[[ 0 2 0 4 5]
[ 0 7 0 9 10]]
But I only want to change them if they follow a specific condition, like if they are lesser than 5. I am trying this:
subarr = arr[ix]
subarr[subarr < 5] = 0
But it doesn't modify the original one.
>> arr
[[ 1 2 3 4 5]
[ 6 7 8 9 10]]
>> subarr
[[0 0]
[6 8]]
I am not sure why this is not working, since both accessing the array by indices with np.ix_ and using a mask subarr < 5 should return a view of the array, not a copy.

Fancy indexing returns a copy; hence your original array will not be updated. You can use numpy.where to update your values:
arr[ix] = np.where(arr[ix] < 5, 0, arr[ix])
array([[ 0, 2, 0, 4, 5],
[ 6, 7, 8, 9, 10]])

When you do:
arr[ix] = 0
The python interpreter does arr.__setitem__(ix, 0) hence modifying the original object.
On the second case subarr is independent of arr, it is a copy of the subset of arr. You then modify this copy.

Related

How can I generate a matrix with random values based from a larger matrix in Python?

I would like to know if there was a way to generate a matrix with values based from a larger matrix. For example, if I have
larger_matrix = np.random.randint(10, size=(10,5))
Out[1]:
array([[0, 9, 0, 0, 3],
[9, 4, 7, 7, 0],
[9, 4, 5, 6, 9],
[6, 3, 1, 7, 3],
[8, 4, 6, 9, 7],
[8, 1, 5, 8, 8],
[9, 9, 6, 0, 9],
[9, 9, 6, 8, 7],
[5, 5, 6, 6, 4],
[4, 4, 7, 0, 7]])
and I want to create smaller_matrix of size (4, 5), with values randomly sampled from larger_matrix, how should I go about this? I'm aware that the function np.random.choice() exists, but I'm quite unsure if it would be helpful for my problem because I'm dealing with matrices instead of lists. Thank you.
Use flatten to convert 2d larger_matrix to 1d.
Then you can use random.choice to get random sample from larger_matrix
Finally, use reshape to convert 1d list to 2d matrix
code:
import numpy as np
larger_matrix = np.random.randint(10, size=(10,5))
print(larger_matrix)
n = 4
m = 5
print(np.reshape(np.random.choice(larger_matrix.flatten(),size = n*m),(n,m)))
result:
[[7 4 4 6 0]
[5 7 0 6 8]
[9 9 0 0 5]
[9 8 0 6 7]
[0 9 8 8 1]
[3 7 1 0 0]
[8 9 2 3 8]
[6 3 7 2 9]
[9 7 5 9 3]
[8 8 3 5 8]]
[[0 0 8 0 9]
[6 9 2 7 0]
[8 7 6 0 7]
[7 4 9 3 7]]
You can run a for loop inside a for loop and use it to fill the smaller matrix with random indexes from the matrix.
For i in range(len(larger_matrix)): For j in range(len(larger_matrix[0])): smaller_matrix[i][j] = larger_matrix[rand1][rand2]
That should cover it. Just make sure you generate 2 new numbers each time.
You could do it like this but bear in mind that the choices taken from the large array may be duplicated:-
import numpy as np
import random
R1 = 10
R2 = 4
C = 5
m = np.random.randint(R1, size=(R1, C))
print(m)
print()
n = []
for _ in range(R2):
n.append(random.choice(m))
print(np.array(n))

How in numpy get elements of matrix between two indices arrays?

Let's say I have a matrix:
>> a = np.arange(25).reshape(5, 5)`
>> a
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
and two vectors of indices that define a span of matrix elements that I want to extract:
>> indices1 = np.array([0, 1, 1, 0, 0])
>> indices2 = np.array([2, 3, 3, 2, 2])
As you can see, difference between each corresponding index is equal to 2.
I would like to do sth like this extract a part of the matrix:
>> submatrix = a[indices1:indices2, :]
so that the result would be 2x5 matrix:
>> submatrix
[[ 0 6 7 3 4],
[ 5 11 12 8 9]]
For all I know, numpy allows to provide indices as a boundaries, but does not allow to provide arrays, only integers, e.g. a[0:2].
Note what I want to subtract is not a submatrix:
Do you know of some other way of indexing a numpy matrix so that it is possible to provide arrays defining spans? For now I managed to do it only with for loops.
For reference, the most obvious loop (still took several experimental steps):
In [87]: np.concatenate([a[i:j,n] for n,(i,j) in enumerate(zip(indices1,indices2))], ).reshape(-1,2).T
Out[87]:
array([[ 0, 6, 7, 3, 4],
[ 5, 11, 12, 8, 9]])
Broadcasted indices taking advantage of the constant length:
In [88]: indices1+np.arange(2)[:,None]
Out[88]:
array([[0, 1, 1, 0, 0],
[1, 2, 2, 1, 1]])
In [89]: a[indices1+np.arange(2)[:,None],np.arange(5)]
Out[89]:
array([[ 0, 6, 7, 3, 4],
[ 5, 11, 12, 8, 9]])

Using numpy.ones as indices of an array

I would like to translate a matlab code into a python one. The matlab code is equivalent to the following toy example:
a = [1 2 3; 4 5 6; 7 8 9]
b = a(:, ones(1,3))
It returns
a =
1 2 3
4 5 6
7 8 9
b =
1 1 1
4 4 4
7 7 7
I tried to translate it like this:
from numpy import array
from numpy import ones
a = array([ [1,2,3], [4,5,6], [7,8,9] ])
b = a[:][ones((1,3))]
but it returns the following error message:
Traceback (most recent call last):
File "example_slice.py", line 6, in
b =a[:, ones((1,3))]
IndexError: arrays used as indices must be of integer (or boolean) type
EDIT: maybe ones should be replaced by zeros in this particular case but it is not the problem here. The question deals with the problem of giving a list containing the same index many times to the array a in order to get the same array b as the one computed with Matlab.
The MATLAB code can also be written (more idiomatically and more clearly) as:
b = repmat(a(:,1),1,3);
In NumPy you'd write:
b = np.tile(a[:,None,0],(1,3))
(Note the None needed to preserve the orientation of the vector extracted).
You could use list comprehension with np.full() to create arrays of certain values.
import numpy as np
a = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
b = np.array([np.full(len(i), i[0]) for i in a])
print(b)
Output:
[[1 1 1]
[4 4 4]
[7 7 7]]
In [568]: a = np.array([ [1,2,3], [4,5,6], [7,8,9] ])
In [569]: a[:,0]
Out[569]: array([1, 4, 7])
In [570]: a[:,[0,0,0]]
Out[570]:
array([[1, 1, 1],
[4, 4, 4],
[7, 7, 7]])
In [571]: a[:, np.zeros(3, dtype=int)] # int dtype to avoid your error
Out[571]:
array([[1, 1, 1],
[4, 4, 4],
[7, 7, 7]])
====
In [572]: np.zeros(3)
Out[572]: array([0., 0., 0.])
In [573]: np.zeros(3, int)
Out[573]: array([0, 0, 0])
Earlier numpy versions allowed float indices, but newer ones have tightened the requirement.

Getting the index of the minimum value in each slice of `ndarray`

I am trying to do something that should be straightforward and can be accomplished in a for-loop but I am trying to avoid that.
I would like to get the index of the minimum value in each slice along a certain axis of a numpy.ndarray, a. I am more interested in the index than the value itself. I use the index to get a value from another 2D array with shape equal to the first 2 dimensions of a.
Here is a naive implementation using a for-loop:
a = np.random.randint(0, 10, 60).reshape(3, 4, 5)
print(a)
for i in range(a.shape[-1]):
idx = a[..., i].argmin()
print('Slice:', i, '| Index:', idx, '| min value:',
a[..., i].flat[idx])
Out:
[[[1 9 4 0 7]
[6 3 1 6 8]
[7 8 2 0 2]
[8 6 1 6 5]]
[[8 7 0 6 9]
[7 2 6 4 5]
[3 4 9 2 9]
[1 4 8 0 7]]
[[1 4 6 6 2]
[9 9 5 6 7]
[6 2 8 9 9]
[3 9 8 5 4]]]
Slice: 0 | Index: 0 | min value: 1
Slice: 1 | Index: 5 | min value: 2
Slice: 2 | Index: 4 | min value: 0
Slice: 3 | Index: 0 | min value: 0
Slice: 4 | Index: 2 | min value: 2
I realise I can pass an axis keyword argument to argmin but that does not produce the result I am looking for.
For the specific case given in your question, you can reshape your array, then use argmin:
>>> import numpy as np
>>> a = np.array([[[1, 9, 4, 0, 7],
... [6, 3, 1, 6, 8],
... [7, 8, 2, 0, 2],
... [8, 6, 1, 6, 5]],
...
... [[8, 7, 0, 6, 9],
... [7, 2, 6, 4, 5],
... [3, 4, 9, 2, 9],
... [1, 4, 8, 0, 7]],
...
... [[1, 4, 6, 6, 2],
... [9, 9, 5, 6, 7],
... [6, 2, 8, 9, 9],
... [3, 9, 8, 5, 4]]])
>>> a.reshape(-1, a.shape[2]).min(axis=0)
array([1, 2, 0, 0, 2])
>>> a.reshape(-1, a.shape[2]).argmin(axis=0)
array([0, 5, 4, 0, 2])
>>>
The shape[2] comes from the fact that this is the dimension (in this case, the inner dimension, or rows), where you don't want to calculate the minimum across: you're calculating the minimum across the first two dimensions.
You also need the slice number: basically just the second index of your elements. That is easy, since that one is sequential, and is just:
slices = np.arange(a.shape[2])

How to specify columns when using repeated indices with numpy [for use with np.add.at()]

I'm trying to apply an addition operator to an array where I want repeated indices to indicate repeated addition operations. From a Python Data Science Book (https://jakevdp.github.io/PythonDataScienceHandbook/02.07-fancy-indexing.html), it seems that this is possible using np.add.at(original matrix, indices, thing to add), but I can't figure out how to specify the indices to operate on columns, not rows.
e.g. Dummy Example
# Create Array
A = np.arange(12)
A = A.reshape(4,3)
print(A)
gives
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
and
# Create columns to add to A (in reality, all values won't be the same)
B = np.ones_like(A[:, [0,0]])
print(adder)
gives
[[1 1]
[1 1]
[1 1]
[1 1]]
I want to perform the operation A[:, [0, 0]] += B but using the system where repeated indices indicate repeated operations (so in this case, both columns of B gets added to column 0). The result should thus be:
[[ 2 1 2]
[ 5 4 5]
[ 7 7 8]
[ 11 10 11]]
This can be done using np.add.at(A, I, B) I believe, but how do I specify the indices I to correspond to [:, [0,0]] as this gives a syntax error (it seems that the indices matrix can't contain the : character?).
Thanks
In [12]: A = np.arange(12).reshape(4,3)
In [13]: np.add.at(A, (slice(None), [0,0]), 1)
In [14]: A
Out[14]:
array([[ 2, 1, 2],
[ 5, 4, 5],
[ 8, 7, 8],
[11, 10, 11]])
This could also be written with s_ as
np.add.at(A, np.s_[:, [0,0]], 1)
s_ is a class object that lets us using indexing notation to create the necessary tuple. In an indexing context Python interpreter converts the : into a slice object.
In [19]: np.s_[:, [0,0]]
Out[19]: (slice(None, None, None), [0, 0])

Categories

Resources