numpy multiple slicing booleans - python

I'm having trouble editing values in a numpy array
import numpy as np
foo = np.ones(10,10,2)
foo[row_criteria, col_criteria, 0] += 5
foo[row_criteria,:,0][:,col_criteria] += 5
row_criteria and col_criteria are boolean arrays (1D). In the first case I get a
"shape mismatch: objects cannot be broadcast to a single shape" error
In the second case, += 5 doesn't get applied at all. When I do
foo[row_criteria,:,0][:,col_criteria] + 5
I get a modified return value but modifying the value in place doesn't seem to work...
Can someone explain how to fix this? Thanks!

You want:
foo[np.ix_(row_criteria, col_criteria, [0])] += 5
To understand how this works take this example:
import numpy as np
A = np.arange(25).reshape([5, 5])
print A[[0, 2, 4], [0, 2, 4]]
# [0, 12, 24]
# The above example gives the the elements A[0, 0], A[2, 2], A[4, 4]
# But what if I want the "outer product?" ie for [[0, 2, 4], [1, 3]] i want
# A[0, 1], A[0, 3], A[2, 1], A[2, 3], A[4, 1], A[4, 3]
print A[np.ix_([0, 2, 4], [1, 3])]
# [[ 1 3]
# [11 13]
# [21 23]]
The same thing works with boolean indexing. Also np.ix_ doesn't do anything really amazing, it just reshapes it's arguments so they can be broadcast against each other:
i, j = np.ix_([0, 2, 4], [1, 3])
print i.shape
# (3, 1)
print j.shape
# (1, 2)

Related

How to extract the upper values from the whole 3D array in python

Is it possible to extract the upper values from the whole 3D array?
A simple example of a 3D array is below:
import numpy as np
a = np.array([[[7, 4, 2], [5, 0, 4], [0, 0, 5]],
[[7, 6, 1], [3, 9, 5], [0, 8, 7]],
[[8, 10, 3], [1, 2, 15], [9, 0, 1]]])
You can use the numpy building-matrices functions like numpy.triu (triangle-upper) or numpy.tril (triangle-lower) to return a copy of a matrix with the elements above or below the k-th diagonal zeroed.
If, on the other hand, you are only interested in the values ​​above or below the diagonal (without having a copy of the matrix), you can simply use numpy.triu_indices and numpy.tril_indices, as follows:
upper_index = np.triu_indices(n=3, k=1)
where n is the size of the arrays for which the returned indices will be valid, and k the diagonal offset.
and return the indices for the triangle. The returned tuple contains two arrays, each with the indices along one dimension of the array:
(array([0, 0, 1], dtype=int64), array([1, 2, 2], dtype=int64))
now you can use the indexes obtained as indexes of your array and you will get:
a[upper_index]
and gives:
array([[5, 0, 4],
[0, 0, 5],
[0, 8, 7]])
Similarly you can find the part under the diagonal using numpy.tril_indices.
IUUC, You could use triu_indices:
result = a[np.triu_indices(3)]
print(result)
Output
[[7 4 2]
[5 0 4]
[0 0 5]
[3 9 5]
[0 8 7]
[9 0 1]]
If you want those strictly above the diagonal, you can pass an offset value:
result = a[np.triu_indices(3, 1)]
print(result)
Output
[[5 0 4]
[0 0 5]
[0 8 7]]

How to set individual indices in Numpy arrays

I am trying to use arrays to set values in other arrays. Unfortunately instead of setting a value it is somehow overwriting a bunch of values. What is going on, and how can I achieve what I want?
>>> target = np.array( [ [0,1],[1,2],[2,3] ])
>>> target
array([[0, 1],
[1, 2],
[2, 3]])
>>> actions = np.array([0,0,0])
>>> target[actions] #The first row, 3 times
array([[0, 1],
[0, 1],
[0, 1]])
>>> target[:,actions] #The first column, 3 times
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2]])
>>> values = np.array([7,8,9])
>>> target[:,actions] = values #why isnt this working?
>>> target
array([[9, 1],
[9, 2],
[9, 3]])
#Actually want
#array([[7, 1],
# [8, 2],
# [9, 3]])
>>> target = np.array( [ [0,1],[1,2],[2,3] ]) #reset to original value
>>> actions = np.array([0,1,0])
>>> target[:,actions] = values.reshape(3, 1)
array([[7, 7],
[8, 8],
[9, 9]])
#Actually want
#array([[7, 1],
# [1, 8],
# [9, 3]])
target[:,actions] selects the same column of target thrice.
When you say target[:,actions] = values, what you are doing is:
Assign 7 to all the values in the column, three times.
Assign 8 to all the values in the column, three times.
Assign 9 to all the values in the column, three times.
So you end up with 9 in all the values in the column.
If you insist on this awkward triple-writing of data, you can fix it by transposing the write:
target[:,actions] = values.reshape(3, 1)
This will write [7,8,9] to the column, three times. Obviously that's wasteful, and you could do this instead:
target[:,actions[-1]] = values
The effect should be the same, and it saves computation.
2 ways to write [7,8,9] to the first column:
basic indexing (with slice):
In [396]: target[:,0] = [7,8,9] # all rows, 1st column
In [397]: target
Out[397]:
array([[7, 1],
[8, 2],
[9, 3]])
Advanced indexing (with 2 lists)
In [398]: target[[0,1,2],[0,0,0]] = [7,8,9] # pair [0,0],[1,0],[2,0]
In [399]: target
Out[399]:
array([[7, 1],
[8, 2],
[9, 3]])
The 2nd method also works for a mix of columns:
In [400]: target = np.array( [ [0,1],[1,2],[2,3] ])
In [401]: target[[0,1,2],[0,1,0]] = [7,8,9]
In [402]: target
Out[402]:
array([[7, 1],
[1, 8],
[9, 3]])
Broadcasting comes into play. In a case like this the are 3 potential arrays to broadcast - the 2 dimensions and the source array.
Advanced indexing like this produces a 1d array. So the source array has to match:
In [403]: target[[0,1,2],[0,1,0]]
Out[403]: array([7, 8, 9])
A (1,3) can broadcast to (3,), but a (3,1) can't:
In [404]: target[[0,1,2],[0,1,0]] = np.array([[7,8,9]])
In [405]: target[[0,1,2],[0,1,0]] = np.array([[7,8,9]]).T
...
ValueError: shape mismatch: value array of shape (3,1) could not be broadcast to indexing result of shape (3,)
This sort of indexing is unusual. Note that the result is (3,3).
In [412]: target[:,[0,0,0]]
Out[412]:
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2]])
A (3,1) source:
In [413]: np.array([[7,8,9]]).T
Out[413]:
array([[7],
[8],
[9]])
In [414]: target[:,[0,0,0]] = _
In [415]: target
Out[415]:
array([[7, 1],
[8, 2],
[9, 3]])
The (3,1) can broadcast to (3,3). It works, but ends up assigning [7,8,9] 3 times, all to the same 0 column.
Another way of assigning the 1st column:
In [423]: target[np.ix_([0,1,2],[0,0,0])]
Out[423]:
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2]])
Again a (3,3), with accepts a (3,1):
In [424]: target[np.ix_([0,1,2],[0,0,0])] = np.array([[7,8,9]]).T
In [425]: target
Out[425]:
array([[7, 1],
[8, 2],
[9, 3]])
ix_ makes 2 arrays that can broadcast against each other, in this case a column vector and a row one:
In [426]: np.ix_([0,1,2],[0,0,0])
Out[426]:
(array([[0],
[1],
[2]]), array([[0, 0, 0]]))
I can select all elements of target with:
In [430]: target[np.ix_([0,1,2],[0,1])]
Out[430]:
array([[0, 1],
[1, 2],
[2, 3]])
and in a jumbled order:
In [431]: target[np.ix_([2,0,1],[1,0])]
Out[431]:
array([[3, 2],
[1, 0],
[2, 1]])
I couldn't get it to work using : indexing, however the following is functional by using an array of indices. Not sure why the : method is not working, if someone can come up with a way to fix that I will accept it instead.
>>> target = np.array( [ [0,1],[1,2],[2,3] ])
>>> rows = np.arange(target.shape[0])
>>> actions = np.array([0,1,0])
>>> values = np.array([7,8,9])
>>> target[rows,actions] = values
>>> target
array([[7, 1],
[1, 8],
[9, 3]])

Numpy array indexing with indices

I wrote the following:
arr3=np.array([[[1,2,3],[1,2,3],[1,2,3],[1,2,3]],[[2,2,3],[4,2,3],[4,2,2],[2,2,2]],[[1,1,1],[1,1,1],[1,1,1],[1,1,1]]])
As I expected,
arr3[0:3,1] should return the same result as
arr3[0:3][1]:array([[2, 2, 3],[4, 2, 3],[4, 2, 2],[2, 2, 2]])
But it returns:array([[1, 2, 3],[4, 2, 3],[1, 1, 1]]).
BTW, I am using python3 in Jupyter notebook
When doing arr3[0:3,1], you are taking element from 0:3 in the first axis and then for each of those, taking the first element.
This gives a different result to taking the 0:3 in the first axis with arr3[0:3] and then taking the first array from this axis.
So in this case, the 0:3 part does nothing in either case as the array's shape is (3, 4, 3) so taking the first 3 just gives you back the same array. This does absolutely nothing in the second case, but in the first case, it does serve as essentially a place holder so that you can access the second axis, but for that you should just use a colon so: [:, some_index].
See how its the same array?
>>> arr3[0:3]
array([[[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]],
[[2, 2, 3],
[4, 2, 3],
[4, 2, 2],
[2, 2, 2]],
[[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]]])
But then when you do arr3[:, 1] you are taking the second element from the second axis of the array so that will give you:
array([[1, 2, 3],
[4, 2, 3],
[1, 1, 1]])
whereas in the other case, you are taking the second element from the first axis of the array` so:
array([[2, 2, 3],
[4, 2, 3],
[4, 2, 2],
[2, 2, 2]])
To read further about numpy indexing, take a look at this page on scipy.
Take note of this specific description which applies directly to your problem:
When there is at least one slice (:), ellipsis (...) or np.newaxis in the index (or the array has more dimensions than there are advanced indexes), then the behaviour can be more complicated. It is like concatenating the indexing result for each advanced index element
Let's look at our multidimensional numpy array:
import numpy as np
arr3=np.array([
[
[1,2,3],[1,2,3],[1,2,3],[1,2,3]
],[
[2,2,3],[4,2,3],[4,2,2],[2,2,2]
],[
[1,1,1],[1,1,1],[1,1,1],[1,1,1]
]
])
print(arr3[0:3,1])
That returns:
[[1 2 3]
[4 2 3]
[1 1 1]]
Which makes sense because we are fetching row numbers 1 through 3 and we are grabbing only the first column.
However, arr3[0:3][1] returns the array from row 0 to row 3 and then selects the second row (or row index 1).
Observe:
print(arr3[0:3])
Returns:
[[[1 2 3]
[1 2 3]
[1 2 3]
[1 2 3]]
[[2 2 3]
[4 2 3]
[4 2 2]
[2 2 2]]
[[1 1 1]
[1 1 1]
[1 1 1]
[1 1 1]]]
It returns the a new array (which happens to be the same as our current array because we just asked for all rows in the array). Then we ask for the second row:
print(arr3[0:3][1])
Returns:
[[2 2 3]
[4 2 3]
[4 2 2]
[2 2 2]]

How to return some column items in a NumPy array?

I want print some items in 2D NumPy array.
For example:
a = [[1, 2, 3, 4],
[5, 6, 7, 8]]
a = numpy.array(a)
My questions:
How can I return just (1 and 2)? As well as (5 and 6)?
And how can I keep the dimension as [2, 2]
The following:
a[:, [0, 1]]
will select only the first two columns (with index 0 and 1). The result will be:
array([[1, 2],
[5, 6]])
You can use slicing to get necessary parts of the numpy array.
To get 1 and 2 you need to select 0's row and the first two columns, i.e.
>>> a[0, 0:2]
array([1, 2])
Similarly for 5 and 6
>>> a[1, 0:2]
array([5, 6])
You can also select a 2x2 subarray, e.g.
>>> a[:,0:2]
array([[1, 2],
[5, 6]])
You can do like this,
In [44]: a[:, :2]
Out[44]:
array([[1, 2],
[5, 6]])

Search Numpy array with multiple values

I have numpy 2d array having duplicate values.
I am searching the array like this.
In [104]: import numpy as np
In [105]: array = np.array
In [106]: a = array([[1, 2, 3],
...: [1, 2, 3],
...: [2, 5, 6],
...: [3, 8, 9],
...: [4, 8, 9],
...: [4, 2, 3],
...: [5, 2, 3])
In [107]: num_list = [1, 4, 5]
In [108]: for i in num_list :
...: print(a[np.where(a[:,0] == num_list)])
...:
[[1 2 3]
[1 2 3]]
[[4 8 9]
[4 2 3]]
[[5 2 3]]
The input is list having number similar to column 0 values.
The end result I want is the resulting rows in any format like array, list or tuple for example
array([[1, 2, 3],
[1, 2, 3],
[4, 8, 9],
[4, 2, 3],
[5, 2, 3]])
My code works fine but doesn't seem pythonic. Is there any better searching strategy with multiple values?
like a[np.where(a[:,0] == l)] where only one time lookup is done to get all the values.
my real array is large
Approach #1 : Using np.in1d -
a[np.in1d(a[:,0], num_list)]
Approach #2 : Using np.searchsorted -
num_arr = np.sort(num_list) # Sort num_list and get as array
# Get indices of occurrences of first column in num_list
idx = np.searchsorted(num_arr, a[:,0])
# Take care of out of bounds cases
idx[idx==len(num_arr)] = 0
out = a[a[:,0] == num_arr[idx]]
You can do
a[numpy.in1d(a[:, 0], num_list), :]

Categories

Resources