Insert into numpy array with condition from another array - python

I have a numpy array arr containing 0s and 1s,
arr = np.random.randint(2, size=(800,800))
Then I casted it to astype(np.float32) and inserted various float numbers at various positions. In fact, what I would like to do is insert those float numbers only where the original array had 1 rather than 0; where the original array had 0 I want to keep 0.
My thought was to take a copy of the array (with .copy()) and reinsert from that later. So now I have arr above (1s and 0s), and a same-shaped array arr2 with numerical elements. I want to replace the elements in arr2 with those in arr only where (and everywhere where) the element in arr is 0. How can I do this?
Small example:
arr = np.array([1,0],
[0,1])
arr2 = np.array([2.43, 5.25],
[1.54, 2.59])
Desired output:
arr2 = np.array([2.43, 0],
[0, 2.59])
N.B. should be as fast as possible on arrays of around 800x800

Simply do:
arr2[arr == 0] = 0
or
arr2 = arr2 * arr

#swag2198 is correct, an alternative is below
Numpy has a functin called 'where' which allows you to set values based on a condition from another array - this is essentially masking
below will achieve what you want - essentially it will return an array the same dimensions as arr2, except wherever there is a zero in arr, it will be replaced with zero
arr = np.array([[1,0],[0,1]])
arr2 = np.array([[2.43, 5.25],
[1.54, 2.59]])
arr_out = np.where(arr, arr2, 0)
the advantage of this way is that you can pick values based on two arrays if you wish - say you wanted to mask an image for instance - replace the background

Related

Order one array based on the order of the other array in Python

2 numpy arrays have the exact same values.
arr1 = np.array([1,2,3,4,5])
arr2 = np.array([2,3,1,5,4])
How can I order the values in the first array the way they are ordered in the second array?
Try using the np.vectorize method from here
map_dict = {x[0]:x[1] for x in zip(arr1, arr2)}
mapped_array = np.vectorize(map_dict.get)(arr1)

How to get elements and indices into original array with mask

I am trying to get both the elements and indices from two arrays where the elements match. I think I am overthinking this but I have tried the where function and intersection and cannot get it to work. My actual array is much longer but here two simple arrays to demonstrate what I want:
import numpy as np
arr1 = np.array([0.00, 0.016, 0.033, 0.050, 0.067])
arr2 = np.array([0.016, 0.033, 0.050, 0.067, 0.083])
ind = np.intersect1d(np.where(arr1 >= 0.01), np.where(arr2 >= 0.01))
Printing ind shows array([1, 2, 3, 4]). Technically, I want the elements 1, 2, 3, 4 from arr1 and elements 0, 1, 2, 3 from arr2, which gives the elements 0.016, 0.033, 0.050, 0.067, which match in both arrays.
np.where converts a boolean mask like arr1 >= 0.01 into an index. You can select with the mask directly, but it won't be invertible. You need to invert the indices because you want to intersect from the original array, not the selection. Make sure to set return_indices=True to get indices from intersect1d:
index1 = np.nonzero(arr1 >= 0.01)
index2 = np.nonzero(arr2 >= 0.01)
selection1 = arr1[index1]
selection2 = arr2[index1]
elements, ind1, ind2 = np.intersect1d(selection1, selection2, return_indices=True)
index1 = index1[ind1]
index2 = index2[ind2]
While you get elements directly from the intersection, the indices ind1 and ind2 are referencing the masked selections. Since index1 is the original index of each element in selection1, index1[ind1] converts ind1 back into the arr1 reference frame.
Your original expression was actually meaningless. You were intersecting the indices in each array that met your condition. That has nothing to do with the values at those indices (which wouldn't have to match at all). The seemingly correct result is purely a coincidence based on a fortuitous array construction.

How to get indices of values of numpy array which are zero

Lets say I have a numpy array as below:
array([ 1. , 2. , 0. , 3.4, 0. , 1.1])
Now I want to get indices of all the elements which are zero. Reason being I want to get those indices and then for a different array I want to convert the elements on same indices to zero.
To get indices of non zero I know we can use nonzero or argwhere.
np.nonzero(a)
(array([0, 1, 3, 5]),)
Then lets say I have array b I can use the above indices list and convert all the elements of array b with same indices to zero like below:
b[np.nonzero(a)]=0
This is fine but it is for non zero indices. How to get for zero indices.
If you just want to use the result for indexing purposes, it's more efficient to make a boolean array than an array of indices:
a == 0
You can index with that the same way you would have used the array of indices:
b[a == 0] = 0
If you really want an array of indices, you can call np.nonzero on the boolean array:
np.nonzero(a == 0)
np.where also works.

Shapes of the np.arrays, unexpected additional dimension

I'm dealing with arrays in python, and this generated a lot of doubts...
1) I produce a list of list reading 4 columns from N files and I store 4 elements for N times in a list. I then convert this list in a numpy array:
s = np.array(s)
and I ask for the shape of this array. The answer is correct:
print s.shape
#(N,4)
I then produce the mean of this Nx4 array:
s_m = sum(s)/len(s)
print s_m.shape
#(4,)
that I guess it means that this array is a 1D array. Is this correct?
2) If I subtract the mean vector s_m from the rows of the array s, I can proceed in two ways:
residuals_s = s - s_m
or:
residuals_s = []
for i in range(len(s)):
residuals_s.append([])
tmp = s[i] - s_m
residuals_s.append(tmp)
if I now ask for the shape of residuals_s in the two cases I obtain two different answers. In the first case I obtain:
(N,4)
in the second:
(N,1,4)
can someone explain why there is an additional dimension?
You can get the mean using the numpy method (producing the same (4,) shape):
s_m = s.mean(axis=0)
s - s_m works because s_m is 'broadcasted' to the dimensions of s.
If I run your second residuals_s I get a list containing empty lists and arrays:
[[],
array([ 1.02649662, 0.43613824, 0.66276758, 2.0082684 ]),
[],
array([ 1.13000227, -0.94129685, 0.63411801, -0.383982 ]),
...
]
That does not convert to a (N,1,4) array, but rather a (M,) array with dtype=object. Did you copy and paste correctly?
A corrected iteration is:
for i in range(len(s)):
residuals_s.append(s[i]-s_m)
produces a simpler list of arrays:
[array([ 1.02649662, 0.43613824, 0.66276758, 2.0082684 ]),
array([ 1.13000227, -0.94129685, 0.63411801, -0.383982 ]),
...]
which converts to a (N,4) array.
Iteration like this usually is not needed. But if it is, appending to lists like this is one way to go. Another is to pre allocate an array, and assign rows
residuals_s = np.zeros_like(s)
for i in range(s.shape[0]):
residuals_s[i,:] = s[i]-s_m
I get your (N,1,4) with:
In [39]: residuals_s=[]
In [40]: for i in range(len(s)):
....: residuals_s.append([])
....: tmp = s[i] - s_m
....: residuals_s[-1].append(tmp)
In [41]: residuals_s
Out[41]:
[[array([ 1.02649662, 0.43613824, 0.66276758, 2.0082684 ])],
[array([ 1.13000227, -0.94129685, 0.63411801, -0.383982 ])],
...]
In [43]: np.array(residuals_s).shape
Out[43]: (10, 1, 4)
Here the s[i]-s_m array is appended to an empty list, which has been appended to the main list. So it's an array within a list within a list. It's this intermediate list that produces the middle 1 dimension.
You are using NumPy ndarray without using the functions in NumPy, sum() is a python builtin function, you should use numpy.sum() instead.
I suggest you change your code as:
import numpy as np
np.random.seed(0)
s = np.random.randn(10, 4)
s_m = np.mean(a, axis=0, keepdims=True)
residuals_s = s - s_m
print s.shape, s_m.shape, residuals_s.shape
use mean() function with axis and keepdims arguments will give you the correct result.

Numpy multidimensional array slicing

Suppose I have defined a 3x3x3 numpy array with
x = numpy.arange(27).reshape((3, 3, 3))
Now, I can get an array containing the (0,1) element of each 3x3 subarray with x[:, 0, 1], which returns array([ 1, 10, 19]). What if I have a tuple (m,n) and want to retrieve the (m,n) element of each subarray(0,1) stored in a tuple?
For example, suppose that I have t = (0, 1). I tried x[:, t], but it doesn't have the right behaviour - it returns rows 0 and 1 of each subarray. The simplest solution I have found is
x.transpose()[tuple(reversed(t))].transpose()
but I am sure there must be a better way. Of course, in this case, I could do x[:, t[0], t[1]], but that can't be generalised to the case where I don't know how many dimensions x and t have.
you can create the index tuple first:
index = (numpy.s_[:],)+t
x[index]
HYRY solution is correct, but I have always found numpy's r_, c_ and s_ index tricks to be a bit strange looking. So here is the equivalent thing using a slice object:
x[(slice(None),) + t]
That single argument to slice is the stop position (i.e. None meaning all in the same way that x[:] is equivalent to x[None:None])

Categories

Resources