Related
I would like to check if a numpy tuple is present in a numpy array of tuples.
When I run the following code:
import numpy as np
myarray=np.array([[0,1],[0,2],[0,3],[4,4]])
test1=np.array([0,3])
test2=np.array([4,0])
myarraylst=myarray.tolist()
test1lst=test1.tolist()
test2lst=test2.tolist()
print(test1lst in myarraylst)
print(test2lst in myarraylst)
I get "True" for the first test and "False" for the second test as it should be.
Is there a way to do this without converting the numpy arrays to python lists ?
Many Thanks !
For lists, the in tests for the identity/equality of the sublists. For arrays in, or np.isin the evaluation goes all the down, to the numeric elements.
In [181]: myarray = np.array([[0, 1], [0, 2], [0, 3], [4, 4]])
...: test1 = np.array([0, 3])
...: test2 = np.array([4, 0])
But we can do an elementwise test:
In [183]: test1 == myarray
Out[183]:
array([[ True, False],
[ True, False],
[ True, True],
[False, False]])
Here one array is (4,2) and the other (2,) shape, which broadcast together just fine. For other cases we may need to tweak dimensions.
You just want the rows where both elements match:
In [184]: (test1 == myarray).all(axis=1)
Out[184]: array([False, False, True, False])
In [185]: (test2 == myarray).all(axis=1)
Out[185]: array([False, False, False, False])
and reduce those arrays to one value with any:
In [187]: (test1 == myarray).all(axis=1).any()
Out[187]: True
In [188]: (test2 == myarray).all(axis=1).any()
Out[188]: False
I want to convert a 2d matrix of dates to boolean matrix based on dates in a 1d matrix. i.e.,
[[20030102, 20030102, 20070102],
[20040102, 20040102, 20040102].,
[20050102, 20050102, 20050102]]
should become
[[True, True, False],
[False, False, False].,
[True, True, True]]
if I provide a 1d array [20010203, 20030102, 20030501, 20050102, 20060101]
import numpy as np
dateValues = np.array(
[[20030102, 20030102, 20030102],
[20040102, 20040102, 20040102],
[20050102, 20050102, 20050102]])
requestedDates = [20010203, 20030102, 20030501, 20050102, 20060101]
ix = np.in1d(dateValues.ravel(), requestedDates).reshape(dateValues.shape)
print(ix)
Returns:
[[ True True True]
[False False False]
[ True True True]]
Refer to numpy.in1d for more information (documentation):
http://docs.scipy.org/doc/numpy/reference/generated/numpy.in1d.html
a = np.array([[20030102, 20030102, 20070102],
[20040102, 20040102, 20040102],
[20050102, 20050102, 20050102]])
b = np.array([20010203, 20030102, 20030501, 20050102, 20060101])
>>> a.shape
(3, 3)
>>> b.shape
(5,)
>>>
For the comparison, you need to broadcast b onto a by adding an axis to a. - this compares each element of a with each element of b
>>> mask = a[...,None] == b
>>> mask.shape
(3, 3, 5)
>>>
Then use np.any() to see if there are any matches
>>> np.any(mask, axis = 2, keepdims = False)
array([[ True, True, False],
[False, False, False],
[ True, True, True]], dtype=bool)
timeit.Timer comparison with in1d:
>>>
>>> t = Timer("np.any(a[...,None] == b, axis = 2)","from __main__ import np, a, b")
>>> t.timeit(10000)
0.13268041338812964
>>> t = Timer("np.in1d(a.ravel(), b).reshape(a.shape)","from __main__ import np, a, b")
>>> t.timeit(10000)
0.26060646913566643
>>>
I'm working with numpy masked arrays and there is a trivial operation I cannot figure out how to do it in a simple way. If I have two masked arrays, how can I get them aggregated into another array that contains only the unmasked values?
In [1]: import numpy as np
In [2]: np.ma.array([1, 2, 3], mask = [0,1,1])
Out[2]:
masked_array(data = [1 -- --],
mask = [False True True],
fill_value = 999999)
In [3]: np.ma.array([4, 5, 6], mask = [1,1,0])
Out[3]:
masked_array(data = [-- -- 6],
mask = [ True True False],
fill_value = 999999)
Which operation should I apply to the previous arrays if I want to get:
masked_array(data = [1 -- 6],
mask = [False True False],
fill_value = 999999)
Stack the masks and the arrays using numpy.dstack and create a new masked array and then you can get the required output using numpy.prod:
>>> a1 = np.ma.array([1, 2, 3], mask = [0,1,1])
>>> a2 = np.ma.array([7, 8, 9], mask = [1,1,0])
>>> arr = np.ma.array(np.dstack((a1, a2)), mask=np.dstack((a1.mask, a2.mask)))
>>> np.prod(arr[0], axis=1)
masked_array(data = [1 -- 9],
mask = [False True False],
fill_value = 999999)
I'm trying to rewrite a function using numpy which is originally in MATLAB. There's a logical indexing part which is as follows in MATLAB:
X = reshape(1:16, 4, 4).';
idx = [true, false, false, true];
X(idx, idx)
ans =
1 4
13 16
When I try to make it in numpy, I can't get the correct indexing:
X = np.arange(1, 17).reshape(4, 4)
idx = [True, False, False, True]
X[idx, idx]
# Output: array([6, 1, 1, 6])
What's the proper way of getting a grid from the matrix via logical indexing?
You could also write:
>>> X[np.ix_(idx,idx)]
array([[ 1, 4],
[13, 16]])
In [1]: X = np.arange(1, 17).reshape(4, 4)
In [2]: idx = np.array([True, False, False, True]) # note that here idx has to
# be an array (not a list)
# or boolean values will be
# interpreted as integers
In [3]: X[idx][:,idx]
Out[3]:
array([[ 1, 4],
[13, 16]])
In numpy this is called fancy indexing. To get the items you want you should use a 2D array of indices.
You can use an outer to make from your 1D idx a proper 2D array of indices. The outers, when applied to two 1D sequences, compare each element of one sequence to each element of the other. Recalling that True*True=True and False*True=False, the np.multiply.outer(), which is the same as np.outer(), can give you the 2D indices:
idx_2D = np.outer(idx,idx)
#array([[ True, False, False, True],
# [False, False, False, False],
# [False, False, False, False],
# [ True, False, False, True]], dtype=bool)
Which you can use:
x[ idx_2D ]
array([ 1, 4, 13, 16])
In your real code you can use x=[np.outer(idx,idx)] but it does not save memory, working the same as if you included a del idx_2D after doing the slice.
I want to check if a NumPyArray has values in it that are in a set, and if so set that area in an array = 1. If not set a keepRaster = 2.
numpyArray = #some imported array
repeatSet= ([3, 5, 6, 8])
confusedRaster = numpyArray[numpy.where(numpyArray in repeatSet)]= 1
Yields:
<type 'exceptions.TypeError'>: unhashable type: 'numpy.ndarray'
Is there a way to loop through it?
for numpyArray
if numpyArray in repeatSet
confusedRaster = 1
else
keepRaster = 2
To clarify and ask for a bit further help:
What I am trying to get at, and am currently doing, is putting a raster input into an array. I need to read values in the 2-d array and create another array based on those values. If the array value is in a set then the value will be 1. If it is not in a set then the value will be derived from another input, but I'll say 77 for now. This is what I'm currently using. My test input has about 1500 rows and 3500 columns. It always freezes at around row 350.
for rowd in range(0, width):
for cold in range (0, height):
if numpyarray.item(rowd,cold) in repeatSet:
confusedArray[rowd][cold] = 1
else:
if numpyarray.item(rowd,cold) == 0:
confusedArray[rowd][cold] = 0
else:
confusedArray[rowd][cold] = 2
In versions 1.4 and higher, numpy provides the in1d function.
>>> test = np.array([0, 1, 2, 5, 0])
>>> states = [0, 2]
>>> np.in1d(test, states)
array([ True, False, True, False, True], dtype=bool)
You can use that as a mask for assignment.
>>> test[np.in1d(test, states)] = 1
>>> test
array([1, 1, 1, 5, 1])
Here are some more sophisticated uses of numpy's indexing and assignment syntax that I think will apply to your problem. Note the use of bitwise operators to replace if-based logic:
>>> numpy_array = numpy.arange(9).reshape((3, 3))
>>> confused_array = numpy.arange(9).reshape((3, 3)) % 2
>>> mask = numpy.in1d(numpy_array, repeat_set).reshape(numpy_array.shape)
>>> mask
array([[False, False, False],
[ True, False, True],
[ True, False, True]], dtype=bool)
>>> ~mask
array([[ True, True, True],
[False, True, False],
[False, True, False]], dtype=bool)
>>> numpy_array == 0
array([[ True, False, False],
[False, False, False],
[False, False, False]], dtype=bool)
>>> numpy_array != 0
array([[False, True, True],
[ True, True, True],
[ True, True, True]], dtype=bool)
>>> confused_array[mask] = 1
>>> confused_array[~mask & (numpy_array == 0)] = 0
>>> confused_array[~mask & (numpy_array != 0)] = 2
>>> confused_array
array([[0, 2, 2],
[1, 2, 1],
[1, 2, 1]])
Another approach would be to use numpy.where, which creates a brand new array, using values from the second argument where mask is true, and values from the third argument where mask is false. (As with assignment, the argument can be a scalar or an array of the same shape as mask.) This might be a bit more efficient than the above, and it's certainly more terse:
>>> numpy.where(mask, 1, numpy.where(numpy_array == 0, 0, 2))
array([[0, 2, 2],
[1, 2, 1],
[1, 2, 1]])
Here is one possible way of doing what you whant:
numpyArray = np.array([1, 8, 35, 343, 23, 3, 8]) # could be n-Dimensional array
repeatSet = np.array([3, 5, 6, 8])
mask = (numpyArray[...,None] == repeatSet[None,...]).any(axis=-1)
print mask
>>> [False True False False False True True]
In recent numpy you could use a combination of np.isin and np.where to achieve this result. The first method outputs a boolean numpy array that evaluates to True where its vlaues are equal to an array-like specified test element (see doc), while with the second you could create a new array that set some a value where the specified confition evaluates to True and another value where False.
Example
I'll make an example with a random array but using the specific values you provided.
import numpy as np
repeatSet = ([2, 5, 6, 8])
arr = np.array([[1,5,1],
[0,1,0],
[0,0,0],
[2,2,2]])
out = np.where(np.isin(arr, repeatSet), 1, 77)
> out
array([[77, 1, 77],
[77, 77, 77],
[77, 77, 77],
[ 1, 1, 1]])