Adding numpy array elements to new array only if conditions met - python

I need to copy elements from one numpy array to another, but only if a condition is met. Let's say I have two arrays:
x = ([1,2,3,4,5,6,7,8,9])
y = ([])
I want to add numbers from x to y, but only if they match a condition, lets say check if they are divisible by two. I know I can do the following:
y = x%2 == 0
which makes y an array of values 'true' and 'false'. This is not what I am trying to accomplish however, I want the actual values (0,2,4,6,8) and only those that evaluate to true.

You can get the values you want like this:
import numpy as np
x = np.array([1,2,3,4,5,6,7,8,9])
# array([1, 2, 3, 4, 5, 6, 7, 8, 9])
y = x[x%2==0]
# y is now: array([2, 4, 6, 8])
And, you can sum them like this:
np.sum(x[x%2==0])
# 20
Explanation: As you noticed, x%2==0 gives you a boolean array array([False, True, False, True, False, True, False, True, False], dtype=bool). You can use this as a "mask" on your original array, by indexing it with x[x%2==0], returning the values of x where your "mask" is True. Take a look at the numpy indexing documentation for more info.

Related

I want to create a array without if() of for() but use where()

I'm solving some problems and it asked me
"In Python solve these problems with where() in Numpy without using for() or if()"
There is two arrays First one is
[1,2,3,5,3,4,3,6,9,7,0,8,7,10]
Second one is
[7,2,10,5,7,4,9,1,8,0,3,7,6]
And results of One is "In two arrays have both same values [2 5 4 9 0 7]" other one is In two arrays have both same Index(array([1, 3, 5, 8, 10, 12]))
So I need to find these condition as same values with in Index
I tried solve it. But I don't figure out how to find both. I mean I found Index but couldn't found both.
First, make sure the arrays are of equal length. Then you can use == to do an elementwise comparison
a == b
# array([False, True, False, True, False, True, False, False, False,
# False, False, False, False])
get the equal values using
b[a == b]
# array([2, 5, 4])
and the the indices using
np.where(a == b)[0]
# array([1, 3, 5])

numpy multiple boolean index arrays

I have an array which I want to use boolean indexing on, with multiple index arrays, each producing a different array. Example:
w = np.array([1,2,3])
b = np.array([[False, True, True], [True, False, False]])
Should return something along the lines of:
[[2,3], [1]]
I assume that since the number of cells containing True can vary between masks, I cannot expect the result to reside in a 2d numpy array, but I'm still hoping for something more elegant than iterating over the masks the appending the result of indexing w by the i-th b mask to it.
Am I missing a better option?
Edit: The next step I want to do afterwards is to sum each of the arrays returned by w[b], returning a list of scalars. If that somehow makes the problem easier, I'd love to know as well.
Assuming you want a list of numpy arrays you can simply use a comprehension:
w = np.array([1,2,3])
b = np.array([[False, True, True], [True, False, False]])
[w[bool] for bool in b]
# [array([2, 3]), array([1])]
If your goal is just a sum of the masked values you use:
np.sum(w*b) # 6
or
np.sum(w*b, axis=1) # array([5, 1])
# or b # w
…since False times you number will be 0 and therefor won't effect the sum.
Try this:
[w[x] for x in b]
Hope this helps.

Looping through a Truth array in python and replacing true values with components from another array

Let's say I have a Numpy array truth array that looks something like the following:
truths = [True, False, False, False, True, True]
and I have another array of values that looks something like:
nums = [1, 2, 3]
I want to create a loop that will replace all the truth values in the truths array with the next number from the nums array and replace all the False values with 0.
I want to end up with something that looks like:
array = [1, 0, 0, 0, 2, 3]
I would recommend numpy.putmask(). Since we're converting from type bool to int64, we need to do some conversions first.
First, initialization:
truths = np.array([ True, False, False, False, True, True])
nums = np.array([1, 2, 3])
Then we convert and replace based on our mask (if element of truth is True):
truths = truths.astype('int64') # implicitly changes all the "False" values to 0
numpy.putmask(truths, truths, nums)
The end result:
>>> truths
array([1, 0, 0, 0, 2, 3])
Note that we just pass in truths into the "mask" argument of numpy.putmask(). This will simply check to see if each element of array truths is truthy; since we converted the array to type int64, it will replace only elements that are NOT 0, as required.
If we wanted to be more pedantic, or needed to replace some arbitrary value, we would need numpy.putmask(truths, truths==<value we want to replace>, nums) instead.
If we want to go EVEN more pedantic and not make the assumption that we can easily convert types (as we can from bool to int64), as far as I'm aware, we'd either need to make some sort of mapping to a different numpy.array where we could make that conversion. The way I'd personally do that is to convert my numpy.array into some boolean array where I can do this easy conversion, but there may be a better way.
You can use cycle from itertools to cycle through your nums list. Then just zip it with your booleans and use a ternary list comprehension.
from itertools import cycle
>>> [num if boolean else 0 for boolean, num in zip(truths, cycle(nums))]
[1, 0, 0, 0, 2, 3]
You could use itertools here as you said you want a loop.
from itertools import cycle, chain, repeat
import numpy as np
truths = np.array([True, False, False, False, True, True])
nums = np.array([1, 2, 3])
#you have 2 options here.
#Either repeat over nums
iter_nums = cycle(nums)
#or when nums is exhausted
#you just put default value in it's place
iter_nums = chain(nums, repeat(0))
masked = np.array([next(iter_nums) if v else v for v in truths])
print(masked)
#[1, 0, 0, 0, 2, 3]

Replace values in specific columns of a numpy array

I have a N x M numpy array (matrix). Here is an example with a 3 x 5 array:
x = numpy.array([[0,1,2,3,4,5],[0,-1,2,3,-4,-5],[0,-1,-2,-3,4,5]])
I'd like to scan all the columns of x and replace the values of each column if they are equal to a specific value.
This code for example aims to replace all the negative values (where the value is equal to the column number) to 100:
for i in range(1,6):
x[:,i == -(i)] = 100
This code obtains this warning:
DeprecationWarning: using a boolean instead of an integer will result in an error in the future
I'm using numpy 1.8.2. How can I avoid this warning without downgrade numpy?
I don't follow what your code is trying to do:
the i == -(i)
will evaluate to something like this:
x[:, True]
x[:, False]
I don't think this is what you want. You should try something like this:
for i in range(1, 6):
mask = x[:, i] == -i
x[:, i][mask] = 100
Create a mask over the whole column, and use that to change the values.
Even without the warning, the code you have there will not do what you want. i is the loop index and will equal minus itself only if i == 0, which is never. Your test will always return false, which is cast to 0. In other words your code will replace the first element of each row with 100.
To get this to work I would do
for i in range(1, 6):
col = x[:,i]
col[col == -i] = 100
Notice that you use the name of the array for the masking and that you need to separate the conventional indexing from the masking
If you are worried about the warning spewing out text, then ignore it as a Warning/Exception:
import numpy
import warnings
warnings.simplefilter('default') # this enables DeprecationWarnings to be thrown
x = numpy.array([[0,1,2,3,4,5],[0,-1,2,3,-4,-5],[0,-1,-2,-3,4,5]])
with warnings.catch_warnings():
warnings.simplefilter("ignore") # and this ignores them
for i in range(1,6):
x[:,i == -(i)] = 100
print(x) # just to show that you are actually changing the content
As you can see in the comments, some people are not getting DeprecationWarning. That is probably because python suppresses developer-only warnings since 2.7
As others have said, your loop isn't doing what you think it is doing. I would propose you change your code to use numpy's fancy indexing.
# First, create the "test values" (column index):
>>> test_values = numpy.arange(6)
# test_values is array([0, 1, 2, 3, 4, 5])
#
# Now, we want to check which columns have value == -test_values:
#
>>> mask = (x == -test_values) & (x < 0)
# mask is True wherever a value in the i-th column of x is negative i
>>> mask
array([[False, False, False, False, False, False],
[False, True, False, False, True, True],
[False, True, True, True, False, False]], dtype=bool)
#
# Now, set those values to 100
>>> x[mask] = 100
>>> x
array([[ 0, 1, 2, 3, 4, 5],
[ 0, 100, 2, 3, 100, 100],
[ 0, 100, 100, 100, 4, 5]])

ValueError: too many boolean indices for a n=600 array (float)

I am getting an issue where I am trying to run (on Python):
#Loading in the text file in need of analysis
x,y=loadtxt('2.8k to 293k 15102014_rerun 47_0K.txt',skiprows=1,unpack=True,dtype=float,delimiter=",")
C=-1.0 #Need to flip my voltage axis
yone=C*y #Actually flipping the array
plot(x,yone)#Test
origin=600.0#Where is the origin? i.e V=0, taking the 0 to 1V elements of array
xorg=x[origin:1201]# Array from the origin to the final point (n)
xfit=xorg[(x>0.85)==True] # Taking the array from the origin and shortening it further to get relevant area
It returns the ValueError. I have tried doing this process with a much smaller array of 10 elements and the xfit=xorg[(x>0.85)==True] command works fine. What the program is trying to do is to narrow the field of vision, of some data, to a relevant point so I can fit a line of best fit a linear element of the data.
I apologise for the formatting being messy but this is the first question I have asked on this website as I cannot search for something that I can understand where I am going wrong.
This answer is for people that don't know about numpy arrays (like me), thanks MrE for the pointers to numpy docs.
Numpy arrays have this nice feature of boolean masks.
For numpy arrays, most operators return an array of the operation applied to every element - instead of a single result like in plain Python lists:
>>> alist = range(10)
>>> alist
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> alist > 5
True
>>> anarray = np.array(alist)
>>> anarray
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> anarray > 5
array([False, False, False, False, False, False, True, True, True, True], dtype=bool)
You can use an array of bool as the index for a numpy array, in this case you get a filtered array for the positions where the corresponding bool array element is True.
>>> mask = anarray > 5
>>> anarray[mask]
array([6, 7, 8, 9])
The mask must not be bigger than the array:
>>> anotherarray = anarray[mask]
>>> anotherarray
array([6, 7, 8, 9])
>>> anotherarray[mask]
ValueError: too many boolean indices
So you cant use a mask bigger than the array you are masking:
>>> anotherarray[anarray > 7]
ValueError: too many boolean indices
>>> anotherarray[anotherarray > 7]
array([8, 9])
Since xorg is smaller than x, a mask based on x will be longer than xorg and you get the ValueError exception.
Change
xfit=xorg[x>0.85]
to
xfit=xorg[xorg>0.85]
x is larger than xorg so x > 0.85 has more elements than xorg
Try the following:
replace your code
xorg=x[origin:1201]
xfit=xorg[(x>0.85)==True]
with
mask = x > 0.85
xfit = xorg[mask[origin:1201]]
This works when x is a numpy.ndarray, otherwise you might end up in problems as advanced indexing will return a view, not a copy, see SciPy/NumPy documentation.
I'm unsure whether you like to use numpy, but when trying to fit data, numpy/scipy is a good choice anyway...

Categories

Resources