I would like to slice a numpy array so that I can exclude a single element from it.
For example, like this:
a = numpy.array([1,2,3,4,5])
b = a[0:1::3:4]
b = [1 2 4 5]
Only that this does not work, so either I am doing something wrong, or it isn't possible.
If you are going to repeatedly 'delete' one item at a time, I'd suggest using a boolean mask:
In [493]: a = np.arange(100)
In [494]: mask = np.ones(a.shape, dtype=bool)
In [495]: for i in [2,5,9,20,3,26,40,60]:
...: mask[i]=0
...: a1 = a[mask]
In [496]: a1.shape
Out[496]: (92,)
That's effectively what np.delete does when given a list or array of deletes
In [497]: a2 = np.delete(a, [2,5,9,20,3,26,40,60])
In [498]: np.allclose(a1,a2)
Out[498]: True
For a single element is joins two slices - either by concatenate or copying to result array of the right size. In all cases we have to make a new array.
One exclusion or many, you seek an discontinuous selection of the elements of the original. That can't be produced with a view, which uses shape and strides to select a regular subset of the original.
You need to do something like below
a = np.array([1,2,3,4,5])
b = a[:2]
c = a[3:]
print ( b )
print ( c )
z= np.concatenate((b,c),axis=None)
print ( z )
Output:
[1 2]
[4 5]
[1 2 4 5]
Hence here everything other than 3 is in new numpy.ndarray z here.
Other way is to use to use np.delete function as shown in one the answers where you can provide list of indexes to be deleted inside the [] where list contains coma seperated index to be deleted.
a = np.array([15,14,13,12,11])
a4=np.delete(a,[1,4])
print(a4)
output is :
[15 13 12]
import numpy as np
a = np.array([1,2,3,4,5])
result = np.delete(a,2)
result = [1,2,4,5]
You could always use sets of slicing
b = a[:2]+a[3:]
Will return [1, 2, 4, 5]
for a numpy return value you could do 2 slices and concatenate the results.
b = a[3:]
c = a[:2]
numpy.concatenate([c,b])
Will return
array([1, 2, 4, 5])
Related
Looking for a pythonic way to sum values from multiple lists:
I have got the following list of lists:
a = [0,5,2]
b = [2,1,1]
c = [1,1,1]
d = [5,3,4]
my_list = [a,b,c,d]
I am looking for the output:
[8,10,8]
I`ve used:
print ([sum(x) for x in zip(*my_list )])
but zip only works when I have 2 elements in my_list.
Any idea?
zip works for an arbitrary number of iterables:
>>> list(map(sum, zip(*my_list)))
[8, 10, 8]
which is, of course, roughly equivalent to your comprehension which also works:
>>> [sum(x) for x in zip(*my_list)]
[8, 10, 8]
Numpy has a nice way of doing this, it is also able to handle very large arrays. First we create the my_list as a numpy array as such:
import numpy as np
a = [0,5,2]
b = [2,1,1]
c = [1,1,1]
d = [5,3,4]
my_list = np.array([a,b,c,d])
To get the sum over the columns, you can do the following
np.sum(my_list, axis=0)
Alternatively, the sum over the rows can be retrieved by
np.sum(my_list, axis=1)
I'd make it a numpy array and then sum along axis 0:
my_list = numpy.array([a,b,c,d])
my_list.sum(axis=0)
Output:
[ 8 10 8]
How to perform a sum just for a list of indices over numpy array, e.g., if I have an array a = [1,2,3,4] and a list of indices to sum, indices = [0, 2] and I want a fast operation to give me the answer 4 because the value for summing value at index 0 and index 2 in a is 4
You can use sum directly after indexing with indices:
a = np.array([1,2,3,4])
indices = [0, 2]
a[indices].sum()
The accepted a[indices].sum() approach copies data and creates a new array, which might cause problem if the array is large. np.sum actually has an argument to mask out colums, you can just do
np.sum(a, where=[True, False, True, False])
Which doesn't copy any data.
The mask array can be obtained by:
mask = np.full(4, False)
mask[np.array([0,2])] = True
Try:
>>> a = [1,2,3,4]
>>> indices = [0, 2]
>>> sum(a[i] for i in indices)
4
Faster
If you have a lot of numbers and you want high speed, then you need to use numpy:
>>> import numpy as np
>>> a = np.array([1,2,3,4])
>>> a[indices]
array([1, 3])
>>> np.sum(a[indices])
4
But I don't have the index values, I just have ones in those same indices in a different array. For example, I have
a = array([3,4,5,6])
b = array([0,1,0,1])
Is there some NumPy method than can quickly look at both of these and extract all values from a whose indices match the indices of all 1's in b? I want it to result in:
array([4,6])
It is probably worth mentioning that my a array is multidimensional, while my b array will always have values of either 0 or 1. I tried using NumPy's logical_and function, though this returns ValueError with a and b having different dimensions:
a = numpy.array([[3,2], [4,5], [6,1]])
b = numpy.array([0, 1, 0])
print numpy.logical_and(a,b)
ValueError: operands could not be broadcast together with shapes (3,2) (3,)
Though this method does seem to work if a is flat. Either way, the return type of numpy.logical_and() is a boolean, which I do not want. Is there another way? Again, in the second example above, the desired return would be
array([[4,5]])
Obviously I could write a simple loop to accomplish this, I'm just looking for something a bit more concise.
Edit:
This will introduce more constraints, I should also mention that each element of the multidimensional array a may be any arbitrary length, that does not match its neighbour.
You can simply use fancy indexing.
b == 1
will give you a boolean array:
>>> from numpy import array
>>> a = array([3,4,5,6])
>>> b = array([0,1,0,1])
>>> b==1
array([False, True, False, True], dtype=bool)
which you can pass as an index to a.
>>> a[b==1]
array([4, 6])
Demo for your second example:
>>> a = array([[3,2], [4,5], [6,1]])
>>> b = array([0, 1, 0])
>>> a[b==1]
array([[4, 5]])
You could use compress:
>>> a = np.array([3,4,5,6])
>>> b = np.array([0,1,0,1])
>>> a.compress(b)
array([4, 6])
You can provide an axis argument for multi-dimensional cases:
>>> a2 = np.array([[3,2], [4,5], [6,1]])
>>> b2 = np.array([0, 1, 0])
>>> a2.compress(b2, axis=0)
array([[4, 5]])
This method will work even if the axis of a you're indexing against is a different length to b.
Say I have a 3 dimensional numpy array:
np.random.seed(1145)
A = np.random.random((5,5,5))
and I have two lists of indices corresponding to the 2nd and 3rd dimensions:
second = [1,2]
third = [3,4]
and I want to select the elements in the numpy array corresponding to
A[:][second][third]
so the shape of the sliced array would be (5,2,2) and
A[:][second][third].flatten()
would be equivalent to to:
In [226]:
for i in range(5):
for j in second:
for k in third:
print A[i][j][k]
0.556091074129
0.622016249651
0.622530505868
0.914954716368
0.729005532319
0.253214472335
0.892869371179
0.98279375528
0.814240066639
0.986060321906
0.829987410941
0.776715489939
0.404772469431
0.204696635072
0.190891168574
0.869554447412
0.364076117846
0.04760811817
0.440210532601
0.981601369658
Is there a way to slice a numpy array in this way? So far when I try A[:][second][third] I get IndexError: index 3 is out of bounds for axis 0 with size 2 because the [:] for the first dimension seems to be ignored.
Numpy uses multiple indexing, so instead of A[1][2][3], you can--and should--use A[1,2,3].
You might then think you could do A[:, second, third], but the numpy indices are broadcast, and broadcasting second and third (two one-dimensional sequences) ends up being the numpy equivalent of zip, so the result has shape (5, 2).
What you really want is to index with, in effect, the outer product of second and third. You can do this with broadcasting by making one of them, say second into a two-dimensional array with shape (2,1). Then the shape that results from broadcasting second and third together is (2,2).
For example:
In [8]: import numpy as np
In [9]: a = np.arange(125).reshape(5,5,5)
In [10]: second = [1,2]
In [11]: third = [3,4]
In [12]: s = a[:, np.array(second).reshape(-1,1), third]
In [13]: s.shape
Out[13]: (5, 2, 2)
Note that, in this specific example, the values in second and third are sequential. If that is typical, you can simply use slices:
In [14]: s2 = a[:, 1:3, 3:5]
In [15]: s2.shape
Out[15]: (5, 2, 2)
In [16]: np.all(s == s2)
Out[16]: True
There are a couple very important difference in those two methods.
The first method would also work with indices that are not equivalent to slices. For example, it would work if second = [0, 2, 3]. (Sometimes you'll see this style of indexing referred to as "fancy indexing".)
In the first method (using broadcasting and "fancy indexing"), the data is a copy of the original array. In the second method (using only slices), the array s2 is a view into the same block of memory used by a. An in-place change in one will change them both.
One way would be to use np.ix_:
>>> out = A[np.ix_(range(A.shape[0]),second, third)]
>>> out.shape
(5, 2, 2)
>>> manual = [A[i,j,k] for i in range(5) for j in second for k in third]
>>> (out.ravel() == manual).all()
True
Downside is that you have to specify the missing coordinate ranges explicitly, but you could wrap that into a function.
I think there are three problems with your approach:
Both second and third should be slices
Since the 'to' index is exclusive, they should go from 1 to 3 and from 3 to 5
Instead of A[:][second][third], you should use A[:,second,third]
Try this:
>>> np.random.seed(1145)
>>> A = np.random.random((5,5,5))
>>> second = slice(1,3)
>>> third = slice(3,5)
>>> A[:,second,third].shape
(5, 2, 2)
>>> A[:,second,third].flatten()
array([ 0.43285482, 0.80820122, 0.64878266, 0.62689481, 0.01298507,
0.42112921, 0.23104051, 0.34601169, 0.24838564, 0.66162209,
0.96115751, 0.07338851, 0.33109539, 0.55168356, 0.33925748,
0.2353348 , 0.91254398, 0.44692211, 0.60975602, 0.64610556])
lets say i have one array
a = numpy.arange(8*6*3).reshape((8, 6, 3))
#and another:
l = numpy.array([[0,0],[0,1],[1,1]]) #an array of indexes to array "a"
#and yet another:
b = numpy.array([[0,0,5],[0,1,0],[1,1,3]])
where "l" and "b" are of equal length,
and i want to say
a[l] = b
such that a[0][0] becomes [0,0,5], a[0][1] becomes [0,1,0] etc.
it seems to work fine when ive got one-dimensional arrays, but it gives me the error
ValueError: array is not broadcastable to correct shape
when i try it with a 3-dimensional array.
import numpy as np
a = np.arange(8*6*3).reshape((8, 6, 3))
l = np.array([[0,0],[0,1],[1,1]]) #an array of indexes to array "a"
b = np.array([[0,0,5],[0,1,0],[1,1,3]])
a[tuple(l.T)] = b
print(a[0,0])
# [0 0 5]
print(a[0,1])
# [0 1 0]
print(a[1,1])
# [1 1 3]
Anne Archibald says,
When you are supplying arrays in all index slots, what you get back
has the same shape as the arrays you put in; so if you supply
one-dimensional lists, like
A[[1,2,3],[1,4,5],[7,6,2]]
what you get is
[A[1,1,7], A[2,4,6], A[3,5,2]]
When you compare that with your example, you see that
a[l] = b tells NumPy to set
a[0,0,1] = [0,0,5]
a[0,1,1] = [0,1,0]
and leaves the third element of b unassigned. This is why you get the error
ValueError: array is not broadcastable to correct shape
The solution is to transpose the array l into the correct shape:
In [50]: tuple(l.T)
Out[50]: (array([0, 0, 1]), array([0, 1, 1]))
(You could also use zip(*l), but tuple(l.T) is a bit quicker.)
Or with your same arrays you can use
for i in range(len(l)):
a[l[i][0]][l[i][1]]=b[i]