NumPy slicing: All except one array entry - python

What is the best way to exclude exact one NumPy array entry from an operation?
I have an array x containing n values and want to exclude the i-th entry when I call numpy.prod(x). I know about MaskedArray, but is there another/better way?

I think the simplest would be
np.prod(x[:i]) * np.prod(x[i+1:])
This should be fast and also works when you don't want to or can't modify x.
And in case x is multidimensional and i is a tuple:
x_f = x.ravel()
i_f = np.ravel_multi_index(i, x.shape)
np.prod(x_f[:i_f]) * np.prod(x_f[i_f+1:])

You could use np.delete whch removes an element from a one-dimensional array:
import numpy as np
x = np.arange(1, 5)
i = 2
y = np.prod(np.delete(x, i)) # gives 8

I don't think there is any better way, honestly. Even without knowing the NumPy functions, I would do it like:
#I assume x is array of len n
temp = x[i] #where i is the index of the value you don't want to change
x = x * 5
#...do whatever with the array...
x[i] = temp
If I understand correctly, your problem is one dimensional? Even if not, you can do this the same way.
EDIT:
I checked the prod function and in this case I think you can just replace the value u don't want to use with 1 (using temp approach I've given you above) and later just put in the right value. It is just a in-place change, so it's kinda efficient. The second way you can do this is just to divide the result by the x[i] value (assuming it's not 0, as commenters said).

As np.prod is taking the product of all the elements in an array, if we want to exclude one element from the solution, we can set that element to 1 first in order to ignore it (as p * 1 = p).
So:
>>> n = 10
>>> x = np.arange(10)
>>> i = 0
>>> x[i] = 1
>>> np.prod(x)
362880
which, we can see, works:
>>> 1 * 2 * 3 * 4 * 5 * 6 * 7 * 8 * 9
362880

You could use a list comprehension to index all the points but 1:
i = 2
np.prod(x[[val for val in range(len(x)) if val != i]])
or use a set difference:
np.prod(x[list(set(range(len(x)) - {i})])

Related

Can itertools be used for an unspecified number of dimensions in this case?

This is my code, working with dim=3, but I would like it to work for any dimensionality without having to manually edit code.
I would like to be able to vary the dimensionality between 3 and 20 eventually without manually having to ad for-loops.
I was looking at itertools, but don't know how to select the correct values from the tuples created by itertools.product() to square and add up for my if statement.
arrayshape = (width * 2 + 1,) * dim
funcspace = np.zeros(shape=arrayshape, dtype='b')
x1 = list(range(-int(width), int(width + 1)))
x2 = x1
x3 = x1
for i in range(len(x1)):
for j in range(len(x2)):
for k in range(len(x3)):
if round(np.sqrt(x1[i] ** 2 + x2[j] ** 2 + x3[k] ** 2)) in ranges:
funcspace[i][j][k] = 1
You can use product on enumerate of your vectors, which will yield the value and the index:
for ((i,v1),(j,v2),(k,v3)) in itertools.product(enumerate(x1),enumerate(x2),enumerate(x3)):
if round(np.sqrt(v1**2+v2**2+v3**2)) in ranges:
funcspace[i][j][k]=1
as a bonus, you get rid of the unpythonic range(len()) construct.
I've cooked a more general case when you have a vector of vectors. It's a little harder to read because unpacking isn't done in the for loop.
The square sum is done using sum on the 1 indexes (the values), and if the condition matches, we loop until we find the "deeper" list to set the value to 1.
for t in itertools.product(*(enumerate(x) for x in x_list)):
# compute the squared sum of values
sqsum = sum(v[1]**2 for v in t)
if round(sqsum) in ranges:
# traverse the dimensions except the last one
deeper_list = funcspace
for i in range(len(t)-1):
deeper_list = deeper_list[t[i][0]]
# set the flag using the last dimension list
deeper_list[t[-1][0]] = 1
as noted in comments, since x1 seems to be repeated you can replace the first statement by:
for t in itertools.product(enumerate(x1), repeat=dim):
Another comments states that since funcspace is a numpy ndarray, we can simplify the "set to 1" loop by passing the list of indexes:
funcspace[[x[0] for x in t]] = 1

Efficient Numpy search in a non-monotonic array

I am trying to conduct something similar to searchsorted, but in the case where the array is not completely monotonic. Say I have a scalar, c and a 1D array x, I want to find the indices i of all elements such that x[i] < c <= x[i + 1]. Importantly, x is not completely monotonic.
The following code works, but I just would like to know if this is the most efficient way to do this, or if there is a simper way:
x = np.array([1,2,3,1,2,3,1,2,3])
c = 2.5
t = c > x[:-1]
u = c <= x[1:]
v = t*u
i = v.nonzero()[0]
Or in one line of code:
i = ( (c > x[:-1]) * (c <= x[1:] ).nonzero()[0]
Is this the most efficient way to recover these indices?
Two additional questions.
Is there an easy way to extend this to the case where c is a 1D array and x is a 2D array, where c has as many elements as "rows" in x, and I perform this search for each element of c in the corresponding "row" of x?
My ultimate goal is to do this with a three dimensional case. That is, suppose c is still a 1D vector with n elements. Now, let x be a 3D array, with dimensions j by n by k. Is there a way to do #1 above for each "submatrix" in x? Basically, performing #1 above j times.
For example:
x1 = np.array([1,2,3,1,2,3],[1,2,3,1,2,3],[1,2,3,1,2,3])
x2 = x1 + 1
x = np.array([x1,x2])
c = np.array([1.5,2.5,3.5])
Under #1 above, when we compare c and x1, we would get: [[0,4],[1,5],[]]
When we compare c and x2, we would get: [[],[0,4],[1,5]]
Finally, under #2, I would like to get:
[[[0,4],[1,5],[]],
[[],[0,4],[1,5]]]
We could compare once to give us the boolean mask and re-use it with negation to get the other comparison array and also use slicing -
m = c > x
i = np.flatnonzero( m[:-1] & ~m[1:] )
We can extend it to x as 2D and c as 1D case with a loop, but do minimal computations with it by pre-computing on the masks generation in a vectorized manner, like so -
m = c[:,None] > x
m2 = m[:,:-1] & ~m[:,1:]
i = [np.flatnonzero( mi ) for mi in m2]
On such task, numpy make too much comparisons. You can win a 5X factor with Numba. No difficulties to adapt for 3 dimensions.
#numba.njit
def ind(x,c):
res = empty_like(x)
i=j=0
while i < x.size-1:
if x[i]<c and c<=x[i+1]:
res[j]=i
j+=1
i+=1
return res[:j]

Speed up nested for-loops in python / going through numpy array

Say I have 4 numpy arrays A,B,C,D , each the size of (256,256,1792).
I want to go through each element of those arrays and do something to it, but I need to do it in chunks of 256x256x256-cubes.
My code looks like this:
for l in range(7):
x, y, z, t = 0,0,0,0
for m in range(a.shape[0]):
for n in range(a.shape[1]):
for o in range(256*l,256*(l+1)):
t += D[m,n,o] * constant
x += A[m,n,o] * D[m,n,o] * constant
y += B[m,n,o] * D[m,n,o] * constant
z += C[m,n,o] * D[m,n,o] * constant
final = (x+y+z)/t
doOutput(final)
The code works and outputs exactly what I want, but its awfully slow. I've read online that those kind of nested for loops should be avoided in python. What is the cleanest solution to it? (right now I'm trying to do this part of my code in C and somehow import it via Cython or other tools, but I'd love a pure python solution)
Thanks
Add on
Willem Van Onsem's Solution to the first part seems to work just fine and I think I comprehend it. But now I want to modify my values before summing them. It looks like
(within the outer l loop)
for m in range(a.shape[0]):
for n in range(a.shape[1]):
for o in range(256*l,256*(l+1)):
R += (D[m,n,o] * constant * (A[m,n,o]**2
+ B[m,n,o]**2 + C[m,n,o]**2)/t - final**2)
doOutput(R)
I obviously can't just square the sum x = (A[:a.shape[0],:a.shape[1],256*l:256*(l+1)]*Dsub).sum()**2*constant since (A²+B²) != (A+B)²
How can I redo this last for loops?
Since you update t with every element of m in range(a.shape[0]), n in range(a.shape[1]) and o in range(256*l,256*(l+1)), you can substitute:
for m in range(a.shape[0]):
for n in range(a.shape[1]):
for o in range(256*l,256*(l+1)):
t += D[m,n,o]
With:
t += D[:a.shape[0],:a.shape[1],256*l:256*(l+1)].sum()
The same for the other assignments. So you can rewrite your code to:
for l in range(7):
Dsub = D[:a.shape[0],:a.shape[1],256*l:256*(l+1)]
x = (A[:a.shape[0],:a.shape[1],256*l:256*(l+1)]*Dsub).sum()*constant
y = (B[:a.shape[0],:a.shape[1],256*l:256*(l+1)]*Dsub).sum()*constant
z = (C[:a.shape[0],:a.shape[1],256*l:256*(l+1)]*Dsub).sum()*constant
t = Dsub.sum()*constant
final = (x+y+z)/t
doOutput(final)
Note that the * in numpy is the element-wise multiplication, not the matrix product. You can do the multiplication before the sum, but since the sum of a multiplications with a constant is equal to the multiplication of that constant with the sum, I think it is more efficient to do this out of the loop.
If a.shape[0] is equal to D.shape[0], etc. You can use : instead of :a.shape[0]. Based on your question, that seems to be the case. so:
# only when `a.shape[0] == D.shape[0], a.shape[1] == D.shape[1] (and so for A, B and C)`
for l in range(7):
Dsub = D[:,:,256*l:256*(l+1)]
x = (A[:,:,256*l:256*(l+1)]*Dsub).sum()*constant
y = (B[:,:,256*l:256*(l+1)]*Dsub).sum()*constant
z = (C[:,:,256*l:256*(l+1)]*Dsub).sum()*constant
t = Dsub.sum()*constant
final = (x+y+z)/t
doOutput(final)
Processing the .sum() on the numpy level will boost performance since you do not convert values back and forth and with .sum(), you use a tight loop.
EDIT:
Your updated question does not change much. You can simply use:
m,n,_* = a.shape
lo,hi = 256*l,256*(l+1)
R = (D[:m,:n,lo:hi]*constant*(A[:m,:n,lo:hi]**2+B[:m,:n,lo:hi]**2+D[:m,:n,lo:hi]**2)/t-final**2)).sum()
doOutput(R)

Filter one array according to data in another array

I have two arrays of random numbers, X and Y. X represents x-coordinates and Y represents y-coordinates. I want to filter X such that I only keep indices i of X where:
X[i]^2 + Y[i]^2 < 1
I know how to filter with values in 1 array but since I need to use 2, I am not sure what to do. I am not allowed to use loops of any kind.
This will do:
X_filtered = X[X**2 + Y**2 < 1]
X**2 + Y**2 < 1 returns a boolean array and accessing X on this array returns X only at the indices equal to True.
for ind,(a,b) in enumerate(zip(x,y)) :
if (a**2 + b**2) < 1 :
print ind
X = [X[i] for i in range(len(X)) if X[i]**2 + Y[i]**2 < 1]
this will filter X so that X only contains those that match your filtering criteria.
Note that this does use looping via comprehension, so I am not quite sure how this is to be done without looping.
So I think these arrays are too large to loop? If you only want to keep the indices for later, try Generators:
def X_indices_filterd(X, Y):
for i in enumerate(X):
if (X[i] ** 2 + Y[i] ** 2 < 1) yield i

Aesthetic way of appending to a list in Python?

When appending longer statements to a list, I feel append becomes awkward to read. I would like a method that would work for dynamic list creation (i.e. don't need to initialize with zeros first, etc.), but I cannot seem to come up with another way of doing what I want.
Example:
import math
mylist = list()
phi = [1,2,3,4] # lets pretend this is of unknown/varying lengths
i, num, radius = 0, 4, 6
while i < num:
mylist.append(2*math.pi*radius*math.cos(phi[i]))
i = i + 1
Though append works just fine, I feel it is less clear than:
mylist[i] = 2*math.pi*radius*math.cos(phi[i])
But this does not work, as that element does not exist in the list yet, yielding:
IndexError: list assignment index out of range
I could just assign the resulting value to temporary variable, and append that, but that seems ugly and inefficient.
You don;t need an existing list and append to it later. Just use list comprehension
List comprehension,
is fast,
easy to comprehend,
and can easily be ported as a generator expression
>>> import math
>>> phi = [1,2,3,4]
>>> i, num, radius = 0, 4, 6
>>> circum = 2*math.pi*radius
>>> mylist = [circum * math.cos(p) for p in phi]
Reviewing your code, here are some generic suggestions
Do not compute a known constant in an iteration
while i < num:
mylist.append(2*math.pi*radius*math.cos(phi[i]))
i = i + 1
should be written as
circum = 2*math.pi
while i < num:
mylist.append(circum*math.cos(phi[i]))
i = i + 1
Instead of while use for-each construct
for p in phi:
mylist.append(circum*math.cos(p))
If an expression is not readable, break it into multiple statements, after all readability counts in Python.
In this particular case you could use a list comprehension:
mylist = [2*math.pi*radius*math.cos(phi[i]) for i in range(num)]
Or, if you're doing this sort of computations a lot, you could move away from using lists and use NumPy instead:
In [78]: import numpy as np
In [79]: phi = np.array([1, 2, 3, 4])
In [80]: radius = 6
In [81]: 2 * np.pi * radius * np.cos(phi)
Out[81]: array([ 20.36891706, -15.68836613, -37.32183785, -24.64178397])
I find this last version to be the most aesthetically pleasing of all. For longer phi it will also be more performant than using lists.
mylist += [2*math.pi*radius*math.cos(phi[i])]
you can use list concatenation, but append is twice as fast according to this:
import math
mylist = list()
phi = [1,2,3,4] # lets pretend this is of unknown/varying lengths
i, num, radius = 0, 4, 6
while i < num:
mylist += [(2*math.pi*radius*math.cos(phi[i]))]
i = i + 1

Categories

Resources