I am trying to figure out the best way to do leave one out indexing with numpy, this is the desired behaviour:
import numpy as np
a = np.random.randint(0,10,size=10)
print(a)
def fun(x, xs):
print(x,xs) #do some stuff
for i in range(a.shape[0]):
fun(a[i], a[np.arange(a.shape[0]) != i]) #this is all I can think of, but its horrid!
is there a nicer, more efficient way to do this?
EDIT: To clarify, a question that is hopefully a bit clearer:
I have an array and I want a view that has 1 or more elements missing in the middle e.g. a = [1,2,3,4,5,...] to a = [1,2,4,5,...]. According to here fancy indexing / masking makes a copy of the array, I want to avoid this, and avoid creating a large index array. Thanks in advance for the help!
Related
thanks in advance for your help! I would like to the do the following, but I am new to Python, kind of unsure what to do efficiently.
I have a 2d array, for example A=[[1,1],[2,3]].
Each value in the above 2d array corresponds to the index in another 1d array, for example: B=[0.1,0.2,0.8,0.9].
The end result should be like: C=[[0.2,0.2],[0.8,0.9]]. That means, C[i,j]=B[index=A[i,j]].
The above is a simple example. But in practice, A can be a huge array, so I would really appreciate if there is any way to do this efficiently. Thank you!
According to your post, you already almost got the answer. If you are really looking for a one line code, you can do this.
c = B[A]
c
Out[24]:
array([[0.2, 0.2],
[0.8, 0.9]])
The code above is for numpy array. On the other hand, if it is a list,
list comprehension would be required.
First try planning the sequence from index of first list and the relation with the result list.
A = [[1,1],[2,3]]
B=[0.1,0.2,0.8,0.9]
C = [[B[i] for i in j] for j in A]
print(C)
Based on your comments on answer by #PAUL ANDY DE LA CRUZ YANAC, I see that you are trying to use numpy and avoid for loop but as far as my knowledge, you need to use a for loop at least once.
import numpy as np
for x, y in np.ndindex(np.array(A).shape):
A[x][y] = B[A[x][y]]
Note: This approach changes the original list A. But if you want to create a new list, look at the solution by #Paul Dlc.
I am having a small issue understanding indexing in Numpy arrays. I think a simplified example is best to get an idea of what I am trying to do.
So first I create an array of zeros of the size I want to fill:
x = range(0,10,2)
y = range(0,10,2)
a = zeros(len(x),len(y))
so that will give me an array of zeros that will be 5X5. Now, I want to fill the array with a rather complicated function that I can't get to work with grids. My problem is that I'd like to iterate as:
for i in xrange(0,10,2):
for j in xrange(0,10,2):
.........
"do function and fill the array corresponding to (i,j)"
however, right now what I would like to be a[2,10] is a function of 2 and 10 but instead the index for a function of 2 and 10 would be a[1,4] or whatever.
Again, maybe this is elementary, I've gone over the docs and find myself at a loss.
EDIT:
In the end I vectorized as much as possible and wrote the simulation loops that I could not in Cython. Further I used Joblib to Parallelize the operation. I stored the results in a list because an array was not filling right when running in Parallel. I then used Itertools to split the list into individual results and Pandas to organize the results.
Thank you for all the help
Some tips for your to get the things done keeping a good performance:
- avoid Python `for` loops
- create a function that can deal with vectorized inputs
Example:
def f(xs, ys)
return x**2 + y**2 + x*y
where you can pass xs and ys as arrays and the operation will be done element-wise:
xs = np.random.random((100,200))
ys = np.random.random((100,200))
f(xs,ys)
You should read more about numpy broadcasting to get a better understanding about how the arrays's operations work. This will help you to design a function that can handle properly the arrays.
First, you lack some parenthesis with zeros, the first argument should be a tuple :
a = zeros((len(x),len(y)))
Then, the corresponding indices for your table are i/2 and j/2 :
for i in xrange(0,10,2):
for j in xrange(0,10,2):
# do function and fill the array corresponding to (i,j)
a[i/2, j/2] = 1
But I second Saullo Castro, you should try to vectorize your computations.
I'm trying to understand how to transform a filter, in this case a Notch(stopband) filter, to Python but I don't know how.
x(n)=-2*x(n)/(-0.9*x(n) -0.9*x(n-1))
Can anyone help me please?
Thanks in advance.
If you're using numpy arrays, this should work:
x[1:]=-2*x[1:]/(-0.9*x[1:]-0.9*x[:-1])
this changes your array in place, but you could just as easily assign it to a new array.
y=-2*x[1:]/(-0.9*x[1:]-0.9*x[:-1])
Note that your algorithm isn't really well defined for the 0th element, so my translation leaves x[0] unchanged.
EDIT
To change an iterable to a numpy array:
import numpy as np
x=np.array(iterable) #pretty easy :) although there could be more efficient ways depending on where "iterable" comes from.
result = []
#prime your result, that is, add the initial values to handle indexing
lower_bound = #
upper_bound = #
for n in range(lower_bound,upper_bound):
result.append( 2*result[n]/(-0.9*result[n] -0.9*result[n-1]) )
A toy-case for my problem:
I have a numpy array of size, say, 1000:
import numpy as np
a = np.arange(1000)
I also have a "projection array" p which is a mapping from a to another array b:
p = np.random.randint(0,1000,(1000,1000))
It is easy to get b from a using "fancy indexing":
b = a[p]
But b is not a view, as noted by several previous questions/answers and the numpy documentation.
Unfortunately, in my case only the values in a change over the course of a long simulation and using fancy indexing at each iteration to obtain b becomes very costly. I only read from b and do not modify it.
I understand it is not possible (yet) to solve this with fancy indexing.
I was wondering if anyone had a similar problem/bottleneck and came up with some other workaround?
What your asking for isn't practical and that's why the numpy folks haven't implemented it. You could do it yourself with something like:
class FancyView(object):
def __init__(self, array, index):
self._array = array
self._index = index.copy()
def __array__(self):
return self._array[self._index]
def __getitem__(self, index):
return self._array[self._index[index]]
b = FancyView(a, p)
But notice that the expensive a[p] operation will get called every time you use b as an array. There is no other practice way of making a 'view' of this kind. Numpy can get away with using views for basic slicing because it can manipulate the strides, but there is no way to do something like this using strides.
If you only need parts of b you might be able to get some time savings by indexing the fancy view instead of using it as an array.
I would like to apply a function to a monodimensional array 3 elements at a time, and output for each of them a single element.
for example I have an array of 13 elements:
a = np.arange(13)**2
and I want to apply a function, let's say np.std as an example.
Here is the equivalent list comprehension:
[np.std(a[i:i+3]) for i in range(0, len(a),3)]
[1.6996731711975948,
6.5489609014628334,
11.440668201153674,
16.336734339790461,
0.0]
does anyone know a more efficient way using numpy functions?
The simplest way is to reshape it and apply the function along an axis.
import numpy as np
a = np.arange(12)**2
b = a.reshape(4,3)
print np.std(b, axis=1)
If you need a little better performance than that, you could try stride_tricks. Below is the same as above except using stride_tricks. I was wrong about the performance gain, because as you can see below, b becomes exactly the same view as b above. I wouldn't be surprised if they compiled to exactly the same thing.
import numpy as np
a = np.arange(12)**2
b = np.lib.stride_tricks.as_strided(a, shape=(4,3), strides=(a.itemsize*3, a.itemsize))
print np.std(b, axis=1)
Are you talking about something like vectorize? http://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html
You can reshape it. But that does require that the size not change. If you can tack on some bogus entries at the end you can do this:
[np.std(s) for s in a.reshape(-1,3)]