Related
Assume I have a 1d array, what I want is to sample with a moving window and within the window divide each element by the first element.
For example if I have [2, 5, 8, 9, 6] and a window size of 3, the result will be
[[1, 2.5, 4],
[1, 1.6, 1.8],
[1, 1.125, 0.75]].
What I'm doing now is basically a for loop
import numpy as np
arr = np.array([2., 5., 8., 9., 6.])
window_size = 3
for i in range(len(arr) - window_size + 1):
result.append(arr[i : i + window_size] / arr[i])
etc.
When the array is large it is quite slow, I wonder whether there's better ways? I guess there is no way around the O(n^2) complexity, but perhaps numpy has some optimizations that I don't know of.
Here's a vectorized approach using broadcasting -
N = 3 # Window size
nrows = a.size-N+1
a2D = a[np.arange(nrows)[:,None] + np.arange(N)]
out = a2D/a[:nrows,None].astype(float)
We can also use NumPy strides for a more efficient extraction of sliding windows, like so -
n = a.strides[0]
a2D = np.lib.stride_tricks.as_strided(a,shape=(nrows,N),strides=(n,n))
Sample run -
In [73]: a
Out[73]: array([4, 9, 3, 6, 5, 7, 2])
In [74]: N = 3
...: nrows = a.size-N+1
...: a2D = a[np.arange(nrows)[:,None] + np.arange(N)]
...: out = a2D/a[:nrows,None].astype(float)
...:
In [75]: out
Out[75]:
array([[ 1. , 2.25 , 0.75 ],
[ 1. , 0.33333333, 0.66666667],
[ 1. , 2. , 1.66666667],
[ 1. , 0.83333333, 1.16666667],
[ 1. , 1.4 , 0.4 ]])
I'm trying to do some basic classification of numpy arrays...
I want to compare a 2d array against a 3d array, along the 3rd dimension, and make a classification based on the corresponding z-axis values.
so given 3 arrays that are stacked into a 3d array:
import numpy as np
a1 = np.array([[1,1,1],[1,1,1],[1,1,1]])
a2 = np.array([[3,3,3],[3,3,3],[3,3,3]])
a3 = np.array([[5,5,5],[5,5,5],[5,5,5]])
a3d = dstack((a1,a2,a3))
and another 2d array
a2d = np.array([[1,2,4],[5,5,2],[2,3,3]])
I want to be able to compare a2d against a3d, and return a 2d array of which level of a3d is closest. (or I suppose any custom function that can compare each value along the z-axis, and return a value base on that comparison.)
EDIT
I modified my arrays to more closely match my data. a1 would be the minimum values, a2 the average values, and a3 the maximum values. So I want to output if each a2d value is closer to a1 (classed "1") a2 (classed "2") or a3 (classed "3"). I'm doing as a 3d array because in the real data, it won't be a simple 3-array choice, but for SO purposes, it helps to keep it simple. We can assume that in the case of a tie, we'll take the lower, so 2 would be classed as level "1", 4 as level "2".
You can use the following list comprehension :
>>> [sum(sum(abs(i-j)) for i,j in z) for z in [zip(i,a2d) for i in a3d]]
[30.0, 22.5, 30.0]
In preceding code i create the following list with zip,that is the zip of each sub array of your 3d list then all you need is calculate the sum of the elemets of subtract of those pairs then sum of them again :
>>> [zip(i,a2d) for i in a3d]
[[(array([ 1., 3., 1.]), array([1, 2, 1])), (array([ 2., 2., 1.]), array([5, 5, 4])), (array([ 3., 1., 1.]), array([9, 8, 8]))], [(array([ 4., 6., 4.]), array([1, 2, 1])), (array([ 5. , 6.5, 4. ]), array([5, 5, 4])), (array([ 6., 4., 4.]), array([9, 8, 8]))], [(array([ 7., 9., 7.]), array([1, 2, 1])), (array([ 8., 8., 7.]), array([5, 5, 4])), (array([ 9., 7., 7.]), array([9, 8, 8]))]]
then for all of your sub arrays you'll have the following list:
[30.0, 22.5, 30.0]
that for each sub-list show a the level of difference with 2d array!and then you can get the relative sub-array from a3d like following :
>>> a3d[l.index(min(l))]
array([[ 4. , 6. , 4. ],
[ 5. , 6.5, 4. ],
[ 6. , 4. , 4. ]])
Also you can put it in a function:
>>> def find_nearest(sub,main):
... l=[sum(sum(abs(i-j)) for i,j in z) for z in [zip(i,sub) for i in main]]
... return main[l.index(min(l))]
...
>>> find_nearest(a2d,a3d)
array([[ 4. , 6. , 4. ],
[ 5. , 6.5, 4. ],
[ 6. , 4. , 4. ]])
You might consider a different approach using numpy.vectorize which lets you efficiently apply a python function to each element of your array.
In this case, your python function could just classify each pixel with whatever breaks you define:
import numpy as np
a2d = np.array([[1,2,4],[5,5,2],[2,3,3]])
def classify(x):
if x >= 4:
return 3
elif x >= 2:
return 2
elif x > 0:
return 1
else:
return 0
vclassify = np.vectorize(classify)
result = vclassify(a2d)
Thanks to #perrygeo and #Kasra - they got me thinking in a good direction.
Since I want a classification of the closest 3d array's z value, I couldn't do simple math - I needed the (z)index of the closest value.
I did it by enumerating both axes of the 2d array, and doing a proximity compare against the corresponding (z)index of the 3d array.
There might be a way to do this without iterating the 2d array, but at least I'm avoiding iterating the 3d.
import numpy as np
a1 = np.array([[1,1,1],[1,1,1],[1,1,1]])
a2 = np.array([[3,3,3],[3,3,3],[3,3,3]])
a3 = np.array([[5,5,5],[5,5,5],[5,5,5]])
a3d = np.dstack((a1,a2,a3))
a2d = np.array([[1,2,4],[5,5,2],[2,3,3]])
classOut = np.empty_like(a2d)
def find_nearest_idx(array,value):
idx = (np.abs(array-value)).argmin()
return idx
# enumerate to get indices
for i,a in enumerate(a2d):
for ii,v in enumerate(a):
valStack = a3d[i,ii]
nearest = find_nearest_idx(valStack,v)
classOut[i,ii] = nearest
print classOut
which gets me
[[0 0 1]
[2 2 0]
[0 1 1]]
This tells me that (for example) a2d[0,0] is closest to the 0-index of a3d[0,0], which in my case means it is closest to the min value for that 2d position. a2d[1,1] is closest to the 2-index, which in my case means closer to the max value for that 2d position.
I have a list:
somelist = [500, 600, 200, 1000]
I want to generate the rank order of that list:
rankorderofsomelist = [3, 2, 4, 1]
There are some complex solutions, but does anyone have any simple methods?
Since you've tagged this question scipy, you could use scipy.stats.rankdata:
>>> rankdata(somelist)
array([ 2., 3., 1., 4.])
>>> len(somelist) - rankdata(somelist)
array([ 2., 1., 3., 0.])
>>> len(somelist) - rankdata(somelist) + 1
array([ 3., 2., 4., 1.])
The real advantage is that you can specify how you want the corner cases to be treated:
>>> rankdata([0,1,1,2])
array([ 1. , 2.5, 2.5, 4. ])
>>> rankdata([0,1,1,2], method='min')
array([ 1, 2, 2, 4])
>>> rankdata([0,1,1,2], method='dense')
array([ 1, 2, 2, 3])
Simplest I can think of:
rankorder = sorted(range(len(thelist)), key=thelist.__getitem__)
This will, of course, produce [2, 1, 3, 0], because Python indexing is always zero-based -- if for some absolutely weird reason you need to add one to each index you can of course easily do so:
rankorder_weird = [1+x for x in rankorder]
Try this one-liner:
rankorderofsomelist = [sorted(somelist).index(x) for x in somelist]
Note that it'll behave as expected for a list with multiple entries of the same value (e.g. four instances of the same value, all of them the second-largest in the list, will all be ranked 2). Also note that Pythonic sorting is ascending (smallest to largest) and zero-based, so you may have to apply a final pass over the list to increment the ranks, reverse them, etc.
You can include that pass in the one-liner. To yield your desired result, just use:
rankorderofsomelist = [len(somelist)-(sorted(somelist).index(x)) for x in somelist]
my stack is something like this
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[2, 2, 2],
[2, 2, 2],
[2, 2, 2]]])
I want this result:
array([[ 1.5, 2. , 2.5],
[ 3. , 3.5, 4. ],
[ 4.5, 5. , 5.5]])
I updated my question I think it's more clearer now.
Well, first, you don't have a stack of 2D arrays, you have three separate variables.
Fortunately, most functions in NumPy take an array_like argument. And the tuple (a, b, c) is "array-like" enough—it'll be converted into the 3D array that you should have had in the first place.
Anyway, the obvious function to take the mean is np.mean. As the docs say:
The average is taken over the flattened array by default, otherwise over the specified axis.
So just specify the axis you want—the newly-created axis 0.
np.mean((a,b,c), axis=0)
In your updated question, you now have a single 2x3x3 array, a, instead of three 2x2 arrays, a, b, and c, and you want the mean across the first axis (the one with dimension 2). This is the same thing, but slightly easier:
np.mean(a, axis=0)
Or course the mean of 4, 7, and 3 is 4.666666666666667, not 4. In your updated question, that seems to be what you want; in your original question… I'm not sure if you wanted to truncate or round, or if you wanted the median or something else rather than the mean, or anything else, but those are all easy (add dtype=int64 to the call, call .round() on the result, call median instead of mean, etc.).
>>> a = np.array([[1,2],[3,4]])
>>> b = np.array([[1,5],[6,7]])
>>> c = np.array([[1,8],[8,3]])
>>> np.mean((a,b,c), axis=0)
array([[ 1. , 5. ],
[ 5.66666667, 4.66666667]])
As per your output it seems you are looking for median rather than mean.
>>> np.median((a,b,c), axis=0)
array([[ 1., 5.],
[ 6., 4.]])
Lets say your numpy array is:
A = [1,1,2,3,4]
You can simply do:
A + .1
to add a number to that every element numpy array
I am looking for a way to add a number to just the odd or even indexed numbers A[::2] +1 while keeping the entire array intact.
Is it possible to add a number to all the odd or even indexed elements without any loops?
In [43]: A = np.array([1,1,2,3,4], dtype = 'float')
In [44]: A[::2] += 0.1
In [45]: A
Out[45]: array([ 1.1, 1. , 2.1, 3. , 4.1])
Note that this modifies A. If you wish to leave A unmodified, copy A first:
In [46]: A = np.array([1,1,2,3,4], dtype = 'float')
In [47]: B = A.copy()
In [48]: B[::2] += 0.1
In [49]: B
Out[49]: array([ 1.1, 1. , 2.1, 3. , 4.1])
In [50]: A
Out[50]: array([ 1., 1., 2., 3., 4.])
In addition to previous answers, to modify numbers with odd indices you should use A[1::2] instead of A[::2]
Something with list comprehension could work.
A = [1,1,2,3,4]
A = [A[i] + (0 if (i%2 == 0) else .1) for i in range(len(A))]
Just quick and dirty with a ternary. Might not work in your version of Python, can't remember which versions it works with.
Checked in Python 2.7.3 and Python 3.2.3, output is the same:
>>> A = [1,1,2,3,4]
>>> A
[1, 1, 2, 3, 4]
>>> A = [A[i] + (0 if (i%2 == 0) else .1) for i in range(len(A))]
>>> A
[1, 1.1, 2, 3.1, 4]
If the list didn't start with two 1 and you wanted to add to all even numbers, you could use:
A[1::2] += 0.1
or
A[::-2][::-1] += 0.1
In the latter case, [::-1] is used to reverse the array back to normal order.