Rolling array around - python

Let's say I have
arr = np.arange(6)
arr
array([0, 1, 2, 3, 4, 5])
and I decide that I want to treat an array "like a circle": When I run out of material at the end, I want to start at index 0 again. That is, I want a convenient way of selecting x elements, starting at index i.
Now, if x == 6, I can simply do
i = 3
np.hstack((arr[i:], arr[:i]))
Out[9]: array([3, 4, 5, 0, 1, 2])
But is there a convenient way of generally doing this, even if x > 6, without having to manually breaking the array apart and thinking through the logic?
For example:
print(roll_array_arround(arr)[2:17])
should return.
array([2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0])

See mode='wrap' in ndarray.take:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.take.html
Taking your hypothetical function:
print(roll_array_arround(arr)[2:17])
If it is implied that it is a true slice of the original array that you are after, that is not going to happen; a wrapped-around array cannot be expressed as a strided view of the original; so if you seek a function that maps an ndarray to an ndarray, this will necessarily involve a copy of your data.
That is, efficiency-wise, you shouldnt expect to find solution that significantly differs in performance from the expression below.
print(arr.take(np.arange(2,17), mode='wrap'))

Modulus operation seems like the best fit here -
def rolling_array(n, x, i):
# n is rolling period
# x is length of array
# i is starting number
return np.mod(np.arange(i,i+x),n)
Sample runs -
In [61]: rolling_array(n=6, x=6, i=3)
Out[61]: array([3, 4, 5, 0, 1, 2])
In [62]: rolling_array(n=6, x=17, i=2)
Out[62]: array([2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0])

A solution you can look into would probably be :
from itertools import cycle
list_to_rotate = np.array([1,2,3,4,5])
rotatable_list = cycle(list_to_rotate)

You need to roll your array.
>>> x = np.arange(10)
>>> np.roll(x, 2)
array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])
See numpy documentation for more details.

Related

numpy, "cleaning" an index and value array

I have an index array, and an associated value array.
delta = np.array([0,3,4,1,1,4,4,5,7,10], dtype = int)
theta = np.random.normal(size = (12, 5))
I want to "clean" the index array such indices with no presence are dropped, and higher indices are moved down to take their place. In this case, the result will be:
delta == np.array([0,2,3,1,1,3,3,4,5,6], dtype = int)
theta == theta[np.array([0,1,3,4,5,7,10, 2,6,8,9,11], dtype = int)]
and the associated entries are moved up in the theta array such that their indices match the new indices in the delta vector. How do I go about this?
Let's ask unique for all of the optional values.
In [799]: np.unique(x,return_index=True,return_inverse=True, return_counts=True)
Out[799]:
(array([ 0, 1, 3, 4, 5, 7, 10]),
array([0, 3, 1, 2, 7, 8, 9]),
array([0, 2, 3, 1, 1, 3, 3, 4, 5, 6]),
array([1, 2, 1, 3, 1, 1, 1]))
Looks like the 'inverse' is what you want.
Review np.unique docs for more details.

Numpy: vectorize matrix creation

If I want to create a matrix, I simply call
m = np.matrix([[x00, x01],
[x10, x11]])
, where x00, x01, x10 and x11 are numbers. However, I would like to vectorize this process. For example, if the x's are one-dimensional arrays with length l, then I would like m to become an array of matrices, or a lx2x2-dimensional array. Unfortunately,
zeros = np.zeros(10)
ones = np.ones(10)
m = np.matrix([[zeros, ones],
[zeros, ones]])
raises an error ("matrix must be 2-dimensional") and
m = np.array([[zeros, ones],
[zeros, ones]])
gives an 2x2xl-dimensional array instead. In order to solve this, I could call np.moveaxis(m, 2, 0), but I am looking for a direct solution that doesn't need to change the order of axes of a (potentially huge) array. This also only sets the axis-order right if I'm passing one-dimensional arrays as values for my matrix, not if they're higher dimensional.
Is there a general and efficient way of vectorizing the creation of matrices?
Let's try a 2d (4d after joining) case:
In [374]: ones = np.ones((3,4),int)
In [375]: arr = np.array([[ones*0, ones],[ones*2, ones*3]])
In [376]: arr
Out[376]:
array([[[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]],
[[[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]],
[[3, 3, 3, 3],
[3, 3, 3, 3],
[3, 3, 3, 3]]]])
In [377]: arr.shape
Out[377]: (2, 2, 3, 4)
Notice that the original array elements are 'together'. arr has its own databuffer, with copies of the original arrays, but it was made with relatively efficient block copies.
We can easily transpose axes:
In [378]: arr.transpose(2,3,0,1)
Out[378]:
array([[[[0, 1],
[2, 3]],
[[0, 1],
[2, 3]],
...
[[0, 1],
[2, 3]]]])
Now it's 12 (2,2) arrays. It is a view, using arr's databuffer. It just has a different shape and strides. Doing this transpose is quite efficient, and isn't any slower when arr is very big. And a lot of math on the transposed array will be nearly as efficient as on the original arr (because of stridded iteration). If there are differences in speed it will be because of caching at a deep level.
But some actions will require a copy. For example the transposed array can't be raveled without a copy. The original 0s,1s etc are no longer together.
In [379]: arr.transpose(2,3,0,1).ravel()
Out[379]:
array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1,
2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3,
0, 1, 2, 3])
I could construct the same 1d array with
In [380]: tarr = np.empty((3,4,2,2), int)
In [381]: tarr[...,0,0] = ones*0
In [382]: tarr[...,0,1] = ones*1
In [383]: tarr[...,1,0] = ones*2
In [384]: tarr[...,1,1] = ones*3
In [385]: tarr.ravel()
Out[385]:
array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1,
2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3,
0, 1, 2, 3])
This tarr is effectively what you are trying to produce 'directly'.
Another way to look at this construction, is to assign the values to the array's .flat with strides - insert 0s at every 4th slot, 1s at the adjacent ones, etc.:
In [386]: tarr.flat[0::4] = ones*0
In [387]: tarr.flat[1::4] = ones*1
In [388]: tarr.flat[2::4] = ones*2
In [389]: tarr.flat[3::4] = ones*3
Here's another 'direct' way - use np.stack (a version of concatenate) to create a (3,4,4) array, which can then be reshaped:
np.stack((ones*0,ones*1,ones*2,ones*3),2).reshape(3,4,2,2)
That stack is, in essence:
In [397]: ones1 = ones[...,None]
In [398]: np.concatenate((ones1*0, ones1*1, ones1*2, ones1*3),axis=2)
Notice that this target (3,4,2,2) could be reshaped to (12,4) (and v.v) at no cost. So the original problem becomes: is it easier to construct a (4,12) and transpose, or construct the (12,4) first? It's really a 2d problem, not a (m+n)d one.
np.matrix must be a 2D array. From numpy documentation of np.matrix
Returns a matrix from an array-like object, or from a string of data.
A matrix is a specialized 2-D array that retains its 2-D nature
through operations. It has certain special operators, such as *
(matrix multiplication) and ** (matrix power).
Note
It is no longer recommended to use this class, even for linear
algebra. Instead use regular arrays. The class may be removed in the
future.
Is there any reason you want np.matrix? Most numpy operations should be doable in the array object as the matrix class is quasi-deprecated.
From your example I tried using the transpose (.T) method:
zeros = np.zeros(10)
ones = np.ones(10)
twos = np.ones(10) * 2
threes = np.ones(10) * 3
m = np.array([[zeros, ones], [twos, threes]]).T
>> array([[0,2],[1,3]],...)
or
m = np.transpose(np.array([[zeros, ones], [twos, threes]]), (2,0,1))
>> array([[0,1],[2,3]],...)
This yields a (10, 2, 2) array

How to get N maximum values in a multi dimensional numpy array along a given axis(say 2)?

Since argmax only gives one maximum values,how can we find atleast 2 or 3 elements instead of just one.
Currently my input is in the format np.argmax(array,axis=2) which is giving only one maximum and i have to extract 2 or 3 atleast from the array which is N-dimensional
I would try to use the function called argpartition(). To get the indices of the two largest elements, do:
import numpy as np
a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
ind = np.argpartition(a, -2)[-2:]
ind
Out[13]: array([5, 0], dtype=int64)
a[ind]
Out[14]: array([9, 9])
Using numpy.argsort. Data from #CarlesSansFuentes.
import numpy as np
a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
args = np.argsort(-a)[:2]
array([0, 5], dtype=int64)

numpy append to an indexed array (after np.where)

I want to append values to a selection of array without having to go through a for loop.
i.e. if I want to add 0 values to certain locations of an array:
a=np.array([[1,2,3,4,5],[1,2,3,4,5]])
condition=np.where(a>2)
a[condition]=np.append(a[condition],np.array([0]*len(condition[0])))
-> ValueError: shape mismatch: value array of shape (12,) could not be broadcast to indexing result of shape (6,)
Edit for clarification:
I need to add values (and dimension if needed) to selected array location. The loop looks like that:
for t in range(len(ind)):
c = cols[t]
r = rows[t]
if data1[r, c] > 2:
data2[r,c]=np.append(data2[r,c],t)
Is there any way to remove this loop (~100 000 iterations)? Thank
Let's look at the pieces:
In [92]: a=np.array([[1,2,3,4,5],[1,2,3,4,5]])
...: condition=np.where(a>2)
...:
In [93]: a
Out[93]:
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])
In [94]: condition
Out[94]:
(array([0, 0, 0, 1, 1, 1], dtype=int32),
array([2, 3, 4, 2, 3, 4], dtype=int32))
In [95]: a[condition]
Out[95]: array([3, 4, 5, 3, 4, 5])
In [96]: np.append(a[condition],np.array([0]*len(condition[0])))
Out[96]: array([3, 4, 5, 3, 4, 5, 0, 0, 0, 0, 0, 0])
You are trying to put 12 values into 6 slots. No can do!
What are you expecting? I don't think I should even speculate. Go ahead show us the loop.

Fill zero values of 1d numpy array with last non-zero values

Let's say we have a 1d numpy array filled with some int values. And let's say that some of them are 0.
Is there any way, using numpy array's power, to fill all the 0 values with the last non-zero values found?
for example:
arr = np.array([1, 0, 0, 2, 0, 4, 6, 8, 0, 0, 0, 0, 2])
fill_zeros_with_last(arr)
print arr
[1 1 1 2 2 4 6 8 8 8 8 8 2]
A way to do it would be with this function:
def fill_zeros_with_last(arr):
last_val = None # I don't really care about the initial value
for i in range(arr.size):
if arr[i]:
last_val = arr[i]
elif last_val is not None:
arr[i] = last_val
However, this is using a raw python for loop instead of taking advantage of the numpy and scipy power.
If we knew that a reasonably small number of consecutive zeros are possible, we could use something based on numpy.roll. The problem is that the number of consecutive zeros is potentially large...
Any ideas? or should we go straight to Cython?
Disclaimer:
I would say long ago I found a question in stackoverflow asking something like this or very similar. I wasn't able to find it. :-(
Maybe I missed the right search terms, sorry for the duplicate then. Maybe it was just my imagination...
Here's a solution using np.maximum.accumulate:
def fill_zeros_with_last(arr):
prev = np.arange(len(arr))
prev[arr == 0] = 0
prev = np.maximum.accumulate(prev)
return arr[prev]
We construct an array prev which has the same length as arr, and such that prev[i] is the index of the last non-zero entry before the i-th entry of arr. For example, if:
>>> arr = np.array([1, 0, 0, 2, 0, 4, 6, 8, 0, 0, 0, 0, 2])
Then prev looks like:
array([ 0, 0, 0, 3, 3, 5, 6, 7, 7, 7, 7, 7, 12])
Then we just index into arr with prev and we obtain our result. A test:
>>> arr = np.array([1, 0, 0, 2, 0, 4, 6, 8, 0, 0, 0, 0, 2])
>>> fill_zeros_with_last(arr)
array([1, 1, 1, 2, 2, 4, 6, 8, 8, 8, 8, 8, 2])
Note: Be careful to understand what this does when the first entry of your array is zero:
>>> fill_zeros_with_last(np.array([0,0,1,0,0]))
array([0, 0, 1, 1, 1])
Inspired by jme's answer here and by Bas Swinckels' (in the linked question) I came up with a different combination of numpy functions:
def fill_zeros_with_last(arr, initial=0):
ind = np.nonzero(arr)[0]
cnt = np.cumsum(np.array(arr, dtype=bool))
return np.where(cnt, arr[ind[cnt-1]], initial)
I think it's succinct and also works, so I'm posting it here for the record. Still, jme's is also succinct and easy to follow and seems to be faster, so I'm accepting it :-)
If the 0s only come in strings of 1, this use of nonzero might work:
In [266]: arr=np.array([1,0,2,3,0,4,0,5])
In [267]: I=np.nonzero(arr==0)[0]
In [268]: arr[I] = arr[I-1]
In [269]: arr
Out[269]: array([1, 1, 2, 3, 3, 4, 4, 5])
I can handle your arr by applying this repeatedly until I is empty.
In [286]: arr = np.array([1, 0, 0, 2, 0, 4, 6, 8, 0, 0, 0, 0, 2])
In [287]: while True:
.....: I=np.nonzero(arr==0)[0]
.....: if len(I)==0: break
.....: arr[I] = arr[I-1]
.....:
In [288]: arr
Out[288]: array([1, 1, 1, 2, 2, 4, 6, 8, 8, 8, 8, 8, 2])
If the strings of 0s are long it might be better to look for those strings and handle them as a block. But if most strings are short, this repeated application may be the fastest route.

Categories

Resources