How to re-index 3d numpy array - python

Lets say we have a 3D array like:
array = np.arange(8).reshape(2,2, 2)
new_array = np.zeros((2, 2, 2))
and lets assume we have some new random x,y,z indices for our array
x,y,z = np.meshgrid(array, array, array)
What is the fastest way to re-index our array?
A simple solution given here:
for x in range(0, 3):
for y in range(0, 3):
for z in range(0, 3):
new_x = x_coord[x,y,z]
new_y = y_coord[x,y,z]
new_z = z_coord[x,y,z]
new_array[x,y,z] = array[new_x, new_y, new_z]
Is there a one-liner for this that I am not aware of?
EDIT
Yes, there is... very easy:
vol = np.arange(8).reshape(2,2, 2)
arr = np.arange(2)
x,y,z = np.meshgrid(arr, arr, arr)
print(vol)
print(vol[y, x, z]) ### ---> You have to swap the axes here tho. Does anyone know why?
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
Also, it is very slow. Any ideas how to improve the performance?

Setup:
In [54]: arr = np.arange(9).reshape(3,3)
In [55]: x = np.random.randint(0,3,(3,3))
In [56]: y = np.random.randint(0,3,(3,3))
In [57]: x
Out[57]:
array([[2, 0, 1],
[0, 2, 1],
[0, 0, 1]])
In [58]: y
Out[58]:
array([[0, 0, 0],
[0, 1, 1],
[0, 1, 0]])
The simplest application of these indexing arrays:
In [59]: arr[x,y]
Out[59]:
array([[6, 0, 3],
[0, 7, 4],
[0, 1, 3]])
The iterative equivalent:
In [60]: out = np.empty_like(arr)
In [61]: for i in range(3):
...: for j in range(3):
...: out[i,j] = arr[x[i,j], y[i,j]]
...:
In [62]: out
Out[62]:
array([[6, 0, 3],
[0, 7, 4],
[0, 1, 3]])
Your code isn't the same, because it is modifying the source array as it iterates:
In [63]: arr1 = arr.copy()
In [64]: for i in range(3):
...: for j in range(3):
...: arr1[i,j] = arr1[x[i,j], y[i,j]]
...:
In [65]: arr1
Out[65]:
array([[6, 6, 3],
[6, 7, 7],
[6, 6, 6]])
There isn't a simple equivalent.
You can index with arr[x_coord,y_coord,z_coord] as long a indexing arrays broadcast together. Where they all have the same shape that is trivial.
In [68]: x1 = np.random.randint(0,3,(2,4))
In [69]: x1
Out[69]:
array([[2, 0, 2, 0],
[0, 0, 0, 2]])
In [70]: arr[x1,x1]
Out[70]:
array([[8, 0, 8, 0],
[0, 0, 0, 8]])
A simpler way of picking random values from an array is to create random row and column selectors, and use ix_ to create arrays that broadcast together:
In [71]: x1 = np.random.randint(0,3,(3))
In [72]: y1 = np.random.randint(0,3,(3))
In [75]: np.ix_(x1,y1)
Out[75]:
(array([[2],
[1],
[1]]), array([[2, 2, 1]]))
In [76]: arr[np.ix_(x1,y1)]
Out[76]:
array([[8, 8, 7],
[5, 5, 4],
[5, 5, 4]])
Almost sounds like you just want to shuffle the values of the array, like:
In [95]: arr
Out[95]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [96]: np.random.shuffle(arr.ravel())
In [97]: arr
Out[97]:
array([[0, 1, 2],
[7, 4, 3],
[6, 5, 8]])

Related

numpy filling the diagonal 0 for 3D array

Suppose I have a 3D array, how can I fill the diag of the first two dimensions to zero. For example
a = np.random.rand(2,2,3)
for i in range(3):
np.fill_diagonal(a[:,:,i], 0)
Is there a way to replace the for loop?
The following is one of the solution
a = np.random.rand(2,2,3)
np.einsum('iij->ij',a)[...] = 0
The np.diag function returns a 2D diagonal matrix.
a[:,:,0] = np.diag((1,1))
In [6]: a = np.random.randint(1,10,(2,2,3))
...: for i in range(3):
...: np.fill_diagonal(a[:,:,i], 0)
In [7]: a
Out[7]:
array([[[0, 0, 0],
[7, 4, 4]],
[[8, 2, 7],
[0, 0, 0]]])
Indexing a diagonal is easy - just use the same index array for both dimensions. Thus the 0s we just set are:
In [8]: idx=np.arange(2)
In [9]: a[idx,idx,:]
Out[9]:
array([[0, 0, 0],
[0, 0, 0]])
and used to set a value:
In [10]: a[idx,idx,:] = 10
In [11]: a
Out[11]:
array([[[10, 10, 10],
[ 7, 4, 4]],
[[ 8, 2, 7],
[10, 10, 10]]])

How to get the blocks back from a scipy sparse block matrix?

After some vectorized calculations, I get a sparse block matrix with all my results stacked in blocks of same size.
>>> A = [[1, 1],
... [1, 1]]
>>> B = [[2, 2],
... [2, 2]]
>>> C = [[3, 3],
... [3, 3]]
>>> results = scipy.sparse.block_diag(A, B, C)
>>> print(results.toarray())
[[1 1 0 0 0 0]
[1 1 0 0 0 0]
[0 0 2 2 0 0]
[0 0 2 2 0 0]
[0 0 0 0 3 3]
[0 0 0 0 3 3]]
How can I get back these arrays A,B,C in an efficient way, if necessery by providing their shape (2,2)?
In [177]: >>> A = [[1, 1],
...: ... [1, 1]]
...: >>> B = [[2, 2],
...: ... [2, 2]]
...: >>> C = [[3, 3],
...: ... [3, 3]]
...: >>> results = sparse.block_diag([A, B, C])
...:
In [178]: results
Out[178]:
<6x6 sparse matrix of type '<class 'numpy.int64'>'
with 12 stored elements in COOrdinate format>
block_diag does not preserve the inputs; rather it creates coo format matrix, representing the whole matrix, not the pieces.
In [194]: results.data
Out[194]: array([1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3], dtype=int64)
In [195]: results.row
Out[195]: array([0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5], dtype=int32)
In [196]: results.col
Out[196]: array([0, 1, 0, 1, 2, 3, 2, 3, 4, 5, 4, 5], dtype=int32)
In [179]: results.A
Out[179]:
array([[1, 1, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0],
[0, 0, 2, 2, 0, 0],
[0, 0, 2, 2, 0, 0],
[0, 0, 0, 0, 3, 3],
[0, 0, 0, 0, 3, 3]], dtype=int64)
block_diag pass the arrays to sparse.bmat. That in turn makes a coo matrix from each, and then merges the coo attributes into 3 arrays, which are inputs to the global sparse matrix.
There is another sparse format bsr that may preserve the blocks (until conversion to csr for calculation), but I'll have to experiment to see that's the case.
Let's make a bsr from that results coo:
In [186]: bresults = sparse.bsr_matrix(results)
In [187]: bresults
Out[187]:
<6x6 sparse matrix of type '<class 'numpy.int64'>'
with 12 stored elements (blocksize = 2x2) in Block Sparse Row format>
In [188]: bresults.blocksize
Out[188]: (2, 2)
In [189]: bresults.data
Out[189]:
array([[[1, 1],
[1, 1]],
[[2, 2],
[2, 2]],
[[3, 3],
[3, 3]]], dtype=int64)
So it deduces that there are blocks, just as you desired.
In [191]: bresults.indices
Out[191]: array([0, 1, 2], dtype=int32)
In [192]: bresults.indptr
Out[192]: array([0, 1, 2, 3], dtype=int32)
So it's a csr like storage, but with the data arranged in blocks.
It may be possible to construct this from your A,B,C without the block_diag intermediary, but I'd have to look at the docs more.
that's a funny little problem.
I don't think there is a function that solves this in one line, but there's a way to do it programmatically.
Check out what res.data prints out, I use it here.
This works when shapes are all the same.
from scipy.sparse import block_diag
a = [[1, 2, 4],
[3, 4, 4]]
b = [[2, 2, 1],
[2, 2, 1]]
c = [[3, 3, 6],
[3, 3, 6]]
res = block_diag((a, b, c))
def goBack(res, shape):
s = shape[0]*shape[1]
num = int(len(res.data)/s)
for i in range (num):
mat = res.data[i*s:(i+1)*s].reshape(shape)
print(mat)
goBack(res, [2,3])
Output:
[[1 2 4]
[3 4 4]]
[[2 2 1]
[2 2 1]]
[[3 3 6]
[3 3 6]]
Edit:
Okay, this does not work when any of the elements of the provided matrices is zero, as then it would not be counted in res.data.
Also, forget it, the link provided by cleb should help you.

Converting list to array and reshaping?

Attempting to make an array of 2d points.
import random
import numpy as np
world_x=500
world_y=500
num_points = 7
points_list = []
def fill_points():
for i in range(num_points):
points_list.append(random.randrange(-1*world_x,world_x+1))
points_list.append(random.randrange(-1*world_y,world_y+1))
points_array = np.array(points_list)
points_array.reshape((num_points,2))
print(points_array)
print (points_array[0,])
print (points_array[2,])
print (points_array[4,])
fill_points()
Returns
[ -70 -491 -326 -35 -408 407 94 -330 -493 499 -61 -12 62 -357]
-70
-326
-408
I was expecting [-70,-491],[-408,-407], and [-493,499]. I've also tried to do this just using shape instead of reshape, and got similar results. Am I converting the list incorrectly, or using reshape incorrectly?
The .reshape method returns a new array object, that is, it doesn't work in-place. You can either reassign the results of reshape back to the same variable, or modify the .shape attribute directly, which does work in-place:
In [1]: import numpy as np
In [2]: arr = np.arange(10)
In [3]: arr
Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [4]: arr.reshape(2, 5)
Out[4]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [5]: arr
Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
On the other hand:
In [6]: arr.shape = 2, 5
In [7]: arr
Out[7]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
Or use the .resize method for in-place modifications:
In [8]: arr = np.arange(4)
In [9]: arr
Out[9]: array([0, 1, 2, 3])
In [10]: arr.resize(2, 2)
In [11]: arr
Out[11]:
array([[0, 1],
[2, 3]])
Note: the different array objects can share the same underlying buffer, so be aware that this happens:
In [12]: arr = np.arange(10)
In [13]: arr
Out[13]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [14]: arr2 = arr.reshape(2, 5)
In [15]: arr
Out[15]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [16]: arr2
Out[16]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [17]: arr[0] = 99
In [18]: arr
Out[18]: array([99, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [19]: arr2
Out[19]:
array([[99, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9]])
So, this makes the re-assigning approach relatively cheap:
In [20]: arr = arr.reshape(2, 5)
In [21]: arr
Out[21]:
array([[99, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9]])
Note, I tend to avoid .resize, because you can accidentally do:
In [33]: arr = np.arange(4)
In [34]: arr.resize(4,4)
In [35]: arr
Out[35]:
array([[0, 1, 2, 3],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
But it will at least warn you... if other arrays are referenced:
In [36]: arr = np.arange(4)
In [37]: arr2 = arr.reshape(2,2)
In [38]: arr
Out[38]: array([0, 1, 2, 3])
In [39]: arr2
Out[39]:
array([[0, 1],
[2, 3]])
In [40]: arr.resize(4,4)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-40-c4464d98ed0e> in <module>()
----> 1 arr.resize(4,4)
ValueError: cannot resize an array that references or is referenced
by another array in this way. Use the resize function
However, you can override that behavior at your own peril:
In [41]: arr.resize(4,4, refcheck=False)
In [42]: arr
Out[42]:
array([[0, 1, 2, 3],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
In [43]: arr2
Out[43]:
array([[4611686018427387904, 4611686018427387904],
[ 6, 0]])
It looks like you want to keep the x and y cords together, try adding them as a list. E.g.:
import random
import numpy as np
world_x=500
world_y=500
num_points = 7
points_list = []
def fill_points():
for i in range(num_points):
points_list.append([random.randrange(-1*world_x,world_x+1),
random.randrange(-1*world_y,world_y+1)])
points_array = np.array(points_list)
print(points_array)
print (points_array[0,0])
print (points_array[2,0])
print (points_array[4,0])
fill_points()
Outputs:
[[ 354 -147]
[ 193 288]
[ 157 -319]
[ 133 426]
[-109 -54]
[ -61 224]
[-251 -411]]
354
157
-109
Or if you want to use reshape. Remember reshape returns a new list, it doesnt change the list you input:
def fill_points():
for i in range(num_points):
points_list.append(random.randrange(-1*world_x,world_x+1))
points_list.append(random.randrange(-1*world_y,world_y+1))
points_array = np.array(points_list).reshape((num_points,2))
print(points_array)
print (points_array[0,0])
print (points_array[2,0])
print (points_array[4,0])
fill_points()

Numpy switch numbering from columns to rows

I need to change the numbering scheme of a matrix. Say,
import numpy as np
a = np.arange(6).reshape(3,2)
array([[0, 1],
[2, 3],
[4, 5]])
And I want to switch it to
b = np.array([[0,3],[1,4],[2,5]])
array([[0, 3],
[1, 4],
[2, 5]])
So that basically I number the matrix through the rows first. I am sure there is a nice way to do this in numpy
>>> import numpy as np
>>> np.arange(6).reshape(3,2, order = 'F')
>>> array([[0, 3],
[1, 4],
[2, 5]])
From the doc:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html
Use order='F' to specify the Fortran traditional representation and order='C' (the default) to use the traditional C representation.
To create a view of the same data with the new shape, you can use a.T.reshape(3, 2, order='F'):
In [35]: a = np.arange(6).reshape(3,2)
In [36]: a
Out[36]:
array([[0, 1],
[2, 3],
[4, 5]])
In [37]: b = a.T.reshape(3, 2, order='F')
In [38]: b
Out[38]:
array([[0, 3],
[1, 4],
[2, 5]])
Verify a and b are views of the same data, by changing a and checking b:
In [39]: a[1, 0] = 99
In [40]: a
Out[40]:
array([[ 0, 1],
[99, 3],
[ 4, 5]])
In [41]: b
Out[41]:
array([[ 0, 3],
[ 1, 4],
[99, 5]])

Inserting rows and columns into a numpy array

I would like to insert multiple rows and columns into a NumPy array.
If I have a square array of length n_a, e.g.: n_a = 3
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
and I would like to get a new array with size n_b, which contains array a and zeros (or any other 1D array of length n_b) on certain rows and columns with indices, e.g.
index = [1, 3]
so n_b = n_a + len(index). Then the new array is:
b = np.array([[1, 0, 2, 0, 3],
[0, 0, 0, 0, 0],
[4, 0, 5, 0, 6],
[0, 0, 0, 0, 0],
[7, 0, 8, 0, 9]])
My question is, how to do this efficiently, with the assumption that by bigger arrays n_a is much larger than len(index).
EDIT
The results for:
import numpy as np
import random
n_a = 5000
n_index = 100
a=np.random.rand(n_a, n_a)
index = random.sample(range(n_a), n_index)
Warren Weckesser's solution: 0.208 s
wim's solution: 0.980 s
Ashwini Chaudhary's solution: 0.955 s
Thank you to all!
Here's one way to do it. It has some overlap with #wim's answer, but it uses index broadcasting to copy a into b with a single assignment.
import numpy as np
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
index = [1, 3]
n_b = a.shape[0] + len(index)
not_index = np.array([k for k in range(n_b) if k not in index])
b = np.zeros((n_b, n_b), dtype=a.dtype)
b[not_index.reshape(-1,1), not_index] = a
You can do this by applying two numpy.insert calls on a:
>>> a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> indices = np.array([1, 3])
>>> i = indices - np.arange(len(indices))
>>> np.insert(np.insert(a, i, 0, axis=1), i, 0, axis=0)
array([[1, 0, 2, 0, 3],
[0, 0, 0, 0, 0],
[4, 0, 5, 0, 6],
[0, 0, 0, 0, 0],
[7, 0, 8, 0, 9]])
Since fancy indexing returns a copy instead of a view,
I can only think how to do it in a two-step process. Maybe a numpy wizard knows a better way...
Here you go:
import numpy as np
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
index = [1, 3]
n = a.shape[0]
N = n + len(index)
non_index = [x for x in xrange(N) if x not in index]
b = np.zeros((N,n), a.dtype)
b[non_index] = a
a = np.zeros((N,N), a.dtype)
a[:, non_index] = b
Why can't you just Slice/splice? This has zero loops or for statements.
xlen = a.shape[1]
ylen = a.shape[0]
b = np.zeros((ylen * 2 - ylen % 2, xlen * 2 - xlen % 2)) #accomodates both odd and even shapes
b[0::2,0::2] = a

Categories

Resources