This question already has answers here:
Explain this 4D numpy array indexing intuitively
(2 answers)
Closed 5 years ago.
I am new to numerical computation using numpy. I am having a hard time in understanding arrays with dimentions more than 2. Is there any way to interpret a multidimensional array?
e.g:
>>> import numpy as np
>>> arr1 = np.arange(24).reshape(2,3,2,2)
>>> arr1
array([[[[ 0, 1],
[ 2, 3]],
[[ 4, 5],
[ 6, 7]],
[[ 8, 9],
[10, 11]]],
[[[12, 13],
[14, 15]],
[[16, 17],
[18, 19]],
[[20, 21],
[22, 23]]]])
Any explanation, reference to build intuition?
Edited:
I wanted to know how to interpret the output of .shape with the output of . i.e in the above example (2,3,2,2) what is the rightmost 2 referring to or 3 or other 2. How numpy handles this?
This isn't a direct answer, but when I started working with multidimensional arrays, my biggest difficulty was visualizing what the big long streaming list and brackets was all about. I had a picture in my mind what a 3D and 4D array looked like, but the current representations didn't match what I 'pictured'. To assist me in viewing the data structures, I wrote a couple of functions to rearrange the structure into a form I could understand.
My question to you then is... do any of the presentations below, help you in understanding or visualizing the structure any better? I can provide support code in an edit, if any of these are useful.
arr1 = np.arange(24).reshape(2,3,2,2)
Sample array...
-shape (2, 3, 2, 2), ndim 4
-------------------------
-(0, + (3, 2, 2)
. 0 1 4 5 8 9
. 2 3 6 7 10 11
-------------------------
-(1, + (3, 2, 2)
. 12 13 16 17 20 21
. 14 15 18 19 22 23
Or presentation option 2
Alternate format
Main array...
shape: (2, 3, 2, 2)
[0,...] (3, 2, 2)
.[[[ 0 1]
. [ 2 3]]
. [[ 4 5]
. [ 6 7]]
. [[ 8 9]
. [10 11]]]
[1,...] (3, 2, 2)
.[[[12 13]
. [14 15]]
. [[16 17]
. [18 19]]
. [[20 21]
. [22 23]]]
Sorry not to be able to answer your question directly, but often the 'direct' answer isn't what is really needed to get to the root problem,
Related
I have a code in Matlab which I need to translate in Python. A point here that shapes and indexes are really important since it works with tensors. I'm a little bit confused since it seems that it's enough to use order='F' in python reshape(). But when I work with 3D data I noticed that it does not work. For example, if A is an array from 1 to 27 in python
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[[19, 20, 21],
[22, 23, 24],
[25, 26, 27]]])
if I perform A.reshape(3, 9, order='F') I get
[[ 1 4 7 2 5 8 3 6 9]
[10 13 16 11 14 17 12 15 18]
[19 22 25 20 23 26 21 24 27]]
In Matlab for A = 1:27 reshaped to [3, 3, 3] and then to [3, 9] it seems that I get another array:
1 4 7 10 13 16 19 22 25
2 5 8 11 14 17 20 23 26
3 6 9 12 15 18 21 24 27
And SVD in Matlab and Python gives different results. So, is there a way to fix this?
And maybe you know the correct way of operating with multidimensional arrays in Matlab -> python, like should I get the same SVD for arrays like arange(1, 13).reshape(3, 4) and in Matlab 1:12 -> reshape(_, [3, 4]) or what is the correct way to work with that? Maybe I can swap axes somehow in python to get the same results as in Matlab? Or change the order of axes in reshape(x1, x2, x3,...) in Python?
I was having the same issues, until I found this wikipedia article: row- and column-major order
Python (and C) organizes the data arrays in row major order. As you can see in your first example code, the elements first increases with the columns:
array([[[ 1, 2, 3],
- - - -> increasing
Then in the rows
array([[[ 1, 2, 3],
[ 4, <--- new element
When all columns and rows are full, it moves to the next page.
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, <-- new element in next page
In matlab (as fortran) increases first the rows, then the columns, and so on.
For N-dimensionals arrays it looks like:
Python (row major -> last dimension is contiguous): [dim1,dim2,...,dimN]
Matlab (column major -> first dimension is contiguous): the same tensor in memory would look the other way around .. [dimN,...,dim2,dim1]
If you want to export n-dim. arrays from python to matlab, the easiest way is to permute the dimensions first:
(in python)
import numpy as np
import scipy.io as sio
A=np.reshape(range(1,28),[3,3,3])
sio.savemat('A',{'A':A})
(in matlab)
load('A.mat')
A=permute(A,[3 2 1]);%dimensions in reverse ordering
reshape(A,9,3)' %gives the same result as A.reshape([3,9]) in python
Just notice that the (9,3) an the (3,9) are intentionally putted in reverse order.
In Matlab
A = 1:27;
A = reshape(A,3,3,3);
B = reshape(A,9,3)'
B =
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27
size(B)
ans =
3 9
In Python
A = np.array(range(1,28))
A = A.reshape(3,3,3)
B = A.reshape(3,9)
B
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18],
[19, 20, 21, 22, 23, 24, 25, 26, 27]])
np.shape(B)
(3, 9)
I know this question might be trivial but I am in the learning process. Given numpy 2D array, I want to take a block of rows using slicing approach. For instance, from the following matrix, I want to extract only the first three rows, so from:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[ 28 9 203 102]
[577 902 11 101]]
I want:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
My code here actually still missing something. I appreciate any hint.
X = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [28, 9, 203, 102], [577, 902, 11, 101]]
X = np.array(X)
X_sliced = X[3,:]
print(X_sliced)
Numpy matrices can be thought of as nested lists of lists. Element 1 is list 1, element 2 is list 2, and so on.
You can pull out a single row with x[n], where n is the row number you want.
You can pull out a range of rows with x[n:m], where n is the first row and m is the final row.
If you leave out n or m and do x[n:] or x[:m], Python will fill in the blank with either the start or beginning of the list. For example, x[n:] will return all rows from n to the end, and x[:m] will return all rows from the start to m.
You can accomplish what you want by doing x[:3], which is equivalent to asking for x[0:3].
As a disclaimer I'm very new to python and numpy arrays. Reading some of the answers to similar questions and trying their solutions for my own data hasn't been very helpful so I thought I'd just post my own question. For example, Reshaping 3D Numpy Array to a 2D array. Its entirely believable though that I've just implemented those other solutions wrong.
I have a 3D numpy array "C"
C = np.reshape(np.arange(3*3*4),(3,3,4))
print(C)
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]
[[24 25 26 27]
[28 29 30 31]
[32 33 34 35]]]
I would like to reshape it into something like:
[0 12 14], [1,13,25], [2,24,26] ..... etc
where the first elements of each of the 3 arrays gets put into its own array, then the second elements of each array get put into a new array, and so on.
It seems trivial, but I'm stumped. I've tried different types combinations of .reshape, just for example,
output=C.reshape(12,3)
I've tried changing the order from "C" to "F", playing around with different .reshape() parameters, but can't seem to actually get the final result in the desired structure
Any tips would be much appreciated.
I think this is what you want:
C = np.reshape(np.arange(3*3*4),(3,3,4))
C.reshape(3,12).T
array([[ 0, 12, 24],
[ 1, 13, 25],
[ 2, 14, 26],
[ 3, 15, 27],
[ 4, 16, 28],
[ 5, 17, 29],
[ 6, 18, 30],
[ 7, 19, 31],
[ 8, 20, 32],
[ 9, 21, 33],
[10, 22, 34],
[11, 23, 35]])
I had written a python program to sort a two-dimensional array using the second column and if elements in the second column are the same sort by the first column. Though I solved the problem with my rudimentary python knowledge.
I think it can be improved. Can anyone help optimizing it?
Please also suggest if using other data types for sorting will be good option?
#created a two dimensional array
two_dim_array=[[2, 5], [9, 1], [4, 8], [10, 0], [50, 32], [33, 31],[1, 5], [12, 5], [22, 5], [32, 5], [9, 5],[3, 31], [91, 32] ]
#get the length of the array
n_ship=len(two_dim_array)
#sorting two dimensional array by using second column
sort_by_second_column=sorted(two_dim_array, key=lambda x: x[1], reverse=False)
#declared new variable for storing second soeted array
new_final_data=[]
#parameter used to slice the two dimensional column
first_slice=0
#tmp=[]
index=[0]
for m in range(1, n_ship):
#print('m is: '+str(m)+'final_data[m-1][1] is: '+str(final_data[m-1][1])+'final_data[m][1] is: '+str(final_data[m][1]))
#subtracting second column elements to detect changes and saved to array
if(abs(sort_by_second_column[m-1][1]-sort_by_second_column[m][1])!=0):
index.append(m)
# print(index)
l=1
# used the above generated index to slice the data
for z in range(len(index)):
tmp=[]
if(l==1):
first_slice=0
last=index[z+1]
mid_start=index[z]
# print('l is start'+ 'first is '+str(first_slice)+'last is'+str(last))
v=sort_by_second_column[:last]
elif l==len(index):
first_slice=index[z]
# print('l is last'+str(1)+ 'first is '+str(first_slice)+'last is'+str(last))
v=sort_by_second_column[first_slice:]
else:
first_slice=index[z]
last=index[z+1]
#print('l is middle'+str(1)+ 'first is '+str(first_slice)+'last is'+str(last))
v=sort_by_second_column[first_slice:last]
tmp.extend(v)
tmp=sorted(tmp, key=lambda x: x[0], reverse=False)
#print(tmp)
new_final_data.extend(tmp)
# print(new_final_data)
l+=1
for l in range(n_ship):
print(str(new_final_data[l][0])+' '+str(new_final_data[l][1]))
''' Input
2 5
9 1
4 8
10 0
50 32
33 31
1 5
12 5
22 5
32 5
9 5
3 31
91 32
Output
10 0
9 1
1 5
2 5
9 5
12 5
22 5
32 5
4 8
3 31
33 31
50 32
91 32'''
You should read the documentation on sorted(), as this is exactly what you need to use:
https://docs.python.org/3/library/functions.html#sorted
newarray=sorted(two_dim_array, key=lambda x:(x[1],x[0]))
Outputs:
[10, 0]
[9, 1]
[1, 5]
[2, 5]
[9, 5]
[12, 5]
[22, 5]
[32, 5]
[4, 8]
[3, 31]
[33, 31]
[50, 32]
[91, 32]
Suppose I am working with numpy in Python and I have a two-dimensional array of arbitrary size. For convenience, let's say I have a 5 x 5 array. The specific numbers are not particularly important to my question; they're just an example.
a = numpy.arrange(25).reshape(5,5)
This yields:
[[0, 1, 2, 3, 4 ],
[5, 6, 7, 8, 9 ],
[10,11,12,13,14],
[15,16,17,18,19],
[20,21,22,23,24]]
Now, let's say I wanted to take a 2D slice of this array. In normal conditions, this would be easy. To get the cells immediately adjacent to 2,2 I would simply use a[1:4,1,4] which would yield the expected
[[6, 7, 8 ],
[11, 12, 13],
[16, 17, 18]]
But what if I want to take a slice that wraps
around the edges of the array? For example a[-1:2,-1:2] would yield:
[24, 20, 21],
[4, 0, 1 ],
[9, 5, 6 ]
This would be useful in several situations where the edges don't matter, for example game graphics that wrap around a screen. I realize this can be done with a lot of if statements and bounds-checking, but I was wondering if there was a cleaner, more idiomatic way to accomplish this.
Looking around, I have found several answers such as this: https://stackoverflow.com/questions/17739543/wrapping-around-slices-in-python-numpy that work for 1-dimensional arrays, but I have yet to figure out how to apply this logic to a 2D slice.
So essentially, the question is: how do I take a 2D slice of a 2D array in numpy that wraps around the edges of the array?
Thank you in advance to anyone who can help.
This will work with numpy >= 1.7.
a = np.arange(25).reshape(5,5)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
The pad routine has a 'wrap' method...
b = np.pad(a, 1, mode='wrap')
array([[24, 20, 21, 22, 23, 24, 20],
[ 4, 0, 1, 2, 3, 4, 0],
[ 9, 5, 6, 7, 8, 9, 5],
[14, 10, 11, 12, 13, 14, 10],
[19, 15, 16, 17, 18, 19, 15],
[24, 20, 21, 22, 23, 24, 20],
[ 4, 0, 1, 2, 3, 4, 0]])
Depending on the situation you may have to add 1 to each term of any slice in order to account for the padding around b.
After playing around with various methods for a while, I just came to a fairly simple solution that works using ndarray.take. Using the example I provided in the question:
a.take(range(-1,2),mode='wrap', axis=0).take(range(-1,2),mode='wrap',axis=1)
Provides the desired output of
[[24 20 21]
[4 0 1]
[9 5 6]]
It turns out to be a lot simpler than I thought it would be. This solution also works if you reverse the two axes.
This is similar to the previous answers I've seen using take, but I haven't seen anyone explain how it'd be used with a 2D array before, so I'm posting this in the hopes it helps someone with the same question in the future.
You can also use roll, to roll the array and then take your slice:
b = np.roll(np.roll(a, 1, axis=0), 1, axis=1)[:3,:3]
gives
array([[24, 20, 21],
[ 4, 0, 1],
[ 9, 5, 6]])
I had a similar challenge working with wrap-around indexing, only in my case I needed to set values in the original matrix. I've solved this by 'fancy indexing' and making use of meshgrid function:
A = arange(25).reshape((5,5)) # destinatoin matrix
print 'A:\n',A
k =-1* np.arange(9).reshape(3,3)# test kernel, all negative
print 'Kernel:\n', k
ix,iy = np.meshgrid(arange(3),arange(3)) # create x and y basis indices
pos = (0,-1) # insertion position
# create insertion indices
x = (ix+pos[0]) % A.shape[0]
y = (iy+pos[1]) % A.shape[1]
A[x,y] = k # set values
print 'Result:\n',A
The output:
A:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
Kernel:
[[ 0 -1 -2]
[-3 -4 -5]
[-6 -7 -8]]
Result:
[[-3 -6 2 3 0]
[-4 -7 7 8 -1]
[-5 -8 12 13 -2]
[15 16 17 18 19]
[20 21 22 23 24]]
As I mentioned in the comments, there is a good answer at How do I select a window from a numpy array with periodic boundary conditions?
Here is another simple way to do this
# First some setup
import numpy as np
A = np.arange(25).reshape((5, 5))
m, n = A.shape
and then
A[np.arange(i-1, i+2)%m].reshape((3, -1))[:,np.arange(j-1, j+2)%n]
It is somewhat harder to obtain something that you can assign to.
Here is a somewhat slower version.
In order to get a similar slice of values I would have to do
A.flat[np.array([np.arange(j-1,j+2)%n+a*n for a in xrange(i-1, i+2)]).ravel()].reshape((3,3))
In order to assign to this I would have to avoid the call to reshape and work directly with the flattened version returned by the fancy indexing.
Here is an example:
n = 7
A = np.zeros((n, n))
for i in xrange(n-2, 0, -1):
A.flat[np.array([np.arange(i-1,i+2)%n+a*n for a in xrange(i-1, i+2)]).ravel()] = i+1
print A
which returns
[[ 2. 2. 2. 0. 0. 0. 0.]
[ 2. 2. 2. 3. 0. 0. 0.]
[ 2. 2. 2. 3. 4. 0. 0.]
[ 0. 3. 3. 3. 4. 5. 0.]
[ 0. 0. 4. 4. 4. 5. 6.]
[ 0. 0. 0. 5. 5. 5. 6.]
[ 0. 0. 0. 0. 6. 6. 6.]]