Related
I have an array and need to normalize it in a way that the results will be numbers between 0 and 1. I already normalized the entire array as follows:
C = A / A.max(axis=0)
print(C)
____________________________________________________________________
[[0. 0.05263158 0.1 0.14285714 0.18181818 0.2173913 ]
[0.33333333 0.36842105 0.4 0.42857143 0.45454545 0.47826087]
[0.66666667 0.68421053 0.7 0.71428571 0.72727273 0.73913043]
[1. 1. 1. 1. 1. 1. ]]
But now I have to normalize by column and by line. How can I do that with axis reduction? If there is a better way to what I did, suggest me alterations.
My expected result is two arrays with the values normalized. One considering the columns and the other by the lines
This is my data
A = [[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
You skip the minimum part. Normally a 0-1 normalization demanding subtracting a minimum value from denominator and numerator.
https://stats.stackexchange.com/questions/70801/how-to-normalize-data-to-0-1-range
import numpy as np
A = np.matrix([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
(A-A.min(axis=1))/(A.max(axis=1)-A.min(axis=1))
(A-A.min(axis=0))/(A.max(axis=0)-A.min(axis=0))
My expected result is two arrays with the values normalized. One considering the columns and the other by the lines
a = np.array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
If
c = a / a.max(axis=0)
gives you what you want for the columns then
d = a / a.max(axis=1)[:,None]
will suffice for the rows.
>>> d.round(4)
array([[0. , 0.2 , 0.4 , 0.6 , 0.8 , 1. ],
[0.5455, 0.6364, 0.7273, 0.8182, 0.9091, 1. ],
[0.7059, 0.7647, 0.8235, 0.8824, 0.9412, 1. ],
[0.7826, 0.8261, 0.8696, 0.913 , 0.9565, 1. ]])
https://numpy.org/doc/stable/user/basics.broadcasting.html
I have a code in Matlab which I need to translate in Python. A point here that shapes and indexes are really important since it works with tensors. I'm a little bit confused since it seems that it's enough to use order='F' in python reshape(). But when I work with 3D data I noticed that it does not work. For example, if A is an array from 1 to 27 in python
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[[19, 20, 21],
[22, 23, 24],
[25, 26, 27]]])
if I perform A.reshape(3, 9, order='F') I get
[[ 1 4 7 2 5 8 3 6 9]
[10 13 16 11 14 17 12 15 18]
[19 22 25 20 23 26 21 24 27]]
In Matlab for A = 1:27 reshaped to [3, 3, 3] and then to [3, 9] it seems that I get another array:
1 4 7 10 13 16 19 22 25
2 5 8 11 14 17 20 23 26
3 6 9 12 15 18 21 24 27
And SVD in Matlab and Python gives different results. So, is there a way to fix this?
And maybe you know the correct way of operating with multidimensional arrays in Matlab -> python, like should I get the same SVD for arrays like arange(1, 13).reshape(3, 4) and in Matlab 1:12 -> reshape(_, [3, 4]) or what is the correct way to work with that? Maybe I can swap axes somehow in python to get the same results as in Matlab? Or change the order of axes in reshape(x1, x2, x3,...) in Python?
I was having the same issues, until I found this wikipedia article: row- and column-major order
Python (and C) organizes the data arrays in row major order. As you can see in your first example code, the elements first increases with the columns:
array([[[ 1, 2, 3],
- - - -> increasing
Then in the rows
array([[[ 1, 2, 3],
[ 4, <--- new element
When all columns and rows are full, it moves to the next page.
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, <-- new element in next page
In matlab (as fortran) increases first the rows, then the columns, and so on.
For N-dimensionals arrays it looks like:
Python (row major -> last dimension is contiguous): [dim1,dim2,...,dimN]
Matlab (column major -> first dimension is contiguous): the same tensor in memory would look the other way around .. [dimN,...,dim2,dim1]
If you want to export n-dim. arrays from python to matlab, the easiest way is to permute the dimensions first:
(in python)
import numpy as np
import scipy.io as sio
A=np.reshape(range(1,28),[3,3,3])
sio.savemat('A',{'A':A})
(in matlab)
load('A.mat')
A=permute(A,[3 2 1]);%dimensions in reverse ordering
reshape(A,9,3)' %gives the same result as A.reshape([3,9]) in python
Just notice that the (9,3) an the (3,9) are intentionally putted in reverse order.
In Matlab
A = 1:27;
A = reshape(A,3,3,3);
B = reshape(A,9,3)'
B =
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27
size(B)
ans =
3 9
In Python
A = np.array(range(1,28))
A = A.reshape(3,3,3)
B = A.reshape(3,9)
B
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18],
[19, 20, 21, 22, 23, 24, 25, 26, 27]])
np.shape(B)
(3, 9)
I have a 2-d numpy array of this form:
[[ 0. 1. 2. 3. 4.]
[ 5. 6. 7. 8. 9.]
[ 10. 11. 12. 13. 14.]
[ 15. 16. 17. 18. 19.]
[ 20. 21. 22. 23. 24.]
[ 25. 26. 27. 28. 29.]
[ 30. 31. 32. 33. 34.]
[ 35. 36. 37. 38. 39.]
[ 40. 41. 42. 43. 44.]
[ 45. 46. 47. 48. 49.]]
I want to construct a view of the array, grouping its elements in a moving window (of size 4 in my example). My result should be of shape (6, 4, 5) and I can construct it as follows:
res = []
mem = 4
for i in range(mem, X.shape[0]+1):
res.append(X[i-mem:i, : ])
res = np.asarray(res)
print res.shape
I want to avoid reallocation, so I wonder if I can construct a view to give this result, with as_strided for example.
An explanation of the process is very welcome.
Thanks
Here's an approach with requested np.lib.stride_tricks.as_strided -
def strided_axis0(a, L):
# INPUTS :
# a is array
# L is length of array along axis=0 to be cut for forming each subarray
# Length of 3D output array along its axis=0
nd0 = a.shape[0] - L + 1
# Store shape and strides info
m,n = a.shape
s0,s1 = a.strides
# Finally use strides to get the 3D array view
return np.lib.stride_tricks.as_strided(a, shape=(nd0,L,n), strides=(s0,s0,s1))
Sample run -
In [48]: X = np.arange(35).reshape(-1,5)
In [49]: X
Out[49]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]])
In [50]: strided_axis0(X, L=4)
Out[50]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]],
[[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34]]])
I wrote this function:
import numpy as np
def indices_moving_window(arr, win):
win_h = win[0]
win_w = win[1]
fh = arr.shape[0] - win_h + 1 # Final height
fw = arr.shape[1] - win_w + 1 # Final width
# Generate indices needed to iterate through the array with the moving window
ir = np.repeat(np.arange(fh), win_w).reshape(1, -1, win_w)
ir = np.repeat(ir, win_h, axis=1).reshape(-1, win_h, win_w)
ir = np.add(ir, np.arange(win_h).reshape(-1, win_h, 1))
ir = np.repeat(ir, fw, axis=0).reshape(fh, fw, win_h, win_w)
ic = np.repeat(np.arange(fw), win_h).reshape(1, -1, win_h)
ic = np.repeat(ic, win_w, axis=1).reshape(-1, win_h, win_w)
ic = np.add(ic, np.arange(win_w))
ic = ic.reshape(-1, win_w)
ic = np.tile(ic, (fh, 1))
ic = ic.reshape(fh, fw, win_h, win_w)
return ir, ic # Return indices for rows and columns
Example:
arr = np.arange(1,21).reshape(4,5)
rows, cols = indices_moving_window(arr, (3,4))
print(arr)
print(arr[rows,cols])
Output:
[[ 1 2 3 4 5]
[ 6 7 8 9 10]
[11 12 13 14 15]
[16 17 18 19 20]]
[[[[ 1 2 3 4]
[ 6 7 8 9]
[11 12 13 14]]
[[ 2 3 4 5]
[ 7 8 9 10]
[12 13 14 15]]]
[[[ 6 7 8 9]
[11 12 13 14]
[16 17 18 19]]
[[ 7 8 9 10]
[12 13 14 15]
[17 18 19 20]]]]
Suppose I am working with numpy in Python and I have a two-dimensional array of arbitrary size. For convenience, let's say I have a 5 x 5 array. The specific numbers are not particularly important to my question; they're just an example.
a = numpy.arrange(25).reshape(5,5)
This yields:
[[0, 1, 2, 3, 4 ],
[5, 6, 7, 8, 9 ],
[10,11,12,13,14],
[15,16,17,18,19],
[20,21,22,23,24]]
Now, let's say I wanted to take a 2D slice of this array. In normal conditions, this would be easy. To get the cells immediately adjacent to 2,2 I would simply use a[1:4,1,4] which would yield the expected
[[6, 7, 8 ],
[11, 12, 13],
[16, 17, 18]]
But what if I want to take a slice that wraps
around the edges of the array? For example a[-1:2,-1:2] would yield:
[24, 20, 21],
[4, 0, 1 ],
[9, 5, 6 ]
This would be useful in several situations where the edges don't matter, for example game graphics that wrap around a screen. I realize this can be done with a lot of if statements and bounds-checking, but I was wondering if there was a cleaner, more idiomatic way to accomplish this.
Looking around, I have found several answers such as this: https://stackoverflow.com/questions/17739543/wrapping-around-slices-in-python-numpy that work for 1-dimensional arrays, but I have yet to figure out how to apply this logic to a 2D slice.
So essentially, the question is: how do I take a 2D slice of a 2D array in numpy that wraps around the edges of the array?
Thank you in advance to anyone who can help.
This will work with numpy >= 1.7.
a = np.arange(25).reshape(5,5)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
The pad routine has a 'wrap' method...
b = np.pad(a, 1, mode='wrap')
array([[24, 20, 21, 22, 23, 24, 20],
[ 4, 0, 1, 2, 3, 4, 0],
[ 9, 5, 6, 7, 8, 9, 5],
[14, 10, 11, 12, 13, 14, 10],
[19, 15, 16, 17, 18, 19, 15],
[24, 20, 21, 22, 23, 24, 20],
[ 4, 0, 1, 2, 3, 4, 0]])
Depending on the situation you may have to add 1 to each term of any slice in order to account for the padding around b.
After playing around with various methods for a while, I just came to a fairly simple solution that works using ndarray.take. Using the example I provided in the question:
a.take(range(-1,2),mode='wrap', axis=0).take(range(-1,2),mode='wrap',axis=1)
Provides the desired output of
[[24 20 21]
[4 0 1]
[9 5 6]]
It turns out to be a lot simpler than I thought it would be. This solution also works if you reverse the two axes.
This is similar to the previous answers I've seen using take, but I haven't seen anyone explain how it'd be used with a 2D array before, so I'm posting this in the hopes it helps someone with the same question in the future.
You can also use roll, to roll the array and then take your slice:
b = np.roll(np.roll(a, 1, axis=0), 1, axis=1)[:3,:3]
gives
array([[24, 20, 21],
[ 4, 0, 1],
[ 9, 5, 6]])
I had a similar challenge working with wrap-around indexing, only in my case I needed to set values in the original matrix. I've solved this by 'fancy indexing' and making use of meshgrid function:
A = arange(25).reshape((5,5)) # destinatoin matrix
print 'A:\n',A
k =-1* np.arange(9).reshape(3,3)# test kernel, all negative
print 'Kernel:\n', k
ix,iy = np.meshgrid(arange(3),arange(3)) # create x and y basis indices
pos = (0,-1) # insertion position
# create insertion indices
x = (ix+pos[0]) % A.shape[0]
y = (iy+pos[1]) % A.shape[1]
A[x,y] = k # set values
print 'Result:\n',A
The output:
A:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
Kernel:
[[ 0 -1 -2]
[-3 -4 -5]
[-6 -7 -8]]
Result:
[[-3 -6 2 3 0]
[-4 -7 7 8 -1]
[-5 -8 12 13 -2]
[15 16 17 18 19]
[20 21 22 23 24]]
As I mentioned in the comments, there is a good answer at How do I select a window from a numpy array with periodic boundary conditions?
Here is another simple way to do this
# First some setup
import numpy as np
A = np.arange(25).reshape((5, 5))
m, n = A.shape
and then
A[np.arange(i-1, i+2)%m].reshape((3, -1))[:,np.arange(j-1, j+2)%n]
It is somewhat harder to obtain something that you can assign to.
Here is a somewhat slower version.
In order to get a similar slice of values I would have to do
A.flat[np.array([np.arange(j-1,j+2)%n+a*n for a in xrange(i-1, i+2)]).ravel()].reshape((3,3))
In order to assign to this I would have to avoid the call to reshape and work directly with the flattened version returned by the fancy indexing.
Here is an example:
n = 7
A = np.zeros((n, n))
for i in xrange(n-2, 0, -1):
A.flat[np.array([np.arange(i-1,i+2)%n+a*n for a in xrange(i-1, i+2)]).ravel()] = i+1
print A
which returns
[[ 2. 2. 2. 0. 0. 0. 0.]
[ 2. 2. 2. 3. 0. 0. 0.]
[ 2. 2. 2. 3. 4. 0. 0.]
[ 0. 3. 3. 3. 4. 5. 0.]
[ 0. 0. 4. 4. 4. 5. 6.]
[ 0. 0. 0. 5. 5. 5. 6.]
[ 0. 0. 0. 0. 6. 6. 6.]]
I have the following code that iterates along the diagonals that are orthogonal to the diagonals normally returned by np.diagonal. It starts at position (0, 0) and works its way towards the lower right coordinate.
The code works as intended but is not very numpy with all its loops and inefficient in having to create many arrays to do the trick.
So I wonder if there is a nicer way to do this, because I don't see how I would stride my array or use the diagonal-methods of numpy to do it in a nicer way (though I expect there are some tricks I fail to see).
import numpy as np
A = np.zeros((4,5))
#Construct a distance array of same size that uses (0, 0) as origo
#and evaluates distances along first and second dimensions slightly
#differently so that no values in the array is the same
D = np.zeros(A.shape)
for i in range(D.shape[0]):
for j in range(D.shape[1]):
D[i, j] = i * (1 + 1.0 / (grid_shape[0] + 1)) + j
print D
#[[ 0. 1. 2. 3. 4. ]
# [ 1.05882353 2.05882353 3.05882353 4.05882353 5.05882353]
# [ 2.11764706 3.11764706 4.11764706 5.11764706 6.11764706]
# [ 3.17647059 4.17647059 5.17647059 6.17647059 7.17647059]]
#Make a flat sorted copy
rD = D.ravel().copy()
rD.sort()
#Just to show how it works, assigning incrementing values
#iterating along the 'orthagonal' diagonals starting at (0, 0) position
for i, v in enumerate(rD):
A[D == v] = i
print A
#[[ 0 1 3 6 10]
# [ 2 4 7 11 14]
# [ 5 8 12 15 17]
# [ 9 13 16 18 19]]
Edit
To clarify, I want to iterate element-wise through the entire A but doing so in the order the code above invokes (which is displayed in the final print).
It is not important which direction the iteration goes along the diagonals (if 1 and 2 switched placed, and 3 and 5 etc. in A) only that the diagonals are orthogonal to the main diagonal of A (the one produced by np.diag(A)).
The application/reason for this question is in my previous question (in the solution part at the bottom of that question): Constructing a 2D grid from potentially incomplete list of candidates
Here is a way that avoids Python for-loops.
First, let's look at our addition tables:
import numpy as np
grid_shape = (4,5)
N = np.prod(grid_shape)
y = np.add.outer(np.arange(grid_shape[0]),np.arange(grid_shape[1]))
print(y)
# [[0 1 2 3 4]
# [1 2 3 4 5]
# [2 3 4 5 6]
# [3 4 5 6 7]]
The key idea is that if we visit the sums in the addition table in order, we would be iterating through the array in the desired order.
We can find out the indices associated with that order using np.argsort:
idx = np.argsort(y.ravel())
print(idx)
# [ 0 1 5 2 6 10 3 7 11 15 4 8 12 16 9 13 17 14 18 19]
idx is golden. It is essentially everything you need to iterate through any 2D array of shape (4,5), since a 2D array is just a 1D array reshaped.
If your ultimate goal is to generate the array A that you show above at the end of your post, then you could use argsort again:
print(np.argsort(idx).reshape(grid_shape[0],-1))
# [[ 0 1 3 6 10]
# [ 2 4 7 11 14]
# [ 5 8 12 15 17]
# [ 9 13 16 18 19]]
Or, alternatively, if you need to assign other values to A, perhaps this would be more useful:
A = np.zeros(grid_shape)
A1d = A.ravel()
A1d[idx] = np.arange(N) # you can change np.arange(N) to any 1D array of shape (N,)
print(A)
# [[ 0. 1. 3. 6. 10.]
# [ 2. 4. 7. 11. 15.]
# [ 5. 8. 12. 16. 18.]
# [ 9. 13. 14. 17. 19.]]
I know you asked for a way to iterate through your array, but I wanted to show the above because generating arrays through whole-array assignment or numpy function calls (like np.argsort) as done above will probably be faster than using a Python loop. But if you need to use a Python loop, then:
for i, j in enumerate(idx):
A1d[j] = i
print(A)
# [[ 0. 1. 3. 6. 10.]
# [ 2. 4. 7. 11. 15.]
# [ 5. 8. 12. 16. 18.]
# [ 9. 13. 14. 17. 19.]]
>>> D
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
>>> D[::-1].diagonal(offset=1)
array([16, 12, 8, 4])
>>> D[::-1].diagonal(offset=-3)
array([0])
>>> np.hstack([D[::-1].diagonal(offset=-x) for x in np.arange(-4,4)])[::-1]
array([ 0, 1, 5, 2, 6, 10, 3, 7, 11, 15, 4, 8, 12, 16, 9, 13, 17,
14, 18, 19])
Simpler as long as it is not a large matrix.
I'm not sure if this is what you really want, but maybe:
>>> import numpy as np
>>> ar = np.random.random((4,4))
>>> ar
array([[ 0.04844116, 0.10543146, 0.30506354, 0.4813217 ],
[ 0.59962641, 0.44428831, 0.16629692, 0.65330539],
[ 0.61854927, 0.6385717 , 0.71615447, 0.13172049],
[ 0.05001291, 0.41577457, 0.5579213 , 0.7791656 ]])
>>> ar.diagonal()
array([ 0.04844116, 0.44428831, 0.71615447, 0.7791656 ])
>>> ar[::-1].diagonal()
array([ 0.05001291, 0.6385717 , 0.16629692, 0.4813217 ])
Edit
As a general solution, for arbitrarily shape arrays, you can use
import numpy as np
shape = tuple([np.random.randint(3,10) for i in range(2)])
ar = np.arange(np.prod(shape)).reshape(shape)
out = np.hstack([ar[::-1].diagonal(offset=x) \
for x in np.arange(-ar.shape[0]+1,ar.shape[1]-1)])
print ar
print out
giving, for example
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
[ 0 5 1 10 6 2 15 11 7 3 20 16 12 8 4 21 17 13 9 22 18 14 23 19]