I'm trying to understand a python code, a specific line of the code has troubled me a bit:
mean = np.average(data[:,index])
I understand that this is an average calculation of data declared early above, but what does [:,index]indicate?
I apologise if this question is duplicated, but please link me a solution before you mark it down. This is the first day I'm exposed to Python, please excuse my ignorance. Appreciate for any kind advice!
below is part of the original code
data = np.genfromtxt(args.inputfile)
def doBlocking(data,index):
ndata = data.shape[0]
ncols = data.shape[1]-1
#things unimportant
mean = np.average(data[:,index])
#more unimportance
It is so called slicing. In your case average of specific column (with index equal to variable with the name "index") of 2-dimensional array is calculated.
In this case data is a two dimensional numpy.array. Numpy supports slicing similar to that of Matlab
In [1]: import numpy as np
In [2]: data = np.arange(15)
In [3]: data
Out[3]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
In [4]: data = data.reshape([5,3])
In [5]: data
Out[5]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [6]: data[:, 1]
Out[6]: array([ 1, 4, 7, 10, 13])
As you can see it selects the second column
Your code above will get the mean of column index. It basically says "Compute the mean for data in every line, and column index"
Related
I was going through one of the documentation of NumPy module, I come across something like : If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b, I'm beginner to NumPy I thought there are only 2 axes 0 ( rows) and 1( columns) could someone please explain what it means? if I have ND array as say n=np.arange(16).reshape(4,4), which is the second to last axis?
when you first think of it as a simple data structure, you can think of 2-dimensional arrays as rows and columns. But here, instead of saying 0:represents row and 1:column, it is more correct to say 0:represents data and 1:represents dimensions.
In other words, you need to look at the dimension-based, not the axis-based.
np.arange(16).reshape(4,4)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
Here, we get an array with n*m(4*4) ie 4 dimensions, and 16 data in it.
Below, we obtain a 2-dimensional array containing 16 data.
np.arange(16).reshape(8,2)
array([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11],
[12, 13],
[14, 15]])
As for the question you want to learn.
a=np.arange(16).reshape(4,4)
print(a[:,-2])
array([ 2, 6, 10, 14])
The above expression returns data in the second-to-last dimension.
I have a problem, which seems to be easy but it is causing me a lot of headache.
The problem is that I'm programming in python (I'm relative new to it) and I'm looking for an aquivalent of the function max (min) of a matrix in matlab but using numpy.
What I want to do is to get the minimum value and its index in a matrix
Just to keep it as easiest as possible with an example, let's say this is the matrix:
arr2D = np.array([[11, 12, 13, 34],
[14, 15, 16, 3],
[17, 15, 11, 1],
[7, 5, 11, 4],
[1, 12, 4, 4],
[12, 14, 15,-3]])
in matlab I would do:
[local_max, index] = min(arr2D)
and I would get the min value and its index for every column in the matrix.
Trying to repeat the same in python (after looking here and here) with the following code:
print(np.where(arr2D == np.amin(arr2D, axis = 0))) # axis 0 is for columns
I get the following output:
(array([3, 4, 4, 5]), array([1, 0, 2, 3]))
which is not really what I want to get!
The expected output should be:
[1, 4] # Meaning the minimum value is 1 and it is in row 4 for the first column
[5, 3] # Meaning the minimum value is 5 and it is in row 3 for the second column
[4, 4] # Meaning the minimum value is 4 and it is in row 4 for the third column
[-3, 5] # Meaning the minimum value is -3 and it is in row 5 for the last column
I cannot use the output I get by:
print(np.where(arr2D == np.amin(arr2D, axis = 0)))
Or I don't understand the output or that's not the right way to get the aquivalent function max (min) of matlab.
Could you help me?
UPDATE:
I forgot to say that the matrix is float and not integer. I used integer just for the example
np.amin or np.min returns the min values along an axis
np.amin(arr2D, axis=0)
Out:
array([ 1, 5, 4, -3])
np.argmin returns the indices
np.argmin(arr2D, axis=0)
Out:
array([4, 3, 4, 5])
To get the desired output you can use np.vstack and transpose the array
np.vstack([np.amin(arr2D, axis=0), np.argmin(arr2D, axis=0)]).T
Out:
array([[ 1, 4],
[ 5, 3],
[ 4, 4],
[-3, 5]])
Use this code (you can simply make a function out of it):
import numpy as np
arr2D = np.array([[11, 12, 13, 34],
[14, 15, 16, 3],
[17, 15, 11, 1],
[7, 5, 11, 4],
[1, 12, 4, 4],
[12, 14, 15,-3]])
flat = arr2D.flatten()
arrayIndex = flat.tolist().index(min(flat))
// results
rowIndex = int(minIndex/arr2D.shape[0])
columnIndex = minIndex % arr2D.shape[1]
I'm writing code that includes the algorithm to find local maximum/minimum values in array. But I failed to find the proper function.
At first, I used argrelextrema in scipy.signal.
b = [6, 1, 3, 5, 5, 3, 1, 2, 2, 3, 2, 1, 1, 9, 10, 10, 9, 8, 7, 7, 13, 10]
scipy.signal.argrelextrema(np.array(b), np.greater)
scipy.signal.argrelextrema(np.array(b), np.greater_equal)
scipy.signal.argrelextrema(np.array(b), np.greater_equal, order=2)
The result is
(array([ 9, 20], dtype=int64),)
(array([ 0, 3, 4, 7, 9, 14, 15, 20], dtype=int64),)
(array([ 0, 3, 4, 9, 14, 15, 20], dtype=int64),)
First one didn't catch the b[3](or b[4]). So I modified it to second one, using np.greater_equal. However, in this case, the first value b[0] is also treated as local maximum, and the value 2 in b[7] is included. By using third one, I could throw away b[7]. But order=2 still also has problem when data is like [1, 3, 1, 4, 1] (it can't catch 3)
My expected result is
[3(or 4), 9, 14(or 15), 20]
I want to catch only one among b[3], b[4] (same value). I want some problems of argrelextrema I mentioned above to be solved. The code below succeeded.
scipy.signal.find_peaks(b)
the result is [3, 9, 14, 20].
The code I'm writing is treating the pair of local maximum, and local minimum. So I want to find the local minimum in the same way. Is there any function like scipy.signal.find_peaks to find local minimum?
You could simply apply find_peaks to the negative version of your array:
from scipy.signal import find_peaks
min_idx = find_peaks([-x for x in b])
Even more convenient when using numpy arrays:
import numpy as np
b = np.array(b)
min_idx = find_peaks(-b)
For a 2*N x 2*N array x, I'd like to swap rows [0:N] with rows [N:2*N] in a particular way, namely, the question I have is, if there is a 'built-in' way of 'adding / joining' slice objects to achieve this? I.e. something like:
x[N:2*N + 0:N,:]
although, the preceding does something different.
Certainly I could do things like vstack((x[N:2*N,:],x[0:N,:])), which is not really what I'm looking for, or x[[i for i in range(N)]+[i for i in range(N,2*N)],:], which probably is slow.
I think you're looking for numpy.r_, which "translates slice objects to concatenation along the first axis". It allows you to perform more complex slices along the first axis - you can concatenate multiple slices with commas: np.r_[5:10, 100:200:10, 15, 20, 0:5].
For example:
>>> import numpy as np
>>> N = 2
>>> x = np.arange(16).reshape(4, 4)
>>> x[np.r_[N:2*N, 0:N]]
array([[ 8, 9, 10, 11],
[12, 13, 14, 15],
[ 0, 1, 2, 3],
[ 4, 5, 6, 7]])
And in this specific case, you could also just np.roll it:
>>> np.roll(x, N, axis=0)
array([[ 8, 9, 10, 11],
[12, 13, 14, 15],
[ 0, 1, 2, 3],
[ 4, 5, 6, 7]])
Given a numpy array such as:
x = array([[0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
How do I form a new array composed of the first and third columns?
To extract the first and third columns from the array use the following syntax:
x[:,[0,2]]
This means, take all rows, selecting only columns 0 and 2. Note that indexing in numPy arrays starts at zero.