Mean subtraction of patches in python nympy scipy - python

I have a numpy array of 3 dimension, it's a grid of patches of 8x8 images.
What is the best way to subtract from each patch it's average, in other words each patch has a unique mean and I want to subtract it. I tried the following with no success obviously because both arrays are not equal in shape
patches=- patches.mean(axis = 2).mean(axis = 1)
I thought of using the repeat function, something like:
patches=- np.repeat(np.repeat(patches.mean(axis =2).mean(axis =1).reshape((n_patches, 8, 8)), 1, 1))
Put I think that following this route would lead to an inefficient solution. Any thoughts or solution on this?

import numpy as np
a = np.random.rand(10,8,8)
mean = a.mean(axis=2).mean(axis=1)
b = a - mean[:, np.newaxis, np.newaxis] # reshape the mean as (10, 1, 1)

I think you are looking for broadcasting:
http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

Related

Repmat operation in python

I want to calculate the mean of a 3D array along two axes and subtract this mean from the array.
In Matlab I use the repmat function to achieve this as follows
% A is an array of size 100x50x100
mean_A = mean(mean(A,3),1); % mean_A is 1D of length 50
Am = repmat(mean_A,[100,1,100]) % Am is 3D 100x50x100
flc_A = A - Am % flc_A is 3D 100x50x100
Now, I am trying to do the same with python.
mean_A = numpy.mean(numpy.mean(A,axis=2),axis=0);
gives me the 1D array. However, I cannot find a way to copy this to form a 3D array using numpy.tile().
Am I missing something or is there another way to do this in python?
You could set keepdims to True in both cases so the resulting shape is broadcastable and use np.broadcast_to to broadcast to the shape of A:
np.broadcast_to(np.mean(np.mean(A,2,keepdims=True),axis=0,keepdims=True), A.shape)
Note that you can also specify a tuple of axes along which to take the successive means:
np.broadcast_to(np.mean(A,axis=tuple([2,0]), keepdims=True), A.shape)
numpy.tile is not the same with Matlab repmat. You could refer to this question. However, there is an easy way to repeat the work you have done in Matlab. And you don't really have to understand how numpy.tile works in Python.
import numpy as np
A = np.random.rand(100, 50, 100)
# keep the dims of the array when calculating mean values
B = np.mean(A, axis=2, keepdims=True)
C = np.mean(B, axis=0, keepdims=True) # now the shape of C is (1, 50, 1)
# then simply duplicate C in the first and the third dimensions
D = np.repeat(C, 100, axis=0)
D = np.repeat(D, 100, axis=2)
D is the 3D array you want.

Selecting last column of a Numpy array while maintaining then umber of dimensions? [duplicate]

I'm using numpy and want to index a row without losing the dimension information.
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10,:]
xslice.shape # >> (10,)
In this example xslice is now 1 dimension, but I want it to be (1,10).
In R, I would use X[10,:,drop=F]. Is there something similar in numpy. I couldn't find it in the documentation and didn't see a similar question asked.
Thanks!
Another solution is to do
X[[10],:]
or
I = array([10])
X[I,:]
The dimensionality of an array is preserved when indexing is performed by a list (or an array) of indexes. This is nice because it leaves you with the choice between keeping the dimension and squeezing.
It's probably easiest to do x[None, 10, :] or equivalently (but more readable) x[np.newaxis, 10, :]. None or np.newaxis increases the dimension of the array by 1, so that you're back to the original after the slicing eliminates a dimension.
As far as why it's not the default, personally, I find that constantly having arrays with singleton dimensions gets annoying very quickly. I'd guess the numpy devs felt the same way.
Also, numpy handle broadcasting arrays very well, so there's usually little reason to retain the dimension of the array the slice came from. If you did, then things like:
a = np.zeros((100,100,10))
b = np.zeros(100,10)
a[0,:,:] = b
either wouldn't work or would be much more difficult to implement.
(Or at least that's my guess at the numpy dev's reasoning behind dropping dimension info when slicing)
I found a few reasonable solutions.
1) use numpy.take(X,[10],0)
2) use this strange indexing X[10:11:, :]
Ideally, this should be the default. I never understood why dimensions are ever dropped. But that's a discussion for numpy...
Here's an alternative I like better. Instead of indexing with a single number, index with a range. That is, use X[10:11,:]. (Note that 10:11 does not include 11).
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10:11,:]
xslice.shape # >> (1,10)
This makes it easy to understand with more dimensions too, no None juggling and figuring out which axis to use which index. Also no need to do extra bookkeeping regarding array size, just i:i+1 for any i that you would have used in regular indexing.
b = np.ones((2, 3, 4))
b.shape # >> (2, 3, 4)
b[1:2,:,:].shape # >> (1, 3, 4)
b[:, 2:3, :].shape . # >> (2, 1, 4)
To add to the solution involving indexing by lists or arrays by gnebehay, it is also possible to use tuples:
X[(10,),:]
This is especially annoying if you're indexing by an array that might be length 1 at runtime. For that case, there's np.ix_:
some_array[np.ix_(row_index,column_index)]
I've been using np.reshape to achieve the same as shown below
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10,:].reshape(1, -1)
xslice.shape # >> (1, 10)

Mean over multiple axis in NumPy

I Want to write the code below as Pythonic way, applying mean over two axis. What the best way to do this?
import numpy as np
m = np.random.rand(30, 10, 10)
m_mean = np.zeros((30, 1))
for j in range(30):
m_mean[j, 0] = m[j, :, :].mean()
If you have a sufficiently recent NumPy, you can do
m_mean = m.mean(axis=(1, 2))
I believe this was introduced in 1.7, though I'm not sure. The documentation was only updated to reflect this in 1.10, but it worked earlier than that.
If your NumPy is too old, you can take the mean a bit more manually:
m_mean = m.sum(axis=2).sum(axis=1) / np.prod(m.shape[1:3])
These will both produce 1-dimensional results. If you really want that extra length-1 axis, you can do something like m_mean = m_mean[:, np.newaxis] to put the extra axis there.
You can also use the numpy.mean() ufunc and pass the output array as an argument to out= as in:
np.mean(m, axis=(1, 2), out=m_mean)

Efficient two dimensional numpy array statistics

I have many 100x100 grids, is there an efficient way using numpy to calculate the median for every grid point and return just one 100x100 grid with the median values? Presently, I'm using a for loop to run through each grid point, calculating the median and then combining them into one grid at the end. I'm sure there's a better way to do this using numpy. Any help would be appreciated! Thanks!
Create as 100x100xN array (or stack together if that's not possible) and use np.median with the correct axis to do it in one go:
import numpy as np
a = np.random.rand(100,100)
b = np.random.rand(100,100)
c = np.random.rand(100,100)
d = np.dstack((a,b,c))
result = np.median(d,axis=2)
How many grids are there?
One option would be to create a 3D array that is 100x100xnumGrids and compute the median across the 3rd dimension.
use axis parameter of median:
import numpy as np
data = np.random.rand(100, 5, 5)
print np.median(data, axis=0)
print np.median(data[:, 0, 0])
print np.median(data[:, 1, 0])

Numpy index slice without losing dimension information

I'm using numpy and want to index a row without losing the dimension information.
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10,:]
xslice.shape # >> (10,)
In this example xslice is now 1 dimension, but I want it to be (1,10).
In R, I would use X[10,:,drop=F]. Is there something similar in numpy. I couldn't find it in the documentation and didn't see a similar question asked.
Thanks!
Another solution is to do
X[[10],:]
or
I = array([10])
X[I,:]
The dimensionality of an array is preserved when indexing is performed by a list (or an array) of indexes. This is nice because it leaves you with the choice between keeping the dimension and squeezing.
It's probably easiest to do x[None, 10, :] or equivalently (but more readable) x[np.newaxis, 10, :]. None or np.newaxis increases the dimension of the array by 1, so that you're back to the original after the slicing eliminates a dimension.
As far as why it's not the default, personally, I find that constantly having arrays with singleton dimensions gets annoying very quickly. I'd guess the numpy devs felt the same way.
Also, numpy handle broadcasting arrays very well, so there's usually little reason to retain the dimension of the array the slice came from. If you did, then things like:
a = np.zeros((100,100,10))
b = np.zeros(100,10)
a[0,:,:] = b
either wouldn't work or would be much more difficult to implement.
(Or at least that's my guess at the numpy dev's reasoning behind dropping dimension info when slicing)
I found a few reasonable solutions.
1) use numpy.take(X,[10],0)
2) use this strange indexing X[10:11:, :]
Ideally, this should be the default. I never understood why dimensions are ever dropped. But that's a discussion for numpy...
Here's an alternative I like better. Instead of indexing with a single number, index with a range. That is, use X[10:11,:]. (Note that 10:11 does not include 11).
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10:11,:]
xslice.shape # >> (1,10)
This makes it easy to understand with more dimensions too, no None juggling and figuring out which axis to use which index. Also no need to do extra bookkeeping regarding array size, just i:i+1 for any i that you would have used in regular indexing.
b = np.ones((2, 3, 4))
b.shape # >> (2, 3, 4)
b[1:2,:,:].shape # >> (1, 3, 4)
b[:, 2:3, :].shape . # >> (2, 1, 4)
To add to the solution involving indexing by lists or arrays by gnebehay, it is also possible to use tuples:
X[(10,),:]
This is especially annoying if you're indexing by an array that might be length 1 at runtime. For that case, there's np.ix_:
some_array[np.ix_(row_index,column_index)]
I've been using np.reshape to achieve the same as shown below
import numpy as np
X = np.zeros((100,10))
X.shape # >> (100, 10)
xslice = X[10,:].reshape(1, -1)
xslice.shape # >> (1, 10)

Categories

Resources