I have a segmentation map with 10 classes (A numpy array of size (m,n,1) which every element is a number from 1~10 specifying a class that the pixel belongs to). I want to convert it to an array of size (m,n,10) where each channel is mask for elements of that specific class. I can do it using a for loop like this:
for i in range(10):
mask[:,:,i] = (seg_map==i)[:,:,0]
but I need a faster way to do this. The for loop takes too much time. Is there any built in function that can outperform the for loop.
Thanks in advance.
One approach:
import numpy as np
np.random.seed(42)
# toy data
data = np.random.randint(0, 10, 20).reshape((5, 4, 1))
# https://stackoverflow.com/a/37323404/4001592
n_values = 10
values = data.flatten()
encoded = np.eye(n_values)[data.ravel()].reshape((5, 4, 10))
match = np.allclose(data.reshape(5, 4), encoded.argmax(-1))
print(match)
One way to verify that the output is correct is to verify that the one-hot encoded value matches back with the index, as below:
match = np.allclose(data.reshape(5, 4), encoded.argmax(-1))
print(match)
Output
True
Related
Is there a way to generate a bootstrap sample on an N-dimensional array? I am limited to using numpy==1.19.4
I have already tried using a for loop on the other dimensions to no avail, but the following works for 1-dimensional arrays.
import numpy as np
# Set random state and number of resamples
random.seed(random_state)
n_resamples = 9999
# Generate data
data_1d = np.arange(2, 3, 0.1)
data_nd = np.random.default_rng(42).random((2,3,2))
data = data_1d.copy()
# Resample the data with replacement, computing the test statistic for each set of resamples
bs_samples = [np.std(np.random.choice(data, size=len(data))) for _ in range(n_resamples)]
If I get your problem, I use to apply this method:
suppose you have this multi-dimensionale array:
data_nd = np.random.rand(100, 3, 2)
data_nd.shape #(100, 3, 2)
you can sample elements with bootstrap in this way:
n_resamples = 99
data_nd[np.random.randint(len(data_nd), size=len(data_nd)*n_resamples)].reshape(n_resamples, *data_nd.shape).shape
what I'm doing is to randomly extract indices (randint) with replacement and finally reshape the sampling to obtain 99 bootstrapped dataset with the same dimensions of the original one.
Note that by this procedure you are considering as "elements" the arrays along the first ax and so each element that you are sampling have shape (3,2).
I hope that is clear, but if you have any doubt please let me know.
I have a list of NumPy arrays, I want to apply rot90 and flip function randomly on it. So that in the end, I have a list where some arrays are as it is, and some are modified (with that two fuctions).
I directly pass that list of arrays to numpy.random.choice, it gives me the following error ValueError: a must be 1-dimensional.
thanks in advance.
One approach it to create a population of functions and pick randomly, using random.choice, the one to apply to each image:
import random
import numpy as np
# for reproducibility
random.seed(42)
np.random.seed(42)
# toy data representing the list of images
images = [np.random.randint(255, size=(128, 128)) for _ in range(10)]
functions = [lambda x: x, np.rot90, np.flip]
# pick function at random at apply to image
res = [random.choice(functions)(image) for image in images]
You can just sample indices and apply to the array at the respecting index. So here is an example of the basic idea:
import numpy as np
# generate some random list of arrays
l = [np.random.randint(0,10,(4,4)) for _ in range(10)]
# sample indices and apply rotation and flip
indices = np.random.choice(np.arange(len(l)),int(len(l)/2),replace=False)
new_l = [np.flip(np.rot90(l[i])) if i in indices else l[i] for i in range(len(l))]
Why don't you sample a list of indeces that needs to be modified?
In the following example, I have set:
A list of functions which could be applied transformations
If functions can be applied to the same only once (apply_only_once=True), or multiple applications are permitted (apply_only_once=False)
Number of lines which must be modified is n_lines_to_modify. Clearly, if apply_ony_once=True, n_lines_to_modify must be less or equal to the number of rows in the array; note that, if apply_only_once=False, n_lines_to_modify is not constrained, because multiple transformation can be applied to the same line (corner case: all the transformations applied to one line only!)
arrays is just a test input
In code:
import random
import numpy as np
transformations = [lambda x: x**2, lambda x: x+2]
apply_only_once = True
n_lines_to_modify = 2
arrays = np.array([np.array([1,2,3]), np.array([1,2,3]), np.array([3,4,5])])
if apply_only_once:
to_be_modified = random.sample(range(len(arrays)), n_lines_to_modify)
else:
to_be_modified = [random.choice(range(len(arrays))) for _ in range(n_lines_to_modify)]
for i in to_be_modified:
arrays[i] = random.choice(transformations)(arrays[i])
print(arrays)
I am trying to get a 2d array, by randomly generating its rows and appending
import numpy as np
my_nums = np.array([])
for i in range(100):
x = np.random.rand(2, 1)
my_nums = np.append(my_nums, np.array(x))
But I do not get what I want but instead get a 1d array.
What is wrong?
Transposing x did not help either.
You could do this by using np.append(axis=0) or np.vstack. This however requires the rows appended to have the same length as the rows already in the array.
You cannot use the same code to append a row with two values to an empty array, and to append a row to an already existing 2D array: numpy will throw a
ValueError: all the input arrays must have same number of dimensions.
You could initialize my_nums to work around this:
my_nums = np.random.rand(1, 2)
for i in range(99):
x = np.random.rand(1, 2)
my_nums = np.append(my_nums, x, axis=0)
Note the decrease in the range by one due to the initialization row. Also note that I changed the dimensions to (1, 2) to get actual row vectors.
Much easier than appending row-wise will of course be to create the array in the wanted final shape:
my_nums = np.random.rand(100, 2)
So I am a little new to using matrices in Python, and I am looking for the best way to perform the following operation.
Say I have a vector of an arbitrary length, like this:
data = np.array(range(255))
And I want to fit this data inside a matrix with a shape like so:
concept = np.zeros((3, 9, 6))
Now, obviously this will not fit, and results in an error:
ValueError: cannot reshape array of size 255 into shape (3,9,6)
What would be the best way to go about fitting as much of the data vector inside the first matrix with the shape (3, 9, 6) while making sure any "overflow" is stored in a second (or third, fourth, etc.) matrix?
Does this make sense?
Basically, I want to be able to take a vector of any size and produce an arbitrary amount of matrices that have the data shaped according to the 3, 9, 6 dimensions.
Thank you for your help.
def each_matrix(a, dims):
size = dims.prod()
padded = np.concatenate([ a, np.zeros(size-1) ])
for i in range(len(padded) / size):
yield padded[i*size : (i+1)*size].reshape(dims)
for matrix in each_matrix(np.array(range(255)),
dims=np.array([ 3, 9, 6 ])):
print(str(matrix) + '\n\n-------\n')
This will fill the last matrix with zeros.
Here is a rough solution to your problem.
def split_padded(a,n):
padding = n - len(data)%n
numOfsplit = int(len(data)/n)+1
print padding, numOfsplit
return np.split(np.concatenate((a,np.zeros(padding))),numOfsplit)
data = np.array(range(255))
splitnum = 3*9*6
splitdata = split_padded(data,splitnum)
for mat in splitdata:
print mat.reshape(3,9,6)
It is very rough and works for 1D input for array.
First, calculating the number of 0 we need to pad in padding and then calculating the number of matrices we can get out of input data in numOfsplit and doing the splitting in last line.
I would like to use a generic filter to calculate the mean of values within a given window (or kernel), for values that fulfill a couple of conditions. I expected the following code to produce a mean filter of the first array in a 3-layer window, using the other two arrays to mask values from the mean calculation.
from scipy import ndimage
import numpy as np
#some test data
tstArr = np.random.rand(3,7,7)
tstArr = tstArr*10
tstArr = np.int_(tstArr)
tstArr[1] = tstArr[1]*100
tstArr[2] = tstArr[2] *1000
#mean function
def testFun(tstData,processLayer,nLayers,kernelSize):
funData= tstData.reshape((nLayers,kernelSize,kernelSize))
meanLayer = funData[processLayer]
maskedData = meanLayer[(funData[1]>1)&(funData[2]<9000)]
returnMean = np.mean(maskedData)
return returnMean
#number of layers in the array
nLayers = np.shape(tstArr)[0]
#window size
kernelSize = 5
#create a sampling window of 5x5 elements from each array
footprnt = np.ones((nLayers,kernelSize,kernelSize),dtype = np.int)
# calculate the mean of the first layer in the array (other two are for masking)
processLayer = 0
tstOut = ndimage.generic_filter(tstArr, testFun, footprint=footprnt, extra_arguments = (processLayer,nLayers,kernelSize))
I thought this would yield a 7x7 array of masked mean values from the first layer in the input array. The output is a 3x7x7 array, and I don't understand what the values represent. I'm not sure how to produce the "masked" mean-filtered array, or how to interpret the output as given.
Your code produce a mean filter of the first array in a 3-layer window, using the over two arrays to mask values from the mean calculation. You will find the result in tstOut[1].
What is going on ? When you call ndimage.generic_filter with tstArr of shape (3, 7, 7) and footprint=np.ones((3, 5, 5)) then for all i from 0 to 2, for all j from 0 to 6 and for all k from 0 to 6, testFun is called with the subarray of tstArr centered in (i, j, k) and of shape (3, 5, 5) (the array is reflected at the boundary to supply missing values).
In the end:
tstOut[0] is the mean filter of tstArr[0] with tstArr[0] and tstArr[1] as masks
tstOut[1] is the mean filter of tstArr[0] with tstArr[1] and tstArr[2] as masks
tstOut[2] is the mean filter of tstArr[1] with tstArr[2] and tstArr[2] as masks
Again, the wanted result is in tstOut[1].
I hope this will help you.