I have a question about Keras function Dropout with the argument of noise_shape.
Question 1:
What's the meaning of if your inputs have shape (batch_size, timesteps, features) and you want the dropout mask to be the same for all timesteps, you can use noise_shape=(batch_size, 1, features)?, and what's the benefit of adding this argument?
Does it mean the number of neurons that will be dropped out is same along time step? which means at every timestep t, there would be n neurons dropped?
Question 2:
Do I have to include 'batch_size' in noise_shape when creating models? --> see the following example.
Suppose I have a multivariate time series data in the shape of (10000, 1, 100, 2) --> (number of data, channel, timestep, number of features).
Then I create batches with batch size of 64 --> (64, 1, 100, 2)
If I want to create a CNN model with drop out, I use Keras functional API:
inp = Input([1, 100, 2])
conv1 = Conv2D(64, kernel_size=(11,2), strides(1,1),data_format='channels_first')(inp)
max1 = MaxPooling2D((2,1))(conv1)
max1_shape = max1._keras_shape
drop1 = Dropout((0.1, noise_shape=[**?**, max1._keras_shape[1], 1, 1]))
Because the output shape of layer max1 should be (None, 64, 50, 1), and I cannot assign None to the question mark (which corresponds to batch_size)
I wonder how should I cope with this? Should I just use (64, 1, 1) as noise_shape? or should I define a variable called 'batch_size', then pass it to this argument like (batch_size, 64, 1, 1)?
Question 1:
It's kind of like a numpy broadcast I think.
Imagine you have 2 batches witch 3 timesteps and 4 features (It's a small example to make it easier to show it):
(2, 3, 4)
If you use a noise shape of (2, 1, 4), each batch will have its own
dropout mask that will be applied to all timesteps.
So let's say these are the weights of shape (2, 3, 4):
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 10, 11, 12, 13]],
[[ 14, 15, 16, 17],
[ 18, 19, 20, 21],
[ 22, 23, 24, 25]]])
And this would be the random noise_shape (2, 1, 4)
(1 is like keep and 0 is like turn it off):
array([[[ 1, 1, 1, 0]],
[[ 1, 0, 0, 1]]])
So you have these two noise shapes (For every batch one).
Then it will be kinda broadcast along the timestep axis.
array([[[ 1, 1, 1, 0],
[ 1, 1, 1, 0],
[ 1, 1, 1, 0]],
[[ 1, 0, 0, 1],
[ 1, 0, 0, 1],
[ 1, 0, 0, 1]]])
and applied to the weights:
array([[[ 1, 2, 3, 0],
[ 5, 6, 7, 0],
[ 10, 11, 12, 0]],
[[ 14, 0, 0, 17],
[ 18, 0, 0, 21],
[ 22, 0, 0, 25]]])
Question 2:
I'm not sure about your second question to be honest.
Edit:
What you can do is take the first dimension of the shape of the input,
which should be the batch_size, as proposed in this github issue:
import tensorflow as tf
...
batch_size = tf.shape(inp)[0]
drop1 = Dropout((0.1, noise_shape=[batch_size, max1._keras_shape[1], 1, 1]))
As you can see I'm on tensorflow backend. Dunno if theano also
has these problems and if it does you might just be able to solve it with
the theano shape equivalent.
Below is sample code to see what exactly is happening.
The output log is self explanatory.
While if you are bothered about dynamic batch_size just make first element of noise_shape to None as follows i.e.
dl1 = tk.layers.Dropout(0.2, noise_shape=[_batch_size, 1, _num_features])
to
dl1 = tk.layers.Dropout(0.2, noise_shape=[None, 1, _num_features])
import tensorflow as tf
import tensorflow.keras as tk
import numpy as np
_batch_size = 5
_time_steps = 2
_num_features = 3
input = np.random.random((_batch_size, _time_steps, _num_features))
dl = tk.layers.Dropout(0.2)
dl1 = tk.layers.Dropout(0.2, noise_shape=[_batch_size, 1, _num_features])
out = dl(input, training=True).numpy()
out1 = dl1(input, training=True).numpy()
for i in range(_batch_size):
print(">>>>>>>>>>>>>>>>>>>>>>>>>>>>", i)
print("input")
print(input[i])
print("out")
print(out[i])
print("out1")
print(out1[i])
The output is:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0
input
[[0.53853024 0.80089701 0.64374258]
[0.06481775 0.31187039 0.5029061 ]]
out
[[0.6731628 1.0011213 0. ]
[0.08102219 0.38983798 0.6286326 ]]
out1
[[0.6731628 0. 0.8046782 ]
[0.08102219 0. 0.6286326 ]]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1
input
[[0.70746014 0.08990712 0.58195288]
[0.75798534 0.50140453 0.04914242]]
out
[[0.8843252 0.11238389 0. ]
[0.9474817 0.62675565 0. ]]
out1
[[0. 0.11238389 0. ]
[0. 0.62675565 0. ]]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2
input
[[0.85253707 0.55813084 0.70741476]
[0.98812977 0.21565134 0.67909392]]
out
[[1.0656713 0.69766355 0.8842684 ]
[0. 0.26956415 0. ]]
out1
[[1.0656713 0.69766355 0.8842684 ]
[1.2351623 0.26956415 0.84886736]]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3
input
[[0.9837272 0.3504008 0.37425778]
[0.67648931 0.74456052 0.6229444 ]]
out
[[1.2296591 0.438001 0. ]
[0.84561163 0.93070066 0.7786805 ]]
out1
[[0. 0.438001 0.46782222]
[0. 0.93070066 0.7786805 ]]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4
input
[[0.45599217 0.80992091 0.04458478]
[0.12214568 0.09821599 0.51525869]]
out
[[0.5699902 1.0124011 0. ]
[0.1526821 0. 0.64407337]]
out1
[[0.5699902 1.0124011 0.05573097]
[0.1526821 0.12276999 0.64407337]]
Related
So I have a fine-tuned model that returns among other features, the predicted images, I want to get the predicted classes from those images but I'm not able to get with it, this is intended to compute a Confusion matrix either manually or using scikit learn, I'm able to get with each class of my original dataset but I'm struggling to get with the predicted classes of each image, so far this is a patch of my code:
predlist = torch.zeros(0, dtype=torch.long, device='cpu')
lbllist = torch.zeros(0, dtype=torch.long, device='cpu')
with torch.no_grad():
for i, (inputs, classes) in enumerate(val_loader):
inputs = inputs.to(device) # [1, 13, 224, 224]
classes = classes.to(device)
outputs = model_ft(inputs)[1] # 0 is LOSS, 1 is [1, 196, 3328] is PRED, 2 is [1, 196] is MASK,
# 3 is [1, 13, 224, 224] is TARGET
# !! return loss, pred, mask, target_2d
#outputs = outputs[1]
model = models_mae_mod.__dict__['mae_vit_small_patch16'](in_chans=13, feature='raw')
#loss, pred, mask, target = outputs
#print(loss, pred.shape, mask.shape)
#outputs = torch.Tensor(np.stack((loss, pred, mask, target), -1))
#model = models_mae_mod.__dict__['mae_vit_small_patch16'](in_chans=13)
outputs = model.unpatchify(outputs) #[1, 13, 224, 224]
#lab = torch.argmax(outputs, 1)
_, preds = torch.max(outputs, 1)
#pred_c = torch.argmax(preds)
predlist = torch.cat([predlist, preds.view(-1).cpu()])
lbllist = torch.cat([lbllist, classes.view(-1).cpu()])
if i > 10:
break
After debugging I got with this values:
a = torch.max(outputs, 1)
a
torch.return_types.max(
values=tensor([[[0.5419, 0.3766, 1.0952, ..., 0.9223, 0.7693, 1.0980],
[1.9111, 1.4176, 0.9902, ..., 1.3873, 0.9266, 0.6857],
[0.8174, 0.5505, 0.8097, ..., 0.8501, 0.1761, 1.0284],
...,
[0.5996, 0.4945, 0.8258, ..., 0.8206, 1.3554, 1.1564],
[0.3814, 0.7084, 0.8026, ..., 0.6130, 1.1291, 1.3241],
[1.4426, 1.3198, 0.9262, ..., 0.9011, 0.7266, 0.8977]]],
device='cuda:0'),
indices=tensor([[[ 2, 3, 4, ..., 0, 1, 10],
[ 7, 9, 3, ..., 6, 9, 4],
[ 2, 4, 3, ..., 9, 7, 1],
...,
[10, 0, 1, ..., 11, 9, 3],
[ 6, 7, 10, ..., 7, 11, 8],
[ 6, 7, 8, ..., 7, 4, 4]]], device='cuda:0'))
_.shape
torch.Size([1, 224, 224])
preds.shape
torch.Size([1, 224, 224])
This is just for one image, I know that the indices follows each value of the values tensor, however I cannot understand how to get with the probability or the prediction for this image and not the entire tensor representing the image. Do you have any idea how to get with that information? How could I get with the predicted classes for each image?
PD: The indices tensor looks like the prediction but I'm not sure, since it has values from 0 to 12 when the dataset just have 10 classes thus, seems to be other thing.
I am receiving the following error when I run a convolution operation inside a torch.no_grad() context:
RuntimeError: Only Tensors of floating-point and complex dtype can require gradients.
import torch.nn as nn
import torch
with torch.no_grad():
ker_t = torch.tensor([[1, -1] ,[-1, 1]])
in_t = torch.tensor([ [14, 7, 6, 2,] , [4 ,8 ,11 ,1], [3, 5, 9 ,10], [12, 15, 16, 13] ])
print(in_t.shape)
in_t = torch.unsqueeze(in_t,0)
in_t = torch.unsqueeze(in_t,0)
print(in_t.shape)
conv = nn.Conv2d(1, 1, kernel_size=2,stride=2,dtype=torch.long)
conv.weight[:] = ker_t
conv(in_t)
Now I am sure if I turn my input into floats this message will go away, but I want to work in integers.
But I was under the impression that if I am in a with torch.no_grad() context it should turn off the need for gradients.
The need for gradients comes from nn.Conv2d when it registers the weights of the convolution layer.
However, if you are only after the forward pass, you do not need to use a convolution layer: you can use the underlying convolution function:
import torch.nn.functional as nnf
ker_t = torch.tensor([[1, -1] ,[-1, 1]])[None, None, ...]
in_t = torch.tensor([ [14, 7, 6, 2,] , [4 ,8 ,11 ,1], [3, 5, 9 ,10], [12, 15, 16, 13] ])[None, None, ...]
out = nnf.conv2d(in_t, ker_t, stride=2)
Will give you this output:
tensor([[[[11, -6],
[ 1, -4]]]])
I would like to raise a vector by ascending powers form 0 to 5:
import numpy as np
a = np.array([1, 2, 3]) # list of 11 components
b = np.array([0, 1, 2, 3, 4]) # power
c = np.power(a,b)
desired results are:
c = [[1**0, 1**1, 1**2, 1**3, 1**4], [2**0, 2**1, ...], ...]
I keep getting this error:
ValueError: operands could not be broadcast together with shapes (3,) (5,)
One solution will be to add a new dimension to your array a
c = a[:,None]**b
# Using broadcasting :
# (3,1)**(4,) --> (3,4)
#
# [[1],
# c = [2], ** [0,1,2,3,4]
# [3]]
For more information check the numpy broadcasting documentation
Here's a solution:
num_of_powers = 5
num_of_components = 11
a = []
for i in range(1,num_of_components + 1):
a.append(np.repeat(i,num_of_powers))
b = list(range(num_of_powers))
c = np.power(a,b)
The output c would look like:
array([[ 1, 1, 1, 1, 1],
[ 1, 2, 4, 8, 16],
[ 1, 3, 9, 27, 81],
[ 1, 4, 16, 64, 256],
[ 1, 5, 25, 125, 625],
[ 1, 6, 36, 216, 1296],
[ 1, 7, 49, 343, 2401],
[ 1, 8, 64, 512, 4096],
[ 1, 9, 81, 729, 6561],
[ 1, 10, 100, 1000, 10000],
[ 1, 11, 121, 1331, 14641]], dtype=int32)
Your solution shows a broadcast error because as per the documentation:
If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).
c = [[x**y for y in b] for x in a]
c = np.asarray(list(map(lambda x: np.power(a,x), b))).transpose()
You need to first create a matrix where the rows are repetitions of each number. This can be done with np.tile:
mat = np.tile(a, (len(b), 1)).transpose()
And then raise that to the power of b elementwise:
np.power(mat, b)
All together:
import numpy as np
nums = np.array([1, 2, 3]) # list of 11 components
powers = np.array([0, 1, 2, 3, 4]) # power
print(np.power(np.tile(nums, (len(powers), 1)).transpose(), powers))
Which will give:
[[ 1 1 1 1 1] # == [1**0, 1**1, 1**2, 1**3, 1**4]
[ 1 2 4 8 16] # == [2**0, 2**1, 2**2, 2**3, 2**4]
[ 1 3 9 27 81]] # == [3**0, 3**1, 3**2, 3**3, 3**4]
I want to ask you about calculating the histogram in Python using OpenCV. I used this code:
hist = cv2.calcHist(im, [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
The result gave me the histogram of each color channel with 8 bins, but what I want to get is:
1st bin (R=0-32,G=0-32,B=0-32),
2nd bin (R=33-64,G=0-32,B=0-32),
and so on,
so I will have 512 bins in total.
From my point of view, your cv2.calcHist call isn't correct:
hist = cv2.calcHist(im, [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
The first parameter should be a list of images:
hist = cv2.calcHist([im], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
Let's see this small example:
import cv2
import numpy as np
# Red blue square of size [4, 4], i.e. eight pixels (255, 0, 0) and eight pixels (0, 0, 255); Attention: BGR ordering!
image = np.zeros((4, 4, 3), dtype=np.uint8)
image[:, 0:2, 2] = 255
image[:, 2:4, 0] = 255
# Calculate histogram with two bins [0 - 127] and [128 - 255] per channel:
# Result should be hist["bin 0", "bin 0", "bin 1"] = 8 (red) and hist["bin 1", "bin 0", "bin 0"] = 8 (blue)
# Original cv2.calcHist call with two bins [0 - 127] and [128 - 255]
hist = cv2.calcHist(image, [0, 1, 2], None, [2, 2, 2], [0, 256, 0, 256, 0, 256])
print(hist, '\n') # Not correct
# Correct cv2.calcHist call
hist = cv2.calcHist([image], [0, 1, 2], None, [2, 2, 2], [0, 256, 0, 256, 0, 256])
print(hist, '\n') # Correct
[[[8. 0.]
[0. 0.]]
[[0. 0.]
[0. 4.]]]
[[[0. 8.]
[0. 0.]]
[[8. 0.]
[0. 0.]]]
As you can, your version only has 12 values in total, whereas there are 16 pixels in the image! Also, it's not clear, what "bins" (if at all) are represented.
So, having the proper cv2.calcHist call, your general idea/approach is correct! Maybe, you just need a little hint, "how to read" the resuling hist:
import cv2
import numpy as np
# Colored rectangle of size [32, 16] with one "color" per bin for eight bins per channel,
# i.e. 512 pixels, such that each of the resulting 512 bins has value 1
x = np.linspace(16, 240, 8, dtype=np.uint8)
image = np.reshape(np.moveaxis(np.array(np.meshgrid(x, x, x)), [0, 1, 2, 3], [3, 0, 1, 2]), (32, 16, 3))
# Correct cv2.calcHist call
hist = cv2.calcHist([image], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
# Lengthy output of each histogram bin
for B in np.arange(hist.shape[0]):
for G in np.arange(hist.shape[1]):
for R in np.arange(hist.shape[2]):
r = 'R=' + str(R*32).zfill(3) + '-' + str((R+1)*32-1).zfill(3)
g = 'G=' + str(G*32).zfill(3) + '-' + str((G+1)*32-1).zfill(3)
b = 'B=' + str(B*32).zfill(3) + '-' + str((B+1)*32-1).zfill(3)
print('(' + r + ', ' + g + ', ' + b + '): ', int(hist[B, G, R]))
(R=000-031, G=000-031, B=000-031): 1
(R=032-063, G=000-031, B=000-031): 1
(R=064-095, G=000-031, B=000-031): 1
[... 506 more lines ...]
(R=160-191, G=224-255, B=224-255): 1
(R=192-223, G=224-255, B=224-255): 1
(R=224-255, G=224-255, B=224-255): 1
Hope that helps!
Alright, here the given data;
There are three numpy arrays of the shapes:
(i, 4, 2), (i, 4, 3), (i, 4, 2)
the i is shared among them but is variable.
The dtype is float32 for everything.
The goal is to interweave them in a particular order. Let's look at the data at index 0 for these arrays:
[[-208. -16.]
[-192. -16.]
[-192. 0.]
[-208. 0.]]
[[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]]
[[ 0.49609375 0.984375 ]
[ 0.25390625 0.984375 ]
[ 0.25390625 0.015625 ]
[ 0.49609375 0.015625 ]]
In this case, the concatened target array would look something like this:
[-208, -16, 1, 1, 1, 0.496, 0.984, -192, -16, 1, 1, 1, ...]
And then continue on with index 1.
I don't know how to achieve this, as the concatenate function just keeps telling me that the shapes don't match. The shape of the target array does not matter much, just that the memoryview of it must be in the given order for upload to a gpu shader.
Edit: I could achieve this with a few python for loops, but the performance impact would be a problem in this program.
Use np.dstack and flatten with np.ravel() -
np.dstack((a,b,c)).ravel()
Now, np.dstack is basically stacking along the third axis. So, alternatively we can use np.concatenate too along that axis, like so -
np.concatenate((a,b,c),axis=2).ravel()
Sample run -
1) Setup Input arrays :
In [613]: np.random.seed(1234)
...: n = 3
...: m = 2
...: a = np.random.randint(0,9,(n,m,2))
...: b = np.random.randint(11,99,(n,m,2))
...: c = np.random.randint(101,999,(n,m,2))
...:
2) Check input values :
In [614]: a
Out[614]:
array([[[3, 6],
[5, 4]],
[[8, 1],
[7, 6]],
[[8, 0],
[5, 0]]])
In [615]: b
Out[615]:
array([[[84, 58],
[61, 87]],
[[48, 45],
[49, 78]],
[[22, 11],
[86, 91]]])
In [616]: c
Out[616]:
array([[[104, 359],
[376, 560]],
[[472, 720],
[566, 115]],
[[344, 556],
[929, 591]]])
3) Output :
In [617]: np.dstack((a,b,c)).ravel()
Out[617]:
array([ 3, 6, 84, 58, 104, 359, 5, 4, 61, 87, 376, 560, 8,
1, 48, 45, 472, 720, 7, 6, 49, 78, 566, 115, 8, 0,
22, 11, 344, 556, 5, 0, 86, 91, 929, 591])
What I would do is:
np.hstack([a, b, c]).flatten()
assuming a, b, c are the three arrays