How to use Skimage to resize training data? - python

I am trying to understand how the resize function in skimage works. I am trying to have my x_train data which is CIFAR10 dataset with the shape of x_train shape: (50000, 32, 32, 3) to be resized to (244, 244, 3) to be used in VGG19 model. I tried to using the following code to change the data, but am not sure if the results are correct as trying to do a plt.imshow(resized_xtrain[0]) gives me an odd image.
from skimage import transform
for i in range (50000):
resized_xtrain= transform.resize(x_train[i], (224, 224, 3), order=1, mode='reflect')
When I do a resized_xtrain.shape I get a (244, 244, 3). I am confused on how to set up this to have all 50,000 training images be resized and set to resized_xtrain. Basically how can I change the resize to put all of the training data into an array?

You forgot to index your resized_xtrain, do it like the following:
from skimage import transform
resized_xtrain = np.zeros((50000,224,224,3))
for i in range (50000):
resized_xtrain[i] = transform.resize(x_train[i], (224, 224, 3), order=1, mode='reflect')

Related

what is the use of expand_dims in image processing?

I saw a face detection model which consists of the below function. but I could not understand what is the use of the expand_dims function. can anyone explain me what it is and why we are using ?
def get_embedding(model,face_pixels):
face_pixels=face_pixels.astype('float32')
mean, std=face_pixels.mean(),face_pixels.std()
face_pixels=(face_pixels-mean)/std
samples=expand_dims(face_pixels,axis=0)
yhat=model.predict(samples)
return yhat[0]
tf.keras.Conv2D layers expect input with 4D shape:
(n_samples, height, width, channels)
Most libraries that load images will load in 3D like this:
(height, width, channels)
By using np.expand_dims(image, axis=0) or tf.expand_dims(image, axis=0), you add a batch dimension at the beginning, effectively turning your data in the 4D format the Keras needs for Conv2D layers. For instance:
(224, 224, 3)
to:
(1, 224, 224, 3)
If you give Conv2D 3D data, it will give something like this:
ValueError: Error when checking input: expected conv2d_19_input to have 4 dimensions, but got array with shape (60000, 28, 28)

Reshaping 2D Grayscale into 4D for Keras Model Inference

I have a pre-trained Keras model that I need to use to classify a 512x 512 image that is originally in grayscale format. The input to the Keras model should be in the shape (None, 512, 512, 1). .
I executed the following code:
model=load_model('model.h5')
img = Image.open('img.jpg')
img_array = np.array (img)
img_array = img_array/255
model.predict (img_array)
However, I get the following error
Error when checking input: expected input_1 to have 4 dimensions, but
got array with shape (512, 512)
I know that I need to reshape my grayscale image into 4D to match the desired input shape, however, I am not sure how to do this so that the image keeps its original features. How can I make the grayscale image into 4D properly?
Thanks.
try reshaping the array
img_array = img_array.reshape((1, 512, 512, 1))
here 1st and last dimension are batch size and channels respectively

How to change the shape of a layer in pre-trained Keras CNN model? [duplicate]

I have a dataset containing grayscale images and I want to train a state-of-the-art CNN on them. I'd very much like to fine-tune a pre-trained model (like the ones here).
The problem is that almost all models I can find the weights for have been trained on the ImageNet dataset, which contains RGB images.
I can't use one of those models because their input layer expects a batch of shape (batch_size, height, width, 3) or (64, 224, 224, 3) in my case, but my images batches are (64, 224, 224).
Is there any way that I can use one of those models? I've thought of dropping the input layer after I've loaded the weights and adding my own (like we do for the top layers). Is this approach correct?
The model's architecture cannot be changed because the weights have been trained for a specific input configuration. Replacing the first layer with your own would pretty much render the rest of the weights useless.
-- Edit: elaboration suggested by Prune--
CNNs are built so that as they go deeper, they can extract high-level features derived from the lower-level features that the previous layers extracted. By removing the initial layers of a CNN, you are destroying that hierarchy of features because the subsequent layers won't receive the features that they are supposed to as their input. In your case the second layer has been trained to expect the features of the first layer. By replacing your first layer with random weights, you are essentially throwing away any training that has been done on the subsequent layers, as they would need to be retrained. I doubt that they could retain any of the knowledge learned during the initial training.
--- end edit ---
There is an easy way, though, which you can make your model work with grayscale images. You just need to make the image to appear to be RGB. The easiest way to do so is to repeat the image array 3 times on a new dimension. Because you will have the same image over all 3 channels, the performance of the model should be the same as it was on RGB images.
In numpy this can be easily done like this:
print(grayscale_batch.shape) # (64, 224, 224)
rgb_batch = np.repeat(grayscale_batch[..., np.newaxis], 3, -1)
print(rgb_batch.shape) # (64, 224, 224, 3)
The way this works is that it first creates a new dimension (to place the channels) and then it repeats the existing array 3 times on this new dimension.
I'm also pretty sure that keras' ImageDataGenerator can load grayscale images as RGB.
Converting grayscale images to RGB as per the currently accepted answer is one approach to this problem, but not the most efficient. You most certainly can modify the weights of the model's first convolutional layer and achieve the stated goal. The modified model will both work out of the box (with reduced accuracy) and be finetunable. Modifying the weights of the first layer does not render the rest of the weights useless as suggested by others.
To do this, you'll have to add some code where the pretrained weights are loaded. In your framework of choice, you need to figure out how to grab the weights of the first convolutional layer in your network and modify them before assigning to your 1-channel model. The required modification is to sum the weight tensor over the dimension of the input channels. The way the weights tensor is organized varies from framework to framework. The PyTorch default is [out_channels, in_channels, kernel_height, kernel_width]. In Tensorflow I believe it is [kernel_height, kernel_width, in_channels, out_channels].
Using PyTorch as an example, in a ResNet50 model from Torchvision (https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py), the shape of the weights for conv1 is [64, 3, 7, 7]. Summing over dimension 1 results in a tensor of shape [64, 1, 7, 7]. At the bottom I've included a snippet of code that would work with the ResNet models in Torchvision assuming that an argument (inchans) was added to specify a different number of input channels for the model.
To prove this works I did three runs of ImageNet validation on ResNet50 with pretrained weights. There is a slight difference in the numbers for run 2 & 3, but it's minimal and should be irrelevant once finetuned.
Unmodified ResNet50 w/ RGB Images : Prec #1: 75.6, Prec #5: 92.8
Unmodified ResNet50 w/ 3-chan Grayscale Images: Prec #1: 64.6, Prec #5: 86.4
Modified 1-chan ResNet50 w/ 1-chan Grayscale Images: Prec #1: 63.8, Prec #5: 86.1
def _load_pretrained(model, url, inchans=3):
state_dict = model_zoo.load_url(url)
if inchans == 1:
conv1_weight = state_dict['conv1.weight']
state_dict['conv1.weight'] = conv1_weight.sum(dim=1, keepdim=True)
elif inchans != 3:
assert False, "Invalid number of inchans for pretrained weights"
model.load_state_dict(state_dict)
def resnet50(pretrained=False, inchans=3):
"""Constructs a ResNet-50 model.
Args:
pretrained (bool): If True, returns a model pre-trained on ImageNet
"""
model = ResNet(Bottleneck, [3, 4, 6, 3], inchans=inchans)
if pretrained:
_load_pretrained(model, model_urls['resnet50'], inchans=inchans)
return model
A simple way to do this is to add a convolution layer before the base model and then feed the output to the base model. Like this:
from keras.models import Model
from keras.layers import Input
resnet = Resnet50(weights='imagenet',include_top= 'TRUE')
input_tensor = Input(shape=(IMG_SIZE,IMG_SIZE,1) )
x = Conv2D(3,(3,3),padding='same')(input_tensor) # x has a dimension of (IMG_SIZE,IMG_SIZE,3)
out = resnet (x)
model = Model(inputs=input_tensor,outputs=out)
Why not try to convert a grayscale image to a fake "RGB" image?
tf.image.grayscale_to_rgb(
images,
name=None
)
Dropping the input layer will not work out. This will cause that the all following layers will suffer.
What you can do is Concatenate 3 black and white images together to expand your color dimension.
img_input = tf.keras.layers.Input(shape=(img_size_target, img_size_target,1))
img_conc = tf.keras.layers.Concatenate()([img_input, img_input, img_input])
model = ResNet50(include_top=True, weights='imagenet', input_tensor=img_conc)
I faced the same problem while working with VGG16 along with gray-scale images. I solved this problem like follows:
Let's say our training images are in train_gray_images, each row containing the unrolled gray scale image intensities. So if we directly pass it to fit function it will create an error as the fit function is expecting a 3 channel (RGB) image data-set instead of gray-scale data set. So before passing to fit function do the following:
Create a dummy RGB image data set just like the gray scale data set with the same shape (here dummy_RGB_image). The only difference is here we are using the number of the channel is 3.
dummy_RGB_images = np.ndarray(shape=(train_gray_images.shape[0], train_gray_images.shape[1], train_gray_images.shape[2], 3), dtype= np.uint8)
Therefore just copy the whole data-set 3 times to each of the channels of the "dummy_RGB_images". (Here the dimensions are [no_of_examples, height, width, channel])
dummy_RGB_images[:, :, :, 0] = train_gray_images[:, :, :, 0]
dummy_RGB_images[:, :, :, 1] = train_gray_images[:, :, :, 0]
dummy_RGB_images[:, :, :, 2] = train_gray_images[:, :, :, 0]
Finally pass the dummy_RGB_images instead of the gray scale data-set, like:
model.fit(dummy_RGB_images,...)
numpy's depth-stack function, np.dstack((img, img, img)) is a natural way to go.
If you're already using scikit-image, you can get the desired result by using gray2RGB.
from skimage.color import gray2rgb
rgb_img = gray2rgb(gray_img)
I believe you can use a pretrained resnet with 1 channel gray scale images without repeating 3 times the image.
What I have done is to replace the first layer (this is pythorch not keras, but the idea might be similar):
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
With the following layer:
(conv1): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
And then copy the sum (in the channel axis) of the weights to the new layer, for example, the shape of the original weights was:
torch.Size([64, 3, 7, 7])
So I did:
resnet18.conv1.weight.data = resnet18.conv1.weight.data.sum(axis=1).reshape(64, 1, 7, 7)
And then check that the output of the new model is the same than the output with the gray scale image:
y_1 = model_resnet_1(input_image_1)
y_3 = model_resnet_3(input_image_3)
print(torch.abs(y_1).sum(), torch.abs(y_3).sum())
(tensor(710.8860, grad_fn=<SumBackward0>),
tensor(710.8861, grad_fn=<SumBackward0>))
input_image_1: one channel image
input_image_3: 3 channel image (gray scale - all channels equal)
model_resnet_1: modified model
model_resnet_3: Original resnet model
It's really easy !
example for 'resnet50':
before do it you should have :
resnet_50= torchvision.models.resnet50()
print(resnet_50.conv1)
Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3),
bias=False)
Just do this !
resnet_50.conv1 = nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
the final step is to update state_dict.
resnet_50.state_dict()['conv1.weight'] = resnet_50.state_dict()['conv1.weight'].sum(dim=1, keepdim=True)
so if run as follow :
print(resnet_50.conv1)
results would be :
Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3,
3), bias=False)
As you see input channel is for the grayscale images.
what I did is to just simply expand grayscales into RGB images by using the following transform stage:
import torchvision as tv
tv.transforms.Compose([
tv.transforms.ToTensor(),
tv.transforms.Lambda(lambda x: x.broadcast_to(3, x.shape[1], x.shape[2])),
])
When you add the Resnet to model, you should input the input_shape in Resnet definition like
model = ResNet50(include_top=True,input_shape=(256,256,1))
.

How to modify the tensor data dimension in pytorch, thanks

I want to transform the tensor data to numpy data and save it through Opencv, But the opencv require the data dimension must like such style [1, something, something, something], but my tensor data is a blend one, it'e size like [30, something, something, something],how can I modify the data dimension in pytorch.
PS, is there any function in pytorch can save data as a binary picture? I use "save_image" command to save my tensor data to a picture with all figure is 1 or 0, but the picture show is still a gray style. Maybe there is any other ways to save tensor data as a binary picture, please tell me.
def save_image_tensor2cv2(input_tensor, filename):
assert (len(input_tensor.shape) == 4) and input_tensor.shape[0] == 1)
input_tensor = input_tensor.clone().detach()
input_tensor = input_tensor.to(torch.device('cpu'))
input_tensor = input_tensor.squeeze()
input_tensor = input_tensor.mul_(255).add_(0.5).clamp_(0, 255).permute(1, 2, 0).type(torch.uint8).numpy()
cv2.imwrite(filename, input_tensor)
batch = next(iter(dataloader_test))
batch.shape
torch.Size([4, 3, 160, 160])
np.transpose(batch.numpy(), (0,2,3,1)).shape
(4, 160, 160, 3)
image = np.transpose(batch.numpy(), (0,2,3,1))
cv2.imwrite("image.png", image[0])
You might have to unNormalize the data before saving it though.

keras reshape input image to work with CNN

There are other post with similar questions but none of the answers are helping me. I´m new to this CNN world.
I followed this tutorial for training a CNN with Keras using theano as BackEnd with the MNIST dataset. Now I want to pass to the CNN my own jpg image but I dont know how to reshape it. Can you help me please? Im super new at this.
So far, I tried this to reshape
image = np.expand_dims(image, axis=0) image = preprocess_input(image)
but get the following error when predicting:
ValueError: Error when checking : expected conv2d_1_input to have shape (None, 1, 28, 28) but got array with shape (1, 3, 28, 28)
As you can see, my CNN uses width = 28, height = 28 and depth =1.
Try using Numpy for reshaping. Since, you have been using a 2D-Convolutional model:
image = np.reshape(image, (28, 1, 28, 1))
The error message shows the network expects the image shape is 1*28*28, but your input is in 3*28*28. I guess the image you input is a color image, 3 channels(RGB), while the network expects a gray image, one channel.
When you call opencv to read image, please use code below.
img = cv2.imread(imgfile, cv2.IMREAD_GRAYSCALE)
simply use
'''image = np.reshape(len(image), (28,28, 1))'''

Categories

Resources