I'm new to Autoencoder. I have built a simple convolution autoencoder as shown below:
# ENCODER
input_img = Input(shape=(64, 64, 1))
encode1 = Conv2D(32, (3, 3), activation=tf.nn.leaky_relu, padding='same')(input_img)
encode2 = MaxPooling2D((2, 2), padding='same')(encode1)
l = Flatten()(encode2)
l = Dense(100, activation='linear')(l)
# DECODER
d = Dense(1024, activation='linear')(l)
d = Reshape((32,32,1))(d)
decode3 = Conv2D(64, (3, 3), activation=tf.nn.leaky_relu, padding='same')(d)
decode4 = UpSampling2D((2, 2))(decode3)
model = models.Model(input_img, decode4)
model.compile(optimizer='adam', loss='mse')
# Train it by providing training images
model.fit(x, y, epochs=20, batch_size=16)
Now after training this model, I want to get output from bottleneck layer i.e dense layer. That means if I throw array of shape (1000, 64, 64) to model, I want compressed array of shape (1000, 100).
I have tried one method as shown below, but it's giving me some error.
model = Model(inputs=[x], outputs=[l])
err:
ValueError: Input tensors to a Functional must come from `tf.keras.Input`.
I have also tried some other method but that's also not working. Can someone tell me how can I get compressed array back after training the model.
You need to create the separate model for the encoder. After you train the whole system encoder-decoder, you can use only encoder for prediction. Code example:
# ENCODER
input_img = layers.Input(shape=(64, 64, 1))
encode1 = layers.Conv2D(32, (3, 3), activation=tf.nn.leaky_relu, padding='same')(input_img)
encode2 = layers.MaxPooling2D((2, 2), padding='same')(encode1)
l = layers.Flatten()(encode2)
encoder_output = layers.Dense(100, activation='linear')(l)
# DECODER
d = layers.Dense(1024, activation='linear')(encoder_output)
d = layers.Reshape((32,32,1))(d)
decode3 = layers.Conv2D(64, (3, 3), activation=tf.nn.leaky_relu, padding='same')(d)
decode4 = layers.UpSampling2D((2, 2))(decode3)
model_encoder = Model(input_img, encoder_output)
model = Model(input_img, decode4)
model.fit(X, y, epochs=20, batch_size=16)
model_encoder.predict(X) should return a vector for each image.
Getting the output of intermediate layer (bottleneck_layer).
# ENCODER
input_img = Input(shape=(64, 64, 1))
encode1 = Conv2D(32, (3, 3), activation=tf.nn.leaky_relu, padding='same')(input_img)
encode2 = MaxPooling2D((2, 2), padding='same')(encode1)
l = Flatten()(encode2)
bottleneck = Dense(100, activation='linear', name='bottleneck_layer')(l)
# DECODER
d = Dense(1024, activation='linear')(bottleneck)
d = Reshape((32,32,1))(d)
decode3 = Conv2D(64, (3, 3), activation=tf.nn.leaky_relu, padding='same')(d)
decode4 = UpSampling2D((2, 2))(decode3)
# full model
model_full = models.Model(input_img, decode4)
model_full.compile(optimizer='adam', loss='mse')
model_full.fit(x, y, epochs=20, batch_size=16)
# bottleneck model
bottleneck_output = model_full.get_layer('bottleneck_layer').output
model_bottleneck = models.Model(inputs = model_full.input, outputs = bottleneck_output)
bottleneck_predictions = model_bottleneck.predict(X_test)
Related
I am trying to construct a denoising autoencoder for a facial recognition project, and with the initial tests i am using the cropped yalefaces dataset, with the training (noisy) images in a folder (with separate folders for each class/person inside) and the testing (regular) images in another one with the same structure. But with every test i have the following error:
InvalidArgumentError: logits and labels must have the same first
dimension, got logits shape [986624,38] and labels shape [32] [[node
sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits
(defined at \Desktop\projetos\conv autoencoder teste.py:75) ]]
[Op:__inference_train_function_8691]
Function call stack: train_function
I am using Keras 2.6.0.
batch_size = 32
height = 168
width = 192
depth = 1
chanDim = -1
inputShape = (height, width, depth)
data_dir='C:\\Users\\User\\Desktop\\projetos\\Test1\\Data'
train_data_dir='C:\\Users\\User\\Desktop\\projetos\\Test1\\Test_Images\\sp_noise'
images_noisy = tf.keras.preprocessing.image_dataset_from_directory(directory=train_data_dir, labels='inferred', label_mode='int',class_names=None, color_mode='grayscale', batch_size=batch_size, image_size=(height,width),shuffle=True,seed=2457,interpolation='bilinear')
images_regular = tf.keras.preprocessing.image_dataset_from_directory(directory=data_dir, labels='inferred', label_mode='int',class_names=None, color_mode='grayscale', batch_size=batch_size, image_size=(height,width),shuffle=True,seed=2457,interpolation='bilinear')
datagen = tf.keras.preprocessing.image.ImageDataGenerator()
train_it = datagen.flow_from_directory(train_data_dir, class_mode='sparse', batch_size=32,target_size=(height, width),color_mode='grayscale')
val_it = datagen.flow_from_directory(data_dir, class_mode='sparse', batch_size=32,target_size=(height, width),color_mode='grayscale')
#input = tf.keras.layers.Input(shape=(inputShape))
Input_img = Input(shape=(168,192,1))
#Input_img = Input(shape=(None))
#encoding architecture
#x1 = tf.keras.layers.Reshape((168, 192, 1), input_shape=(None, 168, 192, 1))(Input_img)
x1 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same')(Input_img)
x1 = tf.keras.layers.MaxPooling2D( (2, 2), padding='same')(x1)
x2 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x1)
x2 = tf.keras.layers.MaxPooling2D( (2, 2), padding='same')(x2)
x3 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same')(x2)
encoded = tf.keras.layers.MaxPooling2D( (2, 2), padding='same')(x3)
# decoding architecture
x3 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same')(encoded)
x3 = tf.keras.layers.UpSampling2D((2, 2))(x3)
x2 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x3)
x2 = tf.keras.layers.UpSampling2D((2, 2))(x2)
x1 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu')(x2)
x1 = tf.keras.layers.UpSampling2D((2, 2))(x1)
decoded = tf.keras.layers.Conv2D(38, (3, 3), activation='sigmoid', padding='same')(x1)
autoencoder = Model(Input_img, decoded)
autoencoder.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False))
history = autoencoder.fit(
images_noisy,
epochs=20,
batch_size=32,
shuffle=True,
validation_data=(images_regular))
autoencoder.summary()
At this point i frankly have no idea what is causing the issue. I used the dataset with the image_dataset_from_directory function in a facial recognition/classification cnn with no issues but here nothing seems to work.
I was able to reproduce the error, the input dimension and the output dimension needs to be the same in an autoencoder. Changing the architecture of the decoder as follows will help.
#decoding architecture
x3 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same')(encoded)
x3 = tf.keras.layers.UpSampling2D((2, 2))(x3)
x2 = tf.keras.layers.Conv2D(1, (3, 3), activation='relu', padding='same')(x3)
x1 = tf.keras.layers.UpSampling2D((2, 2))(x2)
decoded = tf.keras.layers.UpSampling2D((2, 2))(x1)
I am beginner in tensorflow and I am trying to create a simple autoencoder for images to detect anomalies.Firstly, I created a simple autoencoder using dogs images , now I want to use this model to reconstruct my tests images and compare the result using some metrics.So how can I do it on tensorflow (because I am beginner on tensorflow )
(I found the same idea implemented on numerical datasets , and also on MNIST dataset ).
this is my code:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import LearningRateScheduler
BATCH_SIZE = 256
EPOCHS = 2
train_datagen = ImageDataGenerator(rescale=1./255)
train_batches = train_datagen.flow_from_directory('C:/MyPath/PetImages1',
target_size=(64,64), shuffle=True, class_mode='input', batch_size=BATCH_SIZE)
input_img = Input(shape=(64, 64, 3))
x = Conv2D(48, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(96, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(192, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
encoded = Conv2D(32, (1, 1), activation='relu', padding='same')(x)
latentSize = (8,8,32)
# DECODER
direct_input = Input(shape=latentSize)
x = Conv2D(192, (1, 1), activation='relu', padding='same')(direct_input)
x = UpSampling2D((2, 2))(x)
x = Conv2D(192, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(96, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(48, (3, 3), activation='relu', padding='same')(x)
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
# COMPILE
encoder = Model(input_img, encoded)
decoder = Model(direct_input, decoded)
autoencoder = Model(input_img, decoder(encoded))
autoencoder.compile(optimizer='Adam', loss='binary_crossentropy')
autoencoder.save_weights('autoencoder_DogsAuto.h5')
history=autoencoder.fit_generator(train_batches,steps_per_epoch=10,epochs =
EPOCHS)
#Images for tests
testGene = train_datagen.flow_from_directory('C:/PetImages/',
target_size=(64,64), shuffle=True, class_mode='input',
batch_size=BATCH_SIZE)
restored = autoencoder.predict_generator(testGene,
steps=testGene.n/BATCH_SIZE)
image_height=64
image_width=64
image_channels=3
x_train = np.zeros((0, image_height, image_width, image_channels), dtype=float)
for x, _ in train_batches :
if train_batches.total_batches_seen > train_batches.n/BATCH_SIZE:
break
else:
x_train = np.r_[x_train,x]
pred=autoencoder.predict(train_batches, steps=train_batches.n/BATCH_SIZE)
from sklearn import metrics
score1=np.sqrt(metrics.mean_squared_error(pred,x_train ))
print(score1)
And I got this error:
Traceback (most recent call last):
File "c:\autoencoder_anomaly.py", line 196, in
score1=np.sqrt(metrics.mean_squared_error(pred,x_train ))
File "C:\Users\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\metrics_regression.py", line 252, in mean_squared_error
y_true, y_pred, multioutput)
File "C:\Users\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\metrics_regression.py", line 84, in _check_reg_targets
check_consistent_length(y_true, y_pred)
ValueError: Found input variables with inconsistent numbers of samples: [6, 0]
Note that I am using only 6 images.
So how can I calculate the error of the reconstructed image using metrics and the autoencoder Model on tensorflow ?
This is simply because of shape mismatch.
when you calculate mean squared error,it calculates the element wise error of ground truth values and estimated values. so pred.shape and train_batches.shape should be equal.check the input data shapes and make sure they are equal.
step 1:
get all training images from the generator and add to one array
x_test = np.zeros((0, image_height, image_width, image_color), dtype=float)
for x, _ in testGene:
if testGene.total_batches_seen > testGene.n/BATCH_SIZE:
break
else:
x_test = np.r_[x_test , x]
step 2 : prediction
pred=autoencoder.predict(testGene, steps=testGene.n/BATCH_SIZE)
step 3 : calculate the difference
score1=np.sqrt(metrics.mean_squared_error(pred,testGene))
I've trained my model and got the .hdf5 file. (training and validation accuracy are about 0.9)
Below is my accuracy curve.
trainnig curve
Because of the imbalance of my data, I used SMOTE to oversample my data and then split it into training and validation data.
sm = SMOTE(random_state=42)
X_resampled, y_resampled = sm.fit_resample(X, Y)
X_resampled = X_resampled.reshape(X_resampled.shape[0],128,128,3)
X_tr, X_tst, y_tr, y_tst = train_test_split(X_resampled, y_resampled, test_size=0.33,random_state=22)
And below is my model structure.
image_input = Input(shape=(img_size, img_size, 3))
conv_1 = Conv2D(64, (5, 5), padding='same',
input_shape=(img_size, img_size, 3), activation='relu')(image_input)
drop_2 = Dropout(0.4)(conv_1)
conv_3 = Conv2D(64, (3, 3), padding='same', activation='relu')(drop_2)
drop_4 = Dropout(0.4)(conv_3)
max_5 = MaxPooling2D(pool_size=(2, 2))(drop_4)
conv_6 = Conv2D(32, (5, 5), padding='same', activation='relu')(max_5)
drop_7 = Dropout(0.4)(conv_6)
conv_8 = Conv2D(32, (3, 3), padding='same', activation='relu')(drop_7)
drop_9 = Dropout(0.4)(conv_8)
max_10= MaxPooling2D(pool_size=(2, 2))(drop_9)
conv_11 = Conv2D(32, (5, 5), padding='same', activation='relu')(max_10)
drop_12 = Dropout(0.4)(conv_11)
conv_13 = Conv2D(32, (3, 3), padding='same', activation='relu')(drop_12)
drop_14 = Dropout(0.4)(conv_13)
max_15= MaxPooling2D(pool_size=(2, 2))(drop_14)
flat_16 = Flatten()(max_15)
den_17= Dense(8,activation='relu')(flat_16)
output = Dense(nb_classes, activation='softmax')(den_17)
img_size = 128
nb_classes = 6
batch_size = 256
nb_epoch=1000
savedModelName = 'M.hdf5'
lr = 0.00001
After I finished training my model, I saved it (by ModelCheckpoint save_best_only according to validation accuracy).
And then I used it to predict the "same" data (same random_state).
sm = SMOTE(random_state=42)
X_resampled, y_resampled = sm.fit_resample(X, Y)
X_resampled = X_resampled.reshape(X_resampled.shape[0], 128, 128, 3)
X_tr, X_tst, y_tr, y_tst = train_test_split(X_resampled, y_resampled, test_size=0.33,random_state=22)
But!! I get the prediction accuracy about 0.3.
Why?
Shouldn't it be 0.9?
could you maybe provide the code of you fitting the model.
Also what happens if you predict the first test set with your model? Could you provide maybe a confusion matrix or the Precision/Recall-Values?
My first guess would be, that your model is maybe overfitting or not really learning.
I am building a convolutional encoder to deal with some 128x128 images - like this.
I have been testing the structure by testing it with a image set of 500 images. The resulting decoded images are basically entirely black (not want I want!)
I was hoping to get some advice on here as I think I am making some obvious mistakes.
A small group of images can be downloaded here -> https://www.dropbox.com/sh/0oj1p6sqal32cvx/AAAYQJSK2SPfynD8wYMSo9bPa?dl=0
Current Code
################################# SETUP #######################################
import glob
import pandas as pd
import numpy as np
import sys
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import random
np.set_printoptions(threshold=np.nan)
######################### DATA PREPARATION #####################################
# create a list of XML files within the raw data folder
image_list = glob.glob("Images/test_images/*.jpeg")
print(image_list)
l = []
for i in image_list:
img = np.array(cv2.imread(i, 0))
l.append(img)
T = np.array(l)
# split into training and testing sets
labels = image_list
data_train, data_test, labels_train, labels_test = train_test_split(T, labels, test_size=0.5, random_state=42)
# convert to 0-1 floats (reconversion by * 255)
data_train = data_train.astype('float32') / 255.
data_test = data_test.astype('float32') / 255.
print(data_train.shape)
# reshape from channels first to channels last
data_train = np.rollaxis(data_train, 0, 3)
data_test = np.rollaxis(data_test, 0, 3)
######################### ENCODER MODELING #####################################
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras import backend as K
input_img = Input(shape=(128, 128, 1)) # adapt this if using `channels_first` image data format
x = Conv2D(64, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
# create the model
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='mse')
# reference for reordering
data_train_dimensions = data_train.shape
data_test_dimensions = data_test.shape
# reshape the data sets
data_train = np.reshape(data_train, (data_train_dimensions[2], 128, 128, 1)) # adapt this if using `channels_first` image data format
data_test = np.reshape(data_test, (data_test_dimensions[2], 128, 128, 1))
from keras.callbacks import TensorBoard
autoencoder.fit(data_train, data_test,
epochs=10,
batch_size=128,
shuffle=True,
validation_data=(data_train, data_test),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
# create decoded images from model
decoded_imgs = autoencoder.predict(data_test)
# reorder columns
decoded_imgs = np.rollaxis(decoded_imgs, 3, 1)
# reshape from channels first to channels last
data_train = np.rollaxis(data_train, 0, 3)
data_test = np.rollaxis(data_test, 0, 3)
# convert to 0-1 floats (reconversion by * 255)
data_train = data_train.astype('float32') * 255.
data_test = data_test.astype('float32') * 255.
I think the main problem is that you fit on data_train, data_test instead of data_train, labels_train, that is, you should fit your model on samples and corresponding outputs but you train it only on inputs which are by a coincidence are of compatible shapes because of the 50/50 split.
If the intention of the model is to reproduce an image from compressed representation, then you could train fit(data_train, data_train, ..., validation_data=(data_test, data_test)).
hi I am building a image classifier for one-class classification in which i've used autoencoder while running this model I am getting this error by this line (autoencoder_model.fit) (ValueError: Error when checking target: expected model_2 to have shape (None, 252, 252, 1) but got array with shape (300, 128, 128, 3).)
num_of_samples = img_data.shape[0]
labels = np.ones((num_of_samples,),dtype='int64')
labels[0:376]=0
names = ['cats']
input_shape=img_data[0].shape
X_train, X_test = train_test_split(img_data, test_size=0.2, random_state=2)
inputTensor = Input(input_shape)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(inputTensor)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded_data = MaxPooling2D((2, 2), padding='same')(x)
encoder_model = Model(inputTensor,encoded_data)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
encoded_input = Input((4,4,8))
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded_input)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu',padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded_data = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
decoder_model = Model(encoded_input,decoded_data)
autoencoder_input = Input(input_shape)
encoded = encoder_model(autoencoder_input)
decoded = decoder_model(encoded)
autoencoder_model = Model(autoencoder_input, decoded)
autoencoder_model.compile(optimizer='adadelta',
`enter code here`loss='binary_crossentropy')
autoencoder_model.fit(X_train, X_train,
epochs=50,
batch_size=32,
validation_data=(X_test, X_test),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
It's a simple incompatibility between the output shape of the decoder and the shape of your training data. (Target means output).
I see you've got 2 MaxPoolings (dividing your image size by 4), and three upsamplings (multiplying the decoder's input by 8).
The final output of the autoencoder is too big and doesn't match your data. You must simply work in the model to make the output shape match your training data.
You're using wrong API
autoencoder_model.fit(X_train, X_train, <--- This one is wrong
epochs=50,
batch_size=32,
validation_data=(X_test, X_test),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
Take a look at .fit method source code
from https://github.com/keras-team/keras/blob/master/keras/models.py
def fit(self,
x=None,
y=None,
batch_size=None,
epochs=1,
verbose=1,
callbacks=None,
validation_split=0.,
validation_data=None,
shuffle=True,
class_weight=None,
sample_weight=None,
initial_epoch=0,
steps_per_epoch=None,
validation_steps=None,
**kwargs):
"""Trains the model for a fixed number of epochs (iterations on a dataset).
# Arguments
x: Numpy array of training data.
If the input layer in the model is named, you can also pass a
dictionary mapping the input name to a Numpy array.
`x` can be `None` (default) if feeding from
framework-native tensors (e.g. TensorFlow data tensors).
y: Numpy array of target (label) data.
If the output layer in the model is named, you can also pass a
dictionary mapping the output name to a Numpy array.
`y` can be `None` (default) if feeding from
framework-native tensors (e.g. TensorFlow data tensors).
So the x should be data, and the y should be label of the data.
Hope that help