I am trying to get values in the simple vector using autoencoder
here is my code
input_img = Input(shape=(28, 28, 1))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
Here I need a flatten layer
encoder = Model(input_img, encoded)
And then make it convolutional back
encoderOutputShape = encoded._keras_shape[1:]
# unflatten here
decoder_input= Input(encoderOutputShape)
decoder = Conv2D(32, (3, 3), activation='relu', padding='same')(decoder_input)
x = UpSampling2D((2, 2))(decoder)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
decoder = Model(decoder_input, decoded)
auto_input = Input(shape=(28,28,1))
encoded = encoder(auto_input)
decoded = decoder(encoded)
auto_encoder = Model(auto_input, decoded)
How to do it in the right way?
In other words, I want to get the output of the encoder (or use random data), change it and put into the decoder and get the decoded result.
There is a question here why do you flat the tensor if you don't use any Dense layers?
but you can make like this:
encoder_output = Flatten()(encoded)
decoder_input = Reshape((7, 7, 32))(encoder_output)
decoder = Conv2D(32, (3, 3), activation='relu', padding='same')(decoder_input)
that is because you need to reshape your tensor before.
Related
I am implementing an autoencoder and I want to calculate the dimension of the latent space.
Let's say that I want a 3D latent space. So from my code how to calculate the current latent space.
Thank you
my current code:
x = Input(shape=(28, 28,1))
# Encoder
conv1_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
pool1 = MaxPooling2D((2, 2))(conv1_1)
conv1_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool1)
pool2 = MaxPooling2D((2, 2))(conv1_2)
conv1_3 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool2)
h = MaxPooling2D((2, 2))(conv1_3)
# Decoder
conv2_1 = Conv2D(8, (3, 3), activation='relu', padding='same')(h)
up1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(up1)
up2 = UpSampling2D((2, 2))(conv2_2)
conv2_3 = Conv2D(16, (3, 3), activation='relu')(up2)
up3 = UpSampling2D((2, 2))(conv2_3)
r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up3)
autoencoder = Model(inputs=x, outputs=r)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
The architecture of an Autoencoder is shaped in the form of a funnel, with the number of nodes decreasing as we move from the input layer to a layer (shown in red) known as "latent space." From latent space, the number of nodes appears to increase again until it reaches the output layer, where it equals the number of nodes in the input layer.
From the model summary we can see that the 7th layer is the latent space which is the compressed form of the input data.
tf.keras.backend.ndim(autoencoder.layers[6].output)
To get the dimension of a particular layer from a model you can use the above API. Thank you!
I am trying to work on building an autoencoder in Keras, with an input shape of (470,470,3) but the output never seems to match, even when I try to switch around padding. This is my code, can you please help? The way it is currently written my model summary shows an output of (472, 472, 3).
from tensorflow.keras.layers import Conv2D, MaxPooling2D, UpSampling2D
from tensorflow.keras import Input, Model
input_image = Input(shape=(470, 470, 3))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_image)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded_image = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_image, decoded_image)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
Thank you!
Change your last padding to 'valid':
decoded_image = Conv2D(3, (3, 3), activation='sigmoid', padding='valid')(x)
I'm having trouble with this autoencoder I'm building using Keras. The input's shape is dependent on the screen size, and the output is going to be a prediction of the next screen size... However there seems to be an error that I cannot figure out... Please excuse my awful formatting on this website...
Code:
def model_build():
input_img = InputLayer(shape=(1, env_size()[1], env_size()[0]))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
model = Model(input_img, decoded)
return model
if __name__ == '__main__':
model = model_build()
model.compile('adam', 'mean_squared_error')
y = np.array([env()])
print(y.shape)
print(y.ndim)
debug = model.fit(np.array([[env()]]), np.array([[env()]]))
Error:
Traceback (most recent call last):
File "/home/ai/Desktop/algernon-test/rewarders.py", line 46, in
debug = model.fit(np.array([[env()]]), np.array([[env()]]))
File "/home/ai/.local/lib/python3.6/site-packages/keras/engine/training.py", line 952, in fit
batch_size=batch_size)
File "/home/ai/.local/lib/python3.6/site-packages/keras/engine/training.py", line 789, in _standardize_user_data
exception_prefix='target')
File "/home/ai/.local/lib/python3.6/site-packages/keras/engine/training_utils.py", line 138, in standardize_input_data
str(data_shape))
ValueError: Error when checking target: expected conv2d_7 to have shape (4, 268, 1) but got array with shape (1, 270, 480)
EDIT:
Code for get_screen imported as env():
def get_screen():
img = screen.grab()
img = img.resize(screen_size())
img = img.convert('L')
img = np.array(img)
return img
You have three 2x downsampling steps, and three x2 upsampling steps. These steps have no knowledge of the original image size, so they will round out the size to the nearest multiple of 8 = 2^3.
cropX = 7 - ((size[0]+7) % 8)
cropY = 7 - ((size[1]+7) % 8)
cropX = 7 - ((npix+7) % 8)
cropY = 7 - ((nlin+7) % 8)
It ought to work if you add a new final layer...
decoded = layers.Cropping2D(((0,cropY),(0,cropX)))(x)
Looks like env_size() and env() mess image dimensions somehow. Consider this example:
image1 = np.random.rand(1, 1, 270, 480) #First dimension is batch size for test purpose
image2 = np.random.rand(1, 4, 268, 1) #Or any other arbitrary dimensions
input_img = layers.Input(shape=image1[0].shape)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(32, (3, 3), activation='relu')(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
model = tf.keras.Model(input_img, decoded)
model.compile('adam', 'mean_squared_error')
model.summary()
This line will work:
model.fit(image1, nb_epoch=1, batch_size=1)
But this doesn't
model.fit(image2, nb_epoch=1, batch_size=1)
Edit:
In order to get output of the same size as input you need to calculate convolution kernel size carefully.
image1 = np.random.rand(1, 1920, 1080, 1)
input_img = layers.Input(shape=image1[0].shape)
x = layers.Conv2D(32, 3, activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(16, 3, activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, 3, activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, 3, activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, 3, activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(32, 1, activation='relu')(x) # set kernel size to 1 for example
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, 3, activation='sigmoid', padding='same')(x)
model = tf.keras.Model(input_img, decoded)
model.compile('adam', 'mean_squared_error')
model.summary()
This will output same dimensions.
As per this guide http://cs231n.github.io/convolutional-networks/
We can compute the spatial size of the output volume as a function of
the input volume size (W), the receptive field size of the Conv Layer
neurons (F), the stride with which they are applied (S), and the
amount of zero padding used (P) on the border. You can convince
yourself that the correct formula for calculating how many neurons
“fit” is given by (W−F+2P)/S+1. For example for a 7x7 input and a 3x3
filter with stride 1 and pad 0 we would get a 5x5 output. With stride
2 we would get a 3x3 output.
I'm simply trying to do what the title says. Here's my code:
def ConvAutoEncoder(train_data,test_data,n_epochs = 50,batchSize = 128,data_shape=(IMAGE_SIZE,IMAGE_SIZE,3)):
print('Training Neural Network')
input_img = Input(shape=data_shape)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
print(x.shape)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
print(x.shape)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
print(x.shape)
x = Conv2D(4, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
print(encoded.shape)
# at this point the representation is (6, 6, 4 i.e. 128-dimensional
encoded = Flatten()(encoded)
encoded = Dense( 6*6*4,activation='relu')(encoded)
print(encoded.shape)
endoded = Reshape((6,6,4))(encoded)
print(encoded.shape)
x = Conv2D(4, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
print(x.shape)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
print(x.shape)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
print(x.shape)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
print(x.shape)
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(train_data, train_data,
epochs=n_epochs,
batch_size=batchSize,
shuffle=True,
verbose=2,
validation_data=(test_data, test_data),
callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
return autoencoder
However when I run it the Reshape layer doesn't do anything at all, the shape of the output before the reshape is (?,144) and the shape after is also (?,144). Am I using reshape wrong or is there some other way to connect a dense layer to a conv2D layer?
I set the convolutional layer and the pooling layer, then deconvolute and de-pool, the input shape of 256 * 256 * 3 images, but finally there is a shape error:
def build_auto_encode_model(shape=(256,256,3)):
input_img = Input(shape=shape)
x = Convolution2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(3, (3, 3), activation='sigmoid', padding='same')(x)
encoder = Model(inputs=input_img, outputs=encoded)
autoencoder = Model(inputs=input_img, outputs=decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
return encoder, autoencoder
def train_auto_encode_model(encoder_model_path="./data/encoder.h5"):
X = np.load("data/train.npy")
X_train = X[int(round(X.shape[0] * 0.2)):, :]
X_test = X[0:int(round(X.shape[0] * 0.2)), :]
encoder, autoencoder = build_auto_encode_model()
autoencoder.fit(X_train, X_train, epochs=10, batch_size=64, shuffle=True, validation_data=(X_test, X_test))
encoder.save(encoder_model_path)
Here is the error I get:
Error when checking target: expected conv2d_7 to have shape (252, 252, 3) but got array with shape (256, 256, 3)
Error trace-back:
By using autoencoder.summary() you would see that the output shape of the last Conv2D layer is (None, 252, 252, 3); so the labels of shape (256,256,3) are not compatible. The cause of this problem is that you have forgotten to set the padding argument of previous Conv2D layer. By setting it to 'same' this problem would resolve:
x = Convolution2D(16, (3, 3), activation='relu', padding='same')(x)