Understanding of Basic Neural Network Structure - python

Let's say I want to code this basic Neural Network Structure in Keras which has 10 units in Input Layer and 3 units in Output layer.
Now if I am using Keras, and give input_shape of more then 10, how it will adjust in it.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential()
model.add(Dense(10, activation = 'relu', input_shape = (64,)))
model.add(Dense(3, activation = 'sigmoid'))
model.summary()
You see, here input_shape is of size 64, but how will it adjust in model whose first layer has 10 units because for what I have learned that size of input shape/vector should be equal to number of units in the input layer.
Or Am I not implementing this neural network right?

That would not be a problem. The weight matrix of shape (10,64) would be used in input layer. your input has shape 64 and first hidden layer has 10 units giving a output of 3 units. Seems fine to me.
But your input layer itself is 64. So what you are getting is a 3-layer network with a hidden layer of 10 units.

If the shape of your input vector is 64, then you really need to have an input layer with size 64. The input layer of a neural network doesn't perform any computations. It just passes the inputs forward to the first hidden layer. This one, on the other hand, performs the computations for all neurons contained in it (linear combination of input vector and weights, later served as an input to the activation function, which is the ReLU in your case).
In your code, you are building a neural net with 64 input neurons (which again don't perform any computations), 10 neurons in the first (and only) hidden layer and 3 neurons in the output layer.

Related

Keras Flatten and GRU input_shape difference receiving same inputs

I'm looking at an example from a book. The input is of shape (samples=128, timesteps=24, features=13). When defining two different networks both receiving the same input they have different input_shape on flatten and GRU layers.
model 1:
model = Sequential()
model.add(layers.Flatten(input_shape=(24, 13)))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(1))
model 2:
model = Sequential()
model.add(layers.GRU(32, input_shape=(None, 13)))
model.add(layers.Dense(1))
I understand that input_shape represents the shape of a single input (not considering batch size), so on my understanding the input_shape on both cases should be (24, 13).
Why are the input_shapes differents between model 1 and model 2?
GRU is a recurrent unit (RNN), which takes a sequence of data as input. The expected input shape for GRU is (batch size, sequence length, feature size). In your case the sequence length is 24 and feature size is 13.
As usual, you don't need to specify a batch size for input_shape argument. Additionally, for recurrent units like GRU or LSTM you can use "None" instead of sequence length, so that it can accept sequences of any length. This is why "input_shape=(None, 13)" is allowed here.

Introduction part in Deep Learning with Keras using Python

Instruction of questions:
Instantiate a Sequential model.
Add a Dense layer of 50 neurons with an input shape of 1 neuron.
Add two Dense layers of 50 neurons each and 'relu' activation.
End your model with a Dense layer with a single neuron and no activation.
Below is my code:
# Instantiate a Sequential model
model = Sequential()
# Add a Dense layer with 50 neurons and an input of 1 neuron
model.add(Dense(50, input_shape=(2,), activation='relu'))
# Add two Dense layers with 50 neurons and relu activation
model.add(Dense(____,____=____))
model.____
# End your model with a Dense layer and no activation
model.____
I am confused about the part
model.add(Dense(____,____=____))
In the documentation, the only required parameter for the Dense layer is units, which is the number of neurons. The default activation function is None, so if you want it to be "relu", do activation="relu".
In conclusion, this is that piece of code that creates a Dense layer with 50 neurons and activation as relu:
model.add(Dense(50, activation="relu"))
In model.add(Dense(___,___=___)), you have three blanks. The first one is for mentioning number of neurons, the second one for saying that you want to set some value for activation and the third one is for setting that value as relu.
So you will get model.add(Dense(50,activation='relu'))
More information can be found in the Dense layer documentation.

Replicated Convolutional Bi-Directional LSTM implementation in Keras diverging

This is the model I am trying to replicate (more information in linked paper):
In our models, we adopted one dropout layer between LSTM models and the first fully-connected layer and another dropout layer between the first fully-connected layer and the second fully-connected layer. Their masking probabilities are both set to 0.5.
...
For our proposed CBLSTM, one-layer CNN is firstly designed, whose filter number, filter size and pooling size are set to 150, 10 and 5. Therefore, the shape of the raw sensory sequence is changed from 100 x 12 to 19 x 150 after CNN. Then, a two-layer bi-directional LSTM is built on top of the CNN.
Backward and forward LSTMs share the same layer sizes as [150, 200]. Therefore, the output of the LSTM module is the concatenated vector of the representations learned by backward and forward LSTMs, and its dimensionality is 400. Then, before feeding the representation into the linear regression layer, two fully-connected layers with a size of [500, 600] are adopted. The nonlinearity activation functions in our proposed CBLSTM are all set to ReLu.
Source: Zhao, R., Yan, R., Wang, J., & Mao, K. (2017). Learning to monitor machine health with convolutional bi-directional LSTM networks. Sensors, 17(2), 273. link to paper
The input is 630 samples x 100 timesteps x 12 features.
How my model looks at the moment:
model = Sequential()
model.add(Conv1D(filters=150, kernel_size=10, activation='relu', input_shape=(100,12)))
model.add(MaxPooling1D(pool_size=5, strides=None, padding='valid'))
model.add(Bidirectional(LSTM(150, return_sequences=True), merge_mode='concat'))
model.add(Bidirectional(LSTM(200, return_sequences=False), merge_mode='concat'))
model.add(Dropout(0.5))
model.add(Dense(500, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(600, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['mae'])
While the training loss steadily decreases per epoch, the validation set does not and diverges pretty quickly. This indicates that there is a mistake in my model which I have not yet been able to find. Any ideas as to what is wrong?
Side note: I am using the same data as input as the authors.

Keras confusion about number of layers

I'm a bit confused about the number of layers that are used in Keras models. The documentation is rather opaque on the matter.
According to Jason Brownlee the first layer technically consists of two layers, the input layer, specified by input_dim and a hidden layer. See the first questions on his blog.
In all of the Keras documentation the first layer is generally specified as
model.add(Dense(number_of_neurons, input_dim=number_of_cols_in_input, activtion=some_activation_function)).
The most basic model we could make would therefore be:
model = Sequential()
model.add(Dense(1, input_dim = 100, activation = None))
Does this model consist of a single layer, where 100 dimensional input is passed through a single input neuron, or does it consist of two layers, first a 100 dimensional input layer and second a 1 dimensional hidden layer?
Further, if I were to specify a model like this, how many layers does it have?
model = Sequential()
model.add(Dense(32, input_dim = 100, activation = 'sigmoid'))
model.add(Dense(1)))
Is this a model with 1 input layer, 1 hidden layer, and 1 output layer or is this a model with 1 input layer and 1 output layer?
Your first one consists of a 100 neurons input layer connected to one single output neuron
Your second one consists of a 100 neurons input layer, one hidden layer of 32 neurons and one output layer of one single neuron.
You have to think of your first layer as your input layer (with the same number of neurons as the dimenson, so 100 for you) connected to another layer with as many neuron as you specify (1 in your first case, 32 in the second one)
In Keras what is useful is the command
model.summary()
For your first question, the model is :
1 input layer and 1 output layer.
For the second question :
1 input layer
1 hidden layer
1 activation layer (The sigmoid one)
1 output layer
For the input layer, this is abstracted by Keras with the input_dim arg or input_shape, but you can find this layer in :
from keras.layers import Input
Same for the activation layer.
from keras.layers import Activation
# Create a `Sequential` model and add a Dense layer as the first layer.
model = tf.keras.models.Sequential()
model.add(tf.keras.Input(shape=(16,)))
model.add(tf.keras.layers.Dense(32, activation='relu'))
# Now the model will take as input arrays of shape (None, 16)
# and output arrays of shape (None, 32).
# Note that after the first layer, you don't need to specify
# the size of the input anymore:
model.add(tf.keras.layers.Dense(32))
model.output_shape
(None, 32)
model.layers
[<keras.layers.core.dense.Dense at 0x7f494062e950>,
<keras.layers.core.dense.Dense at 0x7f4944048d90>]
model.summary()
Output
it may help you understand clearly

Keras word embedding in four gram model

I am following coursera neural network class and I am trying to pass the assignments using python+keras instead of octave.
I want to predict the fourth word given the previous three ones. My input documents total 250 unique words.
The model should have an embedding layer that maps each word to a 50-d vector space, a hidden layer with 200 neurons with sigmoid activation function and an output layer of 250 units scoring the probability of the forth word to be equal to those in my vocabulary through a softmax activation.
I am having troubles with dimensions. Here is my code:
from keras.models import Sequential
from keras.layers import Dense, Activation, Embedding
model = Sequential([Embedding(250,50),
Dense(200, activation='sigmoid'),
Dense(250, activation='softmax')
])
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
Yet I never get to compile the model since I am encountering the following error:
Exception: Input 0 is incompatible with layer dense_1: expected ndim=2, found ndim=3
Any hint will be much appreciated. Thanks in advance
From https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html
"All that the Embedding layer does is to map the integer inputs to the vectors found at the corresponding index in the embedding matrix, i.e. the sequence [1, 2] would be converted to [embeddings[1], embeddings[2]]. This means that the output of the Embedding layer will be a 3D tensor of shape (samples, sequence_length, embedding_dim)."
Your embedding layer outputs 3 dimension vectors while the dense layers expects 2 dim vecs.
You can follow the links tutorial and with some mods it will fit your problem.

Categories

Resources