Keras sequential model with multiple inputs, Tensorflow 1.9.0 - python

I try creating a neural network, having two inputs of a particular size (here four) each and one output of the same size size (so also four). Unfortunately, I always get this error when running my code:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not
the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays:
[array([[[-1.07920336, 1.16782929, 1.40131554, -0.30052492],
[-0.50067655, 0.54517916, -0.87033621, -0.22922157]],
[[-0.53766128, -0.03527806, -0.14637072, 2.32319071],
[ 0...
I think, the problem lies in the fact, that once I pass the data for training, the input shape is either incorrect or I have a datatype issue. Hence, there is an extra list bracket around the array.
I'm using Tensorflow 1.9.0 (due to project restrictions). I already checked the search function and tried solutions provided here. Following is an example code for reproducting the error of mine:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import keras.backend as K
from tensorflow.keras import layers, models
def main():
ip1 = keras.layers.Input(shape=(4,))
ip2 = keras.layers.Input(shape=(4,))
dense = layers.Dense(3, activation='sigmoid', input_dim=4) # Passing the value in a weighted manner
merge_layer = layers.Concatenate()([ip1, ip2]) # Concatenating the outputs of the first network
y = layers.Dense(6, activation='sigmoid')(merge_layer) # Three fully connected layers
y = layers.Dense(4, activation='sigmoid')(y)
model = keras.Model(inputs=[ip1, ip2], outputs=y)
model.compile(optimizer='adam',
loss='mean_squared_error')
model.summary()
# dataset shape: 800 samples, 2 inputs for sequential model, 4 input size
X_train = np.random.randn(800, 2, 4)
y_train = np.random.randn(800, 4)
X_test = np.random.randn(200, 2, 4)
y_test = np.random.randn(200, 4)
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=1000, batch_size=32)
if __name__ == '__main__':
main()

When there is multiple inputs keras expects list of multiple arrays. The size of the list corresponds to number of inputs you have for the model.
So basically you need to pass a list of 2 array each with shape (X,4)
X_train1 = np.random.randn(800, 4)
X_train2=np.random.randn(800,4)
y_train = np.random.randn(800, 4)
X_test1 = np.random.randn(200, 4)
X_test2 = np.random.randn(200, 4)
y_test = np.random.randn(200, 4)
history = model.fit([X_train1,X_train2], y_train, validation_data=([X_test1,X_test2], y_test), epochs=1000, batch_size=32)

Related

Input to the Neural Network using an array

I am writing a neural network to take the Mel frequency coefficients as inputs and then run the model. My dataset contains 100 samples - each sample is an array of 12 values corresponding to the coefficients. After splitting this data into train and test sets, I have created the X input corresponding to the array and the y input corresponding to the label.
Data array containing the coefficients
Here is a small sample of my data containing 5 elements in the X_train array:
['[107.59366 -14.153783 24.799461 -8.244417 20.95272\n -4.375943 12.77285 -0.92922235 3.9418116 7.3581047\n -0.30066165 5.441765 ]'
'[ 96.49664 2.0689797 21.557552 -32.827045 7.348135 -23.513977\n 7.9406714 -16.218931 10.594619 -21.4381 0.5903044 -10.569035 ]'
'[105.98041 -2.0483367 12.276348 -27.334534 6.8239 -23.019623\n 7.5176797 -21.884727 11.349695 -22.734652 3.0335162 -11.142375 ]'
'[ 7.73094559e+01 1.91073620e+00 6.72225571e+00 -2.74525508e-02\n 6.60858107e+00 5.99264860e-01 1.96265772e-01 -3.94772577e+00\n 7.46383286e+00 5.42239428e+00 1.21432066e-01 2.44894314e+00]']
When I create the Neural network, I want to use the 12 coefficients as an input for the network. In order to do this, I need to use each row of my X_train dataset that contains these arrays as the input. However, when I try to consider the array index as an input it gives me shape errors when trying to fit the model. My model is as follows:
def build_model_graph():
model = Sequential()
model.add(Input(shape=(12,)))
model.add(Dense(12))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('relu'))
model.add(Dense(num_labels))
model.add(Activation('softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
return model
Here, I want to use every row of the X_train array as an input which would correspond to the shape(12,). When I use something like this:
num_epochs = 50
num_batch_size = 32
model.fit(x_train, y_train, batch_size=num_batch_size, epochs=num_epochs,
validation_data=(x_test, y_test), verbose=1)
I get an error for the shape which makes sense to me.
For reference, the error is as follows:
ValueError: Exception encountered when calling layer "sequential_20" (type Sequential).
Input 0 of layer "dense_54" is incompatible with the layer: expected min_ndim=2, found ndim=1. Full shape received: (None,)
But I am not exactly sure how I can extract the array of 12 coefficients present at each index of the X_train and then use it in the model input. Indexing the x_train and y_train did not work either. If anyone could point me in a relevant direction, it would be extremely helpful. Thanks!
Edit: My code for the dataframe is as follows:
clapdf = pd.read_csv("clapsdf.csv")
clapdf.drop('Unnamed: 0', inplace=True, axis=1)
clapdf.head()
nonclapdf = pd.read_csv("nonclapsdf.csv")
nonclapdf.drop('Unnamed: 0', inplace=True, axis=1)
sound_df = clapdf.append(nonclapdf)
sound_df.head()
d=sound_data.tolist()
df=pd.DataFrame(data=d)
data = df[0].to_numpy()
print("Before-->", data.shape)
dat = np.array([np.array(d) for d in data])
print('After-->', dat.shape)
Here, the shape remains the same as the values of each of the 80 samples are not in a comma separated format but instead in the form of a series.
If your data looks like this:
samples = 2
features = 12
x_train = tf.random.normal((samples, 1, features))
tf.Tensor(
[[[-2.5988803 -0.629626 -0.8306641 -0.78226614 0.88989156
-0.3851106 -0.66053045 1.0571191 -0.59061646 -1.1602987
0.69124466 -0.04354193]]
[[-0.86917496 2.2923143 -0.05498986 -0.09578358 0.85037625
-0.54679644 -1.2213608 -1.3766612 0.35416105 -0.57801914
-0.3699728 0.7884727 ]]], shape=(2, 1, 12), dtype=float32)
You will have to reshape it to (2, 12) in order to fit your model with the input shape (batch_size, 12):
import tensorflow as tf
def build_model_graph():
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=(12,)))
model.add(tf.keras.layers.Dense(12))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dense(10))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dense(2))
model.add(tf.keras.layers.Activation('softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
return model
model = build_model_graph()
samples = 2
features = 12
x_train = tf.random.normal((samples, 1, features))
x_train = tf.reshape(x_train, (samples, features))
y = tf.random.uniform((samples, 1), maxval=2, dtype=tf.int32)
y_train = tf.keras.utils.to_categorical(y, 2)
model.fit(x_train, y_train, batch_size=1, epochs=2)
Also, you usually need to convert your labels to one-hot encoded vectors if you plan to use categorical_crossentropy.
y_train looks like this:
[[0. 1.]
[1. 0.]]
Update 1:
If your data is coming from a dataframe, try something like this:
import numpy as np
import pandas as pd
d = {'features': [[0.18525402, 0.92130125, 0.2296906, 0.75818471, 0.69813222, 0.47147329,
0.03560711, 0.06583931, 0.90921289, 0.76002148, 0.50413995, 0.36099004],
[0.18525402, 0.92130125, 0.2296906, 0.75818471, 0.69813222, 0.47147329,
0.03560711, 0.06583931, 0.90921289, 0.76002148, 0.50413995, 0.36099004]]}
df = pd.DataFrame(data=d)
data = df['features'].to_numpy()
print('Before -->', data.shape)
data = np.array([np.array(d) for d in data])
print('After -->', data.shape)
Before --> (2,)
After --> (2, 12)

Keras Normalization for a 2d input array

I am new to machine learning and trying to apply it to my problem.
I have a training dataset with 44000 rows of features with shape 6, 25. I want to build a sequential model. I was wondering if there is a way to use the features without flattening it. Currently, I flatten the features to 1d array and normalize for training (see the code below). I could not find a way to normalize 2d features.
dataset2d = dataset2d.reshape(dataset2d.shape[0],
dataset2d.shape[1]*dataset2d.shape[2])
normalizer = preprocessing.Normalization()
normalizer.adapt(dataset2d)
print(normalizer.mean.numpy())
x_train, x_test, y_train, y_test = train_test_split(dataset2d, flux_val,
test_size=0.2)
# %% DNN regression multiple parameter
def build_and_compile_model(norm):
inputs = Input(shape=(x_test.shape[1],))
x = norm(inputs)
x = layers.Dense(128, activation="selu")(x)
x = layers.Dense(64, activation="relu")(x)
x = layers.Dense(32, activation="relu")(x)
x = layers.Dense(1, activation="linear")(x)
model = Model(inputs, x)
model.compile(loss='mean_squared_error',
optimizer=keras.optimizers.Adam(learning_rate=1e-3))
return model
dnn_model = build_and_compile_model(normalizer)
dnn_model.summary()
# interrupt training when model is no longer imporving
path_checkpoint = "model_checkpoint.h5"
modelckpt_callback = keras.callbacks.ModelCheckpoint(monitor="val_loss",
filepath=path_checkpoint,
verbose=1,
save_weights_only=True,
save_best_only=True)
es_callback = keras.callbacks.EarlyStopping(monitor="val_loss",
min_delta=0, patience=10)
history = dnn_model.fit(x_train, y_train, validation_split=0.2,
epochs=120, callbacks=[es_callback, modelckpt_callback])
I also tried to modify my model input layer to the following, such that I do not need to reshape my input
inputs = Input(shape=(x_test.shape[-1], x_test.shape[-2], ))
and modify the normalization to the following
normalizer = preprocessing.Normalization(axis=1)
normalizer.adapt(dataset2d)
print(normalizer.mean.numpy())
But this does not seem to help. The normalization adapts to a 1d array of length 6, while I want it to adapt to a 2d array of shape 25, 6.
Sorry for the long question. You help will be much appreciated.
I'm not sure if I understood your issue. The normalizer layer can take N-D tensor and it produces an output with the same shape, for example:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
t = tf.constant(np.arange(2*3*4).reshape(2,3,4) , dtype=tf.float32)
tf.print("\n",t)
normalizer_layer = tf.keras.layers.LayerNormalization(axis=1)
output = normalizer_layer(t)
tf.print("\n",output)

LSTM: Input 0 of layer sequential is incompatible with the layer

I know there are several questions about this here, but I haven't found one which fits exactly my problem.
I'm trying to fit an LSTM with data from Pandas DataFrames but getting confused about the format I have to provide them.
I created a small code snipped which shall show you what I try to do:
import pandas as pd, tensorflow as tf, random
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
targets = pd.DataFrame(index=pd.date_range(start='2019-01-01', periods=300, freq='D'))
targets['A'] = [random.random() for _ in range(len(targets))]
targets['B'] = [random.random() for _ in range(len(targets))]
features = pd.DataFrame(index=targets.index)
for i in range(len(features)) :
features[str(i)] = [random.random() for _ in range(len(features))]
model = Sequential()
model.add(LSTM(units=targets.shape[1], input_shape=features.shape))
model.compile(optimizer='adam', loss='mae')
model.fit(features, targets, batch_size=10, epochs=10)
this results to:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [10, 300]
which I expect relates to the dimensions of the features DataFrame provided. I guess that once fixed this the next error would mention the targets DataFrame.
As far as I understand, 'units' parameter of my first layer defines the output dimensionality of this model. The inputs have to have a 3D shape, but I don't know how to create them out of the 2D world of the Data Frames.
I hope you can help me understanding the reshape mechanism in Python and how to use them in combination with Pandas DataFrames. (I'm quite new to Python and came from R)
Thankls in advance
Lets looks at the few popular ways in LSTMs are used.
Many to Many
Example: You have a sentence (composed of words in sequence). Give these sequence of words you would like to predict the Parts of speech (POS) of each word.
So you have n words and you feed each word per timestep to the LSTM. Each LSTM timestep (also called LSTM unwrapping) will produce and output. The word is represented by a a set of features normally word embeddings. So the input to LSTM is of size bath_size X time_steps X features
Keras code:
inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = True)(inputs)
outputs = keras.layers.TimeDistributed(keras.layers.Dense(5, activation='softmax'))(lstm)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')
X = np.random.randn(4,10,3)
y = np.random.randint(0,2, size=(4,10,5))
model.fit(X, y, epochs=2)
print (model.predict(X).shape)
Many to One
Example: Again you have a sentence (composed of words in sequence). Give these sequence of words you would like to predict sentiment of the sentence if it is positive or negative.
Keras code
inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = False)(inputs)
outputs =keras.layers.Dense(5, activation='softmax')(lstm)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')
X = np.random.randn(4,10,3)
y = np.random.randint(0,2, size=(4,5))
model.fit(X, y, epochs=2)
print (model.predict(X).shape)
Many to multi-headed
Example: You have a sentence (composed of words in sequence). Give these sequence of words you would like to predict sentiment of the sentence as well the author of the sentence.
This is multi-headed model where one head will predict the sentiment and another head will predict the author. Both the heads share the same LSTM backbone.
Keras code
inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = False)(inputs)
output_A = keras.layers.Dense(5, activation='softmax')(lstm)
output_B = keras.layers.Dense(5, activation='softmax')(lstm)
model = keras.Model(inputs=inputs, outputs=[output_A, output_B])
model.compile(loss='categorical_crossentropy', optimizer='adam')
X = np.random.randn(4,10,3)
y_A = np.random.randint(0,2, size=(4,5))
y_B = np.random.randint(0,2, size=(4,5))
model.fit(X, [y_A, y_B], epochs=2)
y_hat_A, y_hat_B = model.predict(X)
print (y_hat_A.shape, y_hat_B.shape)
What you are looking for is Many to Multi head model where your predictions for A will be made by one head and another head will make predictions for B
The input data for the LSTM has to be 3D.
If you print the shapes of your DataFrames you get:
targets : (300, 2)
features : (300, 300)
The input data has to be reshaped into (samples, time steps, features). This means that targets and features must have the same shape.
You need to set a number of time steps for your problem, in other words, how many samples will be used to make a prediction.
For example, if you have 300 days and 2 features the time step can be 3. So that three days will be used to make one prediction (you can choose this arbitrarily). Here is the code for reshaping your data (with a few more changes):
import pandas as pd
import numpy as np
import tensorflow as tf
import random
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
data = pd.DataFrame(index=pd.date_range(start='2019-01-01', periods=300, freq='D'))
data['A'] = [random.random() for _ in range(len(data))]
data['B'] = [random.random() for _ in range(len(data))]
# Choose the time_step size.
time_steps = 3
# Use numpy for the 3D array as it is easier to handle.
data = np.array(data)
def make_x_y(ts, data):
"""
Parameters
ts : int
data : numpy array
This function creates two arrays, x and y.
x is the input data and y is the target data.
"""
x, y = [], []
offset = 0
for i in data:
if offset < len(data)-ts:
x.append(data[offset:ts+offset])
y.append(data[ts+offset])
offset += 1
return np.array(x), np.array(y)
x, y = make_x_y(time_steps, data)
print(x.shape, y.shape)
nodes = 100 # This is the width of the network.
out_size = 2 # Number of outputs produced by the network. Same size as features.
model = Sequential()
model.add(LSTM(units=nodes, input_shape=(x.shape[1], x.shape[2])))
model.add(Dense(out_size)) # For the output a Dense (fully connected) layer is used.
model.compile(optimizer='adam', loss='mae')
model.fit(x, y, batch_size=10, epochs=10)
Well, just to finalize this issue I would like to provide one solution I have meanwhile worked on. The class TimeseriesGenerator in tf.keras.... enabled me quite easy to provide the data in the right shape to an LSTM model
from keras.preprocessing.sequence import TimeseriesGenerator
import numpy as np
window_size = 7
batch_size = 8
sampling_rate = 1
train_gen = TimeseriesGenerator(X_train.values, y_train.values,
length=window_size, sampling_rate=sampling_rate,
batch_size=batch_size)
valid_gen = TimeseriesGenerator(X_valid.values, y_valid.values,
length=window_size, sampling_rate=sampling_rate,
batch_size=batch_size)
test_gen = TimeseriesGenerator(X_test.values, y_test.values,
length=window_size, sampling_rate=sampling_rate,
batch_size=batch_size)
There are many other ways on implementing generators e.g. using the more_itertools which provides the function windowed, or making use of tensorflow.Dataset and its function window.
For me the TimeseriesGenerator was sufficient to feed the tests I did.
In case you would like to see an example modeling the DAX based on some stocks I'm sharing a notebook on Github.

Loss function in Keras does not match analogical function

I compared the results, obtained via model.evaluate(...) and the ones via numpy. As you can see, they differ a lot. The kernel has just been restarted. Cannot find where the problem is.
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
import keras.backend as K
X = np.random.rand(10000)
Y = X + np.random.rand(10000) / 5
X_train, X_valid = X[:8000], X[8000:]
Y_train, Y_valid = Y[:8000], Y[8000:]
model = Sequential([
Dense(1, input_shape=(1,), activation='linear'),
])
model.compile('adam', 'mae')
model.fit(X_train, Y_train, epochs=1, batch_size=2000, validation_data=(X_valid, Y_valid))
print(model.evaluate(X_valid, Y_valid))
>>> 0.15643194556236267
preds = model.predict(X_valid)
np.abs(Y_valid - preds).mean()
>>> 0.34461398701699736
Versions: keras = '2.3.1', tensorflow = '2.1.0'.
It's because the model.predict output shape is not same with Y_valid. If you get the transpose of the predictions it will give you almost same loss.
>>> Y_valid.shape
(2000,)
>>> preds.shape
(2000, 1)
>>> np.abs(Y_valid - np.transpose(preds)).mean()
This is a tricky one, but actually simple to fix:
Your targets Y_valid have shape (2000,), i.e. just an array of 2000 numbers. The network outputs however, have shape (2000, 1). The expression Y_valid - preds then tries to subtract a shape (2000, 1) from a shape (2000,)... The two are not compatible, and need to be broadcast. Standard broadcasting rules will proceed as follows:
1. Align like
( 2000,)
(2000, 1)`
2. add extra dimension in front
(1, 2000,)
(2000, 1)
3. broadcast to make compatible
(2000, 2000)
(2000, 2000)
...and so you are actually subtracting two arrays of size (2000, 2000) from each other. You are basically computing the difference between each prediction and all targets instead of just the corresponding one. Obviously, the mean of this will be much larger.
tl; dr: model.evaluate is correct. The manual computation is incorrect due to funny broadcasting. You can fix it by reshaping the predictions to (2000,) (or the targets to (2000, 1):
preds = model.predict(X_valid)[:, 0]
np.abs(Y_valid - preds).mean()

Feed data into lstm using tflearn python

I know there were already some questions in this area, but I couldn't find the answer to my problem.
I have an LSTM (with tflearn) for a regression problem.
I get 3 types of errors, no matter what kind of modifications I do.
import pandas
import tflearn
import tensorflow as tf
from sklearn.cross_validation import train_test_split
csv = pandas.read_csv('something.csv', sep = ',')
X_train, X_test = train_test_split(csv.loc[:,['x1', 'x2',
'x3','x4','x5','x6',
'x7','x8','x9',
'x10']].as_matrix())
Y_train, Y_test = train_test_split(csv.loc[:,['y']].as_matrix())
#create LSTM
g = tflearn.input_data(shape=[None, 1, 10])
g = tflearn.lstm(g, 512, return_seq = True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512)
g = tflearn.dropout(g, 0.5)
g = tflearn.fully_connected(g, 1, activation='softmax')
g = tflearn.regression(g, optimizer='adam', loss = 'mean_square',
learning_rate=0.001)
model = tflearn.DNN(g)
model.fit(X_train, Y_train, validation_set = (Y_train, Y_test))
n_examples = Y_train.size
def mean_squared_error(y,y_):
return tf.reduce_sum(tf.pow(y_ - y, 2))/(2 * n_examples)
print()
print("\nTest prediction")
print(model.predict(X_test))
print(Y_test)
Y_pred = model.predict(X_test)
print('MSE Test: %.3f' % ( mean_squared_error(Y_test,Y_pred)) )
At the first run when starting new kernel i get
ValueError: Cannot feed value of shape (100, 10) for Tensor 'InputData/X:0', which has shape '(?, 1, 10)'
Then, at the second time
AssertionError: Input dim should be at least 3.
and it refers to the second LSTM layer. I tried to remove the second LSTM an Dropout layers, but then I get
feed_dict[net_inputs[i]] = x
IndexError: list index out of range
If you read this, have a nice day. I you answer it, thanks a lot!!!!
Ok, I solved it. I post it so maybe it helps somebody:
X_train = X_train.reshape([-1,1,10])
X_test = X_test.reshape([-1,1,10])

Categories

Resources