Input to the Neural Network using an array - python

I am writing a neural network to take the Mel frequency coefficients as inputs and then run the model. My dataset contains 100 samples - each sample is an array of 12 values corresponding to the coefficients. After splitting this data into train and test sets, I have created the X input corresponding to the array and the y input corresponding to the label.
Data array containing the coefficients
Here is a small sample of my data containing 5 elements in the X_train array:
['[107.59366 -14.153783 24.799461 -8.244417 20.95272\n -4.375943 12.77285 -0.92922235 3.9418116 7.3581047\n -0.30066165 5.441765 ]'
'[ 96.49664 2.0689797 21.557552 -32.827045 7.348135 -23.513977\n 7.9406714 -16.218931 10.594619 -21.4381 0.5903044 -10.569035 ]'
'[105.98041 -2.0483367 12.276348 -27.334534 6.8239 -23.019623\n 7.5176797 -21.884727 11.349695 -22.734652 3.0335162 -11.142375 ]'
'[ 7.73094559e+01 1.91073620e+00 6.72225571e+00 -2.74525508e-02\n 6.60858107e+00 5.99264860e-01 1.96265772e-01 -3.94772577e+00\n 7.46383286e+00 5.42239428e+00 1.21432066e-01 2.44894314e+00]']
When I create the Neural network, I want to use the 12 coefficients as an input for the network. In order to do this, I need to use each row of my X_train dataset that contains these arrays as the input. However, when I try to consider the array index as an input it gives me shape errors when trying to fit the model. My model is as follows:
def build_model_graph():
model = Sequential()
model.add(Input(shape=(12,)))
model.add(Dense(12))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('relu'))
model.add(Dense(num_labels))
model.add(Activation('softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
return model
Here, I want to use every row of the X_train array as an input which would correspond to the shape(12,). When I use something like this:
num_epochs = 50
num_batch_size = 32
model.fit(x_train, y_train, batch_size=num_batch_size, epochs=num_epochs,
validation_data=(x_test, y_test), verbose=1)
I get an error for the shape which makes sense to me.
For reference, the error is as follows:
ValueError: Exception encountered when calling layer "sequential_20" (type Sequential).
Input 0 of layer "dense_54" is incompatible with the layer: expected min_ndim=2, found ndim=1. Full shape received: (None,)
But I am not exactly sure how I can extract the array of 12 coefficients present at each index of the X_train and then use it in the model input. Indexing the x_train and y_train did not work either. If anyone could point me in a relevant direction, it would be extremely helpful. Thanks!
Edit: My code for the dataframe is as follows:
clapdf = pd.read_csv("clapsdf.csv")
clapdf.drop('Unnamed: 0', inplace=True, axis=1)
clapdf.head()
nonclapdf = pd.read_csv("nonclapsdf.csv")
nonclapdf.drop('Unnamed: 0', inplace=True, axis=1)
sound_df = clapdf.append(nonclapdf)
sound_df.head()
d=sound_data.tolist()
df=pd.DataFrame(data=d)
data = df[0].to_numpy()
print("Before-->", data.shape)
dat = np.array([np.array(d) for d in data])
print('After-->', dat.shape)
Here, the shape remains the same as the values of each of the 80 samples are not in a comma separated format but instead in the form of a series.

If your data looks like this:
samples = 2
features = 12
x_train = tf.random.normal((samples, 1, features))
tf.Tensor(
[[[-2.5988803 -0.629626 -0.8306641 -0.78226614 0.88989156
-0.3851106 -0.66053045 1.0571191 -0.59061646 -1.1602987
0.69124466 -0.04354193]]
[[-0.86917496 2.2923143 -0.05498986 -0.09578358 0.85037625
-0.54679644 -1.2213608 -1.3766612 0.35416105 -0.57801914
-0.3699728 0.7884727 ]]], shape=(2, 1, 12), dtype=float32)
You will have to reshape it to (2, 12) in order to fit your model with the input shape (batch_size, 12):
import tensorflow as tf
def build_model_graph():
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=(12,)))
model.add(tf.keras.layers.Dense(12))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dense(10))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dense(2))
model.add(tf.keras.layers.Activation('softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
return model
model = build_model_graph()
samples = 2
features = 12
x_train = tf.random.normal((samples, 1, features))
x_train = tf.reshape(x_train, (samples, features))
y = tf.random.uniform((samples, 1), maxval=2, dtype=tf.int32)
y_train = tf.keras.utils.to_categorical(y, 2)
model.fit(x_train, y_train, batch_size=1, epochs=2)
Also, you usually need to convert your labels to one-hot encoded vectors if you plan to use categorical_crossentropy.
y_train looks like this:
[[0. 1.]
[1. 0.]]
Update 1:
If your data is coming from a dataframe, try something like this:
import numpy as np
import pandas as pd
d = {'features': [[0.18525402, 0.92130125, 0.2296906, 0.75818471, 0.69813222, 0.47147329,
0.03560711, 0.06583931, 0.90921289, 0.76002148, 0.50413995, 0.36099004],
[0.18525402, 0.92130125, 0.2296906, 0.75818471, 0.69813222, 0.47147329,
0.03560711, 0.06583931, 0.90921289, 0.76002148, 0.50413995, 0.36099004]]}
df = pd.DataFrame(data=d)
data = df['features'].to_numpy()
print('Before -->', data.shape)
data = np.array([np.array(d) for d in data])
print('After -->', data.shape)
Before --> (2,)
After --> (2, 12)

Related

Sequential LSTM Model - The dimensions of y_predicted is different from y_trained

I am running an LSTM model on a simple stock market data.
When training the data, the y_train values are a simple array of floats64 of size (985,). But upon using the lstm.predict(X_test), the y_predicted values are of size array of float32 (246,2,1).
Basically, it is giving me two predictions per input X_test value. Ideally I would expect the output to be an array (246,)
Please help, here is the code:
def lstm_split(data,n_steps):
X,y=[],[]
for i in range(len(stock_data)-n_steps+1):
X.append(data[i:i+n_steps,:-1])
y.append(data[i+n_steps-1,-1])
return np.array(X),np.array(y)
stock_data_ft = X_ft
X1,y1 = lstm_split(stock_data_ft.values,n_steps=2)
train_split=0.8
split_idx = int(np.ceil(len(X1)*train_split))
date_index = stock_data_ft.index
X_train, X_test = X1[:split_idx] , X1[split_idx:]
y_train,y_test = y1[:split_idx] , y1[split_idx:]
X_train_date, X_test_date = date_index[:split_idx], date_index[split_idx:]
print(X1.shape , X_train.shape, X_test.shape, y_test.shape)
print(X_train)
lstm = Sequential()
lstm.add(LSTM(32,input_shape=(X_train.shape[1],X_train.shape[2]),activation='relu',return_sequences=True))
lstm.add(Dense(1))
lstm.compile(loss='mean_squared_error',optimizer='adam')
lstm.summary()
history = lstm.fit(X_train,y_train,epochs=100,batch_size=4,verbose=2,shuffle=False)
y_pred = lstm.predict(X_test)
I tried to get predicted values from the model.
y_pred = lstm.predict(X_test)
Was expecting output of array (246,) but instead got an output size array of float32 (246,2,1).
Some additional clarifications:
X_train.shape[1] is 2 and X_train.shape[2] is 3. These indicate the dimensions of the input features.
Basically the X values in training data is an array of dimension (985,2,3).
Some samples below:
[[ 1.53055021, 1.52204214, 1.53825887], [ 1.5526797 , 1.56142366, 1.56073994]],
[[ 1.5526797 , 1.56142366, 1.56073994], [ 1.58880785, 1.59418392, 1.6166433 ]]]

Keras Normalization for a 2d input array

I am new to machine learning and trying to apply it to my problem.
I have a training dataset with 44000 rows of features with shape 6, 25. I want to build a sequential model. I was wondering if there is a way to use the features without flattening it. Currently, I flatten the features to 1d array and normalize for training (see the code below). I could not find a way to normalize 2d features.
dataset2d = dataset2d.reshape(dataset2d.shape[0],
dataset2d.shape[1]*dataset2d.shape[2])
normalizer = preprocessing.Normalization()
normalizer.adapt(dataset2d)
print(normalizer.mean.numpy())
x_train, x_test, y_train, y_test = train_test_split(dataset2d, flux_val,
test_size=0.2)
# %% DNN regression multiple parameter
def build_and_compile_model(norm):
inputs = Input(shape=(x_test.shape[1],))
x = norm(inputs)
x = layers.Dense(128, activation="selu")(x)
x = layers.Dense(64, activation="relu")(x)
x = layers.Dense(32, activation="relu")(x)
x = layers.Dense(1, activation="linear")(x)
model = Model(inputs, x)
model.compile(loss='mean_squared_error',
optimizer=keras.optimizers.Adam(learning_rate=1e-3))
return model
dnn_model = build_and_compile_model(normalizer)
dnn_model.summary()
# interrupt training when model is no longer imporving
path_checkpoint = "model_checkpoint.h5"
modelckpt_callback = keras.callbacks.ModelCheckpoint(monitor="val_loss",
filepath=path_checkpoint,
verbose=1,
save_weights_only=True,
save_best_only=True)
es_callback = keras.callbacks.EarlyStopping(monitor="val_loss",
min_delta=0, patience=10)
history = dnn_model.fit(x_train, y_train, validation_split=0.2,
epochs=120, callbacks=[es_callback, modelckpt_callback])
I also tried to modify my model input layer to the following, such that I do not need to reshape my input
inputs = Input(shape=(x_test.shape[-1], x_test.shape[-2], ))
and modify the normalization to the following
normalizer = preprocessing.Normalization(axis=1)
normalizer.adapt(dataset2d)
print(normalizer.mean.numpy())
But this does not seem to help. The normalization adapts to a 1d array of length 6, while I want it to adapt to a 2d array of shape 25, 6.
Sorry for the long question. You help will be much appreciated.
I'm not sure if I understood your issue. The normalizer layer can take N-D tensor and it produces an output with the same shape, for example:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
t = tf.constant(np.arange(2*3*4).reshape(2,3,4) , dtype=tf.float32)
tf.print("\n",t)
normalizer_layer = tf.keras.layers.LayerNormalization(axis=1)
output = normalizer_layer(t)
tf.print("\n",output)

LSTM: Input 0 of layer sequential is incompatible with the layer

I know there are several questions about this here, but I haven't found one which fits exactly my problem.
I'm trying to fit an LSTM with data from Pandas DataFrames but getting confused about the format I have to provide them.
I created a small code snipped which shall show you what I try to do:
import pandas as pd, tensorflow as tf, random
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
targets = pd.DataFrame(index=pd.date_range(start='2019-01-01', periods=300, freq='D'))
targets['A'] = [random.random() for _ in range(len(targets))]
targets['B'] = [random.random() for _ in range(len(targets))]
features = pd.DataFrame(index=targets.index)
for i in range(len(features)) :
features[str(i)] = [random.random() for _ in range(len(features))]
model = Sequential()
model.add(LSTM(units=targets.shape[1], input_shape=features.shape))
model.compile(optimizer='adam', loss='mae')
model.fit(features, targets, batch_size=10, epochs=10)
this results to:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [10, 300]
which I expect relates to the dimensions of the features DataFrame provided. I guess that once fixed this the next error would mention the targets DataFrame.
As far as I understand, 'units' parameter of my first layer defines the output dimensionality of this model. The inputs have to have a 3D shape, but I don't know how to create them out of the 2D world of the Data Frames.
I hope you can help me understanding the reshape mechanism in Python and how to use them in combination with Pandas DataFrames. (I'm quite new to Python and came from R)
Thankls in advance
Lets looks at the few popular ways in LSTMs are used.
Many to Many
Example: You have a sentence (composed of words in sequence). Give these sequence of words you would like to predict the Parts of speech (POS) of each word.
So you have n words and you feed each word per timestep to the LSTM. Each LSTM timestep (also called LSTM unwrapping) will produce and output. The word is represented by a a set of features normally word embeddings. So the input to LSTM is of size bath_size X time_steps X features
Keras code:
inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = True)(inputs)
outputs = keras.layers.TimeDistributed(keras.layers.Dense(5, activation='softmax'))(lstm)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')
X = np.random.randn(4,10,3)
y = np.random.randint(0,2, size=(4,10,5))
model.fit(X, y, epochs=2)
print (model.predict(X).shape)
Many to One
Example: Again you have a sentence (composed of words in sequence). Give these sequence of words you would like to predict sentiment of the sentence if it is positive or negative.
Keras code
inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = False)(inputs)
outputs =keras.layers.Dense(5, activation='softmax')(lstm)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')
X = np.random.randn(4,10,3)
y = np.random.randint(0,2, size=(4,5))
model.fit(X, y, epochs=2)
print (model.predict(X).shape)
Many to multi-headed
Example: You have a sentence (composed of words in sequence). Give these sequence of words you would like to predict sentiment of the sentence as well the author of the sentence.
This is multi-headed model where one head will predict the sentiment and another head will predict the author. Both the heads share the same LSTM backbone.
Keras code
inputs = keras.Input(shape=(10,3))
lstm = keras.layers.LSTM(8, input_shape = (10, 3), return_sequences = False)(inputs)
output_A = keras.layers.Dense(5, activation='softmax')(lstm)
output_B = keras.layers.Dense(5, activation='softmax')(lstm)
model = keras.Model(inputs=inputs, outputs=[output_A, output_B])
model.compile(loss='categorical_crossentropy', optimizer='adam')
X = np.random.randn(4,10,3)
y_A = np.random.randint(0,2, size=(4,5))
y_B = np.random.randint(0,2, size=(4,5))
model.fit(X, [y_A, y_B], epochs=2)
y_hat_A, y_hat_B = model.predict(X)
print (y_hat_A.shape, y_hat_B.shape)
What you are looking for is Many to Multi head model where your predictions for A will be made by one head and another head will make predictions for B
The input data for the LSTM has to be 3D.
If you print the shapes of your DataFrames you get:
targets : (300, 2)
features : (300, 300)
The input data has to be reshaped into (samples, time steps, features). This means that targets and features must have the same shape.
You need to set a number of time steps for your problem, in other words, how many samples will be used to make a prediction.
For example, if you have 300 days and 2 features the time step can be 3. So that three days will be used to make one prediction (you can choose this arbitrarily). Here is the code for reshaping your data (with a few more changes):
import pandas as pd
import numpy as np
import tensorflow as tf
import random
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
data = pd.DataFrame(index=pd.date_range(start='2019-01-01', periods=300, freq='D'))
data['A'] = [random.random() for _ in range(len(data))]
data['B'] = [random.random() for _ in range(len(data))]
# Choose the time_step size.
time_steps = 3
# Use numpy for the 3D array as it is easier to handle.
data = np.array(data)
def make_x_y(ts, data):
"""
Parameters
ts : int
data : numpy array
This function creates two arrays, x and y.
x is the input data and y is the target data.
"""
x, y = [], []
offset = 0
for i in data:
if offset < len(data)-ts:
x.append(data[offset:ts+offset])
y.append(data[ts+offset])
offset += 1
return np.array(x), np.array(y)
x, y = make_x_y(time_steps, data)
print(x.shape, y.shape)
nodes = 100 # This is the width of the network.
out_size = 2 # Number of outputs produced by the network. Same size as features.
model = Sequential()
model.add(LSTM(units=nodes, input_shape=(x.shape[1], x.shape[2])))
model.add(Dense(out_size)) # For the output a Dense (fully connected) layer is used.
model.compile(optimizer='adam', loss='mae')
model.fit(x, y, batch_size=10, epochs=10)
Well, just to finalize this issue I would like to provide one solution I have meanwhile worked on. The class TimeseriesGenerator in tf.keras.... enabled me quite easy to provide the data in the right shape to an LSTM model
from keras.preprocessing.sequence import TimeseriesGenerator
import numpy as np
window_size = 7
batch_size = 8
sampling_rate = 1
train_gen = TimeseriesGenerator(X_train.values, y_train.values,
length=window_size, sampling_rate=sampling_rate,
batch_size=batch_size)
valid_gen = TimeseriesGenerator(X_valid.values, y_valid.values,
length=window_size, sampling_rate=sampling_rate,
batch_size=batch_size)
test_gen = TimeseriesGenerator(X_test.values, y_test.values,
length=window_size, sampling_rate=sampling_rate,
batch_size=batch_size)
There are many other ways on implementing generators e.g. using the more_itertools which provides the function windowed, or making use of tensorflow.Dataset and its function window.
For me the TimeseriesGenerator was sufficient to feed the tests I did.
In case you would like to see an example modeling the DAX based on some stocks I'm sharing a notebook on Github.

Keras sequential model with multiple inputs, Tensorflow 1.9.0

I try creating a neural network, having two inputs of a particular size (here four) each and one output of the same size size (so also four). Unfortunately, I always get this error when running my code:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not
the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays:
[array([[[-1.07920336, 1.16782929, 1.40131554, -0.30052492],
[-0.50067655, 0.54517916, -0.87033621, -0.22922157]],
[[-0.53766128, -0.03527806, -0.14637072, 2.32319071],
[ 0...
I think, the problem lies in the fact, that once I pass the data for training, the input shape is either incorrect or I have a datatype issue. Hence, there is an extra list bracket around the array.
I'm using Tensorflow 1.9.0 (due to project restrictions). I already checked the search function and tried solutions provided here. Following is an example code for reproducting the error of mine:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import keras.backend as K
from tensorflow.keras import layers, models
def main():
ip1 = keras.layers.Input(shape=(4,))
ip2 = keras.layers.Input(shape=(4,))
dense = layers.Dense(3, activation='sigmoid', input_dim=4) # Passing the value in a weighted manner
merge_layer = layers.Concatenate()([ip1, ip2]) # Concatenating the outputs of the first network
y = layers.Dense(6, activation='sigmoid')(merge_layer) # Three fully connected layers
y = layers.Dense(4, activation='sigmoid')(y)
model = keras.Model(inputs=[ip1, ip2], outputs=y)
model.compile(optimizer='adam',
loss='mean_squared_error')
model.summary()
# dataset shape: 800 samples, 2 inputs for sequential model, 4 input size
X_train = np.random.randn(800, 2, 4)
y_train = np.random.randn(800, 4)
X_test = np.random.randn(200, 2, 4)
y_test = np.random.randn(200, 4)
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=1000, batch_size=32)
if __name__ == '__main__':
main()
When there is multiple inputs keras expects list of multiple arrays. The size of the list corresponds to number of inputs you have for the model.
So basically you need to pass a list of 2 array each with shape (X,4)
X_train1 = np.random.randn(800, 4)
X_train2=np.random.randn(800,4)
y_train = np.random.randn(800, 4)
X_test1 = np.random.randn(200, 4)
X_test2 = np.random.randn(200, 4)
y_test = np.random.randn(200, 4)
history = model.fit([X_train1,X_train2], y_train, validation_data=([X_test1,X_test2], y_test), epochs=1000, batch_size=32)

Number of test cases predicted are less than the actual test data in LSTM

I have been trying to predict the number of orders based on the time series data using LSTM Keras.
I have divided sample data with the training set contains 282 records while test set contains 82 records. I am using look back window of 30 in order to predict the forecast for the test data.
But for some reason predicted dataset contains only 40 records as opposed to expected 71 records in test data. What can be the reason behind it? is the lookup window that is causing the issue? I feel look back window is causing the issue. But how can I rectify it?
It is important to keep such a high look back window
def create_LSTM(trainX,testX,trainY,testY,look_back):
model = Sequential()
model.add(LSTM(6, input_shape=(1, look_back),activation= 'relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='RMSProp')
model.fit(trainX, trainY, epochs=300, batch_size=4, verbose=1)
trainpredict = model.predict(trainX,batch_size=4)
testpredict = model.predict(testX,batch_size=4)
testpredict = np.array(testpredict).reshape(len(testpredict),1)
print(testpredict)
print(len(testpredict))
return trainpredict,testpredict
I am using the following function to create the data for LSTM which is causing the actual issue. how can I rectify it?
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
#for i in range(len(dataset)-look_back):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
The Problem with create_dataset
When you get an element of the ndarray, you loose the rank associated with that element. This happens because if you are interested in that single element, then you don't need to retain the dimension 1:
x = np.random.randn(4, 4)
print(x, x.shape)
array([[ 1.37938213, -0.10407424, -0.356567 , -1.5032779 ],
[-0.53166922, 0.98204605, -0.62052479, 0.99265612],
[ 0.23046477, -0.17742399, 0.38283412, 0.24104468],
[-0.78093724, 1.06833765, -1.22112772, -0.78429717]])
(4, 4)
print(x[0:3, 0], x[0:3, 0].shape)
array([ 1.37938213, -0.53166922, 0.23046477])
(3,)
So when you write a = dataset[i:(i + look_back), 0], you are taking a dataset of shape (samples, features) and getting a chunk of shape (look_back,). After adding all a's to dataX, it becomes an ndarray of shape (samples, look_back) = (len(dataset) - look_back - 1, look_back). However, the LSTM is expecting the shape (samples, look_back, features), which in your case (samples, look_back, 1).
If you change it to a = dataset[i:(i + look_back)], then things will start to work. A better solution, however, is to use TimeseriesGenerator:
from keras.preprocessing.sequence import TimeseriesGenerator
batch_size = 4
look_back = 1
features = 1
d = np.random.randn(364, features)
train = TimeseriesGenerator(d, d,
length=look_back,
batch_size=batch_size,
end_index=282)
test = TimeseriesGenerator(d, d,
length=look_back,
batch_size=batch_size,
start_index=282)
model = Sequential()
model.add(LSTM(6, input_shape=[look_back, features], activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='rmsprop')
model.fit_generator(train, epochs=1, verbose=1)
p_train = model.predict_generator(train)
p_test = model.predict_generator(test)
Further Comments on Other Sections
model.add(LSTM(6, input_shape=(1, look_back),activation= 'relu')) - the input shape should conform with (length, features). In this case, where length == features, things would work out. You do need to update this code to input_shape=(look_back, 1) if you want a larger look_back.
testpredict = np.array(testpredict).reshape(len(testpredict), 1) - this is unnecessary. Model#predict already outputs a ndarray if you have a single output, and its shape is already (samples, output_units) = (len(testX), 1).
LSTM(activation='relu') usually leads to instability when dealing with very large sequences. It's usually a good idea to leave it in tanh.

Categories

Resources