I know there were already some questions in this area, but I couldn't find the answer to my problem.
I have an LSTM (with tflearn) for a regression problem.
I get 3 types of errors, no matter what kind of modifications I do.
import pandas
import tflearn
import tensorflow as tf
from sklearn.cross_validation import train_test_split
csv = pandas.read_csv('something.csv', sep = ',')
X_train, X_test = train_test_split(csv.loc[:,['x1', 'x2',
'x3','x4','x5','x6',
'x7','x8','x9',
'x10']].as_matrix())
Y_train, Y_test = train_test_split(csv.loc[:,['y']].as_matrix())
#create LSTM
g = tflearn.input_data(shape=[None, 1, 10])
g = tflearn.lstm(g, 512, return_seq = True)
g = tflearn.dropout(g, 0.5)
g = tflearn.lstm(g, 512)
g = tflearn.dropout(g, 0.5)
g = tflearn.fully_connected(g, 1, activation='softmax')
g = tflearn.regression(g, optimizer='adam', loss = 'mean_square',
learning_rate=0.001)
model = tflearn.DNN(g)
model.fit(X_train, Y_train, validation_set = (Y_train, Y_test))
n_examples = Y_train.size
def mean_squared_error(y,y_):
return tf.reduce_sum(tf.pow(y_ - y, 2))/(2 * n_examples)
print()
print("\nTest prediction")
print(model.predict(X_test))
print(Y_test)
Y_pred = model.predict(X_test)
print('MSE Test: %.3f' % ( mean_squared_error(Y_test,Y_pred)) )
At the first run when starting new kernel i get
ValueError: Cannot feed value of shape (100, 10) for Tensor 'InputData/X:0', which has shape '(?, 1, 10)'
Then, at the second time
AssertionError: Input dim should be at least 3.
and it refers to the second LSTM layer. I tried to remove the second LSTM an Dropout layers, but then I get
feed_dict[net_inputs[i]] = x
IndexError: list index out of range
If you read this, have a nice day. I you answer it, thanks a lot!!!!
Ok, I solved it. I post it so maybe it helps somebody:
X_train = X_train.reshape([-1,1,10])
X_test = X_test.reshape([-1,1,10])
Related
I am attempting to do a sandbox project using the wine reviews dataset and want to combine both text data as well as some engineered numeric features into the neural network, but I am receiving a value error.
The three sets of features I have are the description (the actual reviews), scaled price, and scaled number of words (length of description). The y target variable I converted into a dichotomous variable representing good or bad reviews turning it into a classification problem.
Whether or not these are the best features to use is not the point, but I am hoping to try to combine NLP with meta or numeric data. When I run the code with just the description it works fine, but adding the additional variables is causing a value error.
y = df['y']
X = df.drop('y', axis=1)
# split up the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)
X_train.head();
description_train = X_train['description']
description_test = X_test['description']
#subsetting the numeric variables
numeric_train = X_train[['scaled_price','scaled_num_words']].to_numpy()
numeric_test = X_test[['scaled_price','scaled_num_words']].to_numpy()
MAX_VOCAB_SIZE = 60000
tokenizer = Tokenizer(num_words=MAX_VOCAB_SIZE)
tokenizer.fit_on_texts(description_train)
sequences_train = tokenizer.texts_to_sequences(description_train)
sequences_test = tokenizer.texts_to_sequences(description_test)
word2idx = tokenizer.word_index
V = len(word2idx)
print('Found %s unique tokens.' % V)
Found 31598 unique tokens.
nlp_train = pad_sequences(sequences_train)
print('Shape of data train tensor:', nlp_train.shape)
Shape of data train tensor: (91944, 136)
# get sequence length
T = nlp_train.shape[1]
nlp_test = pad_sequences(sequences_test, maxlen=T)
print('Shape of data test tensor:', nlp_test.shape)
Shape of data test tensor: (45286, 136)
data_train = np.concatenate((nlp_train,numeric_train), axis=1)
data_test = np.concatenate((nlp_test,numeric_test), axis=1)
# Choosing embedding dimensionality
D = 20
# Hidden state dimensionality
M = 40
nlp_input = Input(shape=(T,),name= 'nlp_input')
meta_input = Input(shape=(2,), name='meta_input')
emb = Embedding(V + 1, D)(nlp_input)
emb = Bidirectional(LSTM(64, return_sequences=True))(emb)
emb = Dropout(0.40)(emb)
emb = Bidirectional(LSTM(128))(emb)
nlp_out = Dropout(0.40)(emb)
x = tf.concat([nlp_out, meta_input], 1)
x = Dense(64, activation='swish')(x)
x = Dropout(0.40)(x)
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs=[nlp_input, meta_input], outputs=[x])
#next, create a custom optimizer
optimizer1 = RMSprop(learning_rate=0.0001)
# Compile and fit
model.compile(
loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
print('Training model...')
r = model.fit(
data_train,
y_train,
epochs=5,
validation_data=(data_test, y_test))
I apologize if that was over-kill but wanted to make sure I didn't leave out any relevant clues or information that would potentially be helpful. The error I get from running the code is
ValueError: Layer model expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 138) dtype=float32>]
How do I resolve that error?
Thank you for posting all your code. These two lines are the problem:
data_train = np.concatenate((nlp_train,numeric_train), axis=1)
data_test = np.concatenate((nlp_test,numeric_test), axis=1)
A numpy array is interpreted as one input regardless of its shape.
Either use tf.data.Dataset and feed your dataset directly to your model:
train_dataset = tf.data.Dataset.from_tensor_slices((nlp_train, numeric_train))
labels = tf.data.Dataset.from_tensor_slices(y_train)
dataset = tf.data.Dataset.zip((train_dataset, train_dataset))
r = model.fit(dataset, epochs=5)
Or just feed your data directly to model.fit() as a list of inputs:
r = model.fit(
[nlp_train, numeric_train],
y_train,
epochs=5,
validation_data=([nlp_test, numeric_test], y_test))
I have 3 inputs and 3 outputs. I am trying to use KerasRegressor and cross_val_score to get my prediction score.
my code is:
# Function to create model, required for KerasClassifier
def create_model():
# create model
# #Start defining the input tensor:
input_data = layers.Input(shape=(3,))
#create the layers and pass them the input tensor to get the output tensor:
layer = [2,2]
hidden1Out = Dense(units=layer[0], activation='relu')(input_data)
finalOut = Dense(units=layer[1], activation='relu')(hidden1Out)
u_out = Dense(1, activation='linear', name='u')(finalOut)
v_out = Dense(1, activation='linear', name='v')(finalOut)
p_out = Dense(1, activation='linear', name='p')(finalOut)
#define the model's start and end points
model = Model(input_data,outputs = [u_out, v_out, p_out])
model.compile(loss='mean_squared_error', optimizer='adam')
return model
#load data
...
input_var = np.vstack((AOA, x, y)).T
output_var = np.vstack((u,v,p)).T
# evaluate model
estimator = KerasRegressor(build_fn=create_model, epochs=num_epochs, batch_size=batch_size, verbose=0)
kfold = KFold(n_splits=10)
I tried:
results = cross_val_score(estimator, input_var, [output_var[:,0], output_var[:,1], output_var[:,2]], cv=kfold)
and
results = cross_val_score(estimator, input_var, [output_var[:,0:1], output_var[:,1:2], output_var[:,2:3]], cv=kfold)
and
results = cross_val_score(estimator, input_var, output_var, cv=kfold)
I got the error msg like:
Details:
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 3 array(s), but instead got the following list of 1 arrays: [array([[ 0.69945297, 0.13296847, 0.06292328],
or
ValueError: Found input variables with inconsistent numbers of samples: [72963, 3]
So how do I solve this problem?
Thanks.
The problem is the input dimension of the layer Input is not 3, but 3*feature_dim. Below is an working example
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input,Dense,Concatenate
from sklearn.model_selection import cross_val_score,KFold
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor
def create_model():
feature_dim = 10
input_data = Input(shape=(3*feature_dim,))
#create the layers and pass them the input tensor to get the output tensor:
layer = [2,2]
hidden1Out = Dense(units=layer[0], activation='relu')(input_data)
finalOut = Dense(units=layer[1], activation='relu')(hidden1Out)
u_out = Dense(1, activation='linear', name='u')(finalOut)
v_out = Dense(1, activation='linear', name='v')(finalOut)
p_out = Dense(1, activation='linear', name='p')(finalOut)
output = Concatenate()([u_out,v_out,p_out])
#define the model's start and end points
model = Model(inputs=input_data,outputs=output)
model.compile(loss='mean_squared_error', optimizer='adam')
return model
x_0 = np.random.rand(100,10)
x_1 = np.random.rand(100,10)
x_2 = np.random.rand(100,10)
input_val = np.hstack([x_0,x_1,x_2])
u = np.random.rand(100,1)
v = np.random.rand(100,1)
p = np.random.rand(100,1)
output_val = np.hstack([u,v,p])
estimator = KerasRegressor(build_fn=create_model,nb_epoch=3,batch_size=8,verbose=False)
kfold = KFold(n_splits=3, random_state=0)
results = cross_val_score(estimator=estimator,X=input_val,y=output_val,cv=kfold)
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))
As you can see, since the input dimension is 10, inside create_model, I specify the feature_dim.
I don't know have your data look like, but I think it how to stack them together.
I have tried to tried the following procedure
input_var = np.random.randint(0,1, size=(100,3))
x = np.sum(np.sin(input_var),axis=1,keepdims=True) # (100,1)
y = np.sum(np.cos(input_var),axis=1,keepdims=True) # (100,1)
z = np.sum(np.sin(input_var)+ np.cos(input_var),axis=1, keepdims=True) # (100,1)
output_var = np.hstack((x,y,z))
# evaluate model
estimator = KerasRegressor(build_fn=create_model, epochs=10, batch_size=8, verbose=0)
kfold = KFold(n_splits=10)
results = cross_val_score(estimator, input_var, output_var, cv=kfold)
The only issue I get is Tensorlfow complaining about not using tensor
I hope this help if not let me know the dimension of your data looks like
I try creating a neural network, having two inputs of a particular size (here four) each and one output of the same size size (so also four). Unfortunately, I always get this error when running my code:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not
the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays:
[array([[[-1.07920336, 1.16782929, 1.40131554, -0.30052492],
[-0.50067655, 0.54517916, -0.87033621, -0.22922157]],
[[-0.53766128, -0.03527806, -0.14637072, 2.32319071],
[ 0...
I think, the problem lies in the fact, that once I pass the data for training, the input shape is either incorrect or I have a datatype issue. Hence, there is an extra list bracket around the array.
I'm using Tensorflow 1.9.0 (due to project restrictions). I already checked the search function and tried solutions provided here. Following is an example code for reproducting the error of mine:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import keras.backend as K
from tensorflow.keras import layers, models
def main():
ip1 = keras.layers.Input(shape=(4,))
ip2 = keras.layers.Input(shape=(4,))
dense = layers.Dense(3, activation='sigmoid', input_dim=4) # Passing the value in a weighted manner
merge_layer = layers.Concatenate()([ip1, ip2]) # Concatenating the outputs of the first network
y = layers.Dense(6, activation='sigmoid')(merge_layer) # Three fully connected layers
y = layers.Dense(4, activation='sigmoid')(y)
model = keras.Model(inputs=[ip1, ip2], outputs=y)
model.compile(optimizer='adam',
loss='mean_squared_error')
model.summary()
# dataset shape: 800 samples, 2 inputs for sequential model, 4 input size
X_train = np.random.randn(800, 2, 4)
y_train = np.random.randn(800, 4)
X_test = np.random.randn(200, 2, 4)
y_test = np.random.randn(200, 4)
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=1000, batch_size=32)
if __name__ == '__main__':
main()
When there is multiple inputs keras expects list of multiple arrays. The size of the list corresponds to number of inputs you have for the model.
So basically you need to pass a list of 2 array each with shape (X,4)
X_train1 = np.random.randn(800, 4)
X_train2=np.random.randn(800,4)
y_train = np.random.randn(800, 4)
X_test1 = np.random.randn(200, 4)
X_test2 = np.random.randn(200, 4)
y_test = np.random.randn(200, 4)
history = model.fit([X_train1,X_train2], y_train, validation_data=([X_test1,X_test2], y_test), epochs=1000, batch_size=32)
I am constructing an LSTM predictor with Keras. My input array is historical price data. I segment the data into window_size blocks, in order to predict prediction length blocks ahead. My data is a list of 4246 floating point numbers. I seperate my data into 4055 arrays each of length 168 in order to predict 24 units ahead.
This gives me an x_train set with dimension (4055,168). I then scale my data and try to fit the data but run into a dimension error.
df = pd.DataFrame(data)
print(f"Len of df: {len(df)}")
min_max_scaler = MinMaxScaler()
H = 24
window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1
x_train = []
y_train = []
for i in range(num_pred_blocks):
x_train_block = df['C'][i:(i + window_size)]
x_train.append(x_train_block)
y_train_block = df['C'][(i + window_size):(i + window_size + H)]
y_train.append(y_train_block)
LEN = int(len(x_train)*window_size)
x_train = min_max_scaler.fit_transform(x_train)
batch_size = 1
def build_model():
model = Sequential()
model.add(LSTM(input_shape=(window_size,batch_size),
return_sequences=True,
units=num_pred_blocks))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
num_epochs = epochs
model= build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)
The error being returned is as such.
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 4055 arrays: [array([[0.00630006],
Am I not segmenting correctly? Loading correctly? Should the number of units be different than the number of prediction blocks? I appreciate any help. Thanks.
Edit
The suggestions to convert them to Numpy arrays is correct but MinMixScalar() returns a numpy array. I reshaped the arrays into the proper dimension but now my computer is having CUDA memory error. I consider the problem solved. Thank you.
df = pd.DataFrame(data)
min_max_scaler = MinMaxScaler()
H = prediction_length
window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1
x_train = []
y_train = []
for i in range(num_pred_blocks):
x_train_block = df['C'][i:(i + window_size)].values
x_train.append(x_train_block)
y_train_block = df['C'][(i + window_size):(i + window_size + H)].values
y_train.append(y_train_block)
x_train = min_max_scaler.fit_transform(x_train)
y_train = min_max_scaler.fit_transform(y_train)
x_train = np.reshape(x_train, (len(x_train), 1, window_size))
y_train = np.reshape(y_train, (len(y_train), 1, H))
batch_size = 1
def build_model():
model = Sequential()
model.add(LSTM(batch_input_shape=(batch_size, 1, window_size),
return_sequences=True,
units=100))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
num_epochs = epochs
model = build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)
I don't think you passed the batch size in the model.
input_shape=(window_size,batch_size) is the data dimension. which is correct, but you should use input_shape=(window_size, 1)
If you want to use batch, you have to add another dimension, like this LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2])) (Cited from the Keras)
in your case:
def build_model():
model = Sequential()
model.add(LSTM(input_shape=(batch_size, 1, window_size),
return_sequences=True,
units=num_pred_blocks))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
You also need to use np.shape to change the dimension of the of your data, it should be (batch_dim, data_dim_1, data_dim_2). I use numpy, so numpy.reshape() will work.
First your data should be row-wise, so for each row, you should have a shape of (1, 168), then add the batch dimension, it will be (batch_n, 1, 168).
Hope this help.
That's probably because x_train and y_train were not updated to numpy arrays. Take a closer look at this issue on github.
model = build_model()
x_train, y_train = np.array(x_train), np.array(y_train)
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)
Consider the following example problem:
# dummy data for a SO question
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
from keras.models import Model
from keras.layers import Input, Conv1D, Dense
from keras.optimizers import Adam, SGD
time = np.array(range(100))
brk = np.array((time>40) & (time < 60)).reshape(100,1)
B = np.array([5, -5]).reshape(1,2)
np.dot(brk, B)
y = np.c_[np.sin(time), np.sin(time)] + np.random.normal(scale = .2, size=(100,2))+ np.dot(brk, B)
plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])
You've got N time series, and they've got one component that follows a common process, and another component that is idiosyncratic to the series itself. Assume for simplicity that you know a priori that the bump is between 40 and 60, and you want to model it simultaneously with the sinusoidal component.
A TCN does a good job on the common component, but it can't get the series-idiosyncratic component:
# time series model
n_filters = 10
filter_width = 3
dilation_rates = [2**i for i in range(7)]
inp = Input(shape=(None, 1))
x = inp
for dilation_rate in dilation_rates:
x = Conv1D(filters=n_filters,
kernel_size=filter_width,
padding='causal',
activation = "relu",
dilation_rate=dilation_rate)(x)
x = Dense(1)(x)
model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.summary()
X_train = np.transpose(np.c_[time, time]).reshape(2,100,1)
y_train = np.transpose(y).reshape(2,100,1)
history = model.fit(X_train, y_train,
batch_size=2,
epochs=1000,
verbose = 0)
yhat = model.predict(X_train)
plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])
plt.plot(time, yhat[0,:,:])
plt.plot(time, yhat[1,:,:])
On the other hand, a basic linear regression with N outputs (here implemented in Keras) is perfect for the idiosyncratic component:
inp1 = Input((1,))
x1 = inp1
x1 = Dense(2)(x1)
model1 = Model(inputs = inp1, outputs = x1)
model1.compile(optimizer = Adam(), loss='mean_squared_error')
model1.summary()
brk_train = brk
y_train = y
history = model1.fit(brk_train, y_train,
batch_size=100,
epochs=6000, verbose = 0)
yhat1 = model1.predict(brk_train)
plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])
plt.plot(time, yhat1[:,0])
plt.plot(time, yhat1[:,1])
I want to use keras to jointly estimate the time series component and the idiosyncratic component. The major problem is that feed-forward networks (which linear regression is a special case of) take shape batch_size x dims while time series networks take dimension batch_size x time_steps x dims.
Because I want to jointly estimate the idiosyncratic part of the model (the linear regression part) together with the time series part, I'm only ever going to batch-sample whole time-series. Which is why I specified batch_size = time_steps for model 1.
But in the static model, what I'm really doing is modeling my data as time_steps x dims.
I have tried to re-cast the feed-forward model as a time-series model, without success. Here's the non-working approach:
inp3 = Input(shape = (None, 1))
x3 = inp3
x3 = Dense(2)(x3)
model3 = Model(inputs = inp3, outputs = x3)
model3.compile(optimizer = Adam(), loss='mean_squared_error')
model3.summary()
brk_train = brk.reshape(1, 100, 1)
y_train = np.transpose(y).reshape(2,100,1)
history = model3.fit(brk_train, y_train,
batch_size=1,
epochs=1000, verbose = 1)
ValueError: Error when checking target: expected dense_40 to have shape (None, 2) but got array with shape (100, 1)
I am trying to fit the same model as model1, but with a different shape, so that it is compatible with the TCN model -- and importantly so that it will have the same batching structure.
The output should ultimately have the shape (2, 100, 1) in this example. Basically I want the model to do the following algorithm:
ingest X of shape (N, time_steps, dims)
Lose the first dimension, because the design matrix is going to be identical for every series, yielding X1 of shape (time_steps, dims)
Forward step: np.dot(X1, W), where W is of dimension (dims, N), yielding X2 of dimension (time_steps, N)
Reshape X2 to (N, time_steps, 1). Then I can add it to the output of the other part of the model.
Backwards step: since this is just a linear model, the gradient of W with respect to the output is just X1
How can I implement this? Do I need a custom layer?
I'm building off of ideas in this paper, in case you're curious about the motivation behind all of this.
EDIT: After posting, I noticed that I used only the time variable, rather than the time series itself. A TCN fit with the lagged series fits the idiosyncratic part of the series just fine (in-sample anyway). But my basic question still stands -- I want to merge the two types of networks.
So, I solved my own problem. The answer is to create dummy interactions (and a thus a really sparse design matrix) and then reshape the data.
###########################
# interaction model
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
from keras.models import Model
from keras.layers import Input, Conv1D, Dense
from keras.optimizers import Adam, SGD
from patsy import dmatrix
def shift5(arr, num, fill_value=np.nan):
result = np.empty_like(arr)
if num > 0:
result[:num] = fill_value
result[num:] = arr[:-num]
elif num < 0:
result[num:] = fill_value
result[:num] = arr[-num:]
else:
result = arr
return result
time = np.array(range(100))
brk = np.array((time>40) & (time < 60)).reshape(100,1)
B = np.array([5, -5]).reshape(1,2)
np.dot(brk, B)
y = np.c_[np.sin(time), np.sin(time)] + np.random.normal(scale = .2, size=(100,2))+ np.dot(brk, B)
plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])
# define interaction model
inp = Input(shape=(None, 2))
x = inp
x = Dense(1)(x)
model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.summary()
from patsy import dmatrix
df = pd.DataFrame(data = {"fips": np.concatenate((np.zeros(100), np.ones(100))),
"brk": np.concatenate((brk.reshape(100), brk.squeeze()))})
df.brk = df.brk.astype(int)
tm = np.asarray(dmatrix("brk:C(fips)-1", data = df))
brkint = np.concatenate(( \
tm[:100,:].reshape(1,100,2),
tm[100:200,:].reshape(1,100,2)
), axis = 0)
y_train = np.transpose(y).reshape(2,100,1)
history = model.fit(brkint, y_train,
batch_size=2,
epochs=1000,
verbose = 1)
yhat = model.predict(brkint)
plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])
plt.plot(time, yhat[0,:,:])
plt.plot(time, yhat[1,:,:])
The output shape is the same as for the TCN, and can simply be added element-wise.