I am trying to run a neural network on text inputs. This is a binary classification. Here is my working code so far:
df = pd.read_csv(pathname, encoding = "ISO-8859-1")
df = df[['content_cleaned', 'meaningful']] #Content cleaned: text, meaningful: label
X = df['content_cleaned']
y = df['meaningful']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=21)
tokenizer = Tokenizer(num_words=100)
tokenizer.fit_on_texts(X_train)
X_train_encoded = tokenizer.texts_to_sequences(X_train)
X_test_encoded = tokenizer.texts_to_sequences(X_test)
max_len = 100
X_train = pad_sequences(X_train_encoded, maxlen=max_len)
X_test = pad_sequences(X_test_encoded, maxlen=max_len)
batch_size = 100
max_words = 100
input_dim = X_train.shape[1] # Number of features
model = Sequential()
model.add(layers.Dense(10, activation='relu', input_shape=X_train.shape[1:]))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(X_train, X_test,
batch_size=batch_size,
epochs=5,
verbose=1,
validation_split=0.1)
My question is two parts. First is with the input_shape when creating the layers. I am confused as to the syntax of declaring this. When running this command:
print(X_train.shape)
I am getting this shape: (3609, 100).
From my understanding, this is telling me that there are 3609 instances. From viewing other examples, my naive assumption was to use the 100 as there are 100 types (may be understanding this incorrectly) corresponding to the max_words that I initialized. I believe that I may have done the syntax incorrectly when initializing the input_shape.
The second question is with an error message when running all of this (most likely with the incorrect input_shape). The error message highlights this line of code:
validation_split=0.1)
The error message is:
ValueError: Error when checking target: expected dense_2 to have shape (None, 1) but got array with shape (1547, 1
Am I going about this problem incorrectly? I am very new to Deep Learning.
The input_shape argument specifies the shape of one training sample. Therefore, you need to set it to X_train.shape[1:] (i.e. ignore samples or batch axis):
model.add(layers.Dense(10, activation='relu', input_shape=X_train.shape[1:]))
Further, pass X_train and y_train to the fit_generator (instead of X_train_encoded and X_test_encoded).
You missed two ending parenthesis ) at the line where you defined the input of your model. Also make sure that you provide your activation function.
Change your code as below:
model.add(layers.Dense(10, activation='relu', input_shape=(X_train.shape[0],)))
EDIT:
For your last error just change your input_shape to input_shape=(X_train.shape[0],).
Related
I am trying to predict the of number peaks in time series data by using a CNN and keep on getting a data shape error. My data looks as follows:
X = list of 520 lists (each is a time series) of various lengths (shortest = 137 elements, longest = 2297 elements)
y = list with 520 elements, each being the number of peaks for the respective time series
Due to the various lengths of the time series, I padded X. The shapes of X_train and X_test, after converting them from numpy arrays to tensors are:
X_train.shape = TensorShape([390, 2297])
X_test.shape = TensorShape([130, 2297])
I am new to keras and I am very unsure about the input_size in the first Conv1D layer. According to this post (Keras/Tensorflow Conv1D expected input shape) I chose it as (2297, 1) or (520, 1), but none of them works. The documentation of Keras says that the input shape should be (batch_size, feature_size, channels), where batch_size is omitted though.
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import Adam
#for structure of X and y, see explanation above
X_padded = tf.keras.preprocessing.sequence.pad_sequences(X)
X_train, X_test, y_train, y_test = train_test_split(X_padded, y, test_size=0.25, random_state=33)
X_train = tf.convert_to_tensor(X_train)
X_test = tf.convert_to_tensor(X_test)
y_train = tf.convert_to_tensor(y_train)
y_test = tf.convert_to_tensor(y_test)
model = keras.Sequential()
model.add(Conv1D(filters=16, kernel_size=3, activation = 'relu', strides = 1, padding = 'same', input_shape=(2297, 1)))
model.add(Dropout(0.1))
model.add(Conv1D(filters=32, kernel_size=3, activation = 'relu', strides = 1, padding = 'same'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(9, activation='softmax')) # '9' because there are 9 possible peak counts in the data
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
progress = model.fit(X_train, y_train, epochs = 15, validation_data = (X_test, y_test), verbose=1)
Error:
ValueError: Input 0 of layer sequential is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 2297]
What might be the issue here?
I was able to solve it. The correct input shape is given here Convolutional neural network Conv1d input shape in the answer of user 'rnso'.
I shaped my X_train and X_test (being numpy.arrays) as
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)
and stated the input_shape in the Conv1D statement as input_shape=(ncols, 1)
input_shape=(2297, 1)
I'm trying to make a binary classification, each entry of my data is composed by a list of features (previously generated bert features saved as a tensor.numpy.flatten() on a csv and loaded as pandas dataframe) and a label (0 or 1 values).
Then I create a naive model (just to test) with a dense 16 relu layer and a dense 1 sigmoid layer. I compile the model with BinaryCrossentropy loss and Adam optimizer.
When I run the model fit, I get the error,
ValueError: Dimension 0 in both shapes must be equal, but are 2 and 1. Shapes are [2] and [1]. for '{{node AssignAddVariableOp_8}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_8/resource, Sum_7)' with input shapes: [], [1]
I've already tried to reshape y, but got the same error. If I use a 2 softmax layer (with BinaryCrossentropy or sparse_categorical_crossentropy) get another error:
ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))
I've given a look at many solutions here in stackoverflow, but couldn't solve my problem. Could anyone please help me?
Tensorflow and keras versions: 2.5.0
train = pd.read_csv("../../../bert_features/panglee/panglee_bert_features_train.csv", sep=",")
valid = pd.read_csv("../../../bert_features/panglee/panglee_bert_features_valid.csv", sep=",")
test = pd.read_csv("../../../bert_features/panglee/panglee_bert_features_test.csv", sep=",")
x_train = train['bert_feats'].apply(np.vectorize(tf.constant))
y_train = train['label'].apply(np.vectorize(tf.constant))
x_valid = valid['bert_feats'].apply(np.vectorize(tf.constant))
y_valid = valid['label'].apply(np.vectorize(tf.constant))
x_test = test['bert_feats'].apply(np.vectorize(tf.constant))
y_test = test['label'].apply(np.vectorize(tf.constant))
y_train = np.asarray(y_train).astype('float32').reshape((len(y_train),1))
y_valid = np.asarray(y_valid).astype('float32').reshape((len(y_valid),1))
y_test = np.asarray(y_test).astype('float32').reshape((len(y_test),1))
model = keras.Sequential()
model.add(keras.layers.Dense(16, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss=keras.losses.BinaryCrossentropy(),
metrics=['accuracy', tf.metrics.Precision(),tf.metrics.Recall(), tfa.metrics.F1Score(num_classes=2, average='macro'), tf.metrics.AUC(curve='PR'),tf.metrics.AUC(curve='ROC')])
model.fit(x_train, y_train, epochs=10, validation_data=(x_valid, y_valid))
Answering it here for the benefit of the community since the issues is solved as mentioned in the comment.
Problem solved. I used set_to_categorical() and worked fine.
I changed the data type but I could not resolve the error.
I tried One-Hot Encoding but it doesn't work too.
I don't know what's wrong:(
seed = 0
np.random.seed(seed)
tf.set_random_seed(seed)
df = pd.read_csv('HW01_dataset_tae.txt', sep=',' ,header=None, names = ["Native", "Instructor", "Course", "Semester", "Class Size", "Evaluation"])
dataset = df.values # dataframe to int64
X = dataset[:,0:5] # attribute
Y_Eva = dataset[:,5] # class
e = LabelEncoder()
e.fit(Y_Eva)
Y = e.transform(Y_Eva)
K = 10
kFold = StratifiedKFold(n_splits=K, shuffle=True, random_state=seed)
accuracy = []
for train_index, test_index in kFold.split(X,Y):
model = Sequential()
model.add(Dense(16, input_dim=5, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['accuracy'])
model.fit(X[train_index], Y[train_index], epochs=100, batch_size=2)
the error ; Error when checking target: expected dense_2 to have shape (3,) but got array with shape (1,)
is detected at here ; model.fit(X[train_index], Y[train_index], epochs=100, batch_size=2).
What shout I do?
I solved the problem.
At this code,
model.fit(X[train_index], Y[train_index], epochs=100, batch_size=2)
the number of rows in 'Y[train_index]' must be three because the classes are three.
The error came out since each Y[train_index] has only one row.
So, I used One-Hot Encoding and changed the code like this.
e = LabelEncoder()
e.fit(Y_Eva)
Y = e.transform(Y_Eva)
Y_encoded = np_utils.to_categorical(Y) # changed code
K = 10
kFold = StratifiedKFold(n_splits=K, shuffle=True, random_state=seed)
accuracy = []
for train_index, test_index in kFold.split(X,Y):
model = Sequential()
model.add(Dense(32, input_dim=5, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X[train_index], Y_encoded[train_index], epochs=100, batch_size=2) # changed code
Finally, I was able to run the code.
TensorFlow has made some documenation on the dense layer, and if you then instead of saying input_dim says input_shape you can specify the prefered shape.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense
model = Sequential()
model.add(Dense(16, input_shape=(5,))) # Then your data has to be of shape (batch x 5)
When you then are adding another dense layer, you actaully don't have to provide the input_sahpe
model.add(Dense(10))
I'm trying to construct a (very simple) Keras model as a baseline for a project. I have a list of 3459 numpy arrays of shape (2, 6, 15) as input, and a list of target values (ints as numpy arrays with shape ()). When I try to train the model I get this error:
"ValueError: Number of samples 2 is less than samples required for specified batch_size 32 and steps 108."
The model so far is extremely simple, but I'm having no luck getting it to train:
input = Input(shape=(2, 6, 15))
x = Dense(64, activation='relu')(input)
x = Dense(64, activation='relu')(x)
output = Dense(1)(x)
model = Model(inputs=input, outputs=output)
model.compile(optimizer='adam', loss='mean_squared_error',
metrics=['accuracy'])
hist = model.fit(
X_train,
y_train,
batch_size=32,
epochs=10,
validation_data=(X_test, y_test),
steps_per_epoch=(len(X_train) // 32),
validation_steps=(len(X_test) // 32))
I'm currently loading the data from pickle files, and I suspect that the issue might be the array structure of the individual training cases. When looking at one of the arrays in the X_train it has a structure [[[...]...], [[...]...]], and I suspect the code is confusing the outer brackets as the batch container, so it's reading a batch size of 2 as input instead. Just a theory, but I don't know how to address that to check for myself.
The error is indeed due to the way your data is being generated/loaded; there are no errors if you train the model on random tensors with the specified shapes.
i'm working on a 2-layer RNN (LSTM). I think i have successfully reshaped my train and test set but when i try to run the code, it stops with the Exception:
Exception: When using TensorFlow, you should define explicitly the
number of timesteps of your sequences. If your first layer is an
Embedding, make sure to pass it an "input_length" argument. Otherwise,
make sure the first layer has an "input_shape" or "batch_input_shape"
argument, including the time axis.
I tried several configuration, but no one works well. I don't know how to fix it..
Here it is the code where i create the model and reshape X_train and X_test
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], EMB_SIZE))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], EMB_SIZE))
print 'Building model...'
model = Sequential()
model.add(LSTM(input_dim=EMB_SIZE, output_dim=100, return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2], 1)))
model.add(LSTM(input_dim=EMB_SIZE, output_dim=100, return_sequences=False,input_shape=(X_train.shape[1], X_train.shape[2], 1)))
model.add(Dense(2))
model.add(Activation('softmax'))
model.compile(optimizer='adam',
loss='mse',
metrics=['accuracy'])
model.fit(X_train,
Y_train,
nb_epoch=5,
batch_size = 128,
verbose=1,
validation_split=0.1)
score= model.evaluate(X_test, Y_test, batch_size=128)
print score
any help is really appreciated!
Thank you in advance <3
The number of units in the last layer defines the output shape of the model.
The output shape must be the same shape as your targets (Y).
Dense(2) -> Output shape = (None, 2)
Dense(1) -> Output shape = (None, 1)
Y_train -> Target shape = (15015,1)
Whoa.... Keras 0.3.3? No wonder everything will be problematic.