I am building a Keras model to classify data into 3000 different class, my training data consists of large number of sample so after encoding the output of the training in one hot encoding that data is very large (item_count * 3000 * size of float + input data size)
Is is possible to pass sparse arrays to keras as output of training data, any suggested solution?
You can use a sparse representation of your ground truths by using the sparse_categorical_crossentropy loss function.
# assuming get_model() returns your Keras model with an output_shape == [None, 3000]
# assuming get_data() returns training data, with y_train having shape == [num_samples]
x_train, y_train = get_data()
model = get_model()
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, batch_size=16)
Related
I have a table with 1799 users and 31 features which are arranged in rows and columns respectively. The last column is a 2-type condition feature that tells the model which condition the users belong to. I understood that by using LSTM I need to make my input to be 3-d. So, I used reshape(31,1) as I don't have time series data. I also understood that input_shape took in the number of features. My issue is that I want to predict a new set of users who also have the same 30 features and give me a classification result about which user belongs to which condition. It would be better if the result can tell me what is the probability of each of the conditions predicted. So, I tried to use model.predict to do the mentioned tasks. It gave me a result of a numpy array predict_prob with a shape=(200, 31, 1). I am confused at the part that the data structure should be [(31x1)x200] and the output should be the conditions of the users which should be (200,). How come the result is in 3-d and how should I convert it to dataframe format so that I can read it in .csv format? Thank you in advance.
X = raw_data[feature_names]
P = predict_data_raw[feature_names]
P1 = predict_data_raw[feature_names1]
#Training
y = raw_data['Conditions']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=22, test_size=0.1)
X_test = np.expand_dims(X_test, axis=2)
# fit and evaluate a model
model = Sequential()
model.add(Reshape((31,1)))
model.add(Bidirectional(LSTM(10, return_sequences=True),input_shape=(31,)))
model.add(Dropout(0.5))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
LSTM = model.fit(X_train, y_train, epochs=5, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(X_test)
print('Accuracy: %.2f' % (accuracy*100))
predict_prob=model.predict([X_test])
df = pd.DataFrame(predict_prob, columns=["Prediction"])
I want to train a keras neural network on the mnist dataset. The problem is that my model already overfits after 1 or 2 epochs. To combat this problem, I wanted to use data augmentation:
First I load the data:
#load mnist dataset
(tr_images, tr_labels), (test_images, test_labels) = mnist.load_data()
#normalize images
tr_images, test_images = preprocess(tr_images, test_images)
#function which returns the amount of train images, test images and classes
amount_train_images, amount_test_images, total_classes = get_data_information(tr_images, tr_labels, test_images, test_labels)
#convert labels into the respective vectors
tr_vector_labels = keras.utils.to_categorical(tr_labels, total_classes)
test_vector_labels = keras.utils.to_categorical(test_labels, total_classes)
I create a model with a "create_model" function:
untrained_model = create_model()
This is the function definition:
def create_model(_learning_rate=0.01, _momentum=0.9, _decay=0.001, _dense_neurons=128, _fully_connected_layers=3, _loss="sparse_categorical_crossentropy", _dropout=0.1):
#create model
model = keras.Sequential()
#input
model.add(Flatten(input_shape=(28, 28)))
#add fully connected layers
for i in range(_fully_connected_layers):
model.add(Dense(_dense_neurons, activation='relu'))
model.add(Dropout(_dropout))
#classifier
model.add(Dense(total_classes, activation='sigmoid'))
optimizer = keras.optimizers.SGD(
learning_rate=_learning_rate,
momentum=_momentum,
decay=_decay
)
#compile
model.compile(
optimizer=optimizer,
loss=_loss,
metrics=['accuracy']
)
return model
The function returns a compiled but untrained model. I also use this function when I try to optimize the hyperparameters (hence the many parameters).
Then I create an ImagaDataGenerator:
generator = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=0.15,
width_shift_range=0.15,
height_shift_range=0.15,
zoom_range=0.15
)
Now I want to train the model with my train_model_with_data_augmentation function:
train_model_with_data_augmentation(
tr_images=tr_images,
tr_labels=tr_labels,
test_images=test_images,
test_labels=test_labels,
model=untrained_model,
generator=generator,
hyperparameters=hyperparameters
)
However, I don't know how to use this generator for the model I've created because the only method I've found was the fit method of the generator but I want to train my model and not the generator.
Here is the graph that I get from the training history: https://ibb.co/sKFnwGr
Can I somehow convert the generator to data that I can use as parameters in the fit method of the model?
If not: How can I train the model I've created with this generator? (or do I have to implement data augmentation in a completely different way?)
Does data augmentation even make sense with the mnist dataset?
What other options are there to prevent overfitting on mnist?
Update:
I tried to use this code:
generator.fit(x_train)
model.fit(generator.flow(x_train, y_train, batch_size=32), steps_per_epoch=len(x_train)/32, epochs=epochs)
However I get this error message:
ValueError: "Input to .fit() should have rank 4. Got array with shape: (60000, 28, 28)"
I believe the input matrix of the fit method should contain Image Index, height, widht, depth so it should have 4 dimensions while my x_train array only has 3 dimensions and doesn't have any dimension about the depth of the image. I tried to expand it:
x_train = x_train[..., np.newaxis]
y_train = y_train[..., np.newaxis]
But then I get this error message:
"Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated."
Working example of using ImageDataGenerator can be found here. The example itself:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(x_train)
# fits the model on batches with real-time data augmentation:
model.fit(datagen.flow(x_train, y_train, batch_size=32),
steps_per_epoch=len(x_train) / 32, epochs=epochs)
I'm trying to construct a (very simple) Keras model as a baseline for a project. I have a list of 3459 numpy arrays of shape (2, 6, 15) as input, and a list of target values (ints as numpy arrays with shape ()). When I try to train the model I get this error:
"ValueError: Number of samples 2 is less than samples required for specified batch_size 32 and steps 108."
The model so far is extremely simple, but I'm having no luck getting it to train:
input = Input(shape=(2, 6, 15))
x = Dense(64, activation='relu')(input)
x = Dense(64, activation='relu')(x)
output = Dense(1)(x)
model = Model(inputs=input, outputs=output)
model.compile(optimizer='adam', loss='mean_squared_error',
metrics=['accuracy'])
hist = model.fit(
X_train,
y_train,
batch_size=32,
epochs=10,
validation_data=(X_test, y_test),
steps_per_epoch=(len(X_train) // 32),
validation_steps=(len(X_test) // 32))
I'm currently loading the data from pickle files, and I suspect that the issue might be the array structure of the individual training cases. When looking at one of the arrays in the X_train it has a structure [[[...]...], [[...]...]], and I suspect the code is confusing the outer brackets as the batch container, so it's reading a batch size of 2 as input instead. Just a theory, but I don't know how to address that to check for myself.
The error is indeed due to the way your data is being generated/loaded; there are no errors if you train the model on random tensors with the specified shapes.
I'm completely new to machine learning and I wanted to start with a fairly easy project: the digit recognition using the mnist data set. I'm using keras and tensorflow and I started using code I found here.The network is built and trained correctly and I now want to make a simple prediction. For starters I simply used one of the pictures in the part of the data set meant for testing and I would like my output to be that number. (In this case the output is supposed to be 7.)
Here's my code:
# Baseline MLP for MNIST dataset
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from keras.utils import np_utils
import numpy as np
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# flatten 28*28 images to a 784 vector for each image
num_pixels = X_train.shape[1] * X_train.shape[2]
X_train = X_train.reshape((X_train.shape[0], num_pixels)).astype('float32')
X_test = X_test.reshape((X_test.shape[0], num_pixels)).astype('float32')
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
# define baseline model
def baseline_model():
# create model
model = Sequential()
model.add(Dense(num_pixels, input_dim=num_pixels, kernel_initializer='normal', activation='relu'))
model.add(Dense(num_classes, kernel_initializer='normal', activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
# build the model
model = baseline_model()
print("created model")
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)
print("did model.fit")
image_index=0
print("correct result : ", y_test[image_index])
print("shape of the array: ", X_test[0].shape)
print("predicted result : ", model.predict(X_test[image_index]))
Now I get the following error:
ValueError: Error when checking input: expected dense_input to have shape (784,) but got array with shape (1,)
although my array does have the correct shape! As you can see I print print("shape of the array: ", X_test[0].shape) which does return shape of the array: (784,). 784 is exactly the dimension we want and still i get that error.
I've spent hours trying to solve this but no matter what I tried (reshaping the array for example), it doesn't seem to work. Clearly there is some missunderstanding concerning either keras' predict function or the array. Can you please help me understand and solve this?
Thank you in advance.
So the predict function still expects a 0 dimension to be the sample dimension.
When you indexed X_test[0] you basically removed this dimension which causes the predict function to now have 784 samples of 1 pixel!
change your code to:
print("predicted result : ", model.predict(X_test[0].reshape(-1,num_pixels)))
Now you should have the result probabilities.
Edit:
And if you just want the maximum probability predicted number:
print("predicted result : ", np.argmax(model.predict(X_test[0].reshape(-1,num_pixels)), axis = 1))
ompletely new to machine learning and I wanted to start with a fairly easy project: the digit recognitiors I simply used
From the official example in Keras docs, the stacked LSTM classifier is trained using categorical_crossentropy as a loss function, as expected. https://keras.io/getting-started/sequential-model-guide/#examples
But the y_train values are seeded using numpy.random.random() which outputs real numbers, versus 0,1 binary classification ( which is typical )
Are the y_train values being promoted to 0,1 values under the hood?
Can you even train this loss function against real values between 0,1 ?
How is accuracy then calculated ?
Confusing.. no?
from keras.models import Sequential
from keras.layers import LSTM, Dense
import numpy as np
data_dim = 16
timesteps = 8
num_classes = 10
# expected input data shape: (batch_size, timesteps, data_dim)
model = Sequential()
model.add(LSTM(32, return_sequences=True,
input_shape=(timesteps, data_dim))) # returns a sequence of vectors of dimension 32
model.add(LSTM(32, return_sequences=True)) # returns a sequence of vectors of dimension 32
model.add(LSTM(32)) # return a single vector of dimension 32
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Generate dummy training data
x_train = np.random.random((1000, timesteps, data_dim))
y_train = np.random.random((1000, num_classes))
# Generate dummy validation data
x_val = np.random.random((100, timesteps, data_dim))
y_val = np.random.random((100, num_classes))
model.fit(x_train, y_train,
batch_size=64, epochs=5,
validation_data=(x_val, y_val))
For this example, the y_train and y_test are not the one-hot encoding anymore, but the probabilities of each classes. So it is still applicable for cross-entropy. And we can treat the one-hot encoding as the special case of the probabilities vector.
y_train[0]
array([0.30172708, 0.69581121, 0.23264601, 0.87881279, 0.46294832,
0.5876406 , 0.16881395, 0.38856604, 0.00193709, 0.80681196])