I followed a tutorial on youtube and I accidentally didn't add model.add(Dense(6, activation='relu')) on Keras and I got 36% accuracy. After I added this code it rised to 86%. Why did this happen?
This is the code
from sklearn.model_selection import train_test_split
import keras
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
np.random.seed(3)
classifications = 3
dataset = np.loadtxt('wine.csv', delimiter=",")
X = dataset[:,1:14]
Y = dataset[:,0:1]
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.66,
random_state=5)
y_train = keras.utils.to_categorical(y_train-1, classifications)
y_test = keras.utils.to_categorical(y_test-1, classifications)
model = Sequential()
model.add(Dense(10, input_dim=13, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(6, activation='relu')) # This is the code I missed
model.add(Dense(6, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(2, activation='relu'))
model.add(Dense(classifications, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=
['accuracy'])
model.fit(x_train, y_train, batch_size=15, epochs=2500, validation_data=
(x_test, y_test))
Number of layers is an hyper parameter just like learning rate,no of neurons.
These play an important role in determining the accuracy.
So in your case.
model.add(Dense(6, activation='relu'))
This layer played the key roll.
We cannot understand what exactly these layers are actually doing.
The best we can do is to do hyper parameter tuning to get the best combination of hyper parameters.
In my opinion, maybe it's the ratio of your training set to your test set. You have 66% of your test set, so it's possible that training with this model will be under fitting. So one less layer of dense will have a greater change in the accuracy . You put test_size = 0.2 and try again the change in the accuracy of the missing layer.
Related
I have a dataset on which I train a DNN model.
my dataset contain 398 samples and 330 features, i redueced features to 39 with ExtraTreeclassifier(). this my model :
X_train, X_test, y_train, y_test = train_test_split(xfinal, val_y, test_size = 0.2, random_state = 0)
model=Sequential()
model.add(Dense(units=20, kernel_initializer='uniform', activation='relu',input_dim=nb_features))
model.add(Dense(units=20, kernel_initializer='uniform', activation='relu'))
model.add(Dense(units=10, kernel_initializer='uniform', activation='relu'))
model.add(Dense(units=5, kernel_initializer='uniform', activation='relu'))
model.add(Dense(units=1,kernel_initializer='uniform',activation='sigmoid'))
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
history = model.fit(X_train,y_train,validation_data=(X_test,y_test),batch_size=32,epochs=250)
I tried Dropout but my model steel overfitting :
Any solution for my model ?
You can add Dropout layer between Dense layers like below.
model.add(Dropout(0.2))
Also you can remove one or more hidden layers from your architecture.
One more thing is, you can use Earlystopping method to stop at correct epoch number.
Your final model architecture can be like below:
callbacks = [EarlyStopping(monitor='val_loss', patience=5)]
model=Sequential()
model.add(Dense(units=20, kernel_initializer='uniform', activation='relu',input_dim=nb_features))
model.add(Dropout(0.2))
model.add(Dense(units=5, kernel_initializer='uniform', activation='relu'))
model.add(Dense(units=1,kernel_initializer='uniform',activation='sigmoid'))
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
history = model.fit(X_train,y_train,validation_data=(X_test,y_test),batch_size=32,epochs=250, callbacks=callbacks)
I am trying to approximate function with keras model, that has only one hidden layer and whatever I do - I can't reach necessary result.
I'm trying to do it with following code
from __future__ import print_function
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from LABS.ZeroLab import E_Function as dataset5
train_size = 2000
# 2 model and data initializing
(x_train, y_train), (x_test, y_test) = dataset5.load_data(train_size=train_size, show=True)
model = Sequential()
model.add(Dense(50, kernel_initializer='he_uniform', bias_initializer='he_uniform', activation='sigmoid'))
model.add(Dense(1, kernel_initializer='he_uniform', bias_initializer='he_uniform', activation='linear'))
model.compile(optimizer=Adam(), loss='mae', metrics=['mae'])
history = model.fit(x=x_train, y=y_train, batch_size=20, epochs=10000, validation_data=(x_test, y_test), verbose=1)
It's function that loads from dataset5
It's comparison of model prediction with testing data
I tryied to fit this network with different optimizers and neurons number (from 50 to 300), but result was the same.
What should I change?
I found solution! The main issue was train data. I forgot to shuffle x_train and y_train before fitting.
I approximated it with 2 hidden layers succesfully, but I still can't to approximate it with 1 hidden layer.
I am trying text classification using the bag of word model. Everything works fine till I use the test set for testing and evaluation of accuracy but how we can check the class of a single statement.
I have a data frame with 2 classes labels and body.
cout_vect = CountVectorizer()
final_count = cout_vect.fit_transform(df['body'].values.astype('U'))
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
X_train, X_test, y_train, y_test = train_test_split(final_count, df['label'], test_size = .3, random_state=25)
model = Sequential()
model.add(Dense(264, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
y_train = np_utils.to_categorical(y_train, num_classes=3)
y_test = np_utils.to_categorical(y_test, num_classes=3)
model.fit(X_train, y_train, epochs=50, batch_size=32)
model.evaluate(x=X_test, y=y_test, batch_size=None, verbose=1, sample_weight=None)
Now I want to predict this statement using my model. How to do this
I tried converting my statement to vector using the count vectorizer but according to the bag of word approach, it is just an 8 dimension vector.
x = "Your account balance has been deducted for 4300"
model.predict(x, batch_size=None, verbose=0, steps=None)
You need to do this:
# First transform the sentence to bag-of-words according to the already learnt vocabulary
x = cout_vect.transform([x])
# Then send the feature vector to the predict
print(model.predict(x, batch_size=None, verbose=0, steps=None))
You have not shown how you "I tried converting my statement to vector using the count vectorizer but according to the bag of word approach, it is just an 8 dimension vector.", but I'm guessing you did this:
cout_vect.fit_transform([x])
If you call fit() (or fit_transform()), the vectorizer will forget all the previous training and only remember the current vocab, hence you only got a feature vector of size 8, whereas your previous vector was of higher size.
This is a regression problem. Below is my code
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.cross_validation import cross_val_score, KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
os.chdir(r'C:\Users\Swapnil\Desktop\RP TD\first\Changes')
## Load the dataset
dataset1 = pd.read_csv("Main Lane Plaza 1.csv")
X_train = dataset1.iloc[:,0:11].values
Y_train = dataset1.iloc[:,11].values
dataset2 = pd.read_csv("Main Lane Plaza 1_070416010117.csv")
X_test = dataset2.iloc[:,0:11].values
Y_test = dataset2.iloc[:,11].values
##Define base model
def base_model():
model = Sequential()
model.add(Dense(11, input_dim=11, kernel_initializer='normal',
activation='sigmoid'))
model.add(Dense(7, kernel_initializer='normal', activation='sigmoid'))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer = 'adam')
return model
seed = 7
np.random.seed(seed)
clf = KerasRegressor(build_fn=base_model, nb_epoch=100,
batch_size=5,verbose=0)
clf.fit(X_train, Y_train)
res = clf.predict(X_train)
##Result
clf.score(X_test, Y_test)
Not sure if the score should be negative??
Kindly advise if i am doing something wrong.
Thanks in advance.
I am not able to figure it out can this be problem due to feature scaling as I did feature scaling using R and saved the csv files to use in python.
When you get a negative score for regression problem, it usually means that your the model you choose can't fit your data well.
You have layer 1 activation as sigmoid, layer 2 also as sigmoid and then final layer as 1 output.
change the activations to relu, as sigmoid would be squashing the values between 0 to 1. Making the numbers really small, causing the vanishing gradient problem over the 2 hidden layer.
def base_model():
model = Sequential()
model.add(Dense(11, input_dim=11, kernel_initializer='normal', activation='relu'))
model.add(Dense(7, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer='adam')
return model
I am receiving the following error when I fit the network - ValueError: Error when checking target: expected dense_6 to have shape (2,) but got array with shape (22,)
As far as I can tell the shape should be correct given how the dataset is split? Any help is greatly appreciated, thanks!
The dataset can be found here: https://archive.ics.uci.edu/ml/machine-learning-databases/mushroom/agaricus-lepiota.data
from keras.layers import Dense
from keras.models import Sequential
import keras.utils
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
# seed weights
np.random.seed(3)
# import dataset
data = pd.read_csv('agaricus-lepiota.csv', delimiter=',')
# encode labels as integers so the can be one-hot-encoded which takes int matrix
le = preprocessing.LabelEncoder()
data = data.apply(le.fit_transform)
# one-hot-encode string data (now type int)
ohe = preprocessing.OneHotEncoder(sparse=False)
data = ohe.fit_transform(data)
X = data[:, 1:23]
Y = data[:, 0:1]
# split into test and train set
x_train, y_train, x_test, y_test = train_test_split(X, Y, test_size=.2, random_state=5)
# create model
model = Sequential()
model.add(Dense(500, input_dim=22, activation='relu'))
model.add(Dense(300, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(25, activation='relu'))
model.add(Dense(2, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=1000, batch_size=25)
I found 2 errors in your code.
1)
x_train, y_train, x_test, y_test = train_test_split(X, Y, test_size=.2, random_state=5)
must be
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=.2, random_state=5)
check this to learn more about the function.
2)
You have only one column in y_train. But the last layer in your model adds two columns. So instead of
model = Sequential()
model.add(Dense(500, input_dim=22, activation='relu'))
model.add(Dense(300, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(25, activation='relu'))
model.add(Dense(2, activation='sigmoid'))
use this:
model = Sequential()
model.add(Dense(500, input_dim=22, activation='relu'))
model.add(Dense(300, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(25, activation='relu'))
model.add(Dense(1, activation='sigmoid'))