Can I predict on continues target value in keras? - python

This is the example of data that I have.
Length of df is 1778360.
The search term is the queries that people type on Search Engine.
CR (Conversion Rate) is a continuous number. It starts from 0 to no limit.
Search term CR
0 asos french connection lined mac 100
1 hugo boss polo black 50
2 women's pale grey trousers uk 47
3 military jacket 8
4 girls adidas red tracksuit top 0
What I want is to predict the CR with the text as the input.
texts = df['Search term']
tags = df['CR']
num_max = 1000
# preprocess
le = LabelEncoder()
tags = le.fit_transform(tags)
token = Tokenizer(num_words=num_max)
token.fit_on_texts(texts)
mat_texts = token.texts_to_matrix(texts, mode='freq')
print(tags[:5])
print(mat_texts[:5])
print(tags.shape, mat_texts.shape)
# split data to train and test
X_train, X_test, y_train, y_test = train_test_split(mat_texts, tags, train_size=0.8, random_state=1)
# create model
model = Sequential()
model.add(Dense(512, input_dim=num_max, kernel_initializer='normal', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(100, kernel_initializer='normal', activation='softmax'))
# compile model
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
# fit the model
model.fit(X_train, y_train, epochs=10, batch_size=32, verbose=0, validation_data=(X_test, y_test))
# evaluate the model
train_scores = model.evaluate(X_train, y_train, verbose=0)
print("Train %s: %.2f%%" % (model.metrics_names[1], train_scores[1]*100))
test_scores = model.evaluate(X_test, y_test, verbose=0)
print("Test %s: %.2f%%" % (model.metrics_names[1], test_scores[1]*100))
I got this as result:
Train acc: 82.53%
Test acc: 82.48%
I'm not sure if the last dense and the loss function is correct. This is more like a linear regression but I couldn't find a suitable keras model for linear regression.
Can somebody help, please? Thanks.
P.s. I'm very new to deep learning and neural network.

For regression problem, the activation of the last dense layer should be 'linear' or 'sigmoid', and the loss should be 'mean_squared_error'.

Related

Keras Sequential Neural Network : Bounce in Train and Test loss on learning curve

I'm working of the Online-News-Popularity dataset. I'm trying to predict the numbers of shares with a keras sequential network.
However, a weird bounce in mae for train and test happens on my learning curve and I don't know how to interprete it. Do you have any idea how it means ? (Link Below)
Learning curve of my model based on mae
Here is my code for the model :
x = m_news.loc[:, m_news.columns != "shares"]
y = np.ravel(m_news.loc[:, m_news.columns == "shares"])
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42) #data is already scaled
model = Sequential()
model.add(Dense(24, activation='relu', input_shape=(55,)))
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse',
metrics=['mae', 'mse'],
optimizer = optimizers.RMSprop(0.001))
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=15, batch_size=200)
plt.plot(history.history['mae'], label='train')
plt.plot(history.history['val_mae'], label='test')
_=plt.legend()
plt.xlabel("no epochs")
plt.ylabel("value loss")
Thanks for reading and trying to help me :)

Deep learning accuracy changes

Every time I change the dataset, it gives a different accuracy. Sometimes it gives 97%, 50%, and 92%. It is a text classification. Why does this happen? The other 95% comes from 2 datasets that are the same size and give almost the same result.
#Split DatA
X_train, X_test, label_train, label_test = train_test_split(X, Y, test_size=0.2,random_state=42)
#Size of train and test data:
print("Training:", len(X_train), len(label_train))
print("Testing: ", len(X_test), len(label_test))
#Function defined to test the models in the test set
def test_model(model, epoch_stop):
model.fit(X_test
, Y_test
, epochs=epoch_stop
, batch_size=batch_size
, verbose=0)
results = model.evaluate(X_test, Y_test)
return results
#############3
maxlen = 300
#Bidirectional LSTM model
embedding_dim = 100
dropout = 0.5
opt = 'adam'
####################
#embed_dim = 128 #dimension of the word embedding vector for each word in a sequence
lstm_out = 196 #no of lstm layers
lstm_model = Sequential()
#Adding dropout
#lstm_model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))##############################
lstm_model = Sequential()
lstm_model.add(layers.Embedding(input_dim=num_words,
output_dim=embedding_dim,
input_length=X_train.shape[1]))
#lstm_model.add(Bidirectional(LSTM(lstm_out, return_sequences=True, dropout=0.2, recurrent_dropout=0.2)))
#lstm_model.add(Bidirectional(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2)))
#lstm_model.add(Bidirectional(LSTM(64, return_sequences=True)))
lstm_model.add(Bidirectional(LSTM(64, return_sequences=True)))
lstm_model.add(layers.GlobalMaxPool1D())
#Adding a regularized dense layer
lstm_model.add(layers.Dense(32,kernel_regularizer=regularizers.l2(0.001),activation='relu'))
lstm_model.add(layers.Dropout(0.25))
lstm_model.add(Dense(3,activation='softmax'))
lstm_model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])
print(lstm_model.summary())
#TRANING
history = lstm_model.fit(X_train, label_train,
epochs=4,
verbose=True,**strong text**
validation_data=(X_test, label_test),
batch_size=64)
loss, accuracy = lstm_model.evaluate(X_train, label_train, verbose=True)
print("Training Accuracy: {:.4f}".format(accuracy))
loss_val, accuracy_val = lstm_model.evaluate(X_test, label_test, verbose=True)
print("Testing Accuracy: {:.4f}".format(accuracy_val))
ML models will base their predictions on the data previously trained on, it is only natural that the outcome will differ in case the training data is changed. Also it might be the case that a different dataset may perform better using different hyperparameters.

Are these 2 keras deep learning code the same for multiple outputs?

I've a problem involving airfoil velocity and pressure prediction, given the AOA,x,y. I'm using keras with MLP. I have 3 inputs (AOA,x,y) and I have to predict 3 outputs (u,v,p). I initially have a code which outputs the MSE loss as a single value. However, I modified the code so that I have MSE for each output. However, I don't get the avg MSE of the 3 outputs (u_mean_squared_error: 73.63%,v_mean_squared_error: 1.13%,p_mean_squared_error: 2.16%) equal to the earlier single MSE loss (mean_squared_error: 5.81%). Hence, I'm wondering if my new code is wrong. Or whether I'm doing it the right way. Can someone help?
Old code:
# load pima indians dataset
dataset = numpy.loadtxt("S1020_data.csv", delimiter=",")
# split into input and output variables
X = dataset[:,0:3]
Y = dataset[:,3:6]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
input_data = layers.Input(shape=(3,))
#create the layers and pass them the input tensor to get the output tensor:
hidden1Out = Dense(units=12, activation='relu')(input_data)
hidden2Out = Dense(units=8, activation='relu')(hidden1Out)
finalOut = Dense(units=3, activation='relu')(hidden2Out)
#define the model's start and end points
model = Model(input_data, finalOut)
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_squared_error'])
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test,y_test), epochs=10, batch_size=1000)
# evaluate the model
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
New code:
# load pima indians dataset
dataset = numpy.loadtxt("S1020_data.csv", delimiter=",")
# split into input and output variables
X = dataset[:,0:3]
Y = dataset[:,3:6]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
input_data = layers.Input(shape=(3,))
#create the layers and pass them the input tensor to get the output tensor:
hidden1Out = Dense(units=12, activation='relu')(input_data)
hidden2Out = Dense(units=8, activation='relu')(hidden1Out)
u_out = Dense(1, activation='relu', name='u')(hidden2Out)
v_out = Dense(1, activation='relu', name='v')(hidden2Out)
p_out = Dense(1, activation='relu', name='p')(hidden2Out)
#define the model's start and end points
model = Model(input_data,outputs = [u_out, v_out, p_out])
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_squared_error'])
# Fit the model
model.fit(X_train, [y_train[:,0], y_train[:,1], y_train[:,2]], validation_data=(X_test,[y_test[:,0], y_test[:,1], y_test[:,2]]), epochs=10, batch_size=1000)
# evaluate the model
scores = model.evaluate(X, [Y[:,0], Y[:,1], Y[:,2]])
for i in range(7):
print("\n%s: %.2f%%" % (model.metrics_names[i], scores[i]*100))
I think the difference comes from the optimization objective.
In your old code, the objective was:
sqrt( (u_true - u_pred)^2 + (v_true - v_pred)^2 + (p_true - p_pred)^2 )
which minimizes the 2-norm of the [u_pred,v_pred,p_pred] vector with respect to its target.
But in the new one, the objective became:
sqrt((u_true - u_pred)^2) + sqrt((v_true - v_pred)^2) + sqrt((p_true - p_pred)^2)
which is quite different from the previous one.

Adding prior belief into a neural Network

I am busy with a classification problem, with three classes. One of the classes is never predicted/classified. I would like to know if there s anyway to inject a prior belief into my neural network, be design or not.
My football prediction model predicts [Draws , Home Win , Away Win]. My classes are pretty balanced (40% , 30 % , 30%). The class [Draw] that accounts for 40% of the data is the one the my NN never predicts. My dataset contains 1900 samples.
I am using a deep NN with 2 to 4 hidden layers.
My code of my best model(based on training/val loss) is as follows:
X_all = df.copy()
train_cols = ['a_line0','a_line1','a_line2','a_line3','a_line4','a_line5',
'a_line6','a_line7','a_line8','a_line9','a_line10','h_line0',
'h_line1','h_line2','h_line3','h_line4','h_line5','h_line6',
'h_line7','h_line8','h_line9','h_line10','odds0','odds1','odds2']
x = X_all[train_cols]
x_v = x.values #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x_v)
x = pd.DataFrame(x_scaled)
y = X_all['result']
ohe = OneHotEncoder(n_values=3,categories='auto')
y = ohe.fit_transform(y.reshape(-1,1))
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=0)
for lr,ep in [(0.001,300)]:
model = Sequential()
model.add(Dense(25, input_dim=25, activation='relu'))
model.add(Dense(36, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(12, activation='relu'))
model.add(Dense(3, activation='sigmoid'))
adam = kr.optimizers.Adam(lr=lr, decay=1e-6)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
model.fit(X_train, y_train, epochs=ep, batch_size=10,verbose = 0)
_, accuracy = model.evaluate(X_test, y_test)
_, accuracy1 = model.evaluate(X_train, y_train)
print('Testing Accuracy: %.2f' % (accuracy*100),'Train Accuracy: %.2f' % (accuracy1*100), 'learning rate : ', lr)
I apologise if the code is a bit messy.
My model also overfits by +- 16% (52% vs 68%) on this config of my network.
Since you are in a multi-class single-label setting (i.e. your labels are mutually exclusive), you should not use sigmoid as activation in your final layer; change it to
model.add(Dense(3, activation='softmax'))
Also, dropout should not be used by default; remove it for starters, and only add it if it improves the result.

Keras give less accuracy than any classifier

I use python to multi-class text classification , my data set contains 25000 Arabic tweets divided into 10 classes[sport, politics,....]
When I use
training = pd.read_csv('E:\cluster data\One_File_nonnormalizenew2norm.txt', sep="*")
training.dropna(inplace=True)
training.columns = ["text", "class1"]
training['class1'] = training.class1.astype('category').cat.codes
training.dropna(inplace=True)
# create our training data from the tweets
text = training['text']
y = (training['class1'])
from sklearn.model_selection import train_test_split
sentences_train, sentences_test, y_train, y_test = train_test_split(text, y, test_size=0.25, random_state=1000)
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer()
vectorizer.fit(sentences_train)
X_train = vectorizer.transform(sentences_train)
X_test = vectorizer.transform(sentences_test)
X_train
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()
classifier.fit(X_train, y_train)
score = classifier.score(X_test, y_test)
print("Accuracy:", score)
Accuracy: 0.9525099601593625
When I use keras:
model = Sequential()
max_words=5000
model.add(Dense(512, input_shape=(input_dim,), activation='softmax'))
model.add(Dropout(0.5))
model.add(Dense(256, activation='softmax'))
model.add(Dropout(0.5))
model.add(Dense(1,activation='softmax'))
model.add(Dense(10))
model.summary()
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=150, epochs=5, verbose=1, validation_split=0.3,shuffle=True)
predicted = model.predict(X_test)
predicted = np.argmax(predicted, axis=1)
accuracy_score(y_test, predicted)
0.28127490039840636
where the mistake???
update
I change the code to:
model = Sequential()
max_words=5000
model.add(Dense(512, input_shape=(input_dim,)))
model.add(Dropout(0.5))
model.add(Dense(256))
model.add(Dropout(0.5))
#model.add(Dense(1,activation='sigmoid'))####
model.add(Dense(10))
model.summary()
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train,batch_size=150,epochs=10,verbose=1,validation_split=0.3,shuffle=True)
predicted = model.predict(X_test)
predicted = np.argmax(predicted, axis=1)
accuracy_score(y_test, predicted)
0.7201593625498008
still bad accuracy!!!
Some ideas.
Remove all softmax activations (as #Matias said).
Remove the model.add(Dense(1,activation='softmax')), it's probably destroying your results.
Do more than 5 epochs.
You are not using the same tweets for validation in the two approaches.
You should probably give the accuracy on both the training and the testing datasets to be sure what is going on.

Categories

Resources