I have a table with 1799 users and 31 features which are arranged in rows and columns respectively. The last column is a 2-type condition feature that tells the model which condition the users belong to. I understood that by using LSTM I need to make my input to be 3-d. So, I used reshape(31,1) as I don't have time series data. I also understood that input_shape took in the number of features. My issue is that I want to predict a new set of users who also have the same 30 features and give me a classification result about which user belongs to which condition. It would be better if the result can tell me what is the probability of each of the conditions predicted. So, I tried to use model.predict to do the mentioned tasks. It gave me a result of a numpy array predict_prob with a shape=(200, 31, 1). I am confused at the part that the data structure should be [(31x1)x200] and the output should be the conditions of the users which should be (200,). How come the result is in 3-d and how should I convert it to dataframe format so that I can read it in .csv format? Thank you in advance.
X = raw_data[feature_names]
P = predict_data_raw[feature_names]
P1 = predict_data_raw[feature_names1]
#Training
y = raw_data['Conditions']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=22, test_size=0.1)
X_test = np.expand_dims(X_test, axis=2)
# fit and evaluate a model
model = Sequential()
model.add(Reshape((31,1)))
model.add(Bidirectional(LSTM(10, return_sequences=True),input_shape=(31,)))
model.add(Dropout(0.5))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
LSTM = model.fit(X_train, y_train, epochs=5, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(X_test)
print('Accuracy: %.2f' % (accuracy*100))
predict_prob=model.predict([X_test])
df = pd.DataFrame(predict_prob, columns=["Prediction"])
Related
I am working with an LSTM project for learning purposes where I am using time-series data that has 3 columns [current, sma, target] where sma is the simple moving average; I extracted these values from the dataframe like so
data = df[['current', 'sma', 'target']].values
# normlize data
scaler = MinMaxScaler(feature_range=(0,1))
dataset = scaler.fit_transform(data)
# then split inputs from targets
X = dataset[:, :2]
y = dataset[:, 2]
# split into xtrain ytrain xtest ytest
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
Everything works fine so far, and I understand, but the uncharted territory for me would be to convert the x_*, y_* arrays into a 3-d arrays to feed the model; I am using a simple model just to make this work, I am not looking for impressive results, this is purely educational.
The model that I will use:
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(128, input_shape=(timesteps, features), return_sequences=True))
model.add(tf.keras.layers.LSTM(64, return_sequences=False))
model.add(tf.keras.layers.Dense(features))
model.compile(loss='mean_squared_error', optimizer='adam')
How to reshape the data to feed it to the model?
I am very new to machine learning and am trying to create a Keras model using data I have collected. It is perfectly uniform and loads in fine.
Here is a sample:
n,d0,d1,d2,d3,d4,d5,d6,d7,d8,output
30,85.1,65.0,32.2,38.2,191.9,72.1,118.2,121.5,110.3,0.0
417,232.8,51.3,39.8,66.0,173.4,246.7,285.4,265.6,217.0,1.0
496,194.2,72.7,214.8,41.6,155.2,195.2,208.3,31.0,15.6,2.0
361,206.1,52.8,63.0,105.1,168.5,156.0,145.7,127.4,70.6,1.0
408,202.5,48.4,47.4,79.1,223.8,236.6,260.3,247.4,206.2,1.0
Here is my Keras code:
import numpy as np
import pandas
from tensorflow import keras
from sklearn.model_selection import train_test_split
data = pandas.read_csv("data.csv")
x = data[[f"d{i}" for i in range(9)]]
y = data[["output"]]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.1)
model = keras.models.Sequential()
model.add(keras.layers.Dense(12, input_dim=9, activation="relu"))
model.add(keras.layers.Dense(8, activation="relu"))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=150, batch_size=10)
print(model.predict(np.array([[0, 1, 2, 3, 4, 5, 6, 7, 8]])))
_, acc = model.evaluate(x, y)
print('Accuracy: %.2f' % (acc*100))
I don't see any issues, but I can't predict. Please could someone help?
I think you need to fix your output and your compile step.
If you're doing regression (ie. predicting true, continuous values) then you need to change your loss to RMSE or some other continuous loss, and not use a softmax classification. I don't think the accuracy metric will work either.
If you're doing classification then you need to change the number of outputs to the number of classes that you have. You also can't use binary cross entropy because you have more than one class, so use CategoricalCrossentropy. Note that according to the docs you have to change your current label representation (0,1,2..) to a one-hot representation. You can easily do this by:
y_one_hot = keras.utils.to_categorical(y)
I've a problem involving airfoil velocity and pressure prediction, given the AOA,x,y. I'm using keras with MLP. I have 3 inputs (AOA,x,y) and I have to predict 3 outputs (u,v,p). I initially have a code which outputs the MSE loss as a single value. However, I modified the code so that I have MSE for each output. However, I don't get the avg MSE of the 3 outputs (u_mean_squared_error: 73.63%,v_mean_squared_error: 1.13%,p_mean_squared_error: 2.16%) equal to the earlier single MSE loss (mean_squared_error: 5.81%). Hence, I'm wondering if my new code is wrong. Or whether I'm doing it the right way. Can someone help?
Old code:
# load pima indians dataset
dataset = numpy.loadtxt("S1020_data.csv", delimiter=",")
# split into input and output variables
X = dataset[:,0:3]
Y = dataset[:,3:6]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
input_data = layers.Input(shape=(3,))
#create the layers and pass them the input tensor to get the output tensor:
hidden1Out = Dense(units=12, activation='relu')(input_data)
hidden2Out = Dense(units=8, activation='relu')(hidden1Out)
finalOut = Dense(units=3, activation='relu')(hidden2Out)
#define the model's start and end points
model = Model(input_data, finalOut)
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_squared_error'])
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test,y_test), epochs=10, batch_size=1000)
# evaluate the model
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
New code:
# load pima indians dataset
dataset = numpy.loadtxt("S1020_data.csv", delimiter=",")
# split into input and output variables
X = dataset[:,0:3]
Y = dataset[:,3:6]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
input_data = layers.Input(shape=(3,))
#create the layers and pass them the input tensor to get the output tensor:
hidden1Out = Dense(units=12, activation='relu')(input_data)
hidden2Out = Dense(units=8, activation='relu')(hidden1Out)
u_out = Dense(1, activation='relu', name='u')(hidden2Out)
v_out = Dense(1, activation='relu', name='v')(hidden2Out)
p_out = Dense(1, activation='relu', name='p')(hidden2Out)
#define the model's start and end points
model = Model(input_data,outputs = [u_out, v_out, p_out])
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_squared_error'])
# Fit the model
model.fit(X_train, [y_train[:,0], y_train[:,1], y_train[:,2]], validation_data=(X_test,[y_test[:,0], y_test[:,1], y_test[:,2]]), epochs=10, batch_size=1000)
# evaluate the model
scores = model.evaluate(X, [Y[:,0], Y[:,1], Y[:,2]])
for i in range(7):
print("\n%s: %.2f%%" % (model.metrics_names[i], scores[i]*100))
I think the difference comes from the optimization objective.
In your old code, the objective was:
sqrt( (u_true - u_pred)^2 + (v_true - v_pred)^2 + (p_true - p_pred)^2 )
which minimizes the 2-norm of the [u_pred,v_pred,p_pred] vector with respect to its target.
But in the new one, the objective became:
sqrt((u_true - u_pred)^2) + sqrt((v_true - v_pred)^2) + sqrt((p_true - p_pred)^2)
which is quite different from the previous one.
I am trying to use GloVe embeddings to train a cnn model based on this article (also a rnn, which has this issue). The dataset is a labeled data: text (tweets) with labels (hate, offensive or neither).
The problem is that model performs well on train set but poorly on validation set.
here is the model:
kernel_size = 2
filters = 256
pool_size = 2
gru_node = 64
model = Sequential()
model.add(Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=True))
model.add(Dropout(0.25))
model.add(Conv1D(filters, kernel_size, activation='relu'))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(Conv1D(filters, kernel_size, activation='softmax'))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, return_sequences=True, recurrent_dropout=0.2))
model.add(LSTM(gru_node, recurrent_dropout=0.2))
model.add(Dense(1024,activation='relu'))
model.add(Dense(nclasses))
model.add(Activation('softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
fitting the model:
X = df.tweet
y = df['classifi'] # classes 0,1,2
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, shuffle=False)
X_train_Glove,X_test_Glove, word_index,embeddings_index = loadData_Tokenizer(X_train,X_test)
model_RCNN = Build_Model_RCNN_Text(word_index,embeddings_index, 20)
model_RCNN.fit(X_train_Glove, y_train,validation_data=(X_test_Glove, y_test),
epochs=15,batch_size=128,verbose=2)
predicted = model_RCNN.predict(X_test_Glove)
predicted = np.argmax(predicted, axis=1)
print(metrics.classification_report(y_test, predicted))
this is what the distribution looks like (0:hate, 1:offensive, 2:neither)
model summary
Results:
classification report
is this the correct approach or am I missing something here
Generally speaking there are two sides that you can tackle overfitting:
Improving the data
More unique data
oversampling (to balance data)
Limiting the network structure
Dropout (You've implemented this)
Less parameters (You might want to benchmark against a much smaller network)
regularization (ex. L1 and L2)
I'd suggest trying with significantly fewer parameters (because this is quick) and oversampling (because your data seems lopsided).
Also, You can also try hyperparameter fitting. Making a large number of networks with different parameters than picking the best one.
Note: if you do hyper parameter fitting make sure to have an extra validation set because you can easily overfit your test set this way.
Side note: Sometimes when troubleshooting NN it is helpful to set the optimizer to a basic stochastic gradient descent. It slows the training down a bunch but makes the progression much clearer.
Good luck!
I am a beginner who is starting learn to code in Keras backend Tensorflow. I am using python 2.7
I have model in keras and after training i want to check my weight.
Edited
# fix random seed for reproducibility (split training and validation set)
seed = 7
np.random.seed(seed)
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# reshape to be [samples][pixels][width][height]
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 1, 28, 28).astype('float32')
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs (label encoding)
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
def tempsigmoid(x, temp=0.5):
return K.sigmoid(x/temp)
def baseline_model():
# create model
model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=(1, 28, 28), activation='relu'))
#model.add(Dense, input_shape = (1,28,28), Activation='relu')
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(num_classes, activation=tempsigmoid))
# Compile model
model.compile(loss='mae', optimizer=SGD(lr=0.1), metrics=['accuracy'])
return model
# build the model
model = baseline_model()
earlystopper = EarlyStopping(monitor='val_loss', min_delta=0.1, patience=0, verbose=2, mode='auto')
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=5, batch_size=200, verbose=2, callbacks=[earlystopper])
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("CNN Error: %.2f%%" % (100-scores[1]*100))
# print("Metrics(Test loss & Test Accuracy): ")
print(scores)
weight = model.get_weights()
print(weight)
i get the weight in array like in the picture. How can i save weight array to csv file?
weight array
I tried with model.save_weight() and i have an output file in h5 format but when i want to open it with numpy its only display a little part of that . I am thingking when i can save it in csv format i will get full display of the data.
I had tried to convert h5 to csv with numpy python like in the picture
# to save weight after output
model.save_weights('Result/w_output.h5')
Tried to display full array with numpy
I had succeed to save my array weight to csv by using this coding
weight = model.get_weights()
np.savetxt('weight.csv' , weight , fmt='%s', delimiter=',')
I assume you want every list of numbers in its own row.
If that is the case you need to reform weight in following shape weight = [[data1, data2, data3...],[data11, data12, data13...]...]. functions for reforming list
Then you just need to write/append it to csv file. You can use pandas or csv library.