Tensorflow Keras loss is NaN - python

as you can see below i try to create an MLP with tensorflow/keras. But unfortunately the loss is always NaN when fitting. Do you have any advice?
as a second error message i get the message "'Functional' object has no attribute 'score'" when trying to measure accuracy with model.score, but i think this is a problem that is triggered by the first one.
thanks to all
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from mpl_toolkits import mplot3d
from sklearn import datasets
from various import printShapes, printNumpy, print_Model_Accuracy, printLARGE, checkFormat
from sklearn.datasets import make_blobs
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
np.random.seed(1234)
#%matplotlib qt
#%matplotlib inline
plt.rcParams["figure.figsize"] = [4*2, 4*2]
if 0:
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.80, random_state=1234)
if 1:
X, y = make_blobs(n_features=4, centers=3, n_samples=1000, cluster_std = 5.0, random_state=1234)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=1234)
print ("Target Label Example: y_train[0]:")
print (y_train[0])
print (type(y_train[0]))
printLARGE("MLP classifier TENSORFLOW")
tf.random.set_seed(1234)
Epochs = 10
inputs = keras.Input(shape=(4,), name="digits")
x = layers.Dense(100, activation="tanh", name="dense_1")(inputs)
x = layers.Dense(4, activation="tanh", name="dense_2")(x)
outputs = layers.Dense(3, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(
optimizer=keras.optimizers.RMSprop(), # Optimizer
loss=keras.losses.SparseCategoricalCrossentropy(), # Loss function to minimize
metrics=[keras.metrics.SparseCategoricalAccuracy()], # List of metrics to monitor
)
printShapes(X_train, "X_train", y_train, "y_train")
# TRAINING
model.fit(X_train, y_train, batch_size=64, epochs=Epochs)
printShapes(X_test, "X_test", y_test, "y_test")
# INFERENCE
y_test_predproba = model.predict(X_test)
print(y_test_predproba)
y_test_pred = np.argmax(y_test_predproba, axis = 1)
print(y_test_pred)
print_Model_Accuracy(model, X_test, y_test, y_test_pred)

Using tanh activation function in the hidden layers does not make
any sense. It should be ReLU.
Using one more hidden layer will be better than using more units in the first layer. [for your task]
However, using more hidden layers makes the model more vulnerable to over-fitting, adding Dropout layers solves the issue.
Finally, your model should be,
inputs = keras.Input(shape=(4,), name="digits")
x = layers.Dense(32, activation="relu", name="dense_1")(inputs)
x = layers.Dropout(0.2)(x)
x = layers.Dense(24, activation="relu", name="dense_2")(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense(16, activation="relu", name="dense_2")(x)
outputs = layers.Dense(3, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

Related

ValueError: 'logits' and 'labels' must have the same shape

I am working on my first neural network, and i'm stuck on one error. Here is the code:
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv('iris.csv')
X = pd.get_dummies(df.drop(['variety'], axis=1))
y = df['variety'].apply(lambda x: 0 if x=='Setosa' else (1 if x=='Versicolor' else 2))
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2)
print(y_train.head())
from keras.models import Sequential, load_model
from keras.layers import Dense
from sklearn.metrics import accuracy_score
model = Sequential()
model.add(Dense(units=8, activation='relu', input_dim=len(X_train.columns)))
model.add(Dense(units=3, activation='sigmoid'))
model.add(flatten())
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics='accuracy')
model.fit(X_train, y_train, epochs=50, batch_size=1)
I am working off of a tutorial on tensorflow, and am using https://www.kaggle.com/datasets/arshid/iris-flower-dataset as the dataset to train on. I used the code from the tutorial, but changed it to fit my dataset. Still, I get the ValueError. Any help?

Why predict works without fit the model in Keras

Check the following code:
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Conv1D, MaxPooling1D, Flatten
from sklearn.model_selection import train_test_split
# Data
X = np.random.rand(1000, 100, 1)
y = np.random.randint(0, 2, (1000, 1))
# Splitting into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Conv1D
model = Sequential()
model.add(Conv1D(32, kernel_size=3, activation='relu', input_shape=(100, 1)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
# Predict before fitting the model
cnn_features_train = model.predict(X_train)
cnn_features_test = model.predict(X_test)
Why this runs without throwing an error? The weights are not yet stabilished by the .fit method, how can it predict something?
If i try to do the same thing (predict before fitting the model) using Sklearn i get the expected error, for example:
from sklearn.ensemble import RandomForestClassifier
# Data
X = np.random.rand(1000, 100, 1)
y = np.random.randint(0, 2, (1000, 1))
# Splitting into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Random Forest
rf = RandomForestClassifier()
rf.predict(X_test)
The error:
sklearn.exceptions.NotFittedError: This RandomForestClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.
Keras is different from sklearn. The .predict ()without calling .fit() helps users in preparing and debugging the correct shapes of the tensor.

How to fine tune the last layers of neural network in target model for transfer learning?

I am learning how transfer learning works using this data https://www.kaggle.com/competitions/santander-customer-satisfaction/data .. so this is my simple source model code in tensorflow. and I am saving this model
import pandas as pd
pd.set_option('display.max_rows', None)
import numpy as np
from tensorflow import keras
import matplotlib.pyplot as plt
import tensorflow as tf
""" # Read in the csv data using pandas
train = pd.read_csv('Z:\ADwork2\python\PM/train.csv',index_col=0)
test = pd.read_csv('Z:\ADwork2\python\PM/test.csv', index_col=0)
sample = pd.read_csv('Z:\ADwork2\python\PM/sample_submission.csv')
"""
# Read in the csv data using pandas
train = pd.read_csv('train.csv',index_col=0)
test = pd.read_csv('test.csv', index_col=0)
sample = pd.read_csv('sample_submission.csv')
train.dtypes.value_counts()
train.select_dtypes(include=['int64']).nunique()
features_to_drop = train.nunique()
features_to_drop = features_to_drop.loc[features_to_drop.values==1].index
# now drop these columns from both the training and the test datasets
train = train.drop(features_to_drop,axis=1)
test = test.drop(features_to_drop,axis=1)
train.isnull().values.any()
X = train.iloc[:,:-1]
y = train['TARGET']
y.value_counts().to_frame().T
from imblearn.over_sampling import SMOTE
X_resampled, y_resampled = SMOTE().fit_resample(X, y)
y_resampled.value_counts().to_frame().T
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X_resampled, y_resampled,
train_size=0.5,
test_size=0.2,
random_state=42,
shuffle=True)
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
test = scaler.transform(test)
model = keras.Sequential(
[
keras.layers.Dense(units=9, activation="relu", input_shape=(X_train.shape[-1],) ),
# randomly delete 30% of the input units below
keras.layers.Dropout(0.3),
keras.layers.Dense(units=9, activation="relu"),
# the output layer, with a single neuron
keras.layers.Dense(units=1, activation="sigmoid"),
]
)
# save the initial weights for later
initial_weights = model.get_weights()
model.summary()
#keras.utils.plot_model(model, show_shapes=True)
learning_rate = 0.001
model.compile(optimizer=keras.optimizers.Adam(learning_rate=learning_rate),
loss="binary_crossentropy",
metrics=keras.metrics.AUC()
)
history = model.fit(X_train, y_train,
epochs=500,
batch_size=1000,
validation_data=(X_val, y_val),
verbose=0)
from tensorflow.keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(
min_delta = 0.0002, # minimium amount of change to count as an improvement
patience = 20, # how many epochs to wait before stopping
restore_best_weights=True,
)
model.set_weights(initial_weights)
history = model.fit(X_train, y_train,
epochs=500,
batch_size=1000,
validation_data=(X_val, y_val),
verbose=0,
# add in our early stopping callback
callbacks=[early_stopping]
)
sample['TARGET'] = model.predict(test)
sample.to_csv('submission.csv',index=False)
#tf.keras.models.save_model()
model.save('modelcentral.h5')
I am saving this model and then loading this model into new python file in the target model
from pyexpat import model
import pandas as pd
pd.set_option('display.max_rows', None)
import numpy as np
from tensorflow import keras
import matplotlib.pyplot as plt
import tensorflow as tf
import tryt
# Read in the csv data using pandas
train = pd.read_csv('train.csv',index_col=0)
test = pd.read_csv('test.csv', index_col=0)
sample = pd.read_csv('sample_submission.csv')
train.dtypes.value_counts()
train.select_dtypes(include=['int64']).nunique()
features_to_drop = train.nunique()
features_to_drop = features_to_drop.loc[features_to_drop.values==1].index
# now drop these columns from both the training and the test datasets
train = train.drop(features_to_drop,axis=1)
test = test.drop(features_to_drop,axis=1)
train.isnull().values.any()
X = train.iloc[:,:-1]
y = train['TARGET']
y.value_counts().to_frame().T
from imblearn.over_sampling import SMOTE
X_resampled, y_resampled = SMOTE().fit_resample(X, y)
y_resampled.value_counts().to_frame().T
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X_resampled, y_resampled,
train_size=0.5,
test_size=0.2,
random_state=42,
shuffle=True)
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
test = scaler.transform(test)
#f.keras.models.load_model()
# It can be used to reconstruct the model identically.
model = keras.models.load_model("modelcentral.h5")
model.trainable=False
#layer1.trainable = False
#inputs = keras.Input(shape=(150, 150, 3))
learning_rate = 0.001
model.compile(optimizer=keras.optimizers.Adam(learning_rate=learning_rate),
loss="binary_crossentropy",
metrics=keras.metrics.AUC()
)
history = model.fit(X_train, y_train,
epochs=500,
batch_size=1000,
validation_data=(X_val, y_val),
verbose=0)
model.summary()
for now I am just freezing all model layers but what if I need to fine tune last layers for example I HAVE BINARY Classification in source model and what if in the target model there is multi-classification. how can I fine tune last layers? i am following this repo https://github.com/rasbt/stat453-deep-learning-ss21/blob/main/L14/5-transfer-learning-vgg16_small.ipynb to learn fine-tuning of final layers for transfer learning but this code is in pytorch and on image data .. so I am confused
model.classifier[1].requires_grad = True
model.classifier[3].requires_grad = True
#For the last layer, because the number of class labels differs compared to ImageNet, we replace the output layer with your own output layer:
model.classifier[6] = torch.nn.Linear(4096, 10)
please help and if there is any mistake in current code then guide me
Given your source model:
import tensorflow as tf
model = tf.keras.Sequential(
[
tf.keras.layers.Dense(units=9, activation="relu", input_shape=(10,) ),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(units=9, activation="relu"),
tf.keras.layers.Dense(units=1, activation="sigmoid"),
])
model.save('model.h5')
You can do something like this to replace your last layer with some other layer:
model = tf.keras.models.load_model("model.h5")
transfer_model = tf.keras.Sequential()
for idx, l in enumerate(model.layers):
if idx == len(model.layers) - 1:
transfer_model.add(tf.keras.layers.Dense(units=10, activation="softmax")) # add output layer with 10 different classes
else: transfer_model.add(l)
print(transfer_model.summary())
You can decide which layers you then want to freeze or make trainable using l.trainable = True / False. You could also do this all without the for loop if you prefer:
model.layers[0].trainable = True
model.layers[2].trainable = True
outputs = tf.keras.layers.Dense(units=10, activation="softmax")(model.layers[-2].output)
transfer_model = tf.keras.Model(inputs=model.input, outputs=outputs)

Why the neural network is not learning?

I am training a neural network with a simple dataset. I have tried different combinations of parameters, optimizers, learning rates ... but even after 20 epochs the network is still not learning anything.
I wonder where in the following code lies the problem?
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow import keras
from livelossplot import PlotLossesKeras
from keras.models import Model
from sklearn.datasets import make_classification
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd
seed = 42
X, y = make_classification(n_samples=100000, n_features=2, n_redundant=0,
n_informative=2, random_state=seed)
print(f"Number of features: {X.shape[1]}")
print(f"Number of samples: {X.shape[0]}")
df = pd.DataFrame(np.concatenate((X,y.reshape(-1,1)), axis=1))
df.set_axis([*df.columns[:-1], 'Class'], axis=1, inplace=True)
df['Class'] = df['Class'].astype('int')
X = df.drop('Class', axis=1)
y = df['Class']
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
print(f"Train set: {X_train.shape}")
print(f"Validation set: {X_val.shape}")
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.astype(np.float64))
X_val_scaled = scaler.transform(X_val.astype(np.float64))
inputs = Input(shape=X_train_scaled.shape[1:])
h0 = Dense(5, activation='relu')(inputs)
h1 = Dense(5, activation='relu')(h0)
preds = Dense(1, activation = 'sigmoid')(h1)
model = Model(inputs=inputs, outputs=preds)
opt = keras.optimizers.Adam(lr=0.0001)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(X_train_scaled, y_train, batch_size=128, epochs=20, verbose=0,
validation_data=(X_val_scaled, y_val),
callbacks=[PlotLossesKeras()])
score_train = model.evaluate(X_train_scaled, y_train, verbose=0)
score_test = model.evaluate(X_val_scaled, y_val, verbose=0)
print('Train score:', score_train[0])
print('Train accuracy:', score_train[1])
print('Test score:', score_test[0])
print('Test accuracy:', score_test[1])
The code produces the following kind of output
You have used wrong loss function, change this line
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
to, for example,
model.compile(optimizer=opt, loss='mse', metrics=['accuracy'])
Categorical cross-entropy needs a one-hot encoded y which means, you have to have a 0 or a 1 for every class. MSE is just mean squared error, so it will work. But you might try some other losses as well.
your y:
[1,0,1]
one-hot encoded y:
[[0,1], [1,0], [0,1]]

Sklearn error cannot reshape array of size 6912 into shape (614,154)

I watched this video on how to build your first neural network but got stuck with this error at 27:00 ValueError: cannot reshape array of size 6912 into shape (614,154)
https://www.youtube.com/watch?v=S2sZNlr-4_4
The code is below:
# This algorithm detects if a person has diabetes or not
# Load libraries
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import pandas as pd
# load data
from google.colab import files
uploaded = files.upload()
# Store the dataset
df = pd.read_csv('datasets_4511_6897_diabetes.csv')
# convert the data into an array
dataset = df.values
# get all of the rows from the first eight columns of the dataset
X = dataset[:,0:8]
y = dataset[:,8]
# process the data
from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
# split the data into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X_scale, y, test_size = 0.2, random_state = 4)
# build the model
model = Sequential([
Dense(12, activation ='relu', input_shape = (8,)),
Dense(15, activation = 'relu'),
Dense(1, activation = 'sigmoid')
])
#Compile the model (sgd = stochastic gradient descent)
model.compile(
optimizer = 'sgd',
loss = 'binary_crossentropy',
metrics=['accuracy']
)
# train the model
hist = model.fit(X_train, y_train, batch_size = 57, epochs = 1000, validation_split=0.2)
# evaluate the model on the training data set
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
pred = model.predict(X_train)
pred = [1 if y >= 0.5 else 0 for y in prediction]
#df = df.values.reshape(614,154)
print('confusion_matrix : /n', confusion_matrix(y_train, pred))
Do you have an idea on how to deal with this issue?
Thank you in advance

Categories

Resources