I am relatively new to the neural network, so I was trying to use it for unsupervised clustering. My data is in dataframe with 5 different columns (features), I wanted to get like 4 classes from this, see the full model below
from sklearn import preprocessing as pp
from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import log_loss
from sklearn.metrics import precision_recall_curve, average_precision_score
from sklearn.metrics import roc_curve, auc, roc_auc_score
import keras
from keras import backend as K
from keras.models import Sequential, Model
from keras.layers import Activation, Dense, Dropout , Flatten
from keras.layers import BatchNormalization, Input, Lambda
from keras import regularizers
from keras.losses import mse, categorical_crossentropy
model = Sequential()
model.add(Dense(32, activation='relu',input_shape=[5]))
model.add(Flatten())
model.add(Dense(units=32, activation='relu'))
model.add(Dense(units=16, activation='relu'))
model.add(Dense(units=4, activation='relu'))
model.add(Dense(4, activation = "softmax"))
model.compile(optimizer='adam',loss="categorical_crossentropy",metrics=['accuracy'])
when I give the option of generating 4 classes I get the error message:
ValueError: Shapes (None, 5) and (None, 4) are incompatible
I don't know what I am doing wrong. I have tried to use a different loss function, same error.
i get the error when i input my data,
out_class = model.fit(x=pd_pca_std,
y=pd_pca_std,
epochs=num_epochs,
batch_size=batch_size,
shuffle=True,
validation_data=(pd_pca_std, pd_pca_std),
verbose=1)
the values are
batch_size = 33
epochs = 20
num_classes = 4
input_shape = (990000, 5)
output_shape = (990000, 4)
I would suggest using 5 classes or something relative to the 5 classes instead. I'll explain.
So in Neural Networks and machine learning in general, there are certain matrix operations that happen in the background in TensorFlow. So say I create the following:
import numpy as np
x = np.random.random((3, 4))
y = np.random.random((3, 3))
np.dot(x, y) # if I try multiplying 2 incompatible matrices, the program will fail :(
So what's happening here is that the matrices are incompatible for simple matrix arithmetic, because they need to be certain shapes for them to be compatible. So what I recommend doing is either changing the shapes of the matrices/arrays in question, or play with different shapes in the program to see which will succeed...
You could also learn some linear algebra which has the rules for matrix manipulation and arithmetic, but I won't go into that right now. However, what I will do is leave a link for you to check out regarding this subject so you know what to do in the future...
Here it is:
https://www.mathlynx.com/online/LinAlg_Matrices_rules
Hopefully this helps...
Have a nice day :)
TL;DR
Your output units should match the number of classes you are testing on.
This is the skeleton of how I replicated you problem and had it working
from sklearn import preprocessing as pp
from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import log_loss
from sklearn.metrics import precision_recall_curve, average_precision_score
from sklearn.metrics import roc_curve, auc, roc_auc_score
import numpy as np
import keras
from keras import backend as K
from keras.models import Sequential, Model
from keras.layers import Activation, Dense, Dropout , Flatten
from keras.layers import BatchNormalization, Input, Lambda
from keras import regularizers
from keras.losses import mse, categorical_crossentropy
X = '''input data here as an array''' # I used X = np.zeros((990000, 5))
y = '''output data here as an array'''#I used y = np.ones((990000, 4))
batch_size = 33
num_epochs = 20
num_classes = 4
model = Sequential()
model.add(Dense(32, activation='relu',input_shape=X.shape[1:])) #Input shape = 5
model.add(Flatten())
model.add(Dense(units=32, activation='relu'))
model.add(Dense(units=16, activation='relu'))
model.add(Dense(units=4, activation='relu'))
model.add(Dense(y.shape[1], activation = "softmax")) #Output = y.shape[1] = 4
model.compile(optimizer='adam',loss="categorical_crossentropy",metrics=['accuracy'])
model.summary() #Will show you a summary of the model
model.fit(x=X, y=y,epochs=num_epochs, batch_size=batch_size, shuffle=True,validation_data=(X,y),verbose=1) #You may want to use different variables in your validation.
Related
I've been trying to create a model that recognizes different singing techniques. I have got good results but I want to do different tests with different optimizers, layers, etc. However, I can't get reproducible results. By running twice this model training:
num_epochs = 100
batch_size = 128
history = modelo.fit(X_train_f, Y_train, validation_data=(X_test_f,Y_test), epochs=num_epochs, batch_size=batch_size, verbose=2)
I can get 25% accuracy the first run and then 34% the second. Then if I change the optimizer from "sgd" to "adam", I would get a 99%. If I come back to the previous "sgd" optimizer that got me 34% the second run, I would get 100% or something crazy like that. I don't understand why.
I've tried many things I've read in similar questions. The following lines show how I am trying to make my code to be reproducible, and these are actually the first lines of my whole code:
import numpy as np
import tensorflow as tf
import random as rn
import os
#https://stackoverflow.com/questions/57305909/tensorflow-keras-reproducibility-problem-on-google-colab
os.environ['PYTHONHASHSEED']=str(5)
np.random.seed(5)
rn.seed(12345)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
tf.compat.v1.set_random_seed(1234)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
tf.compat.v1.keras.backend.set_session(sess)
Question is, what am I doing wrong with the code above that is not working (as I mentioned)?
Here's where I create the training sets:
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.convolutional import Conv1D, MaxPooling1D
from keras.layers.core import Dense, Flatten
from keras.layers import BatchNormalization,Activation
from keras.optimizers import SGD, Adam
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=2)
My model:
from tensorflow.keras import layers
from tensorflow.keras import initializers
input_dim = X_train_f.shape[1]
output_dim = Y_train.shape[1]
modelo = Sequential()
modelo.add(Conv1D(filters=6, kernel_initializer=initializers.glorot_uniform(seed=5), kernel_size=5, activation='relu', input_shape=(40, 1))) # 6
modelo.add(MaxPooling1D(pool_size=2))
modelo.add(Conv1D(filters=16, kernel_initializer=initializers.glorot_uniform(seed=5), kernel_size=5, activation='relu')) # 16
modelo.add(MaxPooling1D(pool_size=2))
modelo.add(Flatten())
modelo.add(Dense(120, kernel_initializer=initializers.glorot_uniform(seed=5), activation='relu')) # 120
modelo.add(Dense(84, kernel_initializer=initializers.glorot_uniform(seed=5), activation='relu')) # 84
modelo.add(Dense(nclases, kernel_initializer=initializers.glorot_uniform(seed=5), activation='softmax'))
sgd = SGD(lr=0.1)
#modelo.compile(loss='categorical_crossentropy',
# optimizer='adam',
# metrics=['accuracy'])
modelo.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
modelo.summary()
modelo.input_shape
It is a normal situation. Adam optimizer is much more powerful comparing to SGD. Adam implicitly performs coordinate-wise gradient clipping and can hence, unlike SGD, tackle heavy-tailed noise.
Here is my complete code. I'm trying to predict protein classes from protein sequences.
from sklearn.preprocessing import LabelBinarizer
# Transform labels to one-hot
lb = LabelBinarizer()
Y = lb.fit_transform(df.classification)
from keras.preprocessing import text, sequence
from keras.preprocessing.text import Tokenizer
from sklearn.model_selection import train_test_split
#maximum length of sequence, everything afterwards is discarded!
max_length = 500
#create and fit tokenizer
tokenizer = Tokenizer(char_level=True)
tokenizer.fit_on_texts(seqs)
X = tokenizer.texts_to_sequences(seqs)
X = sequence.pad_sequences(X, maxlen=max_length)
from __future__ import print_function
import numpy as np
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Embedding, LSTM, Bidirectional, Conv1D
from keras.layers.convolutional import MaxPooling1D
import tensorflow as tf
from tensorflow.keras import layers
embedding_vecor_length = 128
max_length = 500
model = Sequential()
model.add(Embedding(len(tokenizer.word_index)+1, embedding_vecor_length, input_length=max_length))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Bidirectional(LSTM(64)))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=.2)
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=512)
This is accuracy of the model
train-acc = 0.8485087800799034
test-acc = 0.8203392530062913
and my prediction results are:
[9.65313017e-02 1.33084046e-04 1.73516816e-03 4.62103529e-08
8.45071673e-03 2.42734270e-04 3.54182965e-04 2.88571493e-04
1.99087553e-05 8.92244339e-01]
[8.89207274e-02 1.99566261e-04 1.76228161e-04 2.08527595e-02
1.64435953e-01 2.83987029e-03 1.53038520e-02 7.07270563e-01
5.16798650e-07 2.19354401e-08]
[9.36142087e-01 6.09822795e-02 3.55492946e-09 2.19342492e-05
5.41335670e-04 1.89031591e-04 2.66434945e-04 1.84136129e-03
1.54582867e-05 3.31551647e-10]
Any help in this regard would be appreciated. I'm stuck with it and don't know how to solve it. Also, I'm kindda new to deep learning.
As you can see your last layer has an activation function of softmax function
model.add(Dense(10, activation='softmax'))
So when you predict values it passes through that softmax function in the last layer which gives you those strange-looking float values.
Now, basically what the softmax function is doing here is that it normalizes the input values given to the function and normalizes them in range (0, 1) and all the components will add up to 1. You can read more about the softmax function here: https://en.wikipedia.org/wiki/Softmax_function.
On how can you find the prediction label id you just need to find the maximum value's index in the array and you will have your label id they are pointing to.
You can use numpy argmax function to find maximum values index in multidimensional arrays. You can refer here: https://numpy.org/doc/stable/reference/generated/numpy.argmax.html
firstly im quiet new with this library. My code is simple there is input x and there is y(x*2). it should learsn simple int*2 but it cant. I think maybe paramaters are wrong but how can i determine true parameters?
from tensorflow import keras
import numpy as np
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
import keras
from keras.layers import Input, Dense
import numpy as np
import pandas as pd
xs = np.array([[1,3,4,5] , [9,2,3,4]]).reshape(1,2,4)
ys = np.array([[2,6,8,10] , [18,4,6,8]]).reshape(1,2,4)
# model = tf.keras.Sequential([layers.Dense(units=1, input_shape=[2,4])])
model = Sequential()
model.add(Dense(8,input_shape=[2,4]))
model.add(Activation('relu'))
model.add(Dense(6))
model.add(Activation('relu'))
model.add(Dense(4))
model.add(Activation('softmax'))
model.compile(optimizer='Adadelta', loss='mean_squared_error')
model.fit(xs, ys, epochs=5, batch_size=1)
p = np.array([[1,3,4,5] , [9,2,3,4]]).reshape(1,2,4)
print(model.predict(p))
My second try is more hard than *2 but i created 100 piece input and output value but still bad performance, how can i make it MORE accurate?
from tensorflow import keras
import numpy as np
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
import keras
from keras.layers import Input, Dense
import numpy as np
import pandas as pd
xs = np.array([[1,3,4,5,9,2,3,4]]).reshape(1,1,8)
xg=np.ones((100,8)).reshape(100,8)
yg=np.ones((100,8)).reshape(100,8)
for i in range(100):
xg[i-1]=xs*np.random.randint(500)
yg[i-1]=xg[i-1]*np.sin(20)
xs=xg
ys=yg
# model = tf.keras.Sequential([layers.Dense(units=1, input_shape=[2,4])])
model = Sequential()
model.add(Dense(8,input_shape=[100,8]))
model.add(Activation('relu'))
model.add(Dense(8))
model.add(Activation('relu'))
model.add(Dense(8))
# opt = keras.optimizers.sgd(learning_rate=0.2)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(xs, ys, epochs=8000, batch_size=100)
model.summary()
p = np.array([[1,340,4,512,9,2,3,4]])
print(model.predict(p))
print(np.sin(p))
Your problem come from 2 places. First, you accidentally use softmax as output activation function, you can solve that by just comment it out (this is your root problem). Softmax function use for classification problem but your problem is regression problem there is no need for activation in the last layer. Second(after taking softmax out), it come from low number of epochs, you need to train the model for larger number of epochs for better performance.
Ive been working on this CNN. In the Test() function it always says that it is 1 given number. (example. always outputting 8 even though it's not even close). Ive tried training the model more to see if the model was just not good enough. Here is my code:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Conv2D, Dropout, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.utils import to_categorical
from matplotlib import pyplot as plt
(Train_Data, Train_Labels), (Test_Data, Test_Labels) = tf.keras.datasets.mnist.load_data()
Train_Data = Train_Data.reshape(60000,28,28,1)
Test_Data = Test_Data.reshape(10000,28,28,1)
Train_Data = Train_Data / 255 - 0.5
Test_Data = Test_Data / 255 - 0.5
def load(name):
net = keras.models.load_model(name)
return net
def save(name):
model.save(name)
print("""
###:::SAVING MODEL:::###
""")
def makeCNN():
model = keras.Sequential()
model.add(Conv2D(32, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(keras.layers.Flatten())
model.add(Dense(9, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(optimzer='adam', loss="mse", metrics=['accuracy'])
return model
def train(epochs):
for i in range(epochs):
print(i+1)
model.fit(Train_Data, Train_Labels)
save('CNN.h5')
def test():
validCorrect = 0
validTotal = 0
print(Test_Data.shape)
for i in range(1000):
data = Test_Data[i]
data = data.reshape(1,28,28,1)
prediction = model.predict(data)
validTotal +=1
if np.argmax(prediction) == Test_Labels[i]:
validCorrect+=1
print(f"""
TOTAL:{validTotal}
ACCURACY:{(validCorrect/validTotal)*100}
CORRECT:{validCorrect}
""")
print(f"GUESS:{np.argmax(prediction)}
REALITY{Test_Labels[i]}")
model = makeCNN()
train(80)
test()
Any help is appreciated. Thanks! Im pretty new to Machine Learning(especially CNNs)
Firstly, you should use categorical_crossentropy as your loss. It's tempting to use MSE, we're dealing with digits after all, but since this is a classification task, the model doesn't know about the supposed ordinality of the different digits. It just knows them as "ten different classes of image". For example, is a 7 more similar to a 2 or an 8? In terms of ordinality, it's closer to 8. But the digit looks rather more like a 2, doesn't it?
Also, I'm guessing that your model is likely to under-fit quite severely, because is not deep enough. You can try adding some more convolutional layers to your network. You could draw inspiration from this example in the Keras documentation (also on the MNIST dataset) here https://keras.io/examples/mnist_cnn/ where they achieve >99% on this problem with just a couple of extra convolutional layers, and some techniques to reduce overfitting, such as dropout.
I am running the test script from the Keras website for Multilayer Perceptron (MLP) for multi-class softmax classification. Running in the jupyter notebook I get the error "name 'keras' is not defined". This may be a simple python syntax problem that I am not keen to, however this code comes straight from keras so I expect it should work as is. I have run other neural nets using keras, so I am pretty sure that I have installed everything (installed keras using anaconda). Can anyone help? I have included both the code and the error at the bottom. Thanks!
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
# Generate dummy data
import numpy as np
x_train = np.random.random((1000, 20))
y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10)
x_test = np.random.random((100, 20))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
model = Sequential()
# Dense(64) is a fully-connected layer with 64 hidden units.
# in the first layer, you must specify the expected input data shape:
# here, 20-dimensional vectors.
model.add(Dense(64, activation='relu', input_dim=20))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
model.fit(x_train, y_train,
epochs=20,
batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
This is the error message:
NameError Traceback (most recent call last)
<ipython-input-1-6d8174e3cf2a> in <module>()
6 import numpy as np
7 x_train = np.random.random((1000, 20))
----> 8 y_train = keras.utils.to_categorical(np.random.randint(10, size=(1000, 1)), num_classes=10)
9 x_test = np.random.random((100, 20))
10 y_test = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
NameError: name 'keras' is not defined
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
From above, you only imported following submodules in keras
keras.models
keras.layers
keras.optimizers
But this does not automatically import the outer module like keras
or other submodules keras.utils
So, you can do either one
import keras
import keras.utils
from keras import utils as np_utils
but from keras import utils as np_utils is the most widely used.
Especially import keras is not a good practice because importing the higher module does not necessarily import its submodules (though it works in Keras)
For example,
import urllib
does not necessarily import urllib.request because if there are so many big submodules, it's inefficient to import all of its submodules every time.
EDIT:
With the introduction of Tensorflow 2, keras submodules such as keras.utils should now be imported as
from tensorflow.keras import utils as np_utils
General way:
from keras.utils import to_categorical
Y_train = to_categorical(y_train, num_classes)
Concrete way:
from keras.utils import to_categorical
print(to_categorical(1, 2))
print(to_categorical(0, 2))
Will output
[0. 1.]
[1. 0.]
Although this is an old question but yet updating the latest approach to access to_categorical function.
This function has now been packed in np_utils.
The correct way to access it is:
from keras.utils.np_utils import to_categorical
This works for me:
import tensorflow as tf
from keras import utils as np_utils
y_train = tf.keras.utils.to_categorical(y_train, num_classes)
y_test = tf.keras.utils.to_categorical(y_test, num_classes)