Predicting the square root of a number using Machine Learning - python

I am trying to create a program in python that uses machine learning to predict the square root of a number. I am listing what all I have done in my program:-
created a csv file with numbers and their squares
extracted the data from csv into suitable variables (X stores squares, y stores numbers)
scaled the data using sklearn's, StandardScaler
built the ANN with two hidden layers each of 6 units (no activation functions)
compiled the ANN using SGD as the optimizer and mean squared error as the loss function
trained the model. Loss was around 0.063
tried predicting but the result is something else.
My actual code:-
import numpy as np
import tensorflow as tf
import pandas as pd
df = pd.read_csv('CSV/SQUARE-ROOT.csv')
X = df.iloc[:, 1].values
X = X.reshape(-1, 1)
y = df.iloc[:, 0].values
y = y.reshape(-1, 1)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_test_sc = sc.fit_transform(X_test)
X_train_sc = sc.fit_transform(X_train)
sc1 = StandardScaler()
y_test_sc1 = sc1.fit_transform(y_test)
y_train_sc1 = sc1.fit_transform(y_train)
ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=6))
ann.add(tf.keras.layers.Dense(units=6))
ann.add(tf.keras.layers.Dense(units=1))
ann.compile(optimizer='SGD', loss=tf.keras.losses.MeanSquaredError())
ann.fit(x = X_train_sc, y = y_train_sc1, batch_size=5, epochs = 100)
print(sc.inverse_transform(ann.predict(sc.fit_transform([[144]]))))
OUTPUT:- array([[143.99747]], dtype=float32)
Shouldn't the output be 12? Why is it giving me the wrong result?
I am attaching the csv file I used to train my model as well: SQUARE-ROOT.csv

TL;DR: You really need those nonlinearities.
The reason behind it not working could be one (or a combination) of several causes, like bad input data range, flaws in your data, over/underfitting, etc.
However, in this specific case the model you build literally can't learn the function you're trying to approximate, because not having nonlinearities makes this a purely linear model, which can't accurately approximate nonlinear functions.
A Dense layer is implemented as follows:
x_res = activ_func(w*x + b)
where x is the layer input, w the weights, b the bias vector and activ_func the activation function (if one is defined).
Your model, then, mathematically becomes (I'm using indices 1 to 3 for the three Dense layers):
pred = w3 * (w2 * ( w1 * x + b1 ) + b2 ) + b3
= w3*w2*w1*x + w3*w2*b1 + w3*b2 + b3
As you see, the resulting model is still linear.
Add activation functions and your mode becomes capable of learning nonlinear functions too. From there, experiment with the hyperparameters and see how the performance of your model changes.

The reason your code does not work is because you apply fit_transform to your test set, which is wrong. You can fix it by replacing fit_transform(test) to transform(test). Although I don't think StandardScaler is neccessary, please try this:
import numpy as np
import tensorflow as tf
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
N = 10000
X = np.arange(1, N).reshape(-1, 1)
y = np.sqrt(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2)
sc = StandardScaler()
X_train_sc = sc.fit_transform(X_train)
#X_test_sc = sc.fit_transform(X_test) # wrong!!!
X_test_sc = sc.transform(X_test)
sc1 = StandardScaler()
y_train_sc1 = sc1.fit_transform(y_train)
#y_test_sc1 = sc1.fit_transform(y_test) # wrong!!!
y_test_sc1 = sc1.transform(y_test)
ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=32, activation='relu')) # you have 10000 data, maybe you need a little deeper network
ann.add(tf.keras.layers.Dense(units=32, activation='relu'))
ann.add(tf.keras.layers.Dense(units=32, activation='relu'))
ann.add(tf.keras.layers.Dense(units=1))
ann.compile(optimizer='SGD', loss='MSE')
ann.fit(x=X_train_sc, y=y_train_sc1, batch_size=32, epochs=100, validation_data=(X_test_sc, y_test_sc1))
#print(sc.inverse_transform(ann.predict(sc.fit_transform([[144]])))) # wrong!!!
print(sc1.inverse_transform(ann.predict(sc.transform([[144]]))))

Related

Fit linear model on all data after doing it just on training data in order to make predictions about future data

In linear regression (but in general on any other model) is there a need to use .fit() on all input data after using it only on training data? I'll try to explain it better with an example. I will artificially create a trivial set of input and output data:
import numpy as np
from random import uniform
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
X = np.arange(0, 500, 5).reshape(-1, 1)
y = np.zeros(100)
for i in range(len(X)):
y[i] = 3 * X[i] + uniform(-10, 10)
X_train = X[:80]
y_train = y[:80]
X_test = X[80:]
y_test = y[80:]
reg = LinearRegression()
It could be seen as a time series (X contains the time periods).
Now, I train the model using the training sets:
reg.fit(X_train, y_train)
and I make predictions using the testing set:
y_pred = reg.predict(X_test)
Using r2_score(y_test, y_pred) I get about 0.99, so my model is able to fit the data correctly.
Now my question: I want to make a prediction about the future, i.e. about a series of data not contained in X, let's say:
X_future = np.arange(500, 525, 5).reshape(-1, 1)
Should I proceed this way:
y_future = reg.predict(X_future)
or in this:
reg.fit(X, y)
y_future = reg.predict(X_future)
In other words, do I have to fit again the model found earlier on the entire (X, y) before making a prediction of the future? Are there any chances that both approaches are correct?
Thanks to anyone who will answer me.

Linear Regression Neural Network Tensorflow Keras Python program

I wrote a small
"Linear Regression Neural Network Tensorflow Keras Python program"
Input dataset is
y = mx + c straight line data.
Predicted y values are not correct and are giving horizontal line kind of
values, instead of a line with some slope.
I ran this program on Windows laptop with tensorflow, Keras and
Jupyter notebook.
What to do to fix this program please?
Thanks and best regards,
SSJ
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
n2 = 50
count = 20
n4 = n2 + count
p = 100
m = 10
c = 5
x = np.linspace(n2, n4, p)
y = m * x + c
x
y
plt.scatter(x,y)
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
x_normalizer = preprocessing.Normalization(input_shape=[1,])
x_normalizer.adapt(x)
x_normalized = x_normalizer(x)
y_normalizer = preprocessing.Normalization(input_shape=[1,])
y_normalizer.adapt(y)
y_normalized = x_normalizer(y)
y_model = tf.keras.Sequential([
y_normalizer,
layers.Dense(1)
])
y_model.compile(optimizer='rmsprop', loss='mse', metrics = ['mae'])
y_hist = y_model.fit(x, y, epochs=100, verbose=0, validation_split = 0.2)
hist = pd.DataFrame(y_hist.history)
hist['epoch'] = y_hist.epoch
hist.head()
hist.tail()
xin = [51,53,59,64]
ypred = y_model.predict(xin)
ypred
plt.scatter(x, y)
plt.scatter(xin, ypred, color = 'r')
plt.grid(linestyle = '--')
Use StandardScaler instead of Normalization
Normalizer acts row-wise and StandardScaler column-wise.
Normalizer does not remove the mean and scale by deviation but scales
the whole row to unit norm.
Found here: Difference between StandardScaler and Normalizer
This is how you can process the data:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from sklearn.preprocessing import StandardScaler
x = np.linspace(50, 70, 100).reshape(-1, 1)
y = 10 * x + 5
x_standard_scaler = StandardScaler().fit(x)
y_standard_scaler = StandardScaler().fit(y)
x_scaled = x_standard_scaler.transform(x)
y_scaled = y_standard_scaler.transform(y)
Remember that you need two separate scalers for x and y so don't use the same object for that. Also if you want to use that scaler to process new data for testing, save the scaler in some variable. A good practice is to not refit the scaler again on test data.
model = Sequential([
Dense(1, input_dim=1, activation='linear'),
])
model.compile(optimizer='rmsprop', loss='mse')
history = model.fit(x_scaled, y_scaled, epochs=1000, verbose=0, validation_split = 0.2).history
pd.DataFrame(history).plot()
plt.show()
As you can see the model is converging. Its worth to plot the loss history which helps to tell if your model is learning or not.
x_test = np.linspace(20, 100, 10).reshape(-1, 1)
y_test = 10 * x_test + 5
x_test_scaled = x_standard_scaler.transform(x_test)
y_test_scaled = y_standard_scaler.transform(y_test)
If you have a test data that you want to use for validation or just predict it, remember to use standard scaler again, but without fitting. It should be fitted on train data only in most cases.
y_test_pred_scaled = model.predict(x_test_scaled)
y_test_pred = y_standard_scaler.inverse_transform(y_test_pred_scaled)
plt.scatter(x_test, y_test, s=30, label='true')
plt.scatter(x_test, y_test_pred, s=15, label='pred')
plt.legend()
plt.show()
If you want to get your prediction rescaled back to its original range use inverse_transform. Notice that prediction on x_test after rescaling is very close to y_test.

kfold cross validation wont terminate, stuck at cross_val_score

I am trying to run kfold cross validation. but for some reason, it gets stuck here, it wont terminate from here accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
i cant understand whats the problem. and how do i fix it.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:, 3:13].values
y = dataset.iloc[:, 13].values
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:,1:]
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
import keras
from keras.models import Sequential #Required to initialize the ANN
from keras.layers import Dense #Build layers of ANN
from keras.layers import Dropout
# Evaluating the ANN
import keras
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential #Required to initialize the ANN
from keras.layers import Dense #Build layers of ANN
def build_classifier(): # Builds the architecture, or the classifier
classifier = Sequential()
classifier.add(Dense(activation = 'relu', input_dim = 11, units = 6, kernel_initializer = 'uniform'))# add layers
classifier.add(Dense(activation = 'relu', units = 6, kernel_initializer = 'uniform'))# add layers
classifier.add(Dense(activation = 'sigmoid', units = 1, kernel_initializer = 'uniform'))
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, nb_epoch = 100)
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
mean = accuracies.mean()
variance = accuracies.std()
Edit
Im on windows 10 using Anaconda with python 3.6.
Dataset : Drive Link for dataset
It works perfectly when i set n_jobs = 1 but not when n_jobs = -1
Since you have set the n_jobs = -1, then all the CPUs are being utlised as per the documentation mentioned here. However, you must understand that utilising all the CPUs does not necessarily may lead to reduction in execution time because:
There is an overhead invovled with creation and allocation of reasources to new threads.
Also, there might be other bottlenecks like data being to large to be broadcasted to all threads at the same time, thread pre-emption over RAM (or other resouces,etc.), how data is pushed into each thread, etc.
Also multithreading in Python has various shortcomings, see here and here.
You can check out a similar issue with GridSearchCV and parallization here in this answer.
Also, as mentioned by #ncfith, there is no current solution for this problem.
References
Why do I sometime get a crash/freeze with n_jobs > 1 under OSX or Linux?
Similar issue with numpy on MacOS

ANN: keras/sklearn doesn't scale well

I don't know the reason why the results I have obtained aren't scale well.
As you can see on the pictures bellow, there is a problem with scaling.
There are two issues:
There are no negative values
There is problem with maximum values prediction
I don't have idea why I have those problems.
Do you have any idea's how I can fix this issue?
I would be very grateful for your help
CODE:
# Read inputs
X = dataset.iloc[0:20000, [1, 4, 10]].values
# Read output
y = dataset.iloc[0:20000, 5].values
# Splitting the dataset into the Training set and Test set
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
# Output matrix conversion
y_train = y_train.reshape(-1, 1)
y_test = y_test.reshape(-1, 1)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
y_train = sc.fit_transform(y_train)
y_test = sc.transform(y_test)
# Import the Keras libraries and package
from keras.models import Sequential
from keras.layers import Dense
# building model
classifier = Sequential()
classifier.add(Dense(activation="sigmoid", input_dim=3, units=64, kernel_initializer="uniform"))
classifier.add(Dense(activation="sigmoid", units=32, kernel_initializer="uniform"))
classifier.add(Dense(activation="sigmoid", units=16, kernel_initializer="uniform"))
classifier.add(Dense(activation="sigmoid", units=1, kernel_initializer="uniform"))
classifier.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
# Fitting the ANN to the training set
results = classifier.fit(X_train, y_train, batch_size=16, epochs=25)
# Predicting the Test set results
y_pred = classifier.predict(X_test)
If your blue curves show your initial output y and your orange ones the output of your model (you have not cared to clarify this...), then there is nothing strange here...
There is problem with maximum values prediction
Looking more carefully at your code, you will realize that you don't actually feed your initial y into your network, but its scaled version, i.e. the result of sc.transform(); hence, your output is also scaled, and you should use the inverse_transform method to get it back to the initial scale:
y_final = sc.inverse_transform(y_pred)
BTW, this will happen to work now, but in general it is not a good idea to use the same scaler (sc here) for two different datasets (i.e. your X's and y's) - you should define two different scalers instead, say sc_X and sc_y.
There are no negative values
That is because the sigmoid function you have used as activation in your output layer takes only positive values in [0, 1], so you may want to change it to something else that will be able to give the required value range (linear is a candidate), and possibly also change your other sigmoids to tanh.

How to do regression using tensorflow with series output?

I want to build a regression model with 2 output nodes using tensorflow. I search a code which can build regression model but with 1 output nodes.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/skflow/boston.py
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from sklearn import cross_validation
from sklearn import metrics
from sklearn import preprocessing
import tensorflow as tf
from tensorflow.contrib import learn
def main(unused_argv):
# Load dataset
boston = learn.datasets.load_dataset('boston')
x, y = boston.data, boston.target
# Split dataset into train / test
x_train, x_test, y_train, y_test = cross_validation.train_test_split(
x, y, test_size=0.2, random_state=42)
# Scale data (training set) to 0 mean and unit standard deviation.
scaler = preprocessing.StandardScaler()
x_train = scaler.fit_transform(x_train)
# Build 2 layer fully connected DNN with 10, 10 units respectively.
feature_columns = learn.infer_real_valued_columns_from_input(x_train)
regressor = learn.DNNRegressor(
feature_columns=feature_columns, hidden_units=[10, 10])
# Fit
regressor.fit(x_train, y_train, steps=5000, batch_size=1)
# Predict and score
y_predicted = list(
regressor.predict(scaler.transform(x_test), as_iterable=True))
score = metrics.mean_squared_error(y_predicted, y_test)
print('MSE: {0:f}'.format(score))
if __name__ == '__main__':
tf.app.run()
I am new to tensorflow, so I searched for the code which has similarity to how mine works, but the output of the code is one.
In my model, the input is N*1000, and the output is N*2. I wonder are there effective and efficient code for regression. Please give me some example.
Actually, I find a workable code using DNNRegressor:
import numpy as np
from sklearn.cross_validation import train_test_split
from tensorflow.contrib import learn
import tensorflow as tf
import logging
#logging.getLogger().setLevel(logging.INFO)
#Some fake data
N=200
X=np.array(range(N),dtype=np.float32)/(N/10)
X=X[:,np.newaxis]
#Y=np.sin(X.squeeze())+np.random.normal(0, 0.5, N)
Y = np.zeros([N,2])
Y[:,0] = X.squeeze()
Y[:,1] = X.squeeze()**2
X_train, X_test, Y_train, Y_test = train_test_split(X, Y,
train_size=0.8,
test_size=0.2)
reg=learn.DNNRegressor(hidden_units=[10,10])
reg.fit(X_train,Y_train[:,0],steps=500)
But, this code will work only if the shape of Y_train is N*1, and it will fail when the shape of Y_train is N*2.
However, I want to build a regress model and the input is N*1000, the output is N*2. And I can't fix it.

Categories

Resources