Tensorflow How to correctly pass input values for prediction to the neural network - python

I'm having trouble while trying to pass values for prediction to my neural network. Here is the code snippet:-
model=keras.Sequential([keras.layers.Dense(units=1, input_shape=[14])])
model.compile(optimizer='sgd', loss='mean_squared_error')
Notice my input_shape=[14]
I'm getting errors while trying to make predictions using each of the following ways:-
print(model.predict(40,8,1,2,0,2,6,10,34,40,16,23,67,25))
TypeError: predict() takes from 2 to 9 positional arguments but 15 were given
print(model.predict([40,8,1,2,0,2,6,10,34,40,16,23,67,25]))
ValueError: Error when checking input: expected dense_1_input to have shape (14,) but got array with shape (1,)
print(model.predict([[40,8,1,2,0,2,6,10,34,40,16,23,67,25]]))
ValueError: Error when checking input: expected dense_1_input to have shape (14,) but got array with shape (1,)
print(model.predict[(40,8,1,2,0,2,6,10,34,40,16,23,67,25)])
TypeError: 'method' object is not subscriptable
print(model.predict([40],[8],[1],[2],[0],[2],[6],[10],[34],[40],[16],[23],[67],[25]))
TypeError: predict() takes from 2 to 9 positional arguments but 15 were given
However, it works with the following way:-
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size= 0.2, shuffle=True)
pred=model.predict(X_test)
Here is a screenshot of X_test when printed print(X_test)
And this is a snippet of my dataset:-
And here is the entire code:-
import glob
import os
from keras.models import Sequential, load_model
import numpy as np
import pandas as pd
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
import matplotlib.pyplot as plt
import keras as k
import tensorflow as tf
from tensorflow import keras
from tensorflow import lite
df = pd.read_csv("kidney4.csv")
df = df.dropna(axis=0)
for column in df.columns:
if df[column].dtype == np.number:
continue
df[column] = LabelEncoder().fit_transform(df[column])
X = df.drop(["classification"], axis=1)
y = df["classification"]
x_scaler = MinMaxScaler()
x_scaler.fit(X)
column_names = X.columns
X[column_names] = x_scaler.transform(X)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size= 0.2, shuffle=True)
model=keras.Sequential([keras.layers.Dense(units=1, input_shape=[14])])
model.compile(optimizer='sgd', loss='mean_squared_error')
model.fit(X_train, y_train, epochs=500)
for model_file in glob.glob("kidney_final_2.model"):
print("Model file: ", model_file)
model = load_model(model_file)
pred=model.predict(X_test)
pred = [1 if y>=0.5 else 0 for y in pred] #Threshold, transforming probabilities to either 0 or 1 depending if the probability is below or above 0.5
scores = model.evaluate(X_test, y_test)
print()
print("Original : {0}".format(", ".join([str(x) for x in y_test])))
print()
print("Predicted : {0}".format(", ".join([str(x) for x in pred])))
print()
print("Scores : loss = ", scores[0], " acc = ", scores[1])
print("---------------------------------------------------------")
print()
I would appreciate any help on this. Thank you.

Good question.
The problem/trick with the "model.predict()" in Keras and TensorFlow is that you can only predict on batches.
Therefore, in order to predict on one data point(in your case an array of 14 elements), you need to simulate the batch axis. That is, a batch of size 1, since you want to predict on one data point.
You can use numpy to achieve this.
input_array = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14])
input_array_for_prediction = np.expand_dims(input_array,axis=0)
print(model.predict(input_array_for_prediction))

Related

Score Error: ValueError: Expected 2D array, got 1D array instead

Here's my code:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_california_housing
california_housing = fetch_california_housing(as_frame=True)
data = california_housing.frame
X = data.drop(columns=['MedHouseVal'])
y = data['MedHouseVal']
model = LinearRegression()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model.fit(X_train, y_train)
predictions = model.predict(X_test)
model.score(predictions, y_test)
Here's the error message:
ValueError: Expected 2D array, got 1D array instead: array=[0.71912284
1.76401657 2.70965883 ... 4.46877017 1.18751119 2.00940251]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
score needs to be called on the testing features and not output:
model.fit(X_train, y_train)
model.score(X_test, y_test)

Predicting the square root of a number using Machine Learning

I am trying to create a program in python that uses machine learning to predict the square root of a number. I am listing what all I have done in my program:-
created a csv file with numbers and their squares
extracted the data from csv into suitable variables (X stores squares, y stores numbers)
scaled the data using sklearn's, StandardScaler
built the ANN with two hidden layers each of 6 units (no activation functions)
compiled the ANN using SGD as the optimizer and mean squared error as the loss function
trained the model. Loss was around 0.063
tried predicting but the result is something else.
My actual code:-
import numpy as np
import tensorflow as tf
import pandas as pd
df = pd.read_csv('CSV/SQUARE-ROOT.csv')
X = df.iloc[:, 1].values
X = X.reshape(-1, 1)
y = df.iloc[:, 0].values
y = y.reshape(-1, 1)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_test_sc = sc.fit_transform(X_test)
X_train_sc = sc.fit_transform(X_train)
sc1 = StandardScaler()
y_test_sc1 = sc1.fit_transform(y_test)
y_train_sc1 = sc1.fit_transform(y_train)
ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=6))
ann.add(tf.keras.layers.Dense(units=6))
ann.add(tf.keras.layers.Dense(units=1))
ann.compile(optimizer='SGD', loss=tf.keras.losses.MeanSquaredError())
ann.fit(x = X_train_sc, y = y_train_sc1, batch_size=5, epochs = 100)
print(sc.inverse_transform(ann.predict(sc.fit_transform([[144]]))))
OUTPUT:- array([[143.99747]], dtype=float32)
Shouldn't the output be 12? Why is it giving me the wrong result?
I am attaching the csv file I used to train my model as well: SQUARE-ROOT.csv
TL;DR: You really need those nonlinearities.
The reason behind it not working could be one (or a combination) of several causes, like bad input data range, flaws in your data, over/underfitting, etc.
However, in this specific case the model you build literally can't learn the function you're trying to approximate, because not having nonlinearities makes this a purely linear model, which can't accurately approximate nonlinear functions.
A Dense layer is implemented as follows:
x_res = activ_func(w*x + b)
where x is the layer input, w the weights, b the bias vector and activ_func the activation function (if one is defined).
Your model, then, mathematically becomes (I'm using indices 1 to 3 for the three Dense layers):
pred = w3 * (w2 * ( w1 * x + b1 ) + b2 ) + b3
= w3*w2*w1*x + w3*w2*b1 + w3*b2 + b3
As you see, the resulting model is still linear.
Add activation functions and your mode becomes capable of learning nonlinear functions too. From there, experiment with the hyperparameters and see how the performance of your model changes.
The reason your code does not work is because you apply fit_transform to your test set, which is wrong. You can fix it by replacing fit_transform(test) to transform(test). Although I don't think StandardScaler is neccessary, please try this:
import numpy as np
import tensorflow as tf
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
N = 10000
X = np.arange(1, N).reshape(-1, 1)
y = np.sqrt(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2)
sc = StandardScaler()
X_train_sc = sc.fit_transform(X_train)
#X_test_sc = sc.fit_transform(X_test) # wrong!!!
X_test_sc = sc.transform(X_test)
sc1 = StandardScaler()
y_train_sc1 = sc1.fit_transform(y_train)
#y_test_sc1 = sc1.fit_transform(y_test) # wrong!!!
y_test_sc1 = sc1.transform(y_test)
ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=32, activation='relu')) # you have 10000 data, maybe you need a little deeper network
ann.add(tf.keras.layers.Dense(units=32, activation='relu'))
ann.add(tf.keras.layers.Dense(units=32, activation='relu'))
ann.add(tf.keras.layers.Dense(units=1))
ann.compile(optimizer='SGD', loss='MSE')
ann.fit(x=X_train_sc, y=y_train_sc1, batch_size=32, epochs=100, validation_data=(X_test_sc, y_test_sc1))
#print(sc.inverse_transform(ann.predict(sc.fit_transform([[144]])))) # wrong!!!
print(sc1.inverse_transform(ann.predict(sc.transform([[144]]))))

ValueError: Error when checking target: expected dense_4 to have shape (1,) but got array with shape (6,)

I am doing a prediction model using a chronic kidney disease dataset.
However the shape of my X_train value doesn't seem to be valid.
I have tried to change it but got a tuple error
# import libraries
import glob
from keras.models import Sequential, load_model
import numpy as np
import pandas as pd
from keras.layers import Dense
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
import matplotlib.pyplot as plt
import keras as k
from sklearn.model_selection import train_test_split
# load the data
from google.colab import files
uploaded = files.upload()
df = pd.read_csv('kidney_disease.csv')
#print the first 5 rows of data
df.head(5)
# create a list of column names to keep
columns_to_retain = ['sg', 'al', 'sc', 'hemo', 'pcv', 'wbcc', 'htn', 'classification']
# drop the unneccessary columns
df = df.drop( [col for col in df.columns if not col in columns_to_retain], axis=1)
#drop the rows with na or missing values
df = df.dropna(axis=0)
# transform the non-numeric data in the columns
for column in df.columns:
if df[column].dtype == np.number:
continue
df[column] = LabelEncoder().fit_transform(df[column])
# split the data into independent (X) dataset and dependent (y) dataset
X = df.drop(['classification'], axis=1)
y = df['classification']
# feature scaling
#min-max scaler method scales the dataset in order that all features lies between 0 and 1
X_scaler = MinMaxScaler()
X_scaler.fit(X)
column_names = X.columns
X[column_names] = X_scaler.transform(X)
# split the data into 80% training & 20% testing
X_train, y_train, X_test, y_test = train_test_split(X,y, test_size = 0.2, shuffle=True)# build the model
model = Sequential()
model.add( Dense(256, input_dim= len(X.columns), kernel_initializer=k.initializers.random_normal(seed=13), activation ='relu') )
model.add( Dense(1, activation = 'hard_sigmoid') )
# compiling the model (loss function mesures how well the model does in training
# & tries to improve on it using the optimizer )
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# train the model
history = model.fit(X_train, y_train, epochs = 2000, batch_size= X_train.shape[0])
#print(X_train[0:1].shape)
Do you guys have any idea and explain me the root of this problem.
Thank you in advance!

InvalidArgumentError: Incompatible shapes with Keras LSTM Net

I want to predict the pressure of a machine. I have 18 input values and the pressure as output. So I have 19 columns and 7657 rows as the database consists of 7657 time steps and each counts for 1 sec.
I have a problem with the following code:
import tensorflow as tf
import pandas as pd
from matplotlib import pyplot
from sklearn.preprocessing import MinMaxScaler
from sklearn import linear_model
from keras.models import Sequential
from keras.layers import Dense #Standard neural network layer
from keras.layers import LSTM
from keras.layers import Activation
from keras.layers import Dropout
df = pd.read_csv('Testdaten_2_Test.csv',delimiter=';')
feature_col_names=['LSDI','LZT1I', ..... ,'LZT5I']
predicted_class_names = ['LMDI']
x = df[feature_col_names].values
y = df[predicted_class_names].values
x_train_size = 6400
x_train, x_test = x[0:x_train_size], x[x_train_size:len(x)]
y_train_size = 6400
y_train, y_test = y[0:y_train_size], y[y_train_size:len(y)]
nb_model = linear_model.LinearRegression()
nb_model.fit(X=x_train, y=y_train)
nb_predict_train = nb_model.predict(x_test)
from sklearn import metrics
def scale(x, y):
# fit scaler
x_scaler = MinMaxScaler(feature_range=(-1, 1))
x_scaler = x_scaler.fit(x)
x_scaled = x_scaler.transform(x)
# fit scaler
y_scaler = MinMaxScaler(feature_range=(-1, 1))
y_scaler = y_scaler.fit(y)
y_scaled = y_scaler.transform(y)
return x_scaler, y_scaler, x_scaled, y_scaled
x_scaler, y_scaler, x_scaled, y_scaled = scale(x, y)
x_train, x_test = x_scaled[0:x_train_size], x_scaled[x_train_size:len(x)]
y_train, y_test = y_scaled[0:y_train_size], y_scaled[y_train_size:len(y)]
x_train=x_train.reshape(x_train_size,1,18)
y_train=y_train.reshape(y_train_size,1,1)
model = Sequential()
model.add(LSTM(10, return_sequences=True,batch_input_shape=(32,1,18)))
model.add(LSTM(10,return_sequences=True))
model.add(LSTM(1,return_sequences=True, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=
['accuracy'])
model.fit(x_train, y_train, epochs=10,batch_size=32)
score = model.evaluate(x_test, y_test,batch_size=32)
predicted = model.predict(x_test)
predicted = y_scaler.inverse_transform(predicted)
predicted = [x if x > 0 else 0 for x in predicted]
correct_values = y_scaler.inverse_transform(y_test)
correct_values = [x if x > 0 else 0 for x in correct_values]
print(nb_predict_train)
I Get the Error:
ValueError: Error when checking input: expected lstm_1_input to have 3
dimensions, but got array with shape (1257, 18)
After the last line of code.
I also tried to reshape the test data but then I get a very similar error.
I think, I'm missing something very easy or basic but I can't figure it out at the moment, as I'm just a beginner in coding neuronal networks.
I need this for my master thesis so I would be very thank full if anyone could help me out.
The problem is that your model input batch_input_shape is fixed. The length of your test length is 1257 and cannot be divisible by 32. It should be changed as follows:
model.add(LSTM(10, return_sequences=True,batch_input_shape=(None,1,18)))
You should modify test shape before the model evaluate test.
x_test= x_test.reshape(len(x)-x_train_size,1,18)
y_test= y_test.reshape(len(y)-x_train_size,1,1)
score = model.evaluate(x_test, y_test,batch_size=32)
Of course, you have to reshape predicted and y_test before inverse_transform.
predicted = model.predict(x_test)
predicted= predicted.reshape(len(y)-x_train_size,1)
y_test= y_test.reshape(len(y)-x_train_size,1)

How to do regression using tensorflow with series output?

I want to build a regression model with 2 output nodes using tensorflow. I search a code which can build regression model but with 1 output nodes.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/skflow/boston.py
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from sklearn import cross_validation
from sklearn import metrics
from sklearn import preprocessing
import tensorflow as tf
from tensorflow.contrib import learn
def main(unused_argv):
# Load dataset
boston = learn.datasets.load_dataset('boston')
x, y = boston.data, boston.target
# Split dataset into train / test
x_train, x_test, y_train, y_test = cross_validation.train_test_split(
x, y, test_size=0.2, random_state=42)
# Scale data (training set) to 0 mean and unit standard deviation.
scaler = preprocessing.StandardScaler()
x_train = scaler.fit_transform(x_train)
# Build 2 layer fully connected DNN with 10, 10 units respectively.
feature_columns = learn.infer_real_valued_columns_from_input(x_train)
regressor = learn.DNNRegressor(
feature_columns=feature_columns, hidden_units=[10, 10])
# Fit
regressor.fit(x_train, y_train, steps=5000, batch_size=1)
# Predict and score
y_predicted = list(
regressor.predict(scaler.transform(x_test), as_iterable=True))
score = metrics.mean_squared_error(y_predicted, y_test)
print('MSE: {0:f}'.format(score))
if __name__ == '__main__':
tf.app.run()
I am new to tensorflow, so I searched for the code which has similarity to how mine works, but the output of the code is one.
In my model, the input is N*1000, and the output is N*2. I wonder are there effective and efficient code for regression. Please give me some example.
Actually, I find a workable code using DNNRegressor:
import numpy as np
from sklearn.cross_validation import train_test_split
from tensorflow.contrib import learn
import tensorflow as tf
import logging
#logging.getLogger().setLevel(logging.INFO)
#Some fake data
N=200
X=np.array(range(N),dtype=np.float32)/(N/10)
X=X[:,np.newaxis]
#Y=np.sin(X.squeeze())+np.random.normal(0, 0.5, N)
Y = np.zeros([N,2])
Y[:,0] = X.squeeze()
Y[:,1] = X.squeeze()**2
X_train, X_test, Y_train, Y_test = train_test_split(X, Y,
train_size=0.8,
test_size=0.2)
reg=learn.DNNRegressor(hidden_units=[10,10])
reg.fit(X_train,Y_train[:,0],steps=500)
But, this code will work only if the shape of Y_train is N*1, and it will fail when the shape of Y_train is N*2.
However, I want to build a regress model and the input is N*1000, the output is N*2. And I can't fix it.

Categories

Resources