TFLearn "cannot feed value of shape." - python

I started using a very basic Deep Belief Network in Node.js but it wasn't fast enough. Essentially it was using a X and Y where each is an array of arrays; X is the data to train and Y is the result.
So I would feed it something like var x=[[1,2,3], [1,3,2]] etc. etc. and y=[[1,0], [1,0]]. Then I would give some data such as [2,3,1] and it would predict the y.
I'm lost on how to do this in tfslearn. I can learn on my own but I've hit a point where I'm not sure what to even Google.
I can get the examples working if it's just a single array.
Every time I try using an array of arrays I get:
cannot feed value of shape

I was setting the input shape incorrectly for my data set. This helped a lot: http://tflearn.org/tutorials/quickstart.html
# Data loading and preprocessing
# Building deep neural network
net = tflearn.input_data(shape=[None, 4])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 1, activation='softmax')
net = tflearn.regression(net)
# Training
model = tflearn.DNN(net)
model.fit(X, Y, n_epoch=10, batch_size=16, show_metric=True)

Related

Tensorflow taking data size ( shape ) as dynamic and thus causing errors during node densing

I am training a model that the feature shape is [3751,4] and I'd like to use reshape and layer dense function built in Tensorflow to make the output labels have the shape [1,6].
The training and testing set are very similar, the only difference is that the testing data set has less batches than training set.
Now I am having two hidden layers in my model that will do something like:
input_layer = tf.reshape(features["x"], [1,-1])
first_hidden_layer = tf.layers.dense(input_layer, 4, activation=tf.nn.relu)
second_hidden_layer = tf.layers.dense(first_hidden_layer, 5, activation=tf.nn.relu)
output_layer = tf.layers.dense(second_hidden_layer, 6, activation=tf.nn.relu)
This network structure is a function that both training and evaluating phase utilize.
Partial code for training is like :
nn = tf.estimator.Estimator(model_fn=model_fn, params=model_params, model_dir='/tmp/nmos_self_define')
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": train_features_numpy},
y=train_labels_numpy,
batch_size = 3751,
num_epochs=None,
shuffle=False)
# Train
nn.train(input_fn=train_input_fn, max_steps=5000)
And testing part is like:
test_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": test_features_numpy},
y=test_labels_numpy,
batch_size = 3751,
num_epochs= 1,
shuffle=False)
ev = nn.evaluate(input_fn=test_input_fn)
print("Loss: %s" % ev["loss"])
print("Root Mean Squared Error: %s" % ev["rmse"])
During training, there is no problem, the function can reshape the input data and do the dense part. During the testing, however, the tensor shape of the reshape function gives something like [1,?], which is different from the training phase ([1,15004]). And this caused the tf.layers.dense functions to fail because it cannot do the dense without knowing the actual shape of the tensor.
The only difference between training and testing from my perspective is the num_epochs, but that shouldn't affect the input shape right? I don't understand why Tensorflow can reshape the tensor with solid values during training while it thinks the testing data input set are dynamic?
Please help and thanks for taking the time reading my question.
What you are doing is flattening the input of multiple batches to a single feature vector of size 15004. What you are most probably trying to accomplish is to reduce the dimension of your feature to a 2D vector with shape (Batches, Nr Features), where Batches is dynamic. There are two common ways to do this. The easiest is to use the flatten layer from tf.contrib like this:
input_layer = tf.contrib.layers.flatten(features["x"])
or you can reshape in such a way that the batch dimension is still dynamic, but then you have to calculate the shape of your input like this:
num_dimensions = features["x"].shape.as_list[1] * features["x"].shape.as_list[2] ...
input_layer = tf.reshape(features["x"], [-1, num_dimensions])

Neural network: estimating sine wave frequency

With an objective of learning Keras LSTM and RNNs, I thought to create a simple problem to work on: given a sine wave, can we predict its frequency?
I wouldn't expect a simple neural network to be able to predict the frequency, given that the notion of time is important here. However, even with LSTMs, I am unable to learn the frequency; I'm able to learn a trivial zero as the estimated frequency (even for train samples).
Here's the code to create the train set.
import numpy as np
import matplotlib.pyplot as plt
def create_sine(frequency):
return np.sin(frequency*np.linspace(0, 2*np.pi, 2000))
train_x = np.array([create_sine(x) for x in range(1, 300)])
train_y = list(range(1, 300))
Now, here's a simple neural network for this example.
from keras.models import Model
from keras.layers import Dense, Input, LSTM
input_series = Input(shape=(2000,),name='Input')
dense_1 = Dense(100)(input_series)
pred = Dense(1, activation='relu')(dense_1)
model = Model(input_series, pred)
model.compile('adam','mean_absolute_error')
model.fit(train_x[:100], train_y[:100], epochs=100)
As expected, this NN doesn't learn anything useful. Next, I tried a simple LSTM example.
input_series = Input(shape=(2000,1),name='Input')
lstm = LSTM(100)(input_series)
pred = Dense(1, activation='relu')(lstm)
model = Model(input_series, pred)
model.compile('adam','mean_absolute_error')
model.fit(train_x[:100].reshape(100, 2000, 1), train_y[:100], epochs=100)
However, this LSTM based model also doesn't learn anything useful.
Why doesn't it learn?
You think it's a simple problem to train an RNN on, but actually your setup isn't easy for the network at all:
As already mentioned, there's lack of important samples. You throw so much data into it (300 * 2000 points), but the actual target (frequency) is seen only once by the network. Even if the network does learn something, there's high chance it will overfit.
Inconsistent data. Remember that RNNs are good at capturing similar patterns in the series data. For instance, in NLP all sentences in the corpus are governed by the same language rules and more sentences help RNN to understand these rules better, i.e., more data helps.
In your case, the series with different frequencies aren't very much alike: compare the sine with frequency=1 and frequency=100. This kind of diversity in the data makes it harder to learn, not easier. It doesn't mean that the frequency is impossible for an RNN to learn, it simply means that you shouldn't be surprised that a trivial RNN like yours has hard time.
Data scale. Changing the frequency from 1 to 300, changes the scale of both x and y by two orders of magnitude, which may be problematic for any neural network.
Solution
Since your goal is rather educational, I solved the second and third items simply by limiting the target frequency to 10, so that scaling and distribution diversity isn't much of an issue (you are welcome to try different values here: you should see that increasing this one parameter to, say, 50 makes the task much more complex).
The first item is solved by giving the RNN 10 examples of each frequency, instead of just one. I've also added one more hidden layer to increase network flexibility, plus a simple regularizer (Dropout layer).
The complete code:
import numpy as np
from keras.models import Model
from keras.layers import Input, Dense, Dropout, LSTM
max_freq = 10
time_steps = 100
def create_sine(frequency, offset):
return np.sin(frequency * np.linspace(offset, 2 * np.pi + offset, time_steps))
train_y = list(range(1, max_freq)) * 10
train_x = np.array([create_sine(freq, np.random.uniform(0,1)) for freq in train_y])
train_y = np.array(train_y)
input_series = Input(shape=(time_steps, 1), name='Input')
lstm = LSTM(units=100)(input_series)
hidden = Dense(units=100, activation='relu')(lstm)
dropout = Dropout(rate=0.1)(hidden)
output = Dense(units=1, activation='relu')(dropout)
model = Model(input_series, output)
model.compile('adam', 'mean_squared_error')
model.fit(train_x.reshape(-1, time_steps, 1), train_y, epochs=200)
# Trying the network on the same data
test_x = train_x.reshape(-1, time_steps, 1)
test_y = train_y
predicted = model.predict(test_x).reshape([-1])
print()
print((predicted - train_y)[:12])
print(np.mean(np.abs(predicted - train_y)))
The output:
max_freq=10
[-0.05612183 -0.01982236 -0.03744316 -0.02568841 -0.11959982 -0.0770483
0.04643679 0.12057972 -0.00625324 -0.00724655 -0.16919005 -0.04512954]
0.0503574344847
max_freq=20 (everything else is the same)
[ 0.51365542 0.09269333 -0.009691 0.0619092 0.09852839 0.04378462
0.01430321 -0.01953268 0.00722599 0.02558327 -0.04520988 -0.0614748 ]
0.146024380232
max_freq=30 (everything else is the same)
[-0.28205156 -0.28922796 -0.00569081 -0.21314907 0.1068716 0.23497915
0.23975039 0.25955486 0.26333141 0.24235058 0.08320332 -0.03686047]
0.406703719805
Note that results are random and actually increasing the max_freq increases the changes of divergence. But even when it converges, the performance doesn't improve despite having more data, instead gets worse and pretty fast.
sample data item very low, one for each freq,
add small noise and use more data,
normalize output data -1 to 1 range
then try again
As you said, you want to predict the frequency. You also want to use LSTM. First we generate enough data to train, then we build the network. I'm sorry my example is not with keras, I'm using tflearn.
import numpy as np
import tflearn
from random import shuffle
# parameters
n_input=100
n_train=2000
n_test = 500
# generate data
xs=[]
ys=[]
frequencies = np.linspace(1,50,n_train+n_test)
shuffle(frequencies)
t=np.linspace(0,2*np.pi,n_input)
for freq in frequencies:
xs.append(np.sin(t*freq))
ys.append(freq)
xs_train=np.array(xs[:n_train]).reshape(n_train,n_input,1)
xs_test=np.array(xs[n_train:]).reshape(n_test,n_input,1)
ys_train = np.array(ys[:n_train]).reshape(-1,1)
ys_test = np.array(ys[n_train:]).reshape(-1,1)
# LSTM network prediction
net = tflearn.input_data(shape=[None, n_input, 1])
net = tflearn.lstm(net, 10)
net = tflearn.fully_connected(net, 100, activation="relu")
net = tflearn.fully_connected(net, 1)
net = tflearn.regression(net, optimizer='adam', loss='mean_square')
model = tflearn.DNN(net)
model.fit(xs_train, ys_train, n_epoch=100)
print(np.hstack((model.predict(xs_test),ys_test))[:10])
# [[ 13.08494568 12.76470588]
# [ 22.23135376 21.98039216]
# [ 39.0812912 37.58823529]
# [ 15.77548409 15.66666667]
# [ 26.57996941 25.58823529]
# [ 26.57759476 25.11764706]
# [ 16.42217445 15.8627451 ]
# [ 32.55020905 30.80392157]
# [ 44.16622925 43.01960784]
# [ 26.18071365 25.45098039]]
If you have the data in that order, you don't actually need LSTM, you can easily replace the LSTM part with a Deep Neural Network:
# Deep network instead of LSTM
net = tflearn.input_data(shape=[None, n_input])
net = tflearn.fully_connected(net, 100)
net = tflearn.fully_connected(net, 100)
net = tflearn.fully_connected(net, 1)
net = tflearn.regression(net, optimizer='adam',loss='mean_square')
model = tflearn.DNN(net)
model.fit(xs_train, ys_train)
print(np.hstack((model.predict(xs_test),ys_test))[:10])
Both codes are going to give you as result the predicted value of the frequency. I also created a gist with the program.

Non-linear classification with tensorflow

I am new to machine learning and Tensorflow and want to do a simple 2-dimensional classification with data, that cannot be linear separated.
On the left side, you can see the training data for the model.
The right side shows, what the trained model predicts.
As of now I am overfitting my model, so every possible input is fed to the model.
My expected result would be a very high accurancy as the model already 'knows' each answer.
Unfortunately the Deep Neural Network I am using is only able to separate by a linear divider, which doesn't fit my data.
This is how I train my Model:
def testDNN(data):
"""
* data is a list of tuples (x, y, b),
* where (x, y) is the input vector and b is the expected output
"""
# Build neural network
net = tflearn.input_data(shape=[None, 2])
net = tflearn.fully_connected(net, 100)
net = tflearn.fully_connected(net, 100)
net = tflearn.fully_connected(net, 100)
net = tflearn.fully_connected(net, 2, activation='softmax')
net = tflearn.regression(net)
# Define model
model = tflearn.DNN(net)
# check if we already have a trained model
# Start training (apply gradient descent algorithm)
model.fit(
[(x,y) for (x,y,b) in data],
[([1, 0] if b else [0, 1]) for (x,y,b) in data],
n_epoch=2, show_metric=True)
return lambda x,y: model.predict([[x, y]])[0][0]
Most of it is taken from the examples of tflearn, so I do not exactly understand, what every line does.
You need an activation function in your network for non-linearity. An activation function is the way for a neural network to fit non-linear function. Tflearn by default uses a linear activation, you could change this to 'sigmoid' and see if the results improve.

Getting weird values on predicting MNIST dataset

I am using TF.LEARN with mnist data. I trained my neural network with 0.96 accuracy but now I am not really sure how to predict a value.
Here is my code..
#getting mnist data to a zip in the computer.
mnist.SOURCE_URL = 'https://web.archive.org/web/20160117040036/http://yann.lecun.com/exdb/mnist/'
trainX, trainY, testX, testY = mnist.load_data(one_hot=True)
# Define the neural network
def build_model():
# This resets all parameters and variables
tf.reset_default_graph()
net = tflearn.input_data([None, 784])
net = tflearn.fully_connected(net, 100, activation='ReLU')
net = tflearn.fully_connected(net, 10, activation='softmax')
net = tflearn.regression(net, optimizer='sgd', learning_rate=0.1, loss='categorical_crossentropy')
# This model assumes that your network is named "net"
model = tflearn.DNN(net)
return model
# Build the model
model = build_model()
model.fit(trainX, trainY, validation_set=0.1, show_metric=True, batch_size=100, n_epoch=8)
#Here is the problem
#lets say I want to predict what my neural network will reply back if I put back the send value from my trainX
the value of trainX[2] is 4
pred = model.predict([trainX[2]])
print(pred)
#What I get is
[[2.6109733880730346e-05, 4.549271125142695e-06, 1.8098366126650944e-05, 0.003199575003236532, 0.20630565285682678, 0.0003870908112730831, 4.902480941382237e-05, 0.006617342587560415, 0.018498118966817856, 0.764894425868988]]
what I want is -> 4
The problem is that I am not sure how to use this predict function and put in the trainX value to get a prediction.
The prediction of tensorflow gives you a probabilistic output. It is sufficient to get the label with maximum probability from pred to get the peediction of the network.
pred = np.argmax(pred, axis=1)
Which in this case is not 4, but 9.
Where np is numpy module imported as import numpy as np, but feel free to replace it with tf.argmax(pred, 1) to use tensorflow's argmax instead.
You are getting a 9, which is quite similar to a 4.
What model.predict returns is score and while the 5-th value in the results array (the 5th value is 4 since it starts with a zero) gets a relatively high score (0.26-second high) - your model gives the last digit (9) the highest score-0.76. It just means your classifier is a bit wrong here - so you should consider using a different one or play with the hyper-parameters.

How to pass a TensorFlow tensor to a TFLearn model

I am working on a project in TensorFlow that performs operations on already-trained machine learning models. Following the tutorial TFLearn Quickstart, I built a deep neural network that predicts survival from the Titanic Dataset. I would like to use the TFLearn model in the same way that I would use a TensorFlow model.
The TFLearn docs homepage says
Full transparency over Tensorflow. All functions are built over tensors and can be used independently of TFLearn
This makes me think that I would be able to pass tensors as inputs, etc. to the the TFLearn model.
# Build neural network
net = tflearn.input_data(shape=[None, 6])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 2, activation='softmax')
net = tflearn.regression(net)
# Define model
model = tflearn.DNN(net)
# Start training (apply gradient descent algorithm)
model.fit(data, labels, n_epoch = 10, batch_size = 16, show_metric = False)
test = preprocess([[3, 'Jack Dawson', 'male', 19, 0, 0, 'N/A', 5.0000]], to_ignore)
# Make into a tensor
testTF = [tf.constant(i) for i in test]
# Pass the tensor into the predictor
print(model.predict([testTF]))
At present, when I pass a tensor into the model I am greeted with ValueError: setting an array element with a sequence.
Specifically, how can you pass tensors into a TFLearn model?
Generally, what limits are placed on how I can use tensors on a TFLearn model?
I don't know if you're still looking for an answer to your problem, but I think the issue is on your very last line:
print(model.predict([testTF]))
Try this instead:
print(model.predict(testTF))
I think that you nested a list inside another list. This isn't a TFlearn issue per se. Hope that helps.

Categories

Resources