Predicting values using TFLearn neural networks - python

I am new to TFLearn and I am trying out a simple neural network to predict the output array value when an input array is given.
Actual input for this code would be either pixel values of a grayscale image or features extracted from a grayscale image. Hence the input is in a 2d array format. The output would be the predicted color for each pixel.
In the example code I have used two random arrays of size 9. I need to train the network to predict the 't_y' array when 't_x' array is given as input.
The code runs, but the prediction is very poor.
The code has been adapted from MNIST example of TFLearn found here
This is my code
from random import randint
import numpy as np
import tflearn
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.normalization import local_response_normalization
from tflearn.layers.estimator import regression
#input
t_x = [3, 8, 7, 4, 0, 7, 9, 5, 1]
#output
t_y = [9, 5, 1, 4, 7, 9, 7, 3, 6]
x = []
y = []
for i in range(1000):
x.append(t_x)
y.append(t_y)
#array of input values
x = np.reshape(x,(-1,3,3,1))
#array of output values
y = np.reshape(y,(-1,9))
network = input_data(shape=[None, 3, 3, 1], name='input')
network = conv_2d(network, 32, 3, activation='relu', regularizer="L2")
network = max_pool_2d(network, 2)
network = local_response_normalization(network)
network = conv_2d(network, 64, 3, activation='relu', regularizer="L2")
network = max_pool_2d(network, 2)
network = local_response_normalization(network)
network = fully_connected(network, 128, activation='tanh')
network = dropout(network, 0.8)
network = fully_connected(network, 256, activation='tanh')
network = dropout(network, 0.8)
network = fully_connected(network, 9, activation='softmax')
network = regression(network, optimizer='adam', learning_rate=0.01,
loss='categorical_crossentropy', name='target')
# Training
model = tflearn.DNN(network, tensorboard_verbose=0)
model.fit({'input': x}, {'target': y}, n_epoch=20)
pred = model.predict(np.reshape(t_x,(-1,3,3,1)))
print "Prediction :", pred[0]
I am assuming it has something to do with the parameter values specified in the 'conv_2d' and 'fully_connnected' functions.
What values would I have to set to get an accurate prediction ?

Format of output
The last layer of your code (fully_connected(network, 9, activation='softmax')) results in 9 neurons with a softmax function, i.e. normalised so that their total sum will add up to 1. This is generally usable (and used in MNIST) for selecting/optimizing a function that selects one of 9 possible output values - the network will output something like [0.01 0.01 0.01 0.9 0.03 0.01 0.01 0.01 0.01], "predicting" that the fourth value is the correct one, and this would be matched against a one-hot target vector (e.g. [0 0 0 1 0 0 0 0 0]).
Needless to say, a softmax output cannot ever be equal to [9, 5, 1, 4, 7, 9, 7, 3, 6], and not even close to that, since the output of all values softmax will add up to 1. Even the layer before that cannot output such values since tanh can only produce values between -1 and 1, and can't ever result in 9.
If you want to predict 9 numbers in the range 1-9, then you might want to use a fully connected layer instead of softmax, and scale your output so that the expected output is in the range of 0 to 1. There's more to that, but this would be a good start.

Related

Masking in LSTM with variable length input does not work

I'm building a LSTM model with variable length arrays as input. In many resources, I was recommended to do padding which is inserting 0 until all input arrays have the same length and then applying Masking for the model to ignore the 0s.
However, after many trainings, I feel like Masking does not work as expected, the padded 0s in the input still affect the prediction ability of the model.
After concatenating all sequences into one array, my training data looks like below without padding:
X y
[1 2 3] 4
[2 3 4] 5
[3 4 5] 6
...
My python implementation:
import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Masking
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
""" Raw Training Input """
arr = np.array([
[1, 2, 3, 4, 5, 6],
[5, 6, 7],
[11, 12, 13, 14]
], dtype=object)
timesteps = 3
n_features = 1
maxlen = 6
""" Padding """
padded_arr = pad_sequences(arr, maxlen=maxlen, padding='pre', truncating='pre')
""" Concatenate all sequences into one array """
sequence = np.concatenate(padded_arr)
sequence = sequence.reshape((len(sequence), n_features))
# print(sequence)
""" Training Data Generator """
generator = TimeseriesGenerator(sequence, sequence, length=timesteps, batch_size=1)
""" Check Generator """
for i in range(len(generator)):
x, y = generator[i]
print('%s => %s' % (x, y))
""" Build Model """
model = Sequential()
model.add(Masking(mask_value=0.0, input_shape=(timesteps, n_features))) # masking to ignore padded 0
model.add(LSTM(1024, activation='relu', input_shape=(timesteps, n_features), return_sequences=False))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(generator, steps_per_epoch=1, epochs=1000, verbose=10)
""" Prediction """
x_input = np.array([2,3,4]).reshape((1, timesteps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat) # here I'm expecting 5 because the input is [2, 3, 4]
For the prediction, I input [2,3,4] and most of the time I keep getting values very far away from the expected value (= 5)
I'm wondering if I missed out on something or simply because the model architecture was not correctly tuned.
I want to understand why the model is not predicting correctly. Is the Masking the issue or is it something else?
The problem is that the batch size is 1, and also just one step per epoch. As a result, no meaningful gradient can be calculated. You have to put all training data into one batch, and you should have good results:
""" Training Data Generator """
generator = TimeseriesGenerator(sequence, sequence, length=timesteps,
batch_size=15)
[ Alternatively, you could leave batch size 1, but change the steps_per_epoch to len(generator) Which seems to work with the adam optimizer, but not with SGD. And it's much slower. ]

How to prepare the inputs in Keras implementation of Wavenet for time-series prediction

In Keras implementation of Wavenet, the input shape is (None, 1). I have a time series (val(t)) in which the target is to predict the next data point given a window of past values (the window size depends on maximum dilation). The input-shape in wavenet is confusing. I have few questions about it:
How Keras figure out the input dimension (None) when a full sequence is given? According to dilations, we want the input to have a length of 2^8.
If a input series of shape (1M, 1) is given as training X, do we need to generate vectors of 2^8 time-steps as input? It seems, we can just use the input series as input of wave-net (Not sure why raw time series input does not give error).
In general, how we can debug such Keras networks. I tried to apply the function on numerical data like Conv1D(16, 1, padding='same', activation='relu')(inputs), however, it gives error.
#
n_filters = 32
filter_width = 2
dilation_rates = [2**i for i in range(7)] * 2
from keras.models import Model
from keras.layers import Input, Conv1D, Dense, Activation, Dropout, Lambda, Multiply, Add, Concatenate
from keras.optimizers import Adam
history_seq = Input(shape=(None, 1))
x = history_seq
skips = []
for dilation_rate in dilation_rates:
# preprocessing - equivalent to time-distributed dense
x = Conv1D(16, 1, padding='same', activation='relu')(x)
# filter
x_f = Conv1D(filters=n_filters,
kernel_size=filter_width,
padding='causal',
dilation_rate=dilation_rate)(x)
# gate
x_g = Conv1D(filters=n_filters,
kernel_size=filter_width,
padding='causal',
dilation_rate=dilation_rate)(x)
# combine filter and gating branches
z = Multiply()([Activation('tanh')(x_f),
Activation('sigmoid')(x_g)])
# postprocessing - equivalent to time-distributed dense
z = Conv1D(16, 1, padding='same', activation='relu')(z)
# residual connection
x = Add()([x, z])
# collect skip connections
skips.append(z)
# add all skip connection outputs
out = Activation('relu')(Add()(skips))
# final time-distributed dense layers
out = Conv1D(128, 1, padding='same')(out)
out = Activation('relu')(out)
out = Dropout(.2)(out)
out = Conv1D(1, 1, padding='same')(out)
# extract training target at end
def slice(x, seq_length):
return x[:,-seq_length:,:]
pred_seq_train = Lambda(slice, arguments={'seq_length':1})(out)
model = Model(history_seq, pred_seq_train)
model.compile(Adam(), loss='mean_absolute_error')
you are using extreme values for dilatation rate, they don't make sense. try to reduce them using, for example, a sequence made of [1, 2, 4, 8, 16, 32]. the dilatation rates aren't a constraint on the dimension of the input passed
your network work simply passing this input
n_filters = 32
filter_width = 2
dilation_rates = [1, 2, 4, 8, 16, 32]
....
model = Model(history_seq, pred_seq_train)
model.compile(Adam(), loss='mean_absolute_error')
n_sample = 5
time_step = 100
X = np.random.uniform(0,1, (n_sample,time_step,1))
model.predict(X)
specify a None dimension in Keras means to leave the model free to receive every dimension. this not means you can pass samples of various dimension, they always must have the same format... you can build the model every time with a different dimension size
for time_step in np.random.randint(100,200, 4):
print('temporal dim:', time_step)
n_sample = 5
model = Model(history_seq, pred_seq_train)
model.compile(Adam(), loss='mean_absolute_error')
X = np.random.uniform(0,1, (n_sample,time_step,1))
print(model.predict(X).shape)
I suggest also you a premade library in Keras which provide WAVENET implementation: https://github.com/philipperemy/keras-tcn you can use it as a baseline and investigate also the code to create a WAVENET

Sum of neighbors in tensorflow

I have a tensorflow model with my truth data in the shape (N, 32, 32, 5) ie. 32x32 images with 5 channels.
Inside the loss function I would like to calculate, for each pixel, the sum of the values of the neighboring pixels for each channel, generating a new (N, 32, 32, 5) tensor.
The tf.nn.pool function does something similar but not exactly what I need. I was trying to see if tf.nn.conv2d could get me there but I'm not sure what I'd need to use as the filter parameter in this case.
Is there a specific function for this? Or can I use conv2d somehow?
You can do that with tf.nn.separable_conv2d like this
import tensorflow as tf
input = tf.placeholder(tf.float32, [None, 32, 32, 5])
# Depthwise filter adds the neighborhood of each pixel per channel
depthwise_filter = tf.ones([3, 3, 5, 1], input.dtype)
# Pointwise filter does not do anything
pointwise_filter = tf.eye(5, batch_shape=[1, 1], dtype=input.dtype)
output = tf.nn.separable_conv2d(input, depthwise_filter, pointwise_filter,
strides=[1, 1, 1, 1], padding='SAME')
print(output.shape)
# (?, 32, 32, 5)
The following method using tf.nn.conv2d is also equivalent:
import tensorflow as tf
input = tf.placeholder(tf.float32, [None, 32, 32, 5])
# Each filter adds the neighborhood for a different channel
filter = tf.eye(5, batch_shape=[3, 3], dtype=input.dtype)
output = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
A new convolutional layer with the filter size of 3x3 and filters initialized to 1 will do the job. Just be careful to declare this special filter as an untrainable variable otherwise your optimizer would change its contents. Additionally, set padding to "SAME" to get the same size output from that convolutional layer. The pixels at the edges will have zero neigbors in that case.

Pytorch LSTM grad only on last output

I'm working with sequences of different lengths. But I would only want to grad them based on the output computed at the end of the sequence.
The samples are ordered so that they are decreasing in length and they are zero-padded. For 5 1D samples it looks like this (omitting width dimension for visibility):
array([[5, 7, 7, 4, 5, 8, 6, 9, 7, 9],
[6, 4, 2, 2, 6, 5, 4, 2, 2, 0],
[4, 6, 2, 4, 5, 1, 3, 1, 0, 0],
[8, 8, 3, 7, 7, 7, 9, 0, 0, 0],
[3, 2, 7, 5, 7, 0, 0, 0, 0, 0]])
For the LSTM I'm using nn.utils.rnn.pack_padded_sequence with the individual sequence lengths:
x = nn.utils.rnn.pack_padded_sequence(x, [10, 9, 8, 7, 5], batch_first=True)
The initialization of LSTM in the Model constructor:
self.lstm = nn.LSTM(width, n_hidden, 2)
Then I call the LSTM and unpack the values:
x, _ = self.lstm(x)
x = nn.utils.rnn.pad_packed_sequence(x1, batch_first=True)
Then I'm applying a fully connected layer and a softmax
x = x.contiguous()
x = x.view(-1, n_hidden)
x = self.linear(x)
x = x.reshape(batch_size, n_labels, 10) # 10 is the sample height
return F.softmax(x, dim=1)
This gives me an output of shape batch x n_labels x height (5x12x10).
For each sample, I would only want to use a single score, for the last output batch x n_labels (5*12). My question is How can I achieve this?
One idea is to apply tanh on the last hidden layer returned from the model but I'm not quite sure if that would give the same results. Is it possible to efficiently extract the output computed at the end of the sequence eg using the same lengths sequence used for pack_padded_sequence?
As Neaabfi answered hidden[-1] is correct. To be more specific to your question, as the docs wrote:
output, (h_n, c_n) = self.lstm(x_pack) # batch_first = True
# h_n is a vector of shape (num_layers * num_directions, batch, hidden_size)
In your case, you have a stack of 2 LSTM layers with only forward direction, then:
h_n shape is (num_layers, batch, hidden_size)
Probably, you may prefer the hidden state h_n of the last layer, then **here is what you should do:
output, (h_n, c_n) = self.lstm(x_pack)
h = h_n[-1] # h of shape (batch, hidden_size)
y = self.linear(h)
Here is the code which wraps any recurrent layer LSTM, RNN or GRU into DynamicRNN. DynamicRNN has a capacity of performing recurrent computations on sequences of varied lengths without any care about the order of lengths.
You can access the last hidden layer as follows:
output, (hidden, cell) = self.lstm(x_pack)
y = self.linear(hidden[-1])

tf.reshape is not working in the cases where you are adding an extra dimension

According to the tensorflow website, tf.reshape takes a tensor of a certain shape and maps it to a tensor of another shape. I want to map a tensor of size [600, 64] to a tensor of size [-1, 8, 8, 1] (in which the dimension at the -1 position is 600). This doesn't seem to be working though.
I am running this on tensorflow on python 3.6 and although it reshapes to something like [-1, 8, 8], it doesn't reshape to [-1, 8, 8, 1]
import tensorflow as tf
import numpy as np
from sklearn import datasets
from sklearn.preprocessing import LabelBinarizer
# preprocessing method needed
def flatten(array):
temp = []
for j in array:
temp.extend(j)
return temp
# preprocess the data
digits = datasets.load_digits()
images = digits.images
images = [flatten(i) for i in images]
labels = digits.target
labels = LabelBinarizer().fit_transform(labels)
# the stats needed
width = 8
height = 8
alpha = 0.1
num_labels = 10
kernel_length = 3
batch_size = 10
channels = 1
# the tensorflow placeholders and reshaping
X = tf.placeholder(tf.float32, shape = [None, width * height * channels])
# AND NOW HERE IS WHERE THE ERROR STARTS
y_true = tf.placeholder(tf.float32, shape = [None, num_labels])
X = tf.reshape(X, [-1, 8, 8, 1])
# the convolutional model
conv1 = tf.layers.conv2d(X, filters = 32, kernel_size = [kernel_length, kernel_length])
conv2 = tf.layers.conv2d(conv1, filters = 64, kernel_size = [2, 2])
flatten = tf.reshape(X, [-1, 1])
dense1 = tf.layers.dense(flatten, units=50, activation = tf.nn.relu)
y_pred = tf.layers.dense(dense1, units=num_labels, activation = tf.nn.softmax)
# the loss and training functions
loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred)
train = tf.train.GradientDescentOptimizer(alpha).minimize(loss)
# initializing the variables and the tf.session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# running the session
for i in range(batch_size):
_, lossVal = sess.run((train, loss), feed_dict = {X:images[:600], y_true: labels[:600]})
print(lossVal)
I keep on getting this error:
ValueError: Cannot feed value of shape (600, 64) for Tensor 'Reshape:0', which has shape '(?, 8, 8, 1)'
And I feel like that should not be the case since 8 * 8 * 1 does equal 64.
images[:600]'s shape is (600, 64), which does not correspond to the placeholder expected shape, (None, 8, 8, 1).
Either reshape your data or change the shape of the placeholder.
Note that the fact that you originally defined the placeholder shape to be (None, 64) is inconsequential as you reshape it a few lines later.

Categories

Resources