How do i iterate a tensor in a for loop?..
I want to do convolution on each row of my input_tensor... but can't seem to iterate in a tensor.
Currently trying to it like this:
def row_convolution(input):
filter_size = 8
print input.dtype
print input.get_shape()
for units in xrange(splits):
extract = input[units:units+filter_size,:,:]
for row_of_extract in extract:
for unit in row_of_extract:
temp_list.append((Conv1D(filters = 1, kernel_size = 1, activation='relu' , name = 'conv')(unit)))
print len(temp_list)
sum_temp_list.append(sum(temp_list))
sum_sum_temp_list.append(sum(sum_temp_list))
conv_feature_map.append(sum_sum_temp_list)
return np.array(conv_feature_map)
It looks like you're trying to define tensorflow operations for each input. This is a common misunderstanding about the framework.
You must first define the operations that you will perform, all operations must be defined up front. Usually it looks something like this:
g = tf.Graph()
with g.as_default():
# define some placeholders to accept your input
X = tf.placeholder(tf.float32, shape=[1000,1])
y = tf.placeholder(tf.float32, shape=[1])
# add more operations...
Conv1D(...) # add your convolution operations
# add the rest of your operations
optimizer = tf.train.AdamOptimizer(0.00001).minimize(loss)
Now the graph has been defined, all of it. Consider that fixed, you won't add anything to it again.
Now you'll run data through the fixed graph:
with g.as_default(), tf.Session() as sess:
X_data, y_data = get_my_data()
# run this in a loop
result = sess.run([optimizer,loss], feed_dict={X:X_data, y:y_data})
Note that your data and labels should be feed in a batch, so the first dimension of your data represents N number of datapoints (N=1 is perfectly acceptable of course). You should preprocess the data so it's in that format. For example, a batch of 10 MNIST digits would be in shape [10,28,28,1]. That's:
10 data samples
Images are 28 px height
Images are 28 px width
It's a grayscale image, so 1 color channel
Related
I have a weird use case for a neural network and want to understand if there is a way to accomplish what I'm trying to do.
I am trying to train a neural network that takes in 3 input variables and outputs 96 continuous variables. The output should ideally produce a continuous curve, however the expected y values have a lot of missing data points (>50%) distributed randomly which affects how the model trains. I know which data points are missing and am trying to find a way to ignore these outputs during backpropagation.
For example:
Input = [1,2,3]
Expected Output = [1,2,3,NAN,5,6,7,NAN,...] # NAN is set to 0 for training
Currently this is the method I am trying (tensorflow.keras)
in1 = layers.Input(3)
in2 = layers.Input(96) # Array of Bools, =1 if expected output variable is a number, =0 if nan
hidden1 = layers.Dense(37,activation='relu',use_bias=True)(in1)
hidden2 = layers.Dense(37,activation='relu',use_bias=True)(hidden1)
hidden3 = layers.Dense(37,activation='relu',use_bias=True)(hidden2)
hidden3_in2 = layers.concatenate([hidden3,in2])
out = layers.Dense(96)(hidden3_in2)
model = Model(inputs=[in1,in2], outputs=[out])
The expected output of this should be 0 being calculated where in2 == 0, and a number greater than 0 everywhere else. When using the model to predict data I plug in an array of 1's into in2, indicating that no expected values should equal 0, so a continuous curve should be output. However, many output variables still come out to 0, which is not ideal.
Essentially my question is: is there a good way to mask specific outputs during backprop and/or loss calculation using an array?
Thanks in advance!
All you have to do is to write your custom loss function, where you literally mask out the loss values.
Something along the lines of
def my_loss_fn(y_true, y_pred):
squared_difference = tf.square(y_true - y_pred)
mask = 1.0 - tf.math.is_nan(y_true)
return tf.reduce_mean(squared_difference * mask, axis=-1) # Note the `axis=-1`
model.compile(optimizer='adam', loss=my_loss_fn)
I have some data that is of shape 10000 x 1440 x 8 where 10000 is the number of days, 1440 the number of minutes and 8 is the number of features.
For each day, ie. each submatrix of size 1440 x 8 I wish to train an autoencoder and extract the weights from the second layer, such that my output will be a matrix output = 10000 x 8
I can do this in a loop with
import numpy as np
from keras.layers import Input, Dense
from keras import regularizers, models, optimizers
data = np.random.random(size=(10000,1440,8))
def AE(y, epochs=100,learning_rate = 1e-4, regularization = 5e-4, epochs=3):
input = Input(shape=(y.shape[1],))
encoded = Dense(1, activation='relu',
kernel_regularizer=regularizers.l2(regularization))(input)
decoded = Dense(y.shape[1], activation='relu',
kernel_regularizer=regularizers.l2(regularization))(encoded)
autoencoder = models.Model(input, decoded)
autoencoder.compile(optimizer=optimizers.Adam(lr=learning_rate), loss='mean_squared_error')
autoencoder.fit(y, y, epochs=epochs, batch_size=10, shuffle=False)
(w1,b1,w2,b2)=autoencoder.get_weights()
return (w1,b1,w2,b2)
lst = []
for i in range(data.shape[0]):
y = data[i]
(_, _, w2, _) = AE(y)
lst.append(w2[0])
output = np.array(lst)
However, this feels very stupid as surely I must be able to just pass the 3D data to the autoencoder and retrieve what I want. However, if I try modify the shape of input to be input = Input(shape=(y.shape[1],y.shape[2]))
I get an error
ValueError: Dimensions must be equal, but are 1440 and 8 for '{{node
mean_squared_error/SquaredDifference}} =
SquaredDifference[T=DT_FLOAT](model_778/dense_1558/Relu,
IteratorGetNext:1)' with input shapes: [?,1440,1440], [?,1440,8].
Any pointers on how to get the shape right?
Simply reshape your your data like so and call the function.
data = data.reshape(data.shape[0]*data.shape[1], -1)
(w1, b1, w2, b2) = AE(data)
print(w2.shape)
Your first layer of the NN is a Dense layer. You can only pass two dimensional data into it. One dimension will be batch size and the other dimension will be the feature vector. When you are using the data in the way you are using it, you are considering each data point independently. Which means that you can join the first two axes together and just pass it on to the NN. However, note that you would still need to modify the code so that you are not passing the entire dataset at once to the NN. You need to split the data into batches and loop over those before passing it on. And honestly, it's the same as what you are doing now. So your looping is not as bad as you think it is for what you are trying to do.
However, also note that you have a time series data and considering each datapoint as an independent point doesn't really make sense. You need an LSTM layer or something to learn the time series encoding.
I am handeling a timeseries dataset with n timesteps, m features and k objects.
As a result my feature vector has a shape of (n,k,m) While my targets shape is (n,m)
I want to predict the targets for every timestep and object, but with the same weights for every opject. Also my loss function looks like this.
average_loss = loss_func(prediction, labels)
sum_loss = loss_func(sum(prediction), sum(labels))
loss = loss_weight * average_loss + (1-loss_weight) * sum_loss
My plan is to not only make sure, that I predict every item as good as possible, but also that the sum of all items get perdicted. loss_weights is a constant.
Currently I am doing this kind of ugly solution:
features = local_batch.squeeze(dim = 0)
labels = torch.unsqueeze(local_labels.squeeze(dim = 0), 1)
prediction = net(features)
I set my batchsize = 1. And squeeze it to make the k objects my batch.
My network looks like this:
def __init__(self, n_feature, n_hidden, n_output):
super(Net, self).__init__()
self.hidden = torch.nn.Linear(n_feature, n_hidden) # hidden layer
self.predict = torch.nn.Linear(n_hidden, n_output) # output layer
def forward(self, x):
x = F.relu(self.hidden(x)) # activation function for hidden layer
x = self.predict(x) # linear output
return x
How do I make sure I do a reasonable convolution over the opject dimension in order to keep the same weights for all objects, without commiting to batchsize=1? Also, how do I achieve the same loss function, where I compute the loss of the prediction sum vs target sum for any timestamp?
It's not exactly ugly -- I would do the same but generalize it a bit for batch size >1 using view.
# Using your notations
n, k, m = features.shape
features = local_batch.view(n*k, m)
prediction = net(features).view(n, k, m)
With the prediction in the correct shape (n*k*m), implementing your loss function should not be difficult.
I have a Keras LSTM model that contains multiple outputs.
The model is defined as follows:
outputs=[]
main_input = Input(shape= (seq_length,feature_cnt), name='main_input')
lstm = LSTM(32,return_sequences=True)(main_input)
for _ in range((output_branches)): #output_branches is the number of output branches of the model
prediction = LSTM(8,return_sequences=False)(lstm)
out = Dense(1)(prediction)
outputs.append(out)
model = Model(inputs=main_input, outputs=outputs)
model.compile(optimizer='rmsprop',loss='mse')
I have a problem when reshaping the output data.
The code for reshaping the output data is:
y=y.reshape((len(y),output_branches,1))
I got the following error:
ValueError: Error when checking model target: the list of Numpy arrays
that you are passing to your model is not the size the model expected.
Expected to see 5 array(s), but instead got the following list of 1
arrays: [array([[[0.29670931],
[0.16652206],
[0.25114482],
[0.36952324],
[0.09429612]],
[[0.16652206],
[0.25114482],
[0.36952324],
[0.09429612],...
How can I correctly reshape the output data?
It depends on how y is structured initially. Here I assume that y is a single-valued label for each sequence in batch.
When there are multiple inputs/outputs model.fit() expects a corresponding list of inputs/outputs to be given. np.split(y, output_branches, axis=-1) in a following fully reproducible example does exactly this - for each batch splits a single list of outputs into a list of separate outputs where each output (in this case) is 1-element list:
import tensorflow as tf
import numpy as np
tf.enable_eager_execution()
batch_size = 100
seq_length = 10
feature_cnt = 5
output_branches = 3
# Say we've got:
# - 100-element batch
# - of 10-element sequences
# - where each element of a sequence is a vector describing 5 features.
X = np.random.random_sample([batch_size, seq_length, feature_cnt])
# Every sequence of a batch is labelled with `output_branches` labels.
y = np.random.random_sample([batch_size, output_branches])
# Here y.shape() == (100, 3)
# Here we split the last axis of y (output_branches) into `output_branches` separate lists.
y = np.split(y, output_branches, axis=-1)
# Here y is not a numpy matrix anymore, but a list of matrices.
# E.g. y[0].shape() == (100, 1); y[1].shape() == (100, 1) etc...
outputs = []
main_input = tf.keras.layers.Input(shape=(seq_length, feature_cnt), name='main_input')
lstm = tf.keras.layers.LSTM(32, return_sequences=True)(main_input)
for _ in range(output_branches):
prediction = tf.keras.layers.LSTM(8, return_sequences=False)(lstm)
out = tf.keras.layers.Dense(1)(prediction)
outputs.append(out)
model = tf.keras.models.Model(inputs=main_input, outputs=outputs)
model.compile(optimizer='rmsprop', loss='mse')
model.fit(X, y)
You might need to play around with axes as you didn't specify how exactly your data look like.
EDIT:
As author is looking for an answer drawing from official sources, it's mentioned here (not explicitly though, it only mentions what the Dataset should yield, hence - what kind of input structure model.fit() expects):
When calling fit with a Dataset object, it should yield either a tuple of lists like ([title_data, body_data, tags_data], [priority_targets, dept_targets]) or a tuple of dictionaries like ({'title': title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_targets, 'department': dept_targets}).
Since you have an amount of outputs equal to output_branches, your output data must be a list with the same amount of arrays.
Basically, if the output data is in the middle dimension as your reshape suggests:
y = [ y[:,i] for i in range(output_branches)]
I am trying to build a generative RNN using Tensorflow. I have a preprocessed dataset which is a list of sequence_length x 2048 x 2 numpy arrays. The sequences have different lengths. I have been looking through examples and documentation but I really couldn't understand, for example, what key is, or how I should create the input_sequences dictionary, etc.
So how should one format a list of numpy arrays, each of which represent a sequence of rank n (2 in this case) tensors, in order to be able to use this batch_sequences_with_states method?
Toy Implementations
I tried this and I will be glad to share my findings with you. It is a toy example. I attempted to create an example that works and observe how the output varies. In particular I used a case study of lstm. For you, you can define a conv net. Feel free to add more input and adjust as usual and follow the doc.
https://www.tensorflow.org/versions/r0.11/api_docs/python/contrib.training/splitting_sequence_inputs_into_minibatches_with_state_saving#batch_sequences_with_states
There are other more subtle examples I tried but I keep this simple version to show how the operation can be useful. In particular add more elements to the dictionaries (input sequence and context sequence) and observe the changes.
Two Approaches
Basically I will use two approaches:
tf.contrib.training.batch_sequences_with_states
tf.train.batch( )
I will start with the first one because it will directly helpful then I will show how to solve similar problem with train.batch.
I will basically be generate toy numpy arrays and tensors and use it for testing the operations
import tensorflow as tf
batch_size = 32
num_unroll = 20
num_enqueue_threads = 20
lstm_size = 8
cell = tf.contrib.rnn.BasicLSTMCell(num_units=lstm_size)
#state size
state_size = cell.state_size[0];
initial_state_values = tf.zeros((state_size,), dtype=tf.float32)
# Initial states
initial_state_values = tf.zeros((state_size,), dtype=tf.float32)
initial_states = {"lstm_state": initial_state_values}
# Key should be string
#I used x as input sequence and y as input context. So that the
# keys should be 2.
key = ["1","2"]
#Toy data for our sample
x = tf.range(0, 12, name="x")
y = tf.range(12,24,name="y")
# convert to float
#I converted to float so as not to raise type mismatch erroe
x=tf.to_float(x)
y=tf.to_float(y)
#the input sequence as dictionary
#This is needed according to the tensorflow doc
sequences = {"x": x }
#Context Input
context = {"batch1": y}
# Train batch with sequence state
batch_new = tf.contrib.training.batch_sequences_with_states(
input_key=key,
input_sequences=sequences,
input_context=context,
initial_states=initial_states,
num_unroll=num_unroll,
batch_size=batch_size,
input_length = None,
pad = True,
num_threads=num_enqueue_threads,
capacity=batch_size * num_enqueue_threads * 2)
# To test what we have got type and observe the output of
# the following
# In short once in ipython notebook
# type batch_new.[press tab] to see all options
batch_new.key
batch_new.sequences
#splitting of input. This generate input per epoch
inputs_by_time = tf.split(inputs, num_unroll)
assert len(inputs_by_time) == num_unroll
# Get lstm or conv net output
lstm_output, _ = tf.contrib.rnn.static_state_saving_rnn(
cell,
inputs_by_time,
state_saver=batch_new,
state_name=("lstm_state","lstm_state"))
Create Graph and Queue as Usual
The parts with # and * can be further adapted to suit requirement.
# Create the graph, etc.
init_op = tf.global_variables_initializer()
#Create a session for running operations in the Graph.
sess = tf.Session()
# Initialize the variables (like the epoch counter).
sess.run(init_op)
# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
# For the part below uncomment
#*those comments with asterics to do other operations
#*try:
#* while not coord.should_stop():
#*Run training steps or whatever
#*sess.run(train_op) # uncomment to run other ops
#*except tf.errors.OutOfRangeError:
#print('Done training -- epoch limit reached')
#*finally:
# When done, ask the threads to stop.
coord.request_stop()
# Wait for threads to finish.
coord.join(threads)
sess.close()
Second Approach
You can also use train.batch in a very interesting way:
import tensorflow as tf
#[0, 1, 2, 3, 4 ,...]
x = tf.range(0, 11, name="x")
# A queue that outputs 0,1,2,3,..
# slice end is useful for dequeuing
slice_end = 10
# instantiate variable y
y = tf.slice(x, [0], [slice_end], name="y")
# Reshape y
y = tf.reshape(y,[10,1])
y=tf.to_float(y, name='ToFloat')
Important
Note the use of dynamic and enqueue many with padding. Feel free to play with both options. And compare output!
batched_data = tf.train.batch(
tensors=[y],
batch_size=10,
dynamic_pad=True,
#enqueue_many=True,
name="y_batch"
)
batch_size = 128 ;
lstm_cell = tf.contrib.rnn.LSTMCell(batch_size,forget_bias=1,state_is_tuple=True)
val, state = tf.nn.dynamic_rnn(lstm_cell, batched_data, dtype=tf.float32)
Conclusion
The aim is to show that by simple examples we can get insight into the
details of the operations. You can adapt it to convolutional net in your case.
Hope this helps!