I'm trying to use Theano's autoencoders to discover context-specific features from two different types of data.
The first type has 13 features and the second one has 60.
n_ins=[13,60],
n_hiddens=[20, 20, 20],
Both have their own independent stack of autoencoders.
I merge the outputs of the topmost layers and feed these into a regression layer for supervised training.
self.logLayer = LogisticRegression(
input=(self.sigmoid_layers[0][-1].output+self.sigmoid_layers[1][-1].output),
n_in=self.n_modes*n_hiddens[-1],
n_out=n_outs
)
Pre-training for each context seems to work correctly, however I hit a snag during finetuning using the standard training function in the tutorials.
train_fn = theano.function(
inputs=[index],
outputs=self.finetune_cost,
updates=updates,
givens={
self.x: train_set_x[
index * batch_size: (index + 1) * batch_size
],
self.y: train_set_y[
index * batch_size: (index + 1) * batch_size
]
},
name='train'
)
I get the following error:
ValueError: dimension mismatch in args to gemm (5,73)x(13,20)->(5,20)
Apply node that caused the error: GpuDot22(GpuSubtensor{int64:int64:}.0, W)
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Inputs shapes: [(5, 73), (13, 20)]
Inputs strides: [(73, 1), (20, 1)]
Inputs values: ['not shown', 'not shown']
I believe this has something to do with how the Theano nodes are processed during training. Seems like the training batch (5, 73), is being applied straight into the output nodes separately starting with the 1st context (13, 20).
Related
I try to train my model using a tfrecord dataset (800 GB).
The simplified data pipeline looks like this:
files = tf.io.matching_files(tfr_dir + '*_' + single_pattern + '_*')
shards = tf.data.Dataset.from_tensor_slices(files)
# Read the tfrecords
dataset = tf.data.TFRecordDataset(filenames=shards, num_parallel_reads=tf.data.experimental.AUTOTUNE)
# Parse the tfrecords
dataset = dataset.map(parse_tfr_element, num_parallel_calls=tf.data.experimental.AUTOTUNE)
# Apply image augmentation and parameter optimization using tf.py_function with defined Tout
dataset = self.dataset.map(imgaug)
# Sort out erroneous
dataset = dataset.filter(lambda f1, f2, f3, f4, f5, state: state == False)
# Batch and prefetch data (not using shuffle atm)
dataset = dataset.batch(self.config.batch_size, num_parallel_calls=self.AUTOTUNE, drop_remainder=True)
dataset = dataset.prefetch(tf.data.AUTOTUNE)
This gives me as output following (batch_size=8):
<RepeatDataset element_spec=
(TensorSpec(shape=(8, 4, 256, 256, 3), dtype=tf.float32, name=None),
TensorSpec(shape=(8, 4, 19, 2), dtype=tf.float32, name=None),
TensorSpec(shape=(8, 4, 19, 3), dtype=tf.float32, name=None),
TensorSpec(shape=(8, 4, 3, 4), dtype=tf.float32, name=None),
TensorSpec(shape=(8, 4, 4, 3), dtype=tf.float32, name=None),
TensorSpec(shape=(8,), dtype=tf.bool, name=None))>
dataset[0], dataset[3] and dataset[4] are the inputs (x) and dataset[1] and dataset[2] is the ground truth (y) (depending on the model).
This works well using a custom training loop iterating over the batches of the dataset using for step, data in enumerate(dataset) and defining the inputs to the model by simple subscribing e.g. data[0]. However, I can't get it running using .fit(). I tried different approaches to force .fit() to iterate over the dataset (next(iter(dataset), .from_generator()) but had no luck so far.
So how could I get a multi-input dataset into the fit function? I consider atm to not use tfrecords, as they were so far just hard to use.
Thanks for your help and all the best
I want to build a neural network using neupy.
Therefore I consturcted the following architecture:
network = layers.join(
layers.Input(10),
layers.Linear(500),
layers.Relu(),
layers.Linear(300),
layers.Relu(),
layers.Linear(10),
layers.Softmax(),
)
My data is shaped as follwoing:
x_train.shape = (32589,10)
y_train.shape = (32589,1)
When I try to train this network using:
model.train(x_train, y_trian)
I get the follwoing error:
ValueError: Input dimension mis-match. (input[0].shape[1] = 10, input[1].shape[1] = 1)
Apply node that caused the error: Elemwise{sub,no_inplace}(SoftmaxWithBias.0, algo:network/var:network-output)
Toposort index: 26
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
Inputs shapes: [(32589, 10), (32589, 1)]
Inputs strides: [(80, 8), (8, 8)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Elemwise{Composite{((i0 * i1) / i2)}}(TensorConstant{(1, 1) of 2.0}, Elemwise{sub,no_inplace}.0, Elemwise{mul,no_inplace}.0), Elemwise{Sqr}[(0, 0)](Elemwise{sub,no_inplace}.0)]]
How do I have to edit my network to map this kind of data?
Thank you a lot!
Your architecture has 10 outputs instead of 1. I assume that your y_train function is a 0-1 class identifier. If so, than you need to change your structure to this:
network = layers.join(
layers.Input(10),
layers.Linear(500),
layers.Relu(),
layers.Linear(300),
layers.Relu(),
layers.Linear(1), # Single output
layers.Sigmoid(), # Sigmoid works better for 2-class classification
)
You can make it even simpler
network = layers.join(
layers.Input(10),
layers.Relu(500),
layers.Relu(300),
layers.Sigmoid(1),
)
The reason why it works is because layers.Liner(10) > layers.Relu() is the same as layers.Relu(10). You can learn more in official documentation: http://neupy.com/docs/layers/basics.html#mutlilayer-perceptron-mlp
When I run my code, I get a value error with the following message:
ValueError: Input dimension mis-match. (input[0].shape[1] = 1, input[2].shape[1] = 20)
Apply node that caused the error: Elemwise{Composite{((i0 + i1) - i2)}}[(0, 0)](Dot22.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0)
Toposort index: 18
Inputs types: [TensorType(float64, matrix), TensorType(float64, row), TensorType(float64, row)]
Inputs shapes: [(20, 1), (1, 1), (1, 20)]
Inputs strides: [(8, 8), (8, 8), (160, 8)]
Inputs values: ['not shown', array([[ 0.]]), 'not shown']
Outputs clients: [[Elemwise{Composite{((i0 * i1) / i2)}}(TensorConstant{(1, 1) of 2.0}, Elemwise{Composite{((i0 + i1) - i2)}}[(0, 0)].0, Elemwise{mul,no_inplace}.0), Elemwise{Sqr}[(0, 0)](Elemwise{Composite{((i0 + i1) - i2)}}[(0, 0)].0)]]
My training data is a matrix of entries such as ..
[ 815.257786 320.447 310.841]
And the batches I'm inputting to my training function have a shape of (BATCH_SIZE, 3) and type TensorType(float64, matrix)
My neural net is very simple:
self.inpt = T.dmatrix('inpt')
self.out = T.dvector('out')
self.network_in = nnet.layers.InputLayer(shape=(BATCH_SIZE, 3), input_var=self.inpt)
self.l0 = nnet.layers.DenseLayer(self.network_in, num_units=40,
nonlinearity=nnet.nonlinearities.rectify,
)
self.network = nnet.layers.DenseLayer(self.l0, num_units=1,
nonlinearity=nnet.nonlinearities.linear
)
My loss function is:
pred = nnet.layers.get_output(self.network)
loss = nnet.objectives.squared_error(pred, self.out)
loss = loss.mean()
I'm a bit confused as to why I'm getting a dimension mismatch. I'm passing in the correct input and label types (as per my symbolic variables), and the shape of my input data corresponds to the expected 'shape' parameter that I'm giving my InputLayer. I believe it's a problem with how I'm specifying the batch size, as when I use a batch size of 1 then my network can train without any problem, and the input[2].shape[1] value from the error message is my batch size. I'm quite new to machine learning, and any help would be greatly appreciated!
Turns out the problem was that my labels had the wrong dimensionality.
My data had shapes:
x_train.shape == (batch_size, 3)
y_train.shape == (batch_size,)
And the symbolic inputs to my net were:
self.inpt = T.dmatrix('inpt')
self.out = T.dvector('out')
I was able to solve my problem by reshaping y_train. I then changed the symbolic output variable to a matrix to account for these changes.
y_train = np.reshape(y_train, y_train.shape + (1,))
# y_train.shape == (batch_size, 1)
self.out = T.dmatrix('out')
I'm using Keras with TF backend. Recently, when using the functional API to make "hybrid" models, it seemed to me that Keras requires me to feed values that it shouldn't need.
As a background, I am trying to implement a conditional GAN in Keras. My implementation has a generator and a discriminator. As an example, the generator accepts (20, 20, 1) inputs and returns (20, 20, 1) outputs. These are stacked by channel to produce a (20, 20, 2) input to the discriminator. The discriminator is supposed to decide whether it is seeing a ground-truth translation of the original (20, 20, 1) image or a translation by the generator. This is represented by 0=fake, 1=real.
By itself, the discriminator is just a CNN for binary classification. Therefore, it can be trained by feeding data points with inputs of shape (20, 20, 2) and outputs in {0,1}. Therefore, if I write something like:
# <disc> is the discriminator
arbitrary_input = np.full(shape=(5, 20, 20, 2), fill_value=0.5)
arbitrary_labels = np.array([1, 1, 0, 0, 1])
disc.fit(arbitrary_input, arbitrary_labels, epochs=5)
training will proceed without errors (obviously this is a useless dataset, though).
However, when I insert the discriminator into the generator-discriminator stack:
# <disc> is the discriminator, <gen> is the generator
input = Input(shape=(20, 20, 1), name='stack_input')
gen_output = gen(input)
pair = Concatenate(axis=FEATURES_AXIS)([input, gen_output])
disc_output = disc(gen_output)
stack = Model(input, disc_output)
stack.compile(optimizer='adam', loss='binary_crossentropy')
arbitrary_input = np.full(shape=(5, 20, 20, 2), fill_value=0.5)
arbitrary_labels = np.array([1, 1, 0, 0, 1])
disc.fit(arbitrary_input, arbitrary_labels, epochs=5)
suddenly I need to feed an extra placeholder. I get this error message on disc.fit():
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'stack_input' with dtype float
[[Node: stack_input = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
As you can see by the name, this is the input to the hybrid/stacked model. I haven't changed the discriminator at all, I have only included it in another model. Therefore disc.fit() should still work, right?
There's a workaround available by freezing the weights of the generator and using fit() on the full stack, I think, but I do not understand why the method above doesn't work.
Is it perhaps some issue with scoping?
Edit: The discriminator is really just a simple CNN. It is initialized with disc = pix2pix_discriminator(input_shape=(20, 20, 2), n_filters=(32, 64)). The function in question is:
def pix2pix_discriminator(input_shape, n_filters, kernel_size=4, strides=2, padding='same', alpha=0.2):
x = Input(shape=input_shape, name='disc_input')
# first layer
h = Conv2D(filters=n_filters[0],
kernel_size=kernel_size,
strides=strides,
padding=padding,
data_format=DATA_FORMAT)(x)
# no BatchNorm
h = LeakyReLU(alpha=alpha)(h)
for i in range(1, len(n_filters)):
h = Conv2D(filters=n_filters[i],
kernel_size=kernel_size,
strides=strides,
padding=padding,
data_format=DATA_FORMAT)(h)
h = BatchNorm(axis=FEATURES_AXIS)(h)
h = LeakyReLU(alpha=alpha)(h)
h_flatten = Flatten()(h) # required for the upcoming Dense layer
y_pred = Dense(units=1, activation='sigmoid')(h_flatten) # binary output
discriminator = Model(inputs=x, outputs=y_pred)
discriminator.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
return discriminator
I'm trying to create a neural network based on theano/lasagne that will (essentially) attempt to do a multi-variable regression.
The meat of the code is:
train_value = train_df.values[:, 0]
train_data = train_df.values[:, 1:]
#print "train:", train_data.shape, train_label.shape
#test_data = test_df.values
#print "test:", test_data.shape
train_data = train_data.astype(np.float)
train_value = train_value.astype(np.int32)
fc_1hidden = NeuralNet(
layers = [ # three layers: one hidden layer
('input', layers.InputLayer),
('hidden', layers.DenseLayer),
('dropout', layers.DropoutLayer),
('output', layers.DenseLayer),
],
# layer parameters:
input_shape = (None, 36), # 36 rows of data
hidden_num_units = 100, # number of units in hidden layer
dropout_p = 0.25, # dropout probability
output_nonlinearity = softmax, # output layer uses softmax function
output_num_units = 10, # 10 labels
# optimization method:
#update = nesterov_momentum,
update = sgd,
update_learning_rate = 0.001,
#update_momentum = 0.9,
eval_size = 0.1,
# batch_iterator_train = BatchIterator(batch_size = 20),
# batch_iterator_test = BatchIterator(batch_size = 20),
max_epochs = 100, # we want to train this many epochs
verbose = 1,
)
fc_1hidden.fit(train_data, train_value)
plot_loss(fc_1hidden)
Here, train_value is just 1 column of (numerical) data that I want to train my NN to predict, and the following 57 columns (train_data) are all the parameters/values (all numbers) which should be weighted appropriately to predict the value in the first column.
However, when I run this script, I get the following error:
Epoch | Train loss | Valid loss | Train / Val | Valid acc | Dur
--------|--------------|--------------|---------------|-------------|-------
Traceback (most recent call last):
File "neuralnetwork.py", line 77, in <module>
fc_1hidden.fit(train_data, train_value)
File "/Users/spadavec/anaconda/lib/python2.7/site-packages/nolearn/lasagne.py", line 150, in fit
self.train_loop(X, y)
File "/Users/spadavec/anaconda/lib/python2.7/site-packages/nolearn/lasagne.py", line 188, in train_loop
batch_train_loss = self.train_iter_(Xb, yb)
File "/Users/spadavec/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", line 606, in __call__
storage_map=self.fn.storage_map)
File "/Users/spadavec/anaconda/lib/python2.7/site-packages/theano/compile/function_module.py", line 595, in __call__
outputs = self.fn()
ValueError: Shape mismatch: x has 83 cols (and 29 rows) but y has 36 rows (and 100 cols)
Apply node that caused the error: Dot22(x_batch, W)
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
Inputs shapes: [(29, 83), (36, 100)]
Inputs strides: [(664, 8), (800, 8)]
Inputs values: ['not shown', 'not shown']
I'm not sure where it is getting this shape--none of my data has 83 columns or rows. (note: I've tried to adapt this script, which originally was written to look at pictures of faces and guess where different parts were (eyes, nose, mouth, etc)).
I have written a much simpler version of this (sans-dropout method) in pybrain, but am trying to migrate to sklearn/lasagne/theano as it opens more doors.
Since you want to do regression, make sure to set the output type correctly:
output_nonlinearity = linear
Are you sure you also have 10 output units? I experienced some weird behavior in Lasagne. I think the API has changed over time and contains some bugs. I succeeded using the latest API demo and adapting it to my needs.