Building a neural network that takes a created feature vector

Building a neural network that takes a created feature vector - python

To be more precise. Lets say I already have a vector that represents something (word, object, image...) and that I can not change the way I get it. What I would like to do is create a NN without the embedding and pooling layer and am wondering if tensorflow supports this kind of aproach.
Lets say my vector is 10 features long (10 floats). For each vector I also have a label, lets say there are 3 labels to chose from.
What I am (struggling/trying) to do is this. I would like to push this sort of vector input into a keras dense layer with relu activation and 10 neurons (stack maybe 2 or 3) and then as a final layer use sigmoid activation with 3 output neurons.
Then fit with labels on 40(?) epochs and so on...
My main question is well.. Is this possible? I have yet to finish the code and maybe I am asking this a bit too soon, but nevertheless.
Is this how one would approach this or would you build the model from embedding layer down and would not use the already made vectors?

Indeed it is possible.
One way to do it is to create a generator function yielding the vectors (that will do your vector representation, whatever it is) you want to pass to the network. Then create a TensorFlow dataset by calling tf.data.Dataset.from_generator.
The model will be then probably just a Sequential of dense layers.

Related

Autoencoders to crypt any type of data and Dimensionality reduction method through autoencoders

We need to implement a new approach using a generative deep learning model based on autoencoders to crypt any type of data And the idea is to use autoencoders to reduce dimensions of data so is it possible and how???

Is it possible? Yes! Autoencoders can be a solution for representing one information into another representation.
Autoencoders are a unsupervised learning technique where the goal is to make the input and output have the same value. So what is the use of autoencoders? They have hidden layers, usually with a smaller number of dimensions and the data is cloudy during this phase. You can still reconstruct the original data using the second part of the ANN, the decoder.
The ideia is:
original data > input layer > hidden layer (different number of nodes) > output layer > original data
The input and hidden layer: encoder
The hidden layer and output layer: decoder
Here you can find more information:
https://towardsdatascience.com/applied-deep-learning-part-3-autoencoders-1c083af4d798
One way would be to use a big hidden layer, higher than the data dimension in order to create an "encrypted" version, but I really don't see the point if you can use classical cryptography.

Keras LSTM, is the time_step equal to 1 like transforming the LSTM into a MLP?

I'm a beginer in this field of Deep Learning. I'm trying to use Keras for a LSTM in a regression problem. I would like to build an ANN which could exploit the memory cell between one prediction and the next one.
In more details... I have a neural network (Keras) with 2 Hidden layer-LSTM and 1 output layer for a regression context.
The batch_size is equal to 7, timestep equal to 1 and I have 5749 samples.
I'm only interested to understand if using timestep == 1 is the same thing as using an MLP instead of LSTM. For time_step, I'm referring to the reshape phase for the input of the Sequential model in Keras. The output is a single regression.
I'm not interested in the previous inputs, but I'm interested only on the output of the network as an information for the next prediction.
Thank you in advance!

You can say so :)
You're right in thinking that you won't have any recurrency anymore.
But internally, there will be still more operations than in regular Dense layers, due to the existence of more kernels.
But be careful:
If you use stateful=True, it will still be a recurrent LSTM!
If you use initial states properly, you can still make it recurrent.
If you're interested in creating custom operations with the memory/state of the cells, you could try creating your custom recurrent cell taking the LSTMCell code as a template.
Then you'd use that cell in a RNN(CustomCell, ...) layer.

How to Add Flattened Layer (or Similar) For Variable Input Size of a Convolutional Neural Network

I am wondering if it is possible how to add a similar to flattened layer for images of variable length.
Say we have an input layer for our CNN as:
input_shape=(1, None, None)
After performing your typical series of convolution/maxpooling layers, can we create a flattened layer, such that the shape is:
output_shape=(None,...)
If not, would someone be able to explain why not?

You can add GlobalMaxPooling2D and GlobalAveragePooling2D.
These will eliminate the spatial dimensions and keep only the channels dimension. Max will take the maximum values, Average will get the mean value.
I don't really know why you can't use a Flatten layer, but in fact you can't with variable dimensions.
I understand why a Dense wouldn't work: it would have a variable number of parameters, which is totally infeasible for backpropagation, weight update and things like that. (PS: Dense layers act only on the last dimension, so that is the only that needs to be fixed).
Examples:
A Dense layer requires the last dimension fixed
A Conv layer can have variable spatial dimensions, but needs fixed channels (otherwise the number of parameters will vary)
A recurrent layer can have variable time steps, but needs fixed features and so on
Also, notice that:
For classification models, you'd need a fixed dimension output, so, how to flatten and still guarantee the correct number of elements in each dimension? It's impossible.
For models with variable output, why would you want to have a fixed dimension in the middle of the model anyway?
If you're going totally custom, you can always use K.reshape() inside a Lambda layer and work with the tensor shapes:
import keras.backend as K
def myReshape(x):
shape = K.shape(x)
batchSize = shape[:1]
newShape = K.variable([-1],dtype='int32')
newShape = K.concatenate([batchSize,newShape])
return K.reshape(x,newShape)
The layer: Lambda(myReshape)

I don't think you can because the compile step uses those dimensions to allocate fixed memory when your model is instanced for training or prediction. Some dimensions need to be known ahead of time, so the matrix dimensions can be allocated.
I understand why you want variable-sized image input, the world is not (226, 226, 3). It depends on your specific goals, but for me, scaling up or windowing to a region of interest using say Single Shot Detection as a preprocessing step may be helpful. You could just start with Keras's ImageDataGenerator to scale all images to a fixed size - then you see how much of a performance gain you get from conditional input sizing or windowing preprocessing.
#mikkola, I have found flatten to be very helpful for TimeDistributed models. You can add flatten after the convolution steps using:
your_model.add(Flatten())

Multilabel classification using LSTM on variable length signal using Keras

I have recently started working on ECG signal classification in to various classes. It is basically multi label classification task (Total 4 classes). I am new to Deep Learning, LSTM and Keras that why i am confused in few things.
I am thinking about giving normalized original signal as input to the network, is this a good approach?
I also need to understand training input shape for LSTM as ECG signals are of variable length (9000 to 18000 samples) and usually classifier need fixed variable input. How can i handle such type of input in case of LSTM.
Finally what should be structure of deep LSTM network for such lengthy input and how many layers should i use.
Thanks for your time.
Regards

I am thinking about giving normalized original signal as input to the network, is this a good approach?
Yes this is a good approach. It is actually quite standard for Deep Learning algorithms to give them your input normalized or rescaled.
This usually helps your model converge faster, as now you are inside smaller range (i.e.: [-1, 1]) instead of greater un-normalized ranges from your original input (say [0, 1000]). It also helps you get better, more precise results, as it helps solve problems like the vanishing gradient as well as adapting better to modern activation and optimizer functions.
I also need to understand training input shape for LSTM as ECG signals are of variable length (9000 to 18000 samples) and usually classifier need fixed variable input. How can i handle such type of input in case of LSTM.
This part is really important. You are correct, LSTM expects to receive inputs with a fixed shape, one that you know beforehand (in fact, any Deep Learning layer expects fixed shape inputs). This is also explained in the keras docs on Recurrent Layers where they say:
Input shape
3D tensor with shape (batch_size, timesteps, input_dim).
As we can see, it expects your data to have a number of timesteps as well as a dimension on each one of those timesteps (batch size is usually 1). To exemplify, suppose your input data consists of elements like: [[1,4],[2,3],[3,2],[4,1]]. Then, using a batch_size of 1, the shape of your data would be (1,4,2). As you have 4 timesteps, each with 2 features.
So bottom line, you have to make sure that you pre-process you data so it has a fixed shape you can then pass to your LSTM layers. This one you will have to find out by yourself, as you know your data and problem better than we do.
Maybe you can fix the samples you obtain from your signal, discarding some and keeping others so every signal is of the same length (if you say your signals are between 9k and 18k choosing 9000 could be the logical choice, discarding samples from the others you get). You could even do some other conversion to your data in a way that you can map from inputs of 9000-18000 to a fixed size.
Finally what should be structure of deep LSTM network for such lengthy input and how many layers should i use.
This one is really quite broad and doesn't have a unique answer. It would depend on the nature of your problem, and determining those parameters a priori is not so straightforward.
What I recommend you do is to start with a simple model first, and then add layers and blocks (neurons) incrementally until you are satisfied with the results.
Try just one hidden layer first, train and test your model and check your performance. You can then add more blocks and see if your performance improved. You can also add more layers and check for the same until you are satisfied.
This is a good way to create Deep Learning models, as you will arrive to the results you want while keeping your Network as lean as possible, which in turn helps your execution time and complexity. Good luck with your coding, hope you find this useful.

Implementing sparse connections in neural network (Theano)

Some use cases for neural networks requires that not all neurons are connected between two consecutive layers. For my neural network architecture, I need to have a layer, where each neuron only has connections to some prespecified neurons in the previous layer (at somewhat arbitrary places, not with a pattern such as a convolution layer). This is needed in order to model data on a specific graph. I need to implement this "Sparse" layer in Theano, but I'm not used to the Theano way of programming.
It seems that the most efficient way of programming sparse connections in Theano would be to use theano.tensor.nnet.blocksparse.SparseBlockGemv. An alternative would be to do matrix multiplication, where many weights are set to 0 (= no connection), but that would be very inefficient compared to SparseBlockGemv as each neuron is only connected to 2-6 neurons in the previous layer out of ~100000 neurons. Moreover, a weight matrix of 100000x100000 would not fit on my RAM/GPU. Could someone therefore provide an example of how to implement sparse connections using the SparseBlockGemv method or another computationally-efficient method?
A perfect example would be to extend the MLP Theano Tutorial with an extra layer after the hidden layer (and before softmax), where each neuron only has connections to a subset of neurons in the previous layer. However, other examples are also very welcome!
Edit: Note that the layer must be implemented in Theano as it is just a small part of a larger architecture.

The output of a fully-connected layer is given by the dot product of the input and the weights of that layer. In theano or numpy you can use the dot method.
y = x.dot(w)
If you only have connections to some neurons in the previous layer and those connections are predefined you could do something like this:
y = [x[edges[i]].dot(w[i])) for i in neurons]
Where edges[i] contains the indices for neurons connected to neuron i and w[i] the weights of this connection.
Please note, that theano doesn't know about layers or other high-level details.

Apologies for resurrecting an old thread, but this was the simplest guidance I found that was useful in extending the guidance at https://iamtrask.github.io/2015/07/12/basic-python-network/ for partially-connected inputs. However, it took me a while to make sense of basaundi's answer and I think I can improve upon it.
There were a couple of things that I needed to change to make it work. In my case, I am trying to map from N inputs to M neurons in my first hidden layer. My inputs are in a NxF array, where F is the number of features for my inputs, and my synapse values (weights) between inputs and the first layer are in a FxM array. Therefore, the output of Inputs <dot> Weights is a NxM array. My edge matrix is an MxF array that specifies for each neuron in layer 1 (rows), which of the features of the input data are relevant (columns).
In this setup, at least, it required me to slice my arrays differently than specified above. Also, the list comprehension returns a list of matrices, which must be summed to get the correct NxM (otherwise you get an MxNxM array).
So I have used the following (util.sigmoid is a helper function of my own):
y = [numpy.dot(x[:, edges[i]], w[edges[i]])
for i in range(M)]
y = util.sigmoid(numpy.sum(y, 0))
This seems to work for me.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.