I am trying to build a multi column deep neural network (MDNN) with tflearn and tensorflow. The MDNN is explained in this paper. The part I am struggling with is how I can add two or more inputs together to be fed to tensorflow.
For a single column I have:
network = tflearn.input_data(shape=[None, image_shape, image_shape, 3])
and
model.fit(X_input, y_train, n_epoch=50, shuffle=True,
validation_set=(X_test_norm, y_test),
show_metric=True, batch_size=240, run_id='traffic_cnn2')
where X_input is of shape (31367, 32, 32, 3). I am pretty new to numpy, tensorflow and tflearn. The difficulty for now really lays in how to specify multiple inputs to tflearn.
Any help is greatly appreciated.
The MDNN explained in the paper individually trains several models using random (but bounded) distortions on the data. Once all models are trained, they produce predictions using an ensemble classifier by averaging the output of all the models on different versions of the data.
As far as I understand, the columns are not jointly but independently trained. So you must create different models and call fit on each on them. I recommend you start training a single model and once you have a training setting getting good results, replicate it. To generate predictions, you must compute the average of the predicted probabilities from the predict function and take the most probable class.
One way to a generate data from your inputs is to use data augmentation. However, instead of generating new data you must replace it by the modified versions.
Related
This question is pretty similar to this one and based on this post over GitHub, in the sense that I am trying to convert an SVM multiclass classification model (e.g., using sklearn) to a Keras model.
Specifically, I am looking for a way of retrieving probabilities (similar to SVC probability=True) or confidence value at the end so that I can define some sort of threshold and be able to distinguish between trained classes and non-trained ones. That is if I train my model with 3 or 4 classes, but then use a 5th that it wasn't trained with, it will still output some prediction, even if totally wrong. I want to avoid that in some way.
I got the following working reasonably well, but it relies on picking the maximum value at the end (argmax), which I would like to avoid:
model = Sequential()
model.add(Dense(30, input_shape=(30,), activation='relu', kernel_initializer='he_uniform'))
# output classes
model.add(Dense(3, kernel_regularizer=regularizers.l2(0.1)))
# the activation is linear by default, which works; softmax makes the accuracy be stuck 33% if targeting 3 classes, or 25% if targeting 4.
#model.add(Activation('softmax'))
model.compile(loss='categorical_hinge', optimizer=keras.optimizers.Adam(lr=1e-3), metrics=['accuracy'])
Any ideas on how to tackle this untrained-class problem? Something like Plat scaling or Temperature scaling would work, if I can still save the model as onnx.
As I suspected, got softmax to work by scaling the features (input) of the model. No need for stop gradient or anything. I was specifically using really big numbers, which despite training well, were preventing softmax (logistic regression) to work properly. The scaling of the features can be done, for instance, through the following code:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_std = scaler.fit_transform(X)
By doing this the output of the SVM-like model using keras is outputting probabilities as originally intended.
Keras gives me a way to use my deep learning models with sklearn(The keras wrapper for sklearn), but I need the same thing the other way around.
I want to create an ensemble of several already trained sklearn models by feeding their output to the input layer of a deep learning classifier(to be trained)
Can I achieve that?
You should probably explore Stacking : http://blog.kaggle.com/2016/12/27/a-kagglers-guide-to-model-stacking-in-practice/
What happens is that when we are doing cross validation, we can combine combine the out of fold predictions to regenerate the training data.
For example, if you 1000 data points and you use 5 folds to evaluate, you will have 5 different validation sets of length 200. Combining all the predictions obtained on this set will essentially give you a new feature of length 1000, hence a new feature.
Similarly by training more models, you can get 3-4 features corresponding to predictions from 3-4 models.
Finally you can stack these features with any model of your choice, you can even use a deep neural network.
Can any one please help me out?
I am working on my thesis work. Its about Predicting Parkinson disease, Since i want to build an LSTM model to adapt independent of patients. Currently i have implemented it using TensorFlow with my own loss function.
Since i am planning to introduce both labeled train and unlabeled train data in every batch of data to train the model. I want to apply my own loss function on this both labeled and unlabeled train data and also want to apply cross entropy loss only on labeled train data. Can i do this in tensorflow?
So my question is, Can i have combination of loss functions in a single model training on different set of train data?
From an implementation perspective, the short answer would be yes. However, I believe your question could be more specific, maybe what you mean is whether you could do it with tf.estimator?
I'm building a CNN model using Tensorflow, without the use of any frontend APIs such as Keras. I'm creating a VGG-16 model and using the pre-trained weights, and want to fine tune the last layers to serve my purpose.
Following the tutorial here, http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/
I re-created the training script and modified as per my requirements. However, my training does not happen and the training accuracy is stuck at 50.00% and validation accuracy is forming a pattern repeating the numbers.
Attached is the screenshot of the same.
I have been stuck on this for days now and can't seem to find the error. Any help is appreciated.
The code is pretty long and hence here is the gist file for the same
Your cross entropy is wrong, you are comparing your logits with the softmax of your logits.
This:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_pred)
Should be:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_true)
Some things to note. I would not train on some data point and then evaluate on the same datapoint. Your training accuracy is probably going to be biased by doing so. Another point to note ist that tf.argmax(tf.softmax(logits)) is the same as tf.argmax(logits).
I am trying to solve a time series prediction problem. I tried with ANN and LSTM, played around a lot with the various parameters, but all I could get was 8% better than the persistence prediction.
So I was wondering: since you can save models in keras; are there any pre-trained model (LSTM, RNN, or any other ANN) for time series prediction? If so, how to I get them? Are there in Keras?
I mean it would be super useful if there a website containing pre trained models, so that people wouldn't have to speent too much time training them..
Similarly, another question:
Is it possible to do the following?
1. Suppose I have a dataset now and I use it to train my model. Suppose that in a month, I will have access to another dataset (corresponding to same data or similar data, in the future possibly, but not exclusively). Will it be possible to continue training the model then? It is not the same thing as training it in batches. When you do it in batches you have all the data in one moment.
Is it possible? And how?
I'll answer your last questions first.
Will it be possible to continue training the model then? It is not the same thing as training it in batches. When you do it in batches you have all the data in one moment. Is it possible? And how?
Yes, it is possible. In general, it's called transfer learning. But keep in mind that if two datasets represent very different populations, the network will soon "forget" what it learned on the first run and will optimize to the second one. To do this, you simply start training from a loaded state instead of random initialization and save the model afterwards. It is also recommended to use a smaller learning rate on the second run in order to adapt it gradually to the new data.
are there any pre-trained model (LSTM, RNN, or any other ANN) for time
series prediction? If so, how to I get them? Are there in Keras?
I haven't found exactly a pre-trained model, but a quick search gave me several active GitHub projects that you can just run and get a result for yourself: Time Series Prediction with Machine Learning (LSTM, GRU implementation in tensorflow), LSTM Neural Network for Time Series Prediction (keras and tensorflow), Time series predictions with Keras (keras and theano), Neural-Network-with-Financial-Time-Series-Data (keras and tensorflow). See also this post.
Now you can use BERT or related variants and here you can find all the pre-trained models: https://huggingface.co/transformers/pretrained_models.html
And it is possible to pre-train and fine-tune RNN, and you can refer to this paper: TimeNet: Pre-trained deep recurrent neural network for time series classification.