Lately I was on a Data Science meetup in my city, there was a talk about connecting Neural Networks with SVM. Unfortunately presenter had to quit right after presentation, so I wasn't able to ask some questions.
I was wondering how is that possible ? He was talking about using neural networks for his classification, and later on, he was using SVM classifier to improve his accuracy and precision by about 10%.
I am using Keras for Neural Networks and SKlearn for the rest of ML.
This is completely possible and actually quite common. You just select the output of a layer of the neural network and use that as a feature vector to train a SVM. Generally one normalizes the feature vectors as well.
Features learned by (Convolutional) Neural Networks are powerful enough that they generalize to different kinds of objects and even completely different images. For examples see the paper CNN Features off-the-shelf: an Astounding Baseline for Recognition.
About implementation, you just have to train a neural network, then select one of the layers (usually the ones right before the fully connected layers or the first fully connected one), run the neural network on your dataset, store all the feature vectors, then train an SVM with a different library (e.g sklearn).
Related
I want to ask the LSTM can be modeled as many-to-one.
However, Seq2Seq can also be modeled as many-to-one. (M to N, when N is one).
So, what is the difference?
Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections. It can not only process single data points (such as images), but also entire sequences of data (such as speech or video). For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition, speech recognition and anomaly detection in network traffic or IDSs (intrusion detection systems).
(https://en.wikipedia.org/wiki/Long_short-term_memory)
Seq2seq is a family of machine learning approaches used for language processing. Applications include language translation, image captioning, conversational models and text summarization.
...
Seq2seq turns one sequence into another sequence. It does so by use of
a recurrent neural network (RNN) or more often LSTM or GRU to avoid
the problem of vanishing gradient.
(https://en.wikipedia.org/wiki/Seq2seq)
From my understanding, I guess Seq2Seq is a model which is optimized for NLP and uses an LSTM or GRU under the hood.
The dataset i am working on has 7 input features and 4 output class. The length of my dataset is 160. Will neural network be a good choice here? If so, how should i take my inputs to the neural network. Since I have 4 output class, i am going to use Softmax in the final layer.
If neural network makes no sense in such a small dataset, then what are the possible good Machine Learning Algorithms for have a great result in this kind of problems?.
Thanks 😊
What kind of a dataset do you have? I am assuming a tabular dataset.
You can use a neural network if you must. However, for such a small dataset, a neural network isn't usually advisable. You should rather look into the following classifiers:
Decision Tree
Naive Bayes
Multi-class Logistic Regression
Support Vector Machine
Ensemble models (Random Forest and/or Gradient Boosting)
I used gridsearchcv to determine which hyperparameters in the mlpclassifier can make the accuracy from my neural network higher. I figured out that the amount of layers and nodes makes a difference but I'm trying to figure out which other configurations can make a difference in accuracy (F1 score actualy). But from my experience it lookes like parameters like "activation", "learning_rate", "solver" don't really change anything.
I need to do a research on which other hyperparameters can make a difference in the accuracy from predictions via the neural network.
Does someone have some tips/ideas on which parameters different from the amount of layers / nodes that can make a difference in the accuracy from my neural network predictions?
It all depends on your dataset. Neural network are not magical tools that can learn everything and also they require a lot of data compared to traditional machine learning models. In case of MLP, making a model extremely complex by adding a lot of layers is never a good idea as it makes the model more complex, slow and can lead to overfitting as well. Learning rate is an important factor as it is used to find the best solution for the model. A model makes mistakes and learns from it and the speed of learning is controlled by learning rate. If learning rate is too small, your model will take a long time to reach the best possible stage but if it is too high the model might just skip the best stage. The choice of activation function is again dependent on the use case and the data but for simpler datasets, activation function will not make a huge differnece.
In traditional deep learning models, a neural network is build up of several layers which might not always be dense. All the layers in MLP as dense i.e. feed forward. To improve your model, you can try a combination of dense layers along with cnn, rnn, lstm, gru or other layers. Which layer to use depends completely on your dataset. If you are using a very simple dataset for a school project, then experiment with traditional machine learning methods like random forest as you might get better results.
If you want to stick to neural nets, read about other types of layers, dropout, regularization, pooling, etc.
I'm working on a project that requires the recognition of just people in a video or a live stream from a camera. I'm currently using the tensorflow object recognition API with python, and i've tried different pre-trained models and frozen inference graphs. I want to recognize only people and maybe cars so i don't need my neural network to recognize all 90 classes that come with the frozen inference graphs, based on mobilenet or rcnn, as it seems this slows the process, and 89 of this 90 classes are not needed in my project. Do i have to train my own model or is there a way to modify the inference graphs and the existing models? This is probably a noob question for some of you, but mind that i've worked with tensorflow and machine learning for just one month.
Thanks in advance
Shrinking the last layer to output 1 or two classes is not likely to yield large speed ups. This is because most of the computation is in the intermediate layers. You could shrink the intermediate layers, but this would result in poorer accuracy.
Yes, you have to train own model. Let's see in short words some ways how to do.
OPTION 1. When you want to apply transfer knowledge as maximum as possible, you can froze the CNN layers. After, you change a quantity of detected classes with dimension of classifier (dense layers). The classifier is the latest part in CNN architecture. Now, you should retrain only classifier.
OPTION 2. Assuming, you want to apply transfer knowledge for first layers of CNN (for example, froze first 2-3 CNN layers) and retrain rest of CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain rest of CNN layers and classifier.
OPTION 3. Assuming, you want to retrain whole CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain whole CNN with classifier.
Generally, the Tensorflow Object Detection API is a good start for beginners! How to proceed with your problem you can see here more detail about whole process and extra explanation here.
I need to implement a classification application for neuron-signals. In the first step, I need to train a denoising autoencoder(DAE) layer for signal cleaning then, I will feed the output to a DBN network for classification. I tried to find support for these types in Tensorflow but all what I found was two models CNN and RNN. Does anyone has an idea about a robust implementation for these two models using Tensorflow?