How can I find out the most impactful inputs to use as the predictors for the Recurrent Neural Network (RNN) modeling? I have a CSV file that has 25 columns and all of them are numeric. I want to predict one of the columns using the rest of the columns (24 columns). How can I find out how many of those 24 columns are impactful enough to be used as input using Mutual Information Analysis in python?
Usually Energy Disaggregation is done from a single input (grid consumption) to multiple outputs (appliances in a particular home). If you want to include multiple inputs, try building a multiple-input branch Neural Network and then stack your RNN layers.
You can also have a look this blog for a better understanding of Disaggregation.
If you want to get started with Energy Disaggregation and NILM using Deep Learning, you can have a look at this Open Source library: https://github.com/plexflo/plexflo. There is a Deep Learning model also included (LSTM) that can do basic Energy DIsaggregation.
I want to build a neural network that will take the feature map from the last layer of a CNN (VGG or resnet for example), concatenate an additional vector (for example , 1X768 bert vector) , and re-train the last layer on classification problem.
So the architecture should be like in:
but I want to concat an additional vector to each feature vector (I have a sentence to describe each frame).
I have 5 possible labels , and 100 frames in the input frames.
Can someone help me as to how to implement this type of network?
I would recommend looking into the Keras functional API.
Unlike a sequential model (which is usually enough for many introductory problems), the functional API allows you to create any acyclic graph you want. This means that you can have two input branches, one for the CNN (image data) and the other for any NLP you need to do (relating to the descriptive sentence that you mentioned). Then, you can feed in the combined outputs of these two branches into the final layers of your network and produce your result.
Even if you've already created your model using models.Sequential(), it shouldn't be too hard to rewrite it to use the functional API.
For more information and implementation details, look at the official documentation here: https://keras.io/guides/functional_api/
Suppose I have a fully-trained TensorFlow network that converts an input A into an output B, without any additional inputs. I want to iterate the network to generate a series of states A,B,C,D,... where each output is fed as an input to the network in order to generate the next state (like a Markov chain).
Is there a way to do this iteration entirely within TensorFlow (ie, include iteration within the graph) while recording a snapshot of the output at each timestep? This seems to me like it might involve RNN, but I am unsure how to build an RNN when I already have a working network for the forward pass.
I'm trying to understand exactly how to implement a basic neural network in python that will use genetic algorithms for unsupervised learning and have ran into a small problem that the literature I've been able to pull up hasn't solved.
Lets say I have an input of 2 values, that are passed to a 3 neuron hidden layer with all weights/biases applied. After I determine if it fired I now send what exactly? Do I send the output from my sigmoid or do I send a full stop/start. In other words is my output into hidden layer 2 going to be binary or is it non-binary?
Can anyone explain this with the reasoning behind we choose one or the other?
This really depends on your network design, but there is no such restriction that inputs have to be binary. In fact that will not be a case you face often. For output layer, the type of output can be easily and clearly determined, eg. if you have something like a classifier that classifies this answer is spam or not, then the output (of 'a single neuron' output layer) will be binary. If you have a neural network to recognise handwritten digits, then probably it's better to have a 10 neuron output layer, each giving probability of the input image being one of the digits [0, 9].
For other layers (hidden and input), the output can be anything, most of the time it won't be binary.
EDIT:
I think I misunderstood your question a bit, and also you probably aren't talking about Fuzzy Neural Networks.
So if you are not considering those (in most cases), when you say a neuron has fired, you mean its output is 1 (binary high), and 0 otherwise, so yes it's binary.
Do I send the output from my sigmoid or do I send a full stop/start
The way sigmoid function is used in neural networks (with weights) it attempts to make the computation output a binary result, so basically both the options mean the same. There is a difference, but usually NNs try to avoid that region where sigmoid (or related neuron) outputs some value which can not be approximated to 0 or 1 nicely. Weights of inputs of that neuron are moved so that the neuron gives a clear 0 or 1.
Also note that, while it's not good to not know sigmoid (and tanh), but for practical purposes ReLU, Leaky ReLU, or maxout are better choices.
Suggested: http://cs231n.github.io/neural-networks-1/
Also you can find lectures (videos and notes) by Andrew Ng, Andrej Karpathy etc helpful.
Hi I'm pretty new to Python and to NLP. I need to implement a perceptron classifier. I searched through some websites but didn't find enough information. For now I have a number of documents which I grouped according to category(sports, entertainment etc). I also have a list of the most used words in these documents along with their frequencies. On a particular website there was stated that I must have some sort of a decision function accepting arguments x and w. x apparently is some sort of vector ( i dont know what w is). But I dont know how to use the information I have to build the perceptron algorithm and how to use it to classify my documents. Have you got any ideas? Thanks :)
How a perceptron looks like
From the outside, a perceptron is a function that takes n arguments (i.e an n-dimensional vector) and produces m outputs (i.e. an m-dimensional vector).
On the inside, a perceptron consists of layers of neurons, such that each neuron in a layer receives input from all neurons of the previous layer and uses that input to calculate a single output. The first layer consists of n neurons and it receives the input. The last layer consist of m neurons and holds the output after the perceptron has finished processing the input.
How the output is calculated from the input
Each connection from a neuron i to a neuron j has a weight w(i,j) (I'll explain later where they come from). The total input of a neuron p of the second layer is the sum of the weighted output of the neurons from the first layer. So
total_input(p) = Σ(output(k) * w(k,p))
where k runs over all neurons of the first layer. The activation of a neuron is calculated from the total input of the neuron by applying an activation function. An often used activation function is the Fermi function, so
activation(p) = 1/(1-exp(-total_input(p))).
The output of a neuron is calculated from the activation of the neuron by applying an output function. An often used output function is the identity f(x) = x (and indeed some authors see the output function as part of the activation function). I will just assume that
output(p) = activation(p)
When the output off all neurons of the second layer is calculated, use that output to calculate the output of the third layer. Iterate until you reach the output layer.
Where the weights come from
At first the weights are chosen randomly. Then you select some examples (from which you know the desired output). Feed each example to the perceptron and calculate the error, i.e. how far off from the desired output is the actual output. Use that error to update the weights. One of the fastest algorithms for calculating the new weights is Resilient Propagation.
How to construct a Perceptron
Some questions you need to address are
What are the relevant characteristics of the documents and how can they be encoded into an n-dimansional vector?
Which examples should be chosen to adjust the weights?
How shall the output be interpreted to classify a document? Example: A single output that yields the most likely class versus a vector that assigns probabilities to each class.
How many hidden layers are needed and how large should they be? I recommend starting with one hidden layer with n neurons.
The first and second points are very critical to the quality of the classifier. The perceptron might classify the examples correctly but fail on new documents. You will probably have to experiment. To determine the quality of the classifier, choose two sets of examples; one for training, one for validation. Unfortunately I cannot give you more detailed hints to answering these questions due to lack of practical experience.
I think that trying to solve an NLP problem with a Neural Network when you're not familiar with either might be a step too far. That you're doing it in a new language is the least of your worries.
I'll link you to my Neural Computation module slides that gets taught at my university. You'll want the slides from session 1 and session 2 in week 2. Right at the bottom of the page is a link to how to implement a neural network in C. With a few modifications should be able to port it to python. You should note that it details how to implement a multilayer perceptron. You only need to implement a single layer perceptron, so ignore anything that talks about hidden layers.
A quick explanation of x and w. Both x and w are vectors. x is the input vector. x contains normalised frequencies for each word you are concerned about. w contains weights for each word you are concerned with. The perceptron works by multiplying the input frequency for each word by its respective weight and summing them up. It passes the result to a function (typically a sigmoid function) that turns the result into a value between 0 and 1. 1 means the perceptron is positive that the inputs are an instance of the class it represents and 0 means it is sure that the inputs really aren't an example of its class.
With NLP you typically learn about the bag of words model first, before moving on to other, more complex, models. With a neural network, hopefully, it will learn its own model. The problem with this is that the neural network will not give you much of an understanding of NLP, other than documents can be classified by the words they contain, and that usually the number and type of words in a document contains most of the information you need to classify a document -- context and grammar do not add much extra detail.
Anyway, I hope that gives a better place from which to start your project. If you're still stuck on a particular part then ask again and I'll do my best to help.
You should take a look at this survey paper on text classification by Frabizio Sebastiani. It tells you all of the best ways to do text classification.
Now, I'm not going to bother you to read the whole thing, but there's one table near the end, where he compares how lots of different people's techniques stack up on lots of different test corpora. Find it, pick the best one (the best perceptron one, if you assignment is specifically to learn how to do this with perceptron), and read the paper he cites that describes that method in detail.
You now know how to construct a good topical text classifier.
Turning the algorithm that Oswald gave you (and that you posted in your other question) into code is a Small Matter of Programming (TM). And if you encounter unfamiliar terms like TF-IDF while you're working, ask your teacher to help you by explaining those terms.
MultiLayer perceptrons (A specific NeuralNet architecture for general classification problem.) Now available for Python from the GraphLab folks:
https://dato.com/products/create/docs/generated/graphlab.deeplearning.MultiLayerPerceptrons.html#graphlab.deeplearning.MultiLayerPerceptrons
I had a try at implementing something similar the other day. I made some code to recognize english looking text vs non-english. I hadn't done AI or statistics in many years, so it was a bit of a shotgun attempt.
My code is here (don't want to bloat the post): http://cnippit.com/content/perceptron-statistically-recognizing-english
Inputs:
I take a text file, split it up into
tri-grams (eg "abcdef" => ["abc",
"bcd", "cde", "def"])
I calculate the relative frequencies of each, and feed that as the inputs to the perceptron (so there are 26^3 inputs)
Despite me not really knowing what I was doing, it seems to work fairly well. The success depends quite heavily on the training data though. I was getting poor results until I trained it on more french/spanish/german text etc.
It's a very small example though, with lots of "lucky guesses" at values (eg. initial weights, bias, threshold, etc.).
Multiple classes:
If you have multiple classes you want to distinquish between (ie. not as simple as "is A or NOT-A"), then one approach is to use a perceptron for each class. Eg. one for sport, one for news, etc.
Train the sport-perceptron on data grouped as either sport or NOT-sport. Similar for news or Not-news, etc.
When classifying new data, you pass your input to all perceptrons, and whichever one returns true (or "fires"), then that's the class the data belongs to.
I used this approach way back in university, where we used a set of perceptrons for recognizing handwritten characters. It's simple and worked pretty effectively (>98% accuracy if I recall correctly).