PyBrain neuron manipulation - python

Is there a good way to add/remove a neuron and its associated connections into/from a fully connected PyBrain network? Say I start with:
from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(2,3,1)
How would I go about making it a (2,4,1) or a (2,2,1) network WHILE maintaining all the old weights (and initializing any new ones to be random as is done when initializing the network)? The reason I want to do this is because I am attempting to use an evolutionary learning strategy to determine the best architecture and the 'mutation' step involves adding/removing nodes with some probability. (The input and output modules should always remain the same.)
edit: I found NeuronDecomposableNetwork which should make this easier, but it still seems that I have to keep track of neurons and connections separately.

I assume you're doing along the lines of the NEAT algorithm?
There are two different answers to your question:
Open ended evolution of the network topology: in this case, I recommend encapsulating every neuron in its own "layer"/module, and add/remove them and their connections to the network iteratively, a bit like in this tutorial, except that there will be many more (single-neuron) layers. Don't forget to call the sortModules() method after each topological change.
Finding the best topology within a predefined framework (say a maximum of 1000 neurons). In that case it's easier and more efficient to build the full network in the beginning, and just mask some of the connections (e.g. using the MaskedParameters module). Among others, memetic algorithms (used like this) are designed to search such topology spaces.
An alternative, as you say, is manually managing all the weights (by tracking what is where, or using NeuronDecomposableNetwork) but I don't recommend that.
A general comment: for more advanced uses of pybrain such as yours, relying on the `buildNetwork' shortcut is really too limited, and you will want to use the Network/Module/Connection API directly.

Related

Shared weights with model parallelism in PyTorch

Our setup involves initial part of the network (input interface) which run on separate GPU cards. Each GPU gets its own portion of data (model parallelism) and process it separately.
Each input interface, in turn, it itself a complex nn.Module. Every input interface can occupy one or several cards (say, interface_1 runs on GPU 0 and 1, interface_2 - on GPU 2 and 3 and so on).
We need to keep the weights of these input interface the same all over the training. We also need them to run in parallel to save training time which is already weeks for our scenario.
The best idea we can think of was initializing the interfaces with the same weights and then average the gradients for them. As the interfaces are identical, updating same weights with the same gradients should keep them the same all over the training process thus achieving desired “shared weights” mode.
However, I cannot find any good solution for changing values of these weights and their gradients represented as Parameter in PyTorch. Apparently, PyTorch does not allow to do so.
Our current state is: if we copy.deepcopy the ‘parameter.data’ of the “master” interface and assign it to ‘parameter.data’ of the "slave" interface, the values are indeed changed but .to(device_id) does not work and keeps them at the “master” device. However, we need them to move to a “slave” device.
Could someone please tell me if it is possible at all or, if not, if there is a better way to implement shared weights along with the parallel execution for our scenario?

Neural network NOT organized in layers with TensorFlow or Keras

I need to implement a neural network which is NOT layer based, meaning that ANY neuron may be connected to any other neuron, and that there's no way to logically organize them in consecutive layers.
What I'm asking for is an example or a reference to proper and clear documentation about how to implement the following:
Originally I had my own implementation in matlab, however, I've been using TensorFlow and Keras to test simple models and it allows to tune your networks very fast and the implementations are pretty efficient, so I decided to try out more complex models, however, I just got stuck creating this type of network.
HINT: It MAY be OK to create single-neuron layers, as long as you can connect a layer to ANY layer (without caring if it is not adjacent) and to MORE THAN ONE LAYER.
I'm new to Tf and Keras, so a simple python example would be appreciated, althought, pointing me in the right direction would be OK.
This is an example network (¡loops are intentional!):
I dont need to train at the moment, just to evaluate models, however, keep in mind that evaluation of this kind of network is different too, one possible way is to keep with the signal sending until output stabilices, but it is just an example.

R-CNN: looking for REPO where FC for classification is retrainable

I'm studying different object detection algorithms for my interest.
The main reference are Andrej Karpathy's slides on object detection slides here.
I would like to start from some reference, in particular something which allows me to directly test some of the network mentioned on my data (mainly consisting in onboard cameras of car and bike races).
Unfortunately I already used some pretrained network (repo forked from JunshengFu one, where I slightly adapt Yolo to my use case), but the classification accuracy is rather poor, I guess because there were not many training instances of racing cars like Formula 1.
For this reason I would like to retrain the networks and here is where I'm finding the most issues:
properly training some of the networks requires either hardware (powerful GPUs) or time I don't have so I was wondering whether I could retrain just some part of the network, in particular the classification network and if there is any repo already allowing that.
Thank you in advance
That is called fine-tuning of the network or transfer-learning. Basically you can do that for any network you find (having similar problem domains of course), and then depending on the amount of the data you have you will either fine-tune whole network or freeze some layers and train only last layers. For your case you would probably need to freeze whole network except last fully-connected layers (which you will actually replace with new ones, satisfying your number of classes), which perform classification. I don't know what library you use, but tensorflow has official tutorial on transfer-learning. However it's not very clear tbh.
More user-friendly tutorial you can find here by some enthusiast: tutorial. Here you can find a code repository as well. One correction you need thou is that the author performs fine-tuning of the whole network, while if you want to freeze some layers you will need to get list of the trainable variables and remove those you want to freeze and pass the resultant list to the optimizer (so he ignores removed vars), like following:
all_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,scope='InceptionResnetV2')
to_train = all_vars[-6:] // you better specify them by name explicitely, but this still will work
optimizer = tf.train.AdamOptimizer(lr=0.0001)
train_op = slim.learning.create_train_op(total_loss,optimizer, variables_to_train=to_train)
Further, tensorflow has a so called model zoo (bunch of trained models you can use for your purposes and transfer-learning). You can find it here.

Is it possible to merge multiple TensorFlow graphs into one?

I have two models trained with Tensorflow Python, exported to binary files named export1.meta and export2.meta. Both files will generate only one output when feeding with input, say output1 and output2.
My question is if it is possible to merge two graphs into one big graph so that it will generate output1 and output2 together in one execution.
Any comment will be helpful. Thanks in advance!
I kicked this around with my local TF expert, and the brief answer is "no"; TF doesn't have a built-in facility for this. However, you could write custom endpoint layers (input and output) with synch operations from Python's process management, so that they'd maintain parallel processing of each input, and concatenate the outputs.
Rationale
I like the way this could be used to get greater accuracy with multiple features, where the features have little or no correlation. For instance, you could train two character recognition models: one to identify the digit, the other to discriminate between left- and right-handed writers.
This would also allow you to examine the internal kernels that evolved for each individual feature, without interdependence with other features: the double-loop of an '8' vs the general slant of right-handed writing.
I also expect that the models for individual features will converge measurably faster than one over-arching training session.
Finally, it's quite possible that the individual models could be used in mix-and-match feature sets. For instance, train another model to differentiate letters, while letting your previously-trained left/right flagger would still have a pretty good guess at the writer's moiety.

Can a neural network recognize a screen and replicate a finite set of actions?

I learned, that neural networks can replicate any function.
Normally the neural network is fed with a set of descriptors to its input neurons and then gives out a certain score at its output neuron. I want my neural network to recognize certain behaviours from a screen. Objects on the screen are already preprocessed and clearly visible, so recognition should not be a problem.
Is it possible to use the neural network to recognize a pixelated picture of the screen and make decisions on that basis? The amount of training data would be huge of course. Is there way to teach the ANN by online supervised learning?
Edit:
Because a commenter said the programming problem would be too general:
I would like to implement this in python first, to see if it works. If anyone could point me to a resource where i could do this online-learning thing with python, i would be grateful.
I would suggest
http://www.neuroforge.co.uk/index.php/getting-started-with-python-a-opencv
http://docs.opencv.org/doc/tutorials/ml/table_of_content_ml/table_of_content_ml.html
http://blog.damiles.com/2008/11/the-basic-patter-recognition-and-classification-with-opencv/
https://github.com/bytefish/machinelearning-opencv
openCV is basically an image processing library but also has some amazing helper classes that you you can use for almost any task. Its machine learning module is pretty easy to use and you can go through the source to see explanation and background theory about each function.
You could also use a pure python machine learning library like:
http://scikit-learn.org/stable/
But, before you feed in the data from your screen (i'm assuming thats in pixels?) to your ANN or SVM or whatever ML algorithm you choose, you need to perform "Feature Extraction" on your data. (which are the objects on the screen)
Feature Extraction can be thought of like representing the same data on the screen but with fewer numbers so i have less numbers to give to my ANN. You need to experiment with different features before you find a combination that works well for your particular scenario. a sample one could look something like this:
[x1,y1,x2,y2...,col]
This is basically a list of edge points that represent the area your object is in. a sort of ROI (Region of Interest) and perform egde detection, color detection and also extract any other relevant characteristics. The important thing is that now all your objects, their shape/color information is represented by a number of these lists, one for each object detected.
This is the data that can be provided as input to the neural network. but you'll have to define some meaningfull output parameters depending on your specific problem statements before you can train/test your system of course.
Hope this helps.
This is not entirely correct.
A 3-layer feedforward MLP can theoretically replicate any CONTINUOUS function.
If there are discontinuities, then you need a 4th layer.
Since you are dealing with pixelated screens and such, you probably would need to consider a fourth layer.
Finally, if you are looking at circular shapes, etc., than a radial basis function (RBF) network may be more suitable.

Categories

Resources