Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have recently been looking into using TensorFlow for creating a custom CNN, and have been attempting to use the tutorials for insight on the most straightforward way to design, train, and deploy an image classification network.
The two approaches that have stood out to me are:
TF Layers API: This API seems to provide the most straightforward and intuitive way of defining a network, layer by layer. That said, the way that they train and evaluate the model uses the tf.learn.Estimator class, which seems a bit limiting in that the network is strictly trained using the Estimator's fit() method and validated using the evaluate() method. This tutorial does not even use a tf.Session.
Low-level API: Defining a network seems a bit more tedious. Also, training and deploying is done in a very manual fashion, but it appears to offer more control.
For a TensorFlow novice looking to implement and train relatively basic CNN's who is looking for the ability to tinker with network architecture and basic hyperparameter tuning, what would be the best API to get familiar with?
Also, if there are any useful tutorials or examples using your preferred interface, links would be greatly appreciated.
Keras is a nice frontend for Tensorflow. It sounds like it should fit your needs. Here is an example of someone training a CNN with Keras.
I like what is described on this page: write your stuff "by hand", but give the model a transparent class interface to the outside. TF graph stuff is handled as properties and set up at construction, and then the model can be used without having to know TF. A bit like Keras, but giving you full control (and forcing you to learn the low level). It lacks Keras' composability, though.
Basically, it recommends you do something along the lines of this:
class Model:
#lazy_property
def prediction(self):
...
#lazy_property
def optimize(self):
# actual TF stuff here, e.g.:
cross_entropy = -tf.reduce_sum(self.target, tf.log(self.prediction))
optimizer = tf.train.RMSPropOptimizer(0.03)
return optimizer.minimize(cross_entropy)
#lazy_property
def error(self):
...
If it interests you, I tried to package up that approach with a common base class and decorators here. I tried to stick to the Keras API, and sessions are explicitly handled with withs. The code, however, has not actually been used in training -- I just wrote it after getting fed up with repeating everything in serveral university projects, and wanting to produce something cleaner.
Related
This tutorial describes how to build a TFF computation from keras model.
This tutorial describes how to build a custom TFF computation from scratch, possibly with a custom federated learning algorithm.
What I need is a combination of these: I want to build a custom federated learning algorithm, and I want to use an existing keras model. Q. How can it be done?
The second tutorial requires MODEL_TYPE which is based on MODEL_SPEC, but I don't know how to get it. I can see some variables in model.trainable_variables (where model = tff.learning.from_keras_model(keras_model, ...), but I doubt it's what I need.
Of course, I can implement the model by hand (as in the second tutorial), but I want to avoid it.
I think you have the correct pointers for writing a custom federated computation, as well as converting a Keras model to a tff.learning.Model. So we'll focus on pulling a TFF type signature from an existing tff.learning.Model.
Once you have your hands on such a model, you should be able to use tff.learning.framework.weights_type_from_model to pull out the appropriate TFF type to use for your custom algorithm.
There is an interesting caveat here: how precisely you use a tff.learning.Model in your custom algorithm is pretty much up to you, and this could affect your desired model weights type. This is unlikely to be the case (likely you will simply be assigning values from incoming tensors to the model variables), so I think we should prefer to avoid going deeper into this caveat.
Finally, a few pointers of end-to-end custom algorithm implementations in TFF:
One of the simplest complete examples TFF has is simple_fedavg, which is totally self-contained and contains instructions for running.
The code for a paper on Adaptive Federated Optimization contains a handwritten implementation of learning rate decay on the clients in TFF.
A similar implementation of adaptive learning rate decay (think Keras' functions to decay learning rate on plateaus) is right next door to the code for AFO.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I have a BERT multilanguage model from Google. And I have a lot of text data in my language (Korean). I want BERT to make better vectors for texts in this language. So I want to additionally train BERT on that text corpus I have. Like if I would have w2v model trained on some data and would want to continue training it. Is it possible with BERT?
There are a lot of examples of "fine-tuning" BERT on some specific tasks like even the original one from Google where you can train BERT further on your data. But as far as I understand it (I might be wrong) we do it within our task-specified model (for classification task for example). So... we do it at the same time as training our classifier (??)
What I want is to train BERT further separately and then get fixed vectors for my data. Not to build it into some task-specified model. But just get vector representation for my data (using get_features function) like they do in here. I just need to train the BERT model additionally on more data of the specific language.
Would be endlessly grateful for any suggestions/links on how to train BURT model further (preferably Tensorflow). Thank you.
Package transformers provides code for using and fine-tuning of most currently popular pre-trained Transformers including BERT, XLNet, GPT-2, ... You can easily load the model and continue training.
You can get the multilingual BERT model:
tokenizer = BertTokenizer.from_pretrained('bert-base-multiligual-cased')
model = TFBertForSequenceClassification.from_pretrained('bert-base-multiligual-cased')
The tokenizer is used both for tokenizing the input and for converting the sub-words into embedding ids. Calling the model on the subword indices will give you hidden states of the model.
Unfortunately, the package does not implement the training procedure, i.e., the masked language model and the next sentence prediction. You will need to write it yourself, but the training procedure well described in the paper and the implementation will be straightforward.
I am starting to learn Tensorflow2.0 and one major source of my confusion is when to use the keras-like model.compile vs tf.GradientTape to train a model.
On the Tensorflow2.0 tutorial for MNIST classification they train two similar models. One with model.compile and the other with tf.GradientTape.
Apologies if this is trivial, but when do you use one over the other?
This is really a case-specific thing and it's difficult to give a definite answer here (it might border on "too opinion-based). But in general, I would say
The "classic" Keras interface (using compile, fitetc.) allows for quick and easy building, training & evaluation of standard models. However, it is very high-level/abstract and as such doesn't give you much low-level control. If you are implementing models with non-trivial control flow, this can be hard to accommodate.
GradientTape gives you full low-level control over all aspects of training/running your model, allowing easier debugging as well as more complex architectures etc., but you will need to write more boilerplate code for many things that a compiled model will hide from you (e.g. training loops). Still, if you do research in deep learning you will probably be working on this level most of the time.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Are these libraries fairly interchangeable?
Looking here, https://stackshare.io/stackups/keras-vs-pytorch-vs-scikit-learn, it seems the major difference is the underlying framework (at least for PyTorch).
Yes, there is a major difference.
SciKit Learn is a general machine learning library, built on top of NumPy. It features a lot of machine learning algorithms such as support vector machines, random forests, as well as a lot of utilities for general pre- and postprocessing of data. It is not a neural network framework.
PyTorch is a deep learning framework, consisting of
A vectorized math library similar to NumPy, but with GPU support and a lot of neural network related operations (such as softmax or various kinds of activations)
Autograd - an algorithm which can automatically calculate gradients of your functions, defined in terms of the basic operations
Gradient-based optimization routines for large scale optimization, dedicated to neural network optimization
Neural-network related utility functions
Keras is a higher-level deep learning framework, which abstracts many details away, making code simpler and more concise than in PyTorch or TensorFlow, at the cost of limited hackability. It abstracts away the computation backend, which can be TensorFlow, Theano or CNTK. It does not support a PyTorch backend, but that's not something unfathomable - you can consider it a simplified and streamlined subset of the above.
In short, if you are going with "classic", non-neural algorithms, neither PyTorch nor Keras will be useful for you. If you're doing deep learning, scikit-learn may still be useful for its utility part; aside from it you will need the actual deep learning framework, where you can choose between Keras and PyTorch but you're unlikely to use both at the same time. This is very subjective, but in my view, if you're working on a novel algorithm, you're more likely to go with PyTorch (or TensorFlow or some other lower-level framework) for flexibility. If you're adapting a known and tested algorithm to a new problem setting, you may want to go with Keras for its greater simplicity and lower entry level.
Well I start learning Tensorflow but I notice there's so much confusion about how to use this thing..
First, some tutorials present models using low level API tf.varibles, scopes...etc, but other tutorials use Keras instead and for example to use tensor board to invoke callbacks.
Second, what's the purpose of having ton of duplicate API, really what's the purpose behind using high level API like Keras when you have low level to build model like Lego blocks?
Finally, what's the true purpose of using eager execution?
You can use these APIs all together. E.g. if you have a regular dense network, but with an special layer you can use higher level API for dense layers (tf.layers and tf.keras) and low level API for your special layer. Furthermore, it is complex graphs are easier to define in low level APIs, e.g. if you want to share variables, etc.
Eager execution helps you for fast debugging, it evaluates tensors directly without a need of invoking a session.
There are different "levels" of APIs (high-level APIs such as keras and estimators, and low level APIs such as Variables, etc) to suit different developer needs.
For the average industry developer, who already knows approximately what ML model you intend to use, keras is a good fit. For example, if you know you want to implement a sequential model with two dense layers with softmax activation, you need only do something like:
model = keras.Sequential([
keras.layers.Dense(128, activation=tf.nn.softmax),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
Using keras is generally simpler as you don't have to think about low-level implementation details such as tf.Variables. For more complete examples, check out the keras tutorials on tensorflow.org.
The low-level API allows users finer control over the models you're developing. These APIs are more commonly used by developers and researchers developing novel ML methods; for example, if you need a specialized layer that does something different from canonical ML methods, you can manually define a layer using low level APIs.
Finally, eager execution is an imperative programming style. It enables faster debugging, and has a gentler learning curve for those new to tensorflow, since it is more "pythonic"/intuitive. Check out the eager guide for more.