Are the below two lines basically the same thing?
tf.keras.layers.experimental.preprocessing.Normalization()
tf.keras.layers.Normalization()
I am trying to normalize(standardize in this case) the inputs for
fitting the neural network model using Tensorflow. After googling I found the two choices above. They seem to be the same thing but I'm not so sure. If they aren't the same, could anyone tell me the exact difference?
They are technically the same thing. But you should use this one
tf.keras.layers.Normalization()
Because this one is not available anymore.
tf.keras.layers.experimental.preprocessing.Normalization()
Related
I am working on comparing different explaination techniques of black box prediction problems.
I read the paper about this rulebased explanation technique called LORE (https://arxiv.org/abs/1805.10820) and i have found the official repository (https://github.com/riccotti/LORE). They apply it to a classification problem, but i would need to use it on regression problem with a neural network i have created.
Since there's no documentation, and comments in code are really poor, i am having difficulty trying to understand how to change the code to adapt it to my case, so i was wondering if anyone had the same problem and, in case, how they solved it.
I have created a prediction model and used RNN in it offered by the tensorflow library in Python. Here is the complete code I have created and tried:
Jupyter Notbook of the Code
But I have doubts.
1) Whether RNN is correct for what I am trying to predict?
2) Is there a better algorithm I can try?
3) Can anyone suggest me how I can give multiple inputs and get the necessary output using tensorflow model? Can anyone guide me please.
I hope I am clear on my points. Please do tell me if anything else required.
Having doubts is normal, but you should try to measure them before asking for advice. If you don't have a clear thing you want to improve it's unlikely you will get something better.
1) Whether RNN is correct for what I am trying to predict?
Yes. RNN is used appropriately here. If you don't care much about having arbitrary length input sequences, you can also try to force them to a fixed size and then apply convolutions on top (see convolutional NeuralNetworks), or even try with a more simple DNN.
The more important question to ask yourself is if you have the right inputs and if you have sufficient training data to learn what you hope to learn.
2) Is there a better algorithm I can try?
Probably no. As I said RNN seems appropriate for this problem. Do try some hyper parameter tuning to make sure you don't accidentally just pick a sub-optimal configuration.
3) Can anyone suggest me how I can give multiple inputs and get the necessary output using tensorflow model? Can anyone guide me please.
The common way to handle variable length inputs is to set a max length and pad the shorter examples until they reach that length. The max length can be a variable you pick or you can dynamically set it to the largest length in the batch. This is needed only because the internal operations are done in batches. You can pick which results you want. Picking the last one is reasonable (the model will just have to learn to propagate the state for the padding values). Another reasonable thing to do is to pick the first one you get after feeding the last meaningful value into the RNN.
Looking at your code, there's one thing I would improve:
Instead of computing a loss on the last value only, I would compute it over all values in the series. This gives your model more training data with very little performance degradation.
Maybe this is a stupid question, but I switched from basic TensorFlow recently to tflearn and while I knew little of TensorFlow, I know even less of tflearn as I have just begun to experiment with it. I was able to create a network, train it, and generate a model that achieved a satisfactory metric. I did this all without using a TensorFlow session because a) none of the documentation I was looking at necessarily suggested it and b) I didn't even think to use it.
However, I would like to predict a value for a single input (the model performs regression on images, so I'm trying to get a value for a single image) and now I'm getting an error that the convolutional layers need to be initialized (Specifically "FailedPreconditionError: Attempting to use uninitialized value Conv2D/W").
The only thing I've added, though, are two lines:
model = Evaluator(network)
model.predict(feed_dict={input_placeholder: image_data})
I'm asking this as a general question because my actual code is a bit troublesome to just post here because admittedly I've been very sloppy in writing it. I will mention, however, that even if I start a session and initialize all variables before that second line, then run the line in the session, I get the same error.
Succinctly put, does tflearn require a session if I've not used TensorFlow stuff directly anywhere in my code? If so, does the model need to be trained in the session? And if not, what about those two lines would cause such an error?
I'm hoping it isn't necessary for more code to be posted, but if this isn't a general issue and is actually specific to my code then I can try to format it to be understandable here and then edit the post.
I'm trying to write something similar to google's wide and deep learning after running into difficulties of doing multi-class classification(12 classes) with the sklearn api. I've tried to follow the advice in a couple of posts and used the tf.group(logistic_regression_optimizer, deep_model_optimizer). It seems to work but I was trying to figure out how to get predictions out of this model. I'm hoping that with the tf.group operator the model is learning to weight the logistic and deep models differently but I don't know how to get these weights out so I can get the right combination of the two model's predictions. Thanks in advance for any help.
https://groups.google.com/a/tensorflow.org/forum/#!topic/discuss/Cs0R75AGi8A
How to set layer-wise learning rate in Tensorflow?
tf.group() creates a node that forces a list of other nodes to run using control dependencies. It's really just a handy way to package up logic that says "run this set of nodes, and I don't care about their output". In the discussion you point to, it's just a convenient way to create a single train_op from a pair of training operators.
If you're interested in the value of a Tensor (e.g., weights), you should pass it to session.run() explicitly, either in the same call as the training step, or in a separate session.run() invocation. You can pass a list of values to session.run(), for example, your tf.group() expression, as well as a Tensor whose value you would like to compute.
Hope that helps!
I try to get reliable features for ImageNet to do further classification on them. To achieve that I would like to use tensorflow with Alexnet, for feature extraction. That means I would like to get the values from the last layer in the CNN. Could someone write a piece of Python code that explains how that works?
As jonrsharpe mentioned, that's not really stackoverflow's MO, but in practice, many people do choose to write code to help explain answers (because it's often easier).
So I'm going to assume that this was just miscommunication, and you really intended to ask one of the following two questions:
How does one grab the values of the last layer of Alexnet in TensorFlow?
How does feature extraction from the last layer of a deep convolutional network like alexnet work?
The answer to the first question is actually very easy. I'll use the cifar10 example code in TensorFlow (which is loosely based on AlexNet) as an example. The forward pass of the network is built in the inference function, which returns a variable representing the output of the softmax layer. To actually get predicted image labels, you just argmax the logits, like this: (I've left out some of the setup code, but if you're already running alexnet, you already have that working)
logits = cifar10.inference(images)
predictions = tf.argmax(logits,1)
# Actually run the computation
labels = session.run([predictions])
So grabbing just the last layer features is literally just as easy as asking for them. The only wrinkle is that, in this case, cifar10 doesn't natively expose them, so you need to modify the cifar10.inference function to return both:
# old code in cifar10.inference:
# return softmax_linear
# new code in cifar10.inference:
return softmax_linear, local4
And then modify all the calls to cifar10.inference, like the one we just showed:
logits,local4 = cifar10.inference(images)
predictions = tf.argmax(logits,1)
# Actually run the computation, this time asking for both answers
labels,last_layer = session.run([predictions, local4])
And that's it. last_layer contains the last layer for all of the inputs you gave the model.
As for the second question, that's a much deeper question, but I'm guessing that's why you want to work on it. I'd suggest starting by reading up on some of the papers published in this area. I'm not an expert here, but I do like Bolei Zhou's work. For instance, try looking at Figure 2 in "Learning Deep Features for Discriminative Localization". It's a localization paper, but it's using very similar techniques (and several of Bolei's papers use it).