I'm using this TensorFlow example to train some data I downloaded. But I wanted to be able to input new data (in a list) and see how the net would classify it.
How can I make this?
Probably what you want is simply add after the loop (i.e. after training), the following for prediction
my_predict = sess.run(predict, feed_dict={X: my_data})
where my_data should be Nx4, since 4 is the number of features in the iris dataset, and N number of examples you want to classify.
Then, my_predict is a vector of size N containing the class of each example you provided.
Related
I have a dataset with many categorical features and many features.I want to apply embedding layer to transfer the categorical data to numerical data for the using of the other models.But, I got some error during training.
Now, my training process is:
Perform label encoder to categorical features
Split training and testing data by train_test_split() function
Drop the numerical columns. Only send the categorical features and target y for model training.
And I got this error:
indices[13,0] = 10 is not in [0, 10)
[[node functional_1/embed_6/embedding_lookup (defined at <ipython-input-34-0b6b3ae455d0>:4) ]] [Op:__inference_train_function_3509]
Errors may have originated from an input operation.
Input Source operations connected to node functional_1/embed_6/embedding_lookup:
functional_1/embed_6/embedding_lookup/2395 (defined at /usr/lib/python3.6/contextlib.py:81)
Function call stack:
train_function
After searching, someone says the problem is that the vocabulary_size parameter of embedding layer is wrong. Enlarge the vocabulary_size can solve this problem.
But in my case, I need to map the result back to original label.
For example, I have a categorical feature ['dog', 'cat', 'fish']. After label encode, it become[0,1,2]. An embedding layer for this feature with 3 unique variable should output something like
([-0.22748041], [-0.03832678], [-0.16490786]).
Then I can replace the ['dog'] variable in original data as -0.22748041, replace ['cat'] variable as -0.03832678, and so on.
So, I can't change the vocabulary_size or the output dimension will be wrong.
I guess the problem in my case is that not all of the categorical variable are go into the training process.
(E.x. Only ['dog', 'fish'] are in the training data. ['cat'] is only appear in testing data). If I set the vocabulary_size as 3, it will report an error like above. If I experimentally add ['cat'] to training data. It works fine.
My problem is, dose embedding layer have to look all of the unique value in training process to perform the application I want? If there are a lot of categorical data with a lot of unique value, how to ensure all the unique value appear in testing data when splitting data.
Thanks in advance!
Solution
You need to use out-of-vocabulary buckets when creating the the lookup table.
oov buckets allow to lookup of unknown category if found during testing.
What the solution does?
Setting it to a required number (like 1000) will allow you to get ids of those other category as well which were not present in test data categories.
words = tf.constant(vocabulary)
word_ids = tf.range(len(vocabulary), dtype=tf.int64)
# important
vocab_init = tf.lookup.KeyValueTensorInitializer(words, word_ids)
num_oov_buckets = 1000
table = tf.lookup.StaticVocabularyTable(vocab_init, num_oov_buckets) # lokup table for ids->category
Then you can encode the training set (I am using TensorFlow Dataset IMDb rating dataset)
def encode_words(X_batch, y_batch):
"""
Encode the training set converting words to IDs
using the lookup table just created
"""
return table.lookup(X_batch), y_batch
train_set = datasets["train"].batch(32).map(preprocess)
train_set = train_set.map(encode_words).prefetch(1)
when creating model:
vocab_size=10000 # whatever the length of variable vocabulary is of
embedding_size = 128 # tweakable | hyperparameter
model = keras.models.Sequential([
keras.layers.Embedding(vocab_size + num_oov_buckets, embedding_size,
input_shape=[None]),
# usual code follows
])
and fit the data
model.compile(loss="binary_crossentropy",
optimizer="adam",
metrics="accuracy")
history = model.fit(train_set, epochs=5)
i have made a cnn using keras.
Now i wanted to extract features of my train set from this model. I compiled the model and trained it on the train set first. Then i used the 'predict; to extract features of the train set. Following lines of code used.
train_feature = model.predict(X_TRAIN)
print(train_feature.shape) # (692,10)
692 are the total train images. Now what does 10 represent? I had 10 classes. What is 10 representing over here?
This isn't called "extracting features". So you shouldn't assign to this name:
train_feature = model.predict(X_TRAIN) # I suggest train_output or something
The number of columns, ie 10, is the number of categories you have, assuming you built your model properly. Each of the 10 categories will result in a probability when making a forward pass.
I am attempting to apply data augmentation to a TFRecord dataset after it has been parsed. However, when I check the size of the dataset before and after mapping the augmentation function, the sizes are the same. I know the parse function is working and the datasets are correct as I have already used them to train a model. So I have only included code to map the function and count the examples afterward.
Here is the code I am using:
num_ex = 0
def flip_example(image, label):
flipped_image = flip(image)
return flipped_image, label
dataset = tf.data.TFRecordDataset('train.TFRecord').map(parse_function)
for x in dataset:
num_ex += 1
num_ex = 0
dataset = dataset.map(flip_example)
#Size of dataset
for x in dataset:
num_ex += 1
In both cases, num_ex = 324 instead of the expected 324 for non-augmented and 648 for augmented. I have also successfully tested the flip function so it seems the issue is with how the function interacts with the dataset. How do I correctly implement this augmentation?
When you apply data augmentation with the tf.data API, it is done on-the-fly, meaning that every example is transformed as implemented in your method. Augmenting data this way does not mean that the number of examples in your pipeline changes.
If you want to use every example n times, simply add dataset = dataset.repeat(count=n). You might want to update your code to use tf.image.random_flip_left_right, otherwise the flip is done the same way each time.
In your example the second time you check num_ex, dataset only contains the flipped images so 324.
Furthermore if you have a large dataset, larger than 324, you might want to look into online data augmentation. In this case, during training the dataset is augmented differently every epoch, and you only train on the augmented data not on the original dataset. This helps the trained model generalise better. (https://www.tensorflow.org/tutorials/images/data_augmentation)
I am getting correct_eval as 0. I have used boston dataset. Splitted into training and testing. Used tensorflow for training the model. (Not keras). The neural networks consists of 2 hidden layers of size 13 each and input size is also 13.
import pandas as pd
import numpy as np
data=pd.read_csv("Boston_Housing.csv")
x=data.iloc[:,0:13]
x=np.array(x)
y=data.iloc[:,13]
y=np.array(y)
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y)
import tensorflow as tf
tf.__version__
input_width=13;
num_layers=2
n_hidden_layer1=13
n_hidden_layer2=13
n_op=1
weights={
"w_h1":tf.Variable(tf.random_normal([input_width,n_hidden_layer1])),
"w_h2":tf.Variable(tf.random_normal([n_hidden_layer1,n_hidden_layer2])),
"w_op":tf.Variable(tf.random_normal([n_hidden_layer2,n_op]))
}
biases={
"b_h1":tf.Variable(tf.random_normal([n_hidden_layer1])),
"b_h2":tf.Variable(tf.random_normal([n_hidden_layer2])),
"b_op":tf.Variable(tf.random_normal([n_op]))
}
tf.trainable_variables()
def forwardPropagation(x,weights,biases):
ip_h1=tf.add(tf.matmul(x,weights['w_h1']),biases['b_h1'])
op_h1=tf.nn.relu(ip_h1)
ip_h2=tf.add(tf.matmul(op_h1,weights['w_h2']),biases['b_h2'])
op_h2=tf.nn.relu(ip_h2)
ip_op=tf.add(tf.matmul(op_h2,weights['w_op']),biases['b_op'])
op_op=tf.nn.relu(ip_op)
return op_op
s=tf.Session()
s.run(tf.global_variables_initializer())
x=tf.placeholder("float",[None,input_width])
y=tf.placeholder("float",[None,n_op])
pred=forwardPropagation(x,weights,biases)
correct_pred=tf.equal(pred,y_train)
pred_eval,correct_eval=s.run([pred,correct_pred],feed_dict={x:x_train,y:y_train})
pred_eval,correct_eval
correct_eval.sum()
correct_eval
correct_eval is 0. Which means no prediction is correct. pred values are mostly 0 or completely random. kindly help me resolve this.
Take a look at this line of code:
correct_pred=tf.equal(pred,y_train)
You are evaluating outputs from an untrained regression model using equality. There are a couple problems with this.
The values in y_train are produced by 3 layers that have random weights and biases. Each layer transforms the inputs using completely random transformations. Before you train your model on the dataset, the values in y_train will be very different from the values in pred.
pred and y_train both contain continuous values. It is almost always a bad idea to check for absolute equality between two continuous values because they need to be exactly the same values for the equality to be True. Say that you have trained your model, and the outputs in pred match the values in y_train very closely. Unless they exactly match up to the last significant digit, the comparison will always be False. Therefore, you always get correct_eval=0.
Most probably, you will want to calculate a metric like the mean square error (MSE) between pred and y_train. tf.keras.losses.MeanSquaredError is the common way to calculate the MSE in Tensorflow 2.0.
As for this,
pred values are mostly 0 or completely random.
You are passing the outputs from the last layer through a ReLU function, which returns 0 for all negative inputs. Again, since the network's outputs come from random transformations, the outputs are random values with zeros in place of negative values. This is expected, and you will need to train your network for it to give any meaningful outputs.
It also looks like you are using Tensorflow 1.x, in which case you can use tf.losses.mean_squared_error.
Good luck!
I've taken a quick course in neural networks to better understand them and now I'm trying them out for myself in R. I'm following this documentation of Keras.
The way I understand what is happening:
We are inputting a series of images and transforming these images to numerical matrices based on the arrangement of the pixels and colors in those pixels. We then build a neural network model to learn the pattern of these arrangements, depending on the classification (0 to 9). We then use the model to predict which class an image belongs to. I'll be honest and admit I'm not entirely sure what y_train and x_train is. I simply see it as one training and one validation set so I'm not sure what the difference between x and y is.
My question:
I've followed the steps to the T and the model runs fine and the predictions look like they do in the documentation. Ultimately, the prediction looks like this:
I take this to mean that observation 1 in x_test is predicted to be a category 7.
However, looking at x_test it looks like this:
There is a 0 in every column and row, also if I scroll further down. This is where I get confused. I'm also not sure how I view the original images to view for myself how well they are predicting them. I would eventually like to draw a number myself in paint or so and then see if the model can predict it, but for that I need to first understand what is going on. I feel I am close but I just need a little nudge!
I think if you read more about the input and output layer's dimensions, that would help.
In your example:
Input layer:
A single training example of image has two dimensions 28*28, which is then converted to a single vector of dimension 784. This acts as the input layer for the neural network.
So for m training examples, your input layer will have dimensions (m, 784). Analogically speaking (to traditional ML systems), you can imagine that each pixel of an image is converted into a feature (or x1, x2, ... x784), and your training set is a dataframe with m rows and 784 columns, which is then fed into neural network to compute y_hat = f(x1,x2,x3,...x784).
Output layer:
As an output for our neural network, we want it to predict which number it is from 0 to 9. So for a single training example, the output layer has dimension 10, representing each number from 0 to 9 and for n testing examples the output layer would be a matrix with dimension n*10.
Our y is a vector of length n which would be something like [1,7,8,2,.....] containing true value for each testing example. But to match the dimension of output layer, the y vector's dimension are converted using one-hot encoding. Imagine a length 10 vector, representing number 7 by putting 1 at 7th place and rest of the positions zeros something like [0,0,0,0,0,0,1,0,0,0].
So in your question, if you wish to see the original image, you should be able to see it before reshaping the training examples with something like image(mnist$test$x[1, , ]
Hope this helps!!
y_train are the labels and x_train is the training data, so images in this example. You need to use some kind of plotting library to plot x'es. In this example you probably are not expected to input your own drawings and if you want you would need to preprocess them in the same way as in MNIST and pass them to the model.