I'm working with Tensorflow but I'm pretty new to Python and machine learning. If I have a tensor of an image from my input pipeline what would be the best way to train it? Like in the basics, how would I handle passing trough data? I have structure I would like to use (I know I can get certain data from certain things like tensors) but I'm just not sure how to do so.
I'm very new to this so all help would be greatly appreciated.
def model(image_tensor):
tf.summary.image(img)
return predictions
def loss(predictions, labels):
return some_loss
def train(some_loss):
return train_op
Tensorflow may be a bit complicated for someone new to machine learning and python. My advice is to go through the excellent notebook tutorials that exist on tensorflow sites and start to understand the abstraction.
However, before that, I would use python with numpy (and sometimes scipy) to implement basic machine methods like Stochastic Gradient Descent just to ensure that you understand how the algorithms work. Then implement a simple logistic regression.
So why do I ask you to do all that? Well, because once you get a good handle of how to work with machine learning algorithm and how tedious it can be find the gradients, you will understand why tensorflow abstraction is useful.
I'm going to provide you with some simple examples dealing with MNIST.
from sklearn.datasets import load_digits
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
mnist = load_digits(2)
print("y [shape: {}] {}] : {}".format(y.shape,y[:10]))
print("x [shape: {}] {}]".format(x.shape)
What i've essentially done above is load two digits from the MNIST dataset (0 and 1) and display the array for the vector y and matrix (x).
If you want to see how the images look you can plt.imshow(X[0].reshape([8,8]))
The next step is to start defining our placeholder and variables
input_x = tf.placeholder(tf.float32,shape=[None,X.shape[1]], name = "input_x")
input_y = tf.placeholder(tf.float32,shape=[None,],name = "labels")
weights = tf.Variable(initial_value = tf.zeros(shape=[X.shape[1],1]), name="weights")
b = tf.Variable(initial_value=0.0, name = "bias")
We have done here is defined two placeholder in tensorflow and have told what the variables should expect as an input. I also gave the placeholder a name for debugging purpose.
prediction_y = tf.squeeze(tf.nn.sigmoid(tf.add(tf.matmul(input_x,weights),tf.cast(b,tf.float32))))
loss = tf.losses.log_loss(input_y,prediction_y)
optimizer = tf.train.Adamoptimizer(0.001).minimize(loss)
There you go, that's a logistic regression in tensorflow. What the last block does is apply the activation function to our input vectors, defines the loss function and then defines an optimizer for the loss function.
The final step is to run it.
from sklearn.metrics import roc_auc_score
s.run(tf.global_variables_initializer())
for i in range(10):
s.run(optimizer,{input_X:X_train, input_y: y_train})
loss_i = s.run(loss, {input_x:x_train,input_y:y_train})
print("loss at iteration {}: {}".format(i, loss_i))
That's essentially how you run your data through tensorflow. This code may have typos, I don't have python on this machine so i'm writing based on memory. However the basic idea is there. Hope this helps.
Edit: Also since you asked best way to train image data. My answer to you would be there isn't a "best". Building a CNN is a typical approach that you may want to experiment using assuming you have large number of classified images. Prior to that people also used support vectors relatively well for classifying images.
Related
Usually people use scikit-learn to train a model this way:
from sklearn.ensemble import GradientBoostingClassifier as gbc
clf = gbc()
clf.fit(X_train, y_train)
predicted = clf.predict(X_test)
It works fine as long as users' memory is large enough to accommodate the entire dataset. The dilemma for me is exactly this--the dataset is too big for my memory. My current solution is to enlarge the virtual memory of my machine and I have already made the system extremely slow by having too much virtual memory--so I start to think whether or not is it possible to feed the fit() method with samples in batches like this (and the answre is no, please keep reading and stop reminding me that the answer is no):
clf = gbc()
for i in range(X_train.shape[0]):
clf.fit(X_train[i], y_train[i])
so that I can read the training set from hard drive only when needed. I read the sklearn's manual and it seems to me that it does not support this:
Calling fit() more than once will overwrite what was learned by any previous fit()
So, is this possible?
This do not work in scikit-learn as explained in the comment section as well as in the documentation. However you can use river ( which is a python package for online/streaming machine learning). This package should be well-suited for you problematic.
Below is an example of training a LinearRegression using river.
from river import datasets
from river import linear_model
from river import metrics
from river import preprocessing
dataset = datasets.TrumpApproval()
model = (
preprocessing.StandardScaler() |
linear_model.LinearRegression(intercept_lr=.1)
)
metric = metrics.MAE()
for x, y, in dataset:
y_pred = model.predict_one(x)
# Update the running metric with the prediction and ground truth value
metric.update(y, y_pred)
# Train the model with the new sample
model.learn_one(x, y)
It is not clear in your question is which steps in the machine learning are slow for you. As also noted in the manual for riverml and this post in sklearn there is an option to do a partial fit. You will be restricted in terms of the models you can use for this incremental learning.
So using your example lets say we use a stochastic gradient descent classifier:
from sklearn.linear_model import SGDClassifier
from sklearn.datasets import make_classification
X,y = make_classification(100000)
clf = SGDClassifier(loss='log')
all_classes = list(set(y))
for ix in np.split(np.arange(0,X.shape[0]),100):
clf.partial_fit(X[ix,:],y[ix],classes = all_classes)
After reading the section 6. Strategies to scale computationally: bigger data of the official manual mentioned by #StupidWolf in this post, I am aware that this question is more to this than meets the eye.
The real difficulty is about the design of a lot of models.
Take Random Forest as an example, one of the most important techniques used to improve its performance compared with the simpler Decision Tree is the application of bagging, which means that the algorithm has to pick some random samples from the entire dataset to construct several weak learners as the basis of the Random Forest. It means that feeding the model with one sample after another won't work with this design.
Although it is still possible for scikit-learn to define an interface for end-users to implement so that scikit-learn can pick a random sample by calling this interface and end-users will decide how their implementation of the interface is about to return the needed data by scanning the dataset on the hard drive, it becomes way more complicated than I initially thought and the performance gain may not be very significant given that the IO-heavy "full table scan" (in database's term) is frequently needed.
Im trying to figure out how to configure a neural network using Neupy. The problem is that I cant seem to find much options for a GRNN, only the sigma value as described here:
There is a parameter, y_i, that I want to be able to adjust, but there doesn't seem to be a way to do it on the package. I'm parsing through the code but i'm not a developer so i've trouble following all the steps, maybe a more experienced set of eyes can find a way to tweak that parameter.
Thanks
From the link that you've provided it looks like y_i is the target variable. In your case it's your target training variable. In the neupy code it's used during the prediction. https://github.com/itdxer/neupy/blob/master/neupy/algorithms/rbfn/grnn.py#L140
GRNN uses lazy learning, which means that it doesn't train, it just re-uses all your training data per each prediction. The self.target_train variable is just a copy that you use during the training phase. You can update this value before making prediction
from neupy import algorithms
grnn = algorithms.GRNN(std=0.1)
grnn.train(x_train, y_train)
grnn.train_target = modify_grnn_algorithm(grnn.train_target)
predicted = grnn.predict(x_test)
Or you can use GRNN code for prediction instead of default predict function
import numpy as np
from neupy import algorithms
from neupy.algorithms.rbfn.utils import pdf_between_data
grnn = algorithms.GRNN(std=0.1)
grnn.train(x_train, y_train)
# In this part of the code you can do any moifications you want
ratios = pdf_between_data(grnn.input_train, x_test, grnn.std)
predicted = (np.dot(grnn.target_train.T, ratios) / ratios.sum(axis=0)).T
Say I have a sklearn training data:
features, labels = assign_dataSets() #assignment operation
Here the feature is a 2-D array, whereas label consists is a 1-D array consisting of values [0,1]
The classification operation:
f1x = [features[i][0] for i in range(0, len(features)) if labels[i]==0]
f2x = [features[i][0] for i in range(0, len(features)) if labels[i]==1]
f1y = [features[i][1] for i in range(0, len(features)) if labels[i]==0]
f2y = [features[i][1] for i in range(0, len(features)) if labels[i]==1]
Now I plot the said data:
import matplotlib.pyplot as plt
plt.scatter(f1x,f1y,color='b')
plt.scatter(f2x,f2y,color='y')
plt.show()
Now I want to run the fitting operation with a classifier for example SVC.
from sklearn.svm import SVC
clf = SVC()
clf.fit(features, labels)
Now my question is as support vectors are really slow, is there a way to monitor the decision boundary of the classifier in real-time (I mean as the fitting operation is occurring)? I know that I can plot the decision boundary after the fitting operation has occurred, but I want the plotting of the classifier to occur in real time. Perhaps with threading and running predictions of an array of points declared by a linespace. Does fit function even allow such operations, or do I need to go for a some other library?
Just so you know, I am new to machine-learning.
scikit-learn has this feature, but it's is limited to a few classifiers from my understanding (e.g. GradientBoostingClassifier, MPLClassifier). To turn on this feature, you need to set verbose=True. For example:
clf = GradientBoostingClassifier(verbose=True)
I tried it with SVC and didn't work as expected (probably for the reason sascha mentioned in the comment section). Here is a different variation of your question on StackOverflow.
With regards to your second question, if you switch to Tensorflow (another machine learning library), you can use the tensorboard feature to monitor a few of metrics (e.g. error decay) in real time.
However, to the best of my knowledge SVM implementation is still experimental in v1.5. Tensorflow is really good when working with neural network based models.
If you decide to use a DNN for classification using Tensorflow then here is a discussion about implementation on StackOverflow: No easy way to add Tensorboard output to pre-defined estimator functions DnnClassifier?
Useful References:
Tensorflow SVM (only linear support for now - v1.5): https://www.tensorflow.org/api_docs/python/tf/contrib/learn/SVM
Tensorflow Kernals Methods: https://www.tensorflow.org/versions/master/tutorials/kernel_methods
Tensorflow Tensorboard: https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard
Tensorflow DNNClassifier Estimator: https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier
I have been starting with tensorflow and have been following this standard MNIST tutorial.
However, in contrast to the expected 92% accuracy, the accuracy obtained over the training set as well as the test set is not going beyond 67%.
I am familiar with softmax and multinomial regression and have obtained more than 94% using scratch python implementation as well as using sklearn.linear_model.LogisticRegression.
I had tried the same using CIFAR-10 dataset and in that case the accuracy was too low and just about 10% which is equal to randomly assigning classes. This has made me doubt my installation of tensorflow, yet I am unsure about this.
Here is my implementation of Tensorflow MNIST tutorial. I would request if someone could have a look at my implementation.
You constructed your graph, specified the loss function, and created the optimizer (which is correct). The problem is that you use your optimizer only once:
sess_tf.run(train_step, feed_dict={x: train_images_reshaped[0:1000], y_: train_labels[0:1000]})
So basically you run your gradient descent only once. Clearly you can't converge fast after only one tiny step in the right direction. You need to do something along the lines:
for _ in xrange(many_steps):
X, Y = get_a_new_batch_from(mnist_data)
sess_tf.run(train_step, feed_dict={x: X, y_: Y})
If you will not be able to figure out how to modify my pseudo-code, consult the tutorial, because based on my memory they covered this nicely.
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
The initialization of W may cause your network does not learn anything but random guessing. Because the grad will be zero and the backprop actually doesn't work at all.
You'd better to init the W using tf.Variable(tf.truncated_normal([784, 10], mean=0.0, stddev=0.01)) see https://www.tensorflow.org/api_docs/python/tf/truncated_normal for more.
Not sure if this is still relevant in June 2018, but the MNIST beginner tutorial no longer matches the example code on Github. If you download and run the example code, it does indeed give you the suggested 92% accuracy.
I noticed two things going wrong when following the tutorial:
1) Accidentally calling softmax twice
The tutorial first tells you to define y as follows:
y = tf.nn.softmax(tf.matmul(x, W) + b)
But later suggests that you define cross-entropy using tf.nn.softmax_cross_entropy_with_logits, which would make it easy to accidentally do the following:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)
This would send your logits (tf.matmul(x, W) + b) through softmax twice, which resulted in me getting stuck at a 67% accuracy.
However I noticed that even fixing this still only brought me up to a very unstable 80-90% accuracy, which leads me to the next issue:
2) tf.nn.softmax_cross_entropy_with_logits() is deprecated
They haven't updated the tutorial yet, but the tf.nn.softmax_cross_entropy_with_logits page indicates that this function has been deprecated.
In the example code on Github they've replaced it with tf.losses.sparse_softmax_cross_entropy(labels=y_, logits=y).
However you can't just swap the function out - the example code also changes the dimensionality on many of the other lines.
My suggestion to anyone doing this for the first time would be to download the current working example code from Github and try to match it up to the tutorial concepts without taking the instructions literally. Hopefully they will get around to updating it!
I'm trying to write something similar to google's wide and deep learning after running into difficulties of doing multi-class classification(12 classes) with the sklearn api. I've tried to follow the advice in a couple of posts and used the tf.group(logistic_regression_optimizer, deep_model_optimizer). It seems to work but I was trying to figure out how to get predictions out of this model. I'm hoping that with the tf.group operator the model is learning to weight the logistic and deep models differently but I don't know how to get these weights out so I can get the right combination of the two model's predictions. Thanks in advance for any help.
https://groups.google.com/a/tensorflow.org/forum/#!topic/discuss/Cs0R75AGi8A
How to set layer-wise learning rate in Tensorflow?
tf.group() creates a node that forces a list of other nodes to run using control dependencies. It's really just a handy way to package up logic that says "run this set of nodes, and I don't care about their output". In the discussion you point to, it's just a convenient way to create a single train_op from a pair of training operators.
If you're interested in the value of a Tensor (e.g., weights), you should pass it to session.run() explicitly, either in the same call as the training step, or in a separate session.run() invocation. You can pass a list of values to session.run(), for example, your tf.group() expression, as well as a Tensor whose value you would like to compute.
Hope that helps!