I am trying to have a look at the MNIST data set for machine learning. In Tensorflow the MNIST data set can be imported with
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
full_data_x = mnist.train.images
However when I try to visualize an 80x80 array of the data using
test_x, test_y = mnist.test.images, mnist.test.labels
plt.gray()
plt.imshow(1-test_x[80:160,80:160])
it looks really strange like this:
How can I extract an image of the actual hand-written digits, like they are shown in the internet:
I saw the similar questions like this. However I would especially be interested where in the training data array the images are actually hidden. I know that tensor flow module provides a function to display the images.
I think I understand your question now, and it is a bit different than the one I thought was duplicate.
The images are not necessarily hidden. Each index of that list is an image in itself:
num_test_images, num_train_images = len(mnist.test.images), len(mnist.train.images)
size_of_first_test_image, size_of_first_train_image = len(mnist.test.images[0]), len(mnist.train.images[0])
print num_test_images, num_train_images
print size_of_first_test_image, size_of_first_train_image
output:
10000 55000
784 784
You can see that the number of training and testing images is the length of each mnist list. Each image is a flat array of size 784. You will have to reshape it yourself to display it using numpy or something of the sort.
Try this:
first_test_image = np.array(mnist.test.images[0], dtype='float')
reshaped_image = first_image.reshape((28, 28))
I'm trying to use the VOC2012 dataset for training a CNN. For my project, I require B&W data, so I extracted the R components. So far so good. The trouble is that the images are of different sizes, so I can't figure out how to pass it to the model. I compiled my model, and then created my mini-batches of size 32 as below (where X_train and Y_train are the paths to the files).
for x in X_train:
img = plt.imread(x)
img = img.reshape(*(img.shape), 1)
X.append(img)
for y in Y_train:
img = plt.imread(y)
img = img.reshape(*(img.shape), 1)
Y.append(img)
model.train_on_batch(np.array(X), np.array(Y))
However, I suspect that because the images are all of different sizes, the numpy array has a shape (32,) rather than (32, height, width, 1) as I'd expect. How do I take care of this?
According to some sources, one is indeed able to train at least some architectures with varying input sizes. (Quora, Cross Validated)
When it comes to generating an array of arrays varying in size, one might simply use a Python list of NumPy arrays, or an ndarray of type object to collect all the image data. Then in the training process, the Quora answer mentioned that only batch size 1 can be used, or one might clump several images together based on the sizes. Even padding with zeros could be used to make the images evenly sized, but I can't say much about the validity of that approach.
Best of luck in your research!
Example code for illustration:
# Generate 10 "images" with different sizes
images = [np.zeros((i+5, i+10)) for i in range(10)]
images = np.array([np.zeros((i+5, i+10)) for i in range(10)])
# Or an empty array to append to
images = np.array([], dtype=object)
I am currently trying to get into machine learning and neural networks, but my lack of programming skills is kind of hindering me at the moment. I am following an online tutorial in which these lines of code were made to evaluate the created model:
pred_fn = tf.estimator.inputs.pandas_input_fn(x=X_test,batch_size=len(X_test),shuffle=False)
predictions = list(model.predict(input_fn=pred_fn))
predictions[0]
final_preds = []
for pred in predictions:
final_preds.append(pred['class_ids'][0])
final_preds[:10]
from sklearn.metrics import classification_report
print(classification_report(y_test,final_preds))
This works very well for me an tells me the precision I achieved on these 10 inputs I chose from X_test. Unfortunately, I can't really figure out how to be able to predict a particular, single value from X_test or maybe even a manually input value that has the same dimensions as an element of X_test.
X_test is a pandas.core.frame.DataFrame and includes 15 columns and thousands of rows. Therefore, I would find it helpful to maybe predict or evaluate a certain value.
If I missed any essential information, that I should have included, let me know. Thanks in advance!
Why don't you just take sections of the X_test dataframe, or pass in single values as a dataframe with a single row.
Sectioning a dataframe:
temp = X_test[i:i+1]
to test with the ith row, use temp now instead of X_test.
Or create a new dataframe with required data:
temp = pandas.DataFrame(data, columns = X_test.columns)
where data is input as a list (iterable) [[a1,a2,a3...a15]].
again use temp instead of X_test in your code.
I want to make a program to recognize the digit in an image. I follow the tutorial in scikit learn .
I can train and fit the svm classifier like the following.
First, I import the libraries and dataset
from sklearn import datasets, svm, metrics
digits = datasets.load_digits()
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))
Second, I create the SVM model and train it with the dataset.
classifier = svm.SVC(gamma = 0.001)
classifier.fit(data[:n_samples], digits.target[:n_samples])
And then, I try to read my own image and use the function predict() to recognize the digit.
Here is my image:
I reshape the image into (8, 8) and then convert it to a 1D array.
img = misc.imread("w1.jpg")
img = misc.imresize(img, (8, 8))
img = img[:, :, 0]
Finally, when I print out the prediction, it returns [1]
predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))
print predicted
Whatever I user others images, it still returns [1]
When I print out the "default" dataset of number "9", it looks like:
My image number "9" :
You can see the non-zero number is quite large for my image.
I dont know why. I am looking for help to solve my problem. Thanks
My best bet would be that there is a problem with your data types and array shapes.
It looks like you are training on numpy arrays that are of the type np.float64 (or possibly np.float32 on 32 bit systems, I don't remember) and where each image has the shape (64,).
Meanwhile your input image for prediction, after the resizing operation in your code, is of type uint8 and shape (1, 64).
I would first try changing the shape of your input image since dtype conversions often just work as you would expect. So change this line:
predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))
to this:
predicted = classifier.predict(img.reshape(img.shape[0]*img.shape[1]))
If that doesn't fix it, you can always try recasting the data type as well with
img = img.astype(digits.images.dtype).
I hope that helps. Debugging by proxy is a lot harder than actually sitting in front of your computer :)
Edit: According to the SciPy documentation, the training data contains integer values from 0 to 16. The values in your input image should be scaled to fit the same interval. (http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits)
1) You need to create your own training set - based on data similar to what you will be making predictions. The call to datasets.load_digits() in scikit-learn is loading a preprocessed version of the MNIST Digits dataset, which, for all we know, could have very different images to the ones that you are trying to recognise.
2) You need to set the parameters of your classifier properly. The call to svm.SVC(gamma = 0.001) is just choosing an arbitrary value of the gamma parameter in SVC, which may not be the best option. In addition, you are not configuring the C parameter - which is pretty important for SVMs. I'd bet that this is one of the reasons why your output is 'always 1'.
3) Whatever final settings you choose for your model, you'll need to use a cross-validation scheme to ensure that the algorithm is effectively learning
There's a lot of Machine Learning theory behind this, but, as a good start, I would really recommend to have a look at SVM - scikit-learn for a more in-depth description of how the SVC implementation in sickit-learn works, and GridSearchCV for a simple technique for parameter setting.
It's just a guess but... The Training set from Sk-Learn are black numbers on a white background. And you are trying to predict numbers which are white on a black background...
I think you should either train on your training set, or train on the negative version of your pictures.
I hope this help !
If you look at:
http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits
you can see that each point in the matrix as a value between 0-16.
You can try to transform the values of the image to between 0-16. I did it and now the prediction works well for the digit 9 but not for 8 and 6. It doesn't give 1 any more.
from sklearn import datasets, svm, metrics
import cv2
import numpy as np
# Load digit database
digits = datasets.load_digits()
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))
# Train SVM classifier
classifier = svm.SVC(gamma = 0.001)
classifier.fit(data[:n_samples], digits.target[:n_samples])
# Read image "9"
img = cv2.imread("w1.jpg")
img = img[:,:,0];
img = cv2.resize(img, (8, 8))
# Normalize the values in the image to 0-16
minValueInImage = np.min(img)
maxValueInImage = np.max(img)
normaliizeImg = np.floor(np.divide((img - minValueInImage).astype(np.float),(maxValueInImage-minValueInImage).astype(np.float))*16)
# Predict
predicted = classifier.predict(normaliizeImg.reshape((1,normaliizeImg.shape[0]*normaliizeImg.shape[1] )))
print predicted
I have solved this problem using below methods:
check the number of attributes, too large or too small.
check the scale of your gray value, I change to [0,16].
check data type, I change it to uint8.
check the number of training data, too small or not.
I hope it helps. ^.^
Hi in addition to #carrdelling respond, i will add that you may use the same training set, if you normalize your images to have the same range of value.
For example you could binaries your data ( 1 if > 0, 0 else ) or you could divide by the maximum intensity in your image to have an arbitrary interval [0;1].
You probably want to extract features relevant to to your data set from the images and train your model on them.
One example I copied from here.
surf = cv2.SURF(400)
kp, des = surf.detectAndCompute(img,None)
But the SURF features may not be the most useful or relevant to your dataset and training task. You should try others too like HOG or others.
Remember this more high level the features you extract the more general/error-tolerant your model will be to unseen images. However, you may be sacrificing accuracy in your known samples and test cases.
I have scraped a lot of ebay titles like this one:
Apple iPhone 5 White 16GB Dual-Core
and I have manually tagged all of them in this way
B M C S NA
where B=Brand (Apple) M=Model (iPhone 5) C=Color (White) S=Size (Size) NA=Not Assigned (Dual Core)
Now I need to train a SVM classifier using the libsvm library in python to learn the sequence patterns that occur in the ebay titles.
I need to extract new value for that attributes (Brand, Model, Color, Size) by considering the problem as a classification one. In this way I can predict new models.
I want to considering this features:
* Position
- from the beginning of the title
- to the end of the listing
* Orthographic features
- current word contains a digit
- current word is capitalized
....
I can't understand how can I give all this info to the library. The official doc lacks a lot of information
My class are Brand, Model, Size, Color, NA
what does the input file of the SVM algo must contain?
how can I create it? could I have an example of that file considering the 4 features that I put as example in my question? Can I also have an example of the code that I must use to elaborate the input file ?
* UPDATE *
I want to represent these features... How can I must do?
Identity of the current word
I think that I can interpret it in this way
0 --> Brand
1 --> Model
2 --> Color
3 --> Size
4 --> NA
If I know that the word is a Brand I will set that variable to 1 (true).
It is ok to do it in the training test (because I have tagged all the words) but how can I do that for the test set? I don't know what is the category of a word (this is why I'm learning it :D).
N-gram substring features of current word (N=4,5,6)
No Idea, what does it means?
Identity of 2 words before the current word.
How can I model this feature?
Considering the legend that I create for the 1st feature I have 5^(5) combination)
00 10 20 30 40
01 11 21 31 41
02 12 22 32 42
03 13 23 33 43
04 14 24 34 44
How can I convert it to a format that the libsvm (or scikit-learn) can understand?
Membership to the 4 dictionaries of attributes
Again how can I do it?
Having 4 dictionaries (for color, size, model and brand) I thing that I must create a bool variable that I will set to true if and only if I have a match of the current word in one of the 4 dictionaries.
Exclusive membership to dictionary of brand names
I think that like in the 4. feature I must use a bool variable. Do you agree?
Here's a step-by-step guide for how to train an SVM using your data and then evaluate using the same dataset. It's also available at http://nbviewer.ipython.org/gist/anonymous/2cf3b993aab10bf26d5f. At the url you can also see the output of the intermediate data and the resulting accuracy (it's an iPython notebook)
Step 0: Install dependencies
You need to install the following libraries:
pandas
scikit-learn
From command line:
pip install pandas
pip install scikit-learn
Step 1: Load the data
We will use pandas to load our data.
pandas is a library for easily loading data. For illustration, we first save
sample data to a csv and then load it.
We will train the SVM with train.csv and get test labels with test.csv
import pandas as pd
train_data_contents = """
class_label,distance_from_beginning,distance_from_end,contains_digit,capitalized
B,1,10,1,0
M,10,1,0,1
C,2,3,0,1
S,23,2,0,0
N,12,0,0,1"""
with open('train.csv', 'w') as output:
output.write(train_data_contents)
train_dataframe = pd.read_csv('train.csv')
Step 2: Process the data
We will convert our dataframe into numpy arrays which is a format that scikit-
learn understands.
We need to convert the labels "B", "M", "C",... to numbers also because svm does
not understand strings.
Then we will train a linear svm with the data
import numpy as np
train_labels = train_dataframe.class_label
labels = list(set(train_labels))
train_labels = np.array([labels.index(x) for x in train_labels])
train_features = train_dataframe.iloc[:,1:]
train_features = np.array(train_features)
print "train labels: "
print train_labels
print
print "train features:"
print train_features
We see here that the length of train_labels (5) exactly matches how many rows
we have in trainfeatures. Each item in train_labels corresponds to a row.
Step 3: Train the SVM
from sklearn import svm
classifier = svm.SVC()
classifier.fit(train_features, train_labels)
Step 4: Evaluate the SVM on some testing data
test_data_contents = """
class_label,distance_from_beginning,distance_from_end,contains_digit,capitalized
B,1,10,1,0
M,10,1,0,1
C,2,3,0,1
S,23,2,0,0
N,12,0,0,1
"""
with open('test.csv', 'w') as output:
output.write(test_data_contents)
test_dataframe = pd.read_csv('test.csv')
test_labels = test_dataframe.class_label
labels = list(set(test_labels))
test_labels = np.array([labels.index(x) for x in test_labels])
test_features = test_dataframe.iloc[:,1:]
test_features = np.array(test_features)
results = classifier.predict(test_features)
num_correct = (results == test_labels).sum()
recall = num_correct / len(test_labels)
print "model accuracy (%): ", recall * 100, "%"
Links & Tips
Example code for how to load LinearSVC: http://scikitlearn.org/stable/modules/svm.html#svm
Long list of scikit-learn examples: http://scikitlearn.org/stable/auto_examples/index.html. I've found these mildly helpful but
often confusing myself.
If you find that the SVM is taking a long time to train, try LinearSVC
instead: http://scikitlearn.org/stable/modules/generated/sklearn.svm.LinearSVC.html
Here's another tutorial on getting familiar with machine learning models: http://scikit-learn.org/stable/tutorial/basic/tutorial.html
You should be able to take this code and replace train.csv with your training data, test.csv with your testing data, and get predictions for your test data, along with accuracy results.
Note that since you're evaluating using the data you trained on the accuracy will be unusually high.
I echo the comment of #MarcoPashkov but will try to elaborate on the LibSVM file format. I find the documentation comprehensive yet hard to find, for the Python lib I recommend the README on GitHub.
An important piece to recognize is that there is a Sparse format where all features which are 0 get removed and a Dense format where features which are 0 are not removed. These two are equivalent examples of each taken from the README.
# Dense data
>>> y, x = [1,-1], [[1,0,1], [-1,0,-1]]
# Sparse data
>>> y, x = [1,-1], [{1:1, 3:1}, {1:-1,3:-1}]
The y variable stores a list of all the categories for the data.
The x variable stores the feature vector.
assert len(y) == len(x), "Both lists should be the same length"
The format found in the Heart Scale Example is a Sparse format where the dictionary key is the feature index and the dictionary value is the feature value while the first value is the category.
The Sparse format is incredibly useful while using a Bag of Words Representation for your feature vector.
As most documents will typically use a very small subset of the words used in the corpus, the resulting matrix will have many feature values that are zeros (typically more than 99% of them).
For instance a collection of 10,000 short text documents (such as emails) will use a vocabulary with a size in the order of 100,000 unique words in total while each document will use 100 to 1000 unique words individually.
For an example using the feature vector you started with, I trained a basic LibSVM 3.20 model. This code isn't meant to be used but may help in showing how to create and test a model.
from collections import namedtuple
# Using namedtuples for descriptive purposes, in actual code a normal tuple would work fine.
Category = namedtuple("Category", ["index", "name"])
Feature = namedtuple("Feature", ["category_index", "distance_from_beginning", "distance_from_end", "contains_digit", "capitalized"])
# Separate up the set of categories, libsvm requires a numerical index so we associate each with an index.
categories = dict()
for index, name in enumerate("B M C S NA".split(' ')):
# LibSVM expects index to start at 1, not 0.
categories[name] = Category(index + 1, name)
categories
Out[0]: {'B': Category(index=1, name='B'),
'C': Category(index=3, name='C'),
'M': Category(index=2, name='M'),
'NA': Category(index=5, name='NA'),
'S': Category(index=4, name='S')}
# Faked set of CSV input for example purposes.
csv_input_lines = """category_index,distance_from_beginning,distance_from_end,contains_digit,capitalized
B,1,10,1,0
M,10,1,0,1
C,2,3,0,1
S,23,2,0,0
NA,12,0,0,1""".split("\n")
# We just ignore the header.
header = csv_input_lines[0]
# A list of Feature namedtuples, this will be trained as lists.
features = list()
for line in csv_input_lines[1:]:
split_values = line.split(',')
# Create a Feature with the values converted to integers.
features.append(Feature(categories[split_values[0]].index, *map(int, split_values[1:])))
features
Out[1]: [Feature(category_index=1, distance_from_beginning=1, distance_from_end=10, contains_digit=1, capitalized=0),
Feature(category_index=2, distance_from_beginning=10, distance_from_end=1, contains_digit=0, capitalized=1),
Feature(category_index=3, distance_from_beginning=2, distance_from_end=3, contains_digit=0, capitalized=1),
Feature(category_index=4, distance_from_beginning=23, distance_from_end=2, contains_digit=0, capitalized=0),
Feature(category_index=5, distance_from_beginning=12, distance_from_end=0, contains_digit=0, capitalized=1)]
# Y is the category index used in training for each Feature. Now it is an array (order important) of all the trained indexes.
y = map(lambda f: f.category_index, features)
# X is the feature vector, for this we convert all the named tuple's values except the category which is at index 0.
x = map(lambda f: list(f)[1:], features)
from svmutil import svm_parameter, svm_problem, svm_train, svm_predict
# Barebones defaults for SVM
param = svm_parameter()
# The (Y,X) parameters should be the train dataset.
prob = svm_problem(y, x)
model=svm_train(prob, param)
# For actual accuracy checking, the (Y,X) parameters should be the test dataset.
p_labels, p_acc, p_vals = svm_predict(y, x, model)
Out[3]: Accuracy = 100% (5/5) (classification)
I hope this example helps, it shouldn't be used for your training. It is meant as an example only because it is inefficient.