I'm writting a python program to classify letters and numbers. I've wrote the classifier and I have the images for my dataset. I really don't have much experience in python or working with images.
My problem is how to create my dataset with the images I have. How to create like an array with the shape of them. Should I just create a numpy array of each image? Or use color histogram?
I will probably convert all images to grayscale.
I've found the link bellow that classifies cats and dogs. It uses two method to extract images features but I don't know if this would apply for my case.
k-nn-classifier-for-image-classification
Could anyone guide me could I extract the features of my images to a vector, for example, so I can write this data in my "dataset.data" file?
I'll use images like the image bellow:
Letter "e"
I've even considered resizing the image to 32x32 and create like a bitmap of 0's and 1's representing the image.
Could anyone guide me could I extract the features of my images to a vector, for example, so I can write this data in my "dataset.data" file?
Thank you.
You would usually want to create a Numpy array to hold all your training data. It is common to arrange it in the following shape:
X_train.shape = (N, img.shape[0], img.shape[1])
where N is the number of images in the set.
This way, if you are using single channel (gray scale), X_train[i,:,:] will hold the values of the i'th image pixels. Note that it's recommended to normalize these values, but this will depend on the model you choose to train.
Here is a quick example of how you can create build such an array:
import numpy as np
import cv2
X = np.zeros((N, IMG_SIZE[0], IMG_SIZE[1]), dtype=np.float32)
y = np.zeros((N))
for idx, img_path in enumerate(images_path):
img = cv2.imread(img_path)
assert ((img.shape[0], img.shape[1]) == IMG_SIZE)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
X[idx, :, :] = gray
y[idx] = # label of this image
# if you wish to normalize:
X = (X/255.0) - 0.5
There are many tutorials for digit classifiers out there, usually using the MNIST dataset as an example. Here is one example but you should go ahead and google it.
If you want to achieve better results, you would probably want to look into neural networks. Again, many tutorials out there, here is one example using tensorflow.
I think you might be looking for this:
http://www.scipy-lectures.org/advanced/scikit-learn/
Sklearn is a very easy to learn machine learning package, with lots of tutorials.
Hope it helps,
Related
I have an image that contains shapes made out of lines. I would like to use a Python library (e.g. opencv or skimage) to identify the shapes.
So if the input is this:
I'm trying to get to an output like this:
I'm fairly new to CV techniques and have been trying several processing techniques in opencv documentation/tutorials, but haven't found any ideas that can emphasis the angle or clustering of the lines as I imagine I need to solve this problem.
I'm also open to machine learning based approaches to solving this problem but preferably ones that won't require me to generate an especially large dataset.
This isn't a perfect answer, but I've found some success using this Gaussian adaptive thresholding and Harris corner detection.
import cv2
img = cv2.imread("input.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshold = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,3,0)
img = cv2.cornerHarris(threshold, 35, 1, 0.2)
plt.imshow(img, cmap='binary')
It doesn't solve the problem completely but it manages to separate the shapes in a meaningful way which is a strong start for me.
I hope you're all doing well!
I'm new to Image Manipulation, and so I want to apologize right here for my simple question. I'm currently working on a problem that involves classifying an object called jet into two known categories. This object is made of sub-objects. My idea is to use this sub-objects to transform each jet in a pixel image, and then applying convolutional neural networks to find the patterns.
Here is an example of the pixel images:
jet's constituents pixel distribution
To standardize all the images, I want to find the two most intense pixels and make sure the axis connecting them is in the vertical direction, as well as make sure that the most intense pixel is at the top. It also would be good to impose that one of the sides (left or right) of the image contains the majority of the intensity and to normalize the intensity of the whole image to 1.
My question is: as I'm new to this kind of processing, I don't know if there is a library in Python that can handle these operations. Are you aware of any?
PS: the picture was taken from here:https://arxiv.org/abs/1407.5675
You can look into OpenCV library for Python:
https://docs.opencv.org/master/d6/d00/tutorial_py_root.html.
It supports a lot of image processing functions.
In your case, it probably would be easier to convert the image into a more suitable color space in which one axis stands for color intensity (e.g HSI, HSL, HSV) and trying to find indices of the maximum values along this axis (this should return the pixels with the highest intensity in the image).
Generally, in Python, we use PIL library for basic manipulations with images and OpenCV for advances ones.
But, if understand your task correctly, you can just think of an image as a multidimensional array and use numpy to manipulate it.
For example, if your image is stored in a variable of type numpy.array called img, you can find maximum value along the desired axis just by writing:
img.max(axis=0)
To normalize image you can use:
img /= img.max()
To find which image part is brighter, you can split an img array into desired parts and calculate their mean:
left = img[:, :int(img.shape[1]/2), :]
right = img[:, int(img.shape[1]/2):, :]
left_mean = left.mean()
right_mean = right.mean()
I have a series of 2d images of two types, either a star or a pentagon. My aim is to classify all of these images respectively. I have 30 star images and 30 pentagon images. An example of each image is shown side by side here:
Before I apply the KNN classification algorithm, I need to extract a feature vector from all the images. The feature vectors must all be of the same size however the 2d images all vary in size. I have extracted read in my image and I get back a 2d array with zeros and ones.
image = pl.imread('imagepath.png')
My question is how do I process image in order produce a meaningful feature vector that contains enough information to allow me to do the classification. It has to be a single vector per image which I will use for training and testing.
If you want to use opencv then:
Resize images to a standard size:
import cv2
import numpy as np
src = cv2.imread("/path.jpg")
target_size = (64,64)
dst = cv2.resize(src, target_size)
Convert to a 1D vector:
dst = dst.reshape(target_size.shape[0] * target_size.shape[1])
Before you start coding, you have to decide whuch features are useful for this task:
The easiest way out is trying the approach in #Jordan's answer and converting the entire image to a feature. This could work because the classes are simple patterns, and is interesting if you are using KNN. If this does not work well, the following steps show how you should approach the problem.
The number of black pixels might not help, because the size of the
star and pentagon can vary.
The number of sharp corners is very likely to be useful.
The number of straight line segments might be useful, but this could
be unreliable because the shapes are hand-drawn.
Supposing you want to have a go at using the number of corners as a feature, you can refer to this page to learn how to extract corners.
I am having a hard time understanding what scipy.cluster.vq really does!!
On Wikipedia it says Clustering can be used to divide a digital image into distinct regions for border detection or object recognition.
on other sites and books it says we can use clustering methods for clustering images for finding groups of similar images.
AS i am interested in image processing ,I really need to fully understand what clustering is .
So
Can anyone show me simple examples about using scipy.cluster.vq with images??
The kind of clustering performed by scipy.cluster.vq is definitely of the latter (groups of similar images) variety.
The only clustering algorithm implemented in scipy.cluster.vq is the K-Means algorithm, which typically treats input data as points in n-dimensional euclidean space, and attempts to divide that space so that new, incoming data can be summarized by saying "example x is most like centroid y". Centroids can be thought of as prototypical examples of the input data. Vector quantization leads to concise, or compressed representations because, instead of remembering all 100 pixels of each new image we see, we can remember a single integer which points at the prototypical example that the new image is most like.
If you had many small grayscale images:
>>> import numpy as np
>>> images = np.random.random_sample((100,10,10))
So, we've got 100 10x10 pixel images. Let's assume they already all have similar brightness and contrast. The scipy kmeans implementation expects flat vectors:
>>> images = images.reshape((100,100))
>>> images.shape
(100,100)
Now, let's train the K-Means algorithm so that any new incoming image can be assigned to one of 10 clusters:
>>> from scipy.cluster.vq import kmeans, vq
>>> codebook,distortion = kmeans(images,10)
Finally, let's say we have five new images we'd like to assign to one of the ten clusters:
>>> newimages = np.random.random_samples((5,10,10))
>>> clusters = vq(newimages.reshape((5,100)),codebook)
clusters will contain the integer index of the best matching centroid for each of the five examples.
This is kind of a toy example, and won't yield great results unless the objects of interest in the images you're working with are all centered. Since objects of interest might appear anywhere in larger images, it's typical to learn centroids for smaller image "patches", and then convolve them (compare them at many different locations) with larger images to promote translation-invariance.
The second is what clustering is: group objects that are somewhat similar (and that could be images). Clustering is not a pure imaging technique.
When processing a single image, it can for example be applied to colors. This is a quite good approach for reducing the number of colors in an image. If you cluster by colors and pixel coordinates, you can also use it for image segmentation, as it will group pixels that have a similar color and are close to each other. But this is an application domain of clustering, not pure clustering.
Recently I downloaded some flags from the CIA world factbook. Now I want to "classify them.
Get the colors
Get some shapes (stars, moons etc.)
While browsing I came across the Python Image Library which allows me to extract the colors (i.e. for Austria:
#!/usr/bin/env python
import Image
bild = Image.open("au-lgflag.gif").convert("RGB")
bild.getcolors()
[(44748, (255, 255, 255)), (452, (236, 145, 146)), (653, (191, 147, 149)), ...)]
What I found strange here is that the austrian flag only has two colors in it, but the above output shows more than ten. Do you know why? My idea was to only count the top 5 colors and as I'm not interested in every color I would do some "normalize" the numbers to multiples of 64 (so (236, 145, 146) becomes (192, 128, 128)).
However at the moment I have no idea what is the best way to extract more information (Ist there a star in the image? or else). Could you give me some hints on how to do it?
Thanks in advance
The Python Imaging Library - PIL just does basic image manipulation - opening, some transforms or filters, and saving to other formats.
Pattern recognition, is part of an advanced image processign field and evolving -- it deos use algorithms far different than those present in PIL.
There are some libraries and frameworks you can use in Python for pattern recognition - (recognising stars, and moons, and so) - Although I advance you: if you want this just to classify one0-hundered-and-a-few coutnry flags, you should do it manually, rather than try to dive in pattern recognition.
Your comment on the number of colors tells that you are not used with computer images at all. And pattern recognition is hardcore, even with a python front-end. (You can't expect any current framework to know beforehand what is a "moon" or a "star" for example)
So, for less than 500 images, you can resort to software that allows you to tag images manually and write some code to link the tags to each flag.
As for the colors: Computer rasterized images are formed of pixels. These are Square. At the boundary between different colors, if a pixel is on one color (say white), and its neighbor is a complete different color (like red), this boundary will show up jagged. This is known as "aliasing". To diminish this, computer software mixes colors at hard boundaries, creating intermediate colors - that is why a PNG even with 2 apparent colors can have several colors internally. For .JPG it is even worse, because the rounded decimal numbers for RGB colors we use are not even stored as they are in the image.
Unlike pattern recognizing, you can downsize the number of colours seen by using just the most significant bits of each component. I'd say the two most significant bits would be enough.
The following python function could do that using a color count given by PIL:
def get_main_colors(col_list):
main_colors = set()
for index, color in col_list:
main_colors.add(tuple(component >> 6 for component in color))
return [tuple(component << 6 for component in color) for color in main_colors]
call it with "get_main_colors(bild.get_colors()) " for example.
Here is another question dealing with the pattern recognition part:
python image recognition
First some quick terminology, just in case:
A classifier learns a map of inputs to outputs. You train a classifier by giving it input/output pairs, for example feature vectors like color information and labels like 'czech flag'. In practice, the labels are represented as scalar numbers. In your example, you have a multi-class problem, which simply means that there are more than two possible labels (obviously, since there are more than two country flags). Training a multi-class classifier can a little trickier than the vanilla binary classifier, so you may want to search for terms like "multi-class classifier" or "one-vs-many classifier" to investigate the best approach for you.
On to the problem:
I think your problem might be easily-solved using a simple classifier, like k-nearest neighbors, with color histograms as feature vectors. In particular, I would use HSV feature vectors as opposed to RGB feature vectors. Some great results have been reported in the literature using just this kind of simple classifier system, for example: SVMs for Histogram-Based Image Classification. In that paper, the authors use a particular classifier known as a Support Vector Machine (SVM) and HSV feature vectors. HSV feature vectors also sidestep the issue of image scale and rotation, for example a flag that is 1024x768 vs 640x480, or a flag that is rotated in an image by 45 degrees.
The pseudocode for training the algorithm would look something like this:
# training simple kNN -- just compute feature vectors, collect labels
X = [] # tuple (input example, label)
for training_image in data:
x = get_hsv_vector(training_image)
y = get_label(training_image)
X.append((x,y))
# classification -- pick k closest feature vectors
K = 3 # the 'k' in kNN -- how many similar featvecs to use
d = [] # (distance, label) tuples for scoring
x_test = get_hsv_vector(test_image) # feature vector to be classified
for x_train in X:
d.append((distance(x_test[0], x_train), x_test[1])
# sort distances, d, by closeness and pick top K labels for scoring
d.sort()
output = get_majority_vote([x[1] for x in d[:K]])
The kNN classifier is available in several python packages, with good documentation. It should be pretty easy to convert to HSV colorspace as well. If you don't achieve your desired results, you can try to improve your feature vectors or your classifier.