I have a series of 2d images of two types, either a star or a pentagon. My aim is to classify all of these images respectively. I have 30 star images and 30 pentagon images. An example of each image is shown side by side here:
Before I apply the KNN classification algorithm, I need to extract a feature vector from all the images. The feature vectors must all be of the same size however the 2d images all vary in size. I have extracted read in my image and I get back a 2d array with zeros and ones.
image = pl.imread('imagepath.png')
My question is how do I process image in order produce a meaningful feature vector that contains enough information to allow me to do the classification. It has to be a single vector per image which I will use for training and testing.
If you want to use opencv then:
Resize images to a standard size:
import cv2
import numpy as np
src = cv2.imread("/path.jpg")
target_size = (64,64)
dst = cv2.resize(src, target_size)
Convert to a 1D vector:
dst = dst.reshape(target_size.shape[0] * target_size.shape[1])
Before you start coding, you have to decide whuch features are useful for this task:
The easiest way out is trying the approach in #Jordan's answer and converting the entire image to a feature. This could work because the classes are simple patterns, and is interesting if you are using KNN. If this does not work well, the following steps show how you should approach the problem.
The number of black pixels might not help, because the size of the
star and pentagon can vary.
The number of sharp corners is very likely to be useful.
The number of straight line segments might be useful, but this could
be unreliable because the shapes are hand-drawn.
Supposing you want to have a go at using the number of corners as a feature, you can refer to this page to learn how to extract corners.
Related
I hope you're all doing well!
I'm new to Image Manipulation, and so I want to apologize right here for my simple question. I'm currently working on a problem that involves classifying an object called jet into two known categories. This object is made of sub-objects. My idea is to use this sub-objects to transform each jet in a pixel image, and then applying convolutional neural networks to find the patterns.
Here is an example of the pixel images:
jet's constituents pixel distribution
To standardize all the images, I want to find the two most intense pixels and make sure the axis connecting them is in the vertical direction, as well as make sure that the most intense pixel is at the top. It also would be good to impose that one of the sides (left or right) of the image contains the majority of the intensity and to normalize the intensity of the whole image to 1.
My question is: as I'm new to this kind of processing, I don't know if there is a library in Python that can handle these operations. Are you aware of any?
PS: the picture was taken from here:https://arxiv.org/abs/1407.5675
You can look into OpenCV library for Python:
https://docs.opencv.org/master/d6/d00/tutorial_py_root.html.
It supports a lot of image processing functions.
In your case, it probably would be easier to convert the image into a more suitable color space in which one axis stands for color intensity (e.g HSI, HSL, HSV) and trying to find indices of the maximum values along this axis (this should return the pixels with the highest intensity in the image).
Generally, in Python, we use PIL library for basic manipulations with images and OpenCV for advances ones.
But, if understand your task correctly, you can just think of an image as a multidimensional array and use numpy to manipulate it.
For example, if your image is stored in a variable of type numpy.array called img, you can find maximum value along the desired axis just by writing:
img.max(axis=0)
To normalize image you can use:
img /= img.max()
To find which image part is brighter, you can split an img array into desired parts and calculate their mean:
left = img[:, :int(img.shape[1]/2), :]
right = img[:, int(img.shape[1]/2):, :]
left_mean = left.mean()
right_mean = right.mean()
I'm writting a python program to classify letters and numbers. I've wrote the classifier and I have the images for my dataset. I really don't have much experience in python or working with images.
My problem is how to create my dataset with the images I have. How to create like an array with the shape of them. Should I just create a numpy array of each image? Or use color histogram?
I will probably convert all images to grayscale.
I've found the link bellow that classifies cats and dogs. It uses two method to extract images features but I don't know if this would apply for my case.
k-nn-classifier-for-image-classification
Could anyone guide me could I extract the features of my images to a vector, for example, so I can write this data in my "dataset.data" file?
I'll use images like the image bellow:
Letter "e"
I've even considered resizing the image to 32x32 and create like a bitmap of 0's and 1's representing the image.
Could anyone guide me could I extract the features of my images to a vector, for example, so I can write this data in my "dataset.data" file?
Thank you.
You would usually want to create a Numpy array to hold all your training data. It is common to arrange it in the following shape:
X_train.shape = (N, img.shape[0], img.shape[1])
where N is the number of images in the set.
This way, if you are using single channel (gray scale), X_train[i,:,:] will hold the values of the i'th image pixels. Note that it's recommended to normalize these values, but this will depend on the model you choose to train.
Here is a quick example of how you can create build such an array:
import numpy as np
import cv2
X = np.zeros((N, IMG_SIZE[0], IMG_SIZE[1]), dtype=np.float32)
y = np.zeros((N))
for idx, img_path in enumerate(images_path):
img = cv2.imread(img_path)
assert ((img.shape[0], img.shape[1]) == IMG_SIZE)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
X[idx, :, :] = gray
y[idx] = # label of this image
# if you wish to normalize:
X = (X/255.0) - 0.5
There are many tutorials for digit classifiers out there, usually using the MNIST dataset as an example. Here is one example but you should go ahead and google it.
If you want to achieve better results, you would probably want to look into neural networks. Again, many tutorials out there, here is one example using tensorflow.
I think you might be looking for this:
http://www.scipy-lectures.org/advanced/scikit-learn/
Sklearn is a very easy to learn machine learning package, with lots of tutorials.
Hope it helps,
Just as a project, I wanted to see if it would be possible to predict the next pixel in an image given all previous pixels.
For example: lets say I have an image with x pixels. Given the first y pixels, I want to be able to somewhat accurately predict the y+1th pixel. How should I go about solving this problem.
You are looking for some kind of generative model. RNNs are commonly used, and there's a great blog post here demonstrating character-by-character text generation.
The same principle can be applied to any ordered sequence. You talk about an image as being a sequence of pixels, but images have an intrinsic 2D structure (3 if you include color) that would be lost if you took the exact same approach as text generation. A couple of ideas:
Use tensorflow's GridLSTMCells
Treat a column of pixels as a single element of the sequence and predict column-by-column (or row by row)
Combine idea 2 with some 1D convolutions along the column/row
Use features from a section of an image as the seed to a generative adversarial network. See this repository for basic implementations.
I have the following image from which I have calculated the dimensions using by few preprocessing like gray scaling, noise removal, dilation, getting corners (CornerHarris), and then using distance formula to get the dimensions. Complete info regarding it can be seen in the Previos Question I know that the dimensions are not correct because I need a reference object to transform the size of the box. But you can see that the image is of a box so logically lengths 42, 54 and 49 must be same but they are not because the way image is taken makes the background smaller so it returns a smaller distance. Am I doing it wrong ? OR any better ideas to get those dimensions such that they are equal. Thanks.
I am having a hard time understanding what scipy.cluster.vq really does!!
On Wikipedia it says Clustering can be used to divide a digital image into distinct regions for border detection or object recognition.
on other sites and books it says we can use clustering methods for clustering images for finding groups of similar images.
AS i am interested in image processing ,I really need to fully understand what clustering is .
So
Can anyone show me simple examples about using scipy.cluster.vq with images??
The kind of clustering performed by scipy.cluster.vq is definitely of the latter (groups of similar images) variety.
The only clustering algorithm implemented in scipy.cluster.vq is the K-Means algorithm, which typically treats input data as points in n-dimensional euclidean space, and attempts to divide that space so that new, incoming data can be summarized by saying "example x is most like centroid y". Centroids can be thought of as prototypical examples of the input data. Vector quantization leads to concise, or compressed representations because, instead of remembering all 100 pixels of each new image we see, we can remember a single integer which points at the prototypical example that the new image is most like.
If you had many small grayscale images:
>>> import numpy as np
>>> images = np.random.random_sample((100,10,10))
So, we've got 100 10x10 pixel images. Let's assume they already all have similar brightness and contrast. The scipy kmeans implementation expects flat vectors:
>>> images = images.reshape((100,100))
>>> images.shape
(100,100)
Now, let's train the K-Means algorithm so that any new incoming image can be assigned to one of 10 clusters:
>>> from scipy.cluster.vq import kmeans, vq
>>> codebook,distortion = kmeans(images,10)
Finally, let's say we have five new images we'd like to assign to one of the ten clusters:
>>> newimages = np.random.random_samples((5,10,10))
>>> clusters = vq(newimages.reshape((5,100)),codebook)
clusters will contain the integer index of the best matching centroid for each of the five examples.
This is kind of a toy example, and won't yield great results unless the objects of interest in the images you're working with are all centered. Since objects of interest might appear anywhere in larger images, it's typical to learn centroids for smaller image "patches", and then convolve them (compare them at many different locations) with larger images to promote translation-invariance.
The second is what clustering is: group objects that are somewhat similar (and that could be images). Clustering is not a pure imaging technique.
When processing a single image, it can for example be applied to colors. This is a quite good approach for reducing the number of colors in an image. If you cluster by colors and pixel coordinates, you can also use it for image segmentation, as it will group pixels that have a similar color and are close to each other. But this is an application domain of clustering, not pure clustering.