I hope you're all doing well!
I'm new to Image Manipulation, and so I want to apologize right here for my simple question. I'm currently working on a problem that involves classifying an object called jet into two known categories. This object is made of sub-objects. My idea is to use this sub-objects to transform each jet in a pixel image, and then applying convolutional neural networks to find the patterns.
Here is an example of the pixel images:
jet's constituents pixel distribution
To standardize all the images, I want to find the two most intense pixels and make sure the axis connecting them is in the vertical direction, as well as make sure that the most intense pixel is at the top. It also would be good to impose that one of the sides (left or right) of the image contains the majority of the intensity and to normalize the intensity of the whole image to 1.
My question is: as I'm new to this kind of processing, I don't know if there is a library in Python that can handle these operations. Are you aware of any?
PS: the picture was taken from here:https://arxiv.org/abs/1407.5675
You can look into OpenCV library for Python:
https://docs.opencv.org/master/d6/d00/tutorial_py_root.html.
It supports a lot of image processing functions.
In your case, it probably would be easier to convert the image into a more suitable color space in which one axis stands for color intensity (e.g HSI, HSL, HSV) and trying to find indices of the maximum values along this axis (this should return the pixels with the highest intensity in the image).
Generally, in Python, we use PIL library for basic manipulations with images and OpenCV for advances ones.
But, if understand your task correctly, you can just think of an image as a multidimensional array and use numpy to manipulate it.
For example, if your image is stored in a variable of type numpy.array called img, you can find maximum value along the desired axis just by writing:
img.max(axis=0)
To normalize image you can use:
img /= img.max()
To find which image part is brighter, you can split an img array into desired parts and calculate their mean:
left = img[:, :int(img.shape[1]/2), :]
right = img[:, int(img.shape[1]/2):, :]
left_mean = left.mean()
right_mean = right.mean()
Related
I have a 3D MR image as a NIfTI file (.nii.gz). I also have a 'mask' image as a NIfTI file, which is just a bunch of 0s and 1s. The 1s in this mask image represent the region of the 3D MR image I am interested in.
I want to retrieve the intensities of the pixels in the 3D MRI image which exist in the mask (i.e. are 1s in the mask image file). The only intensity feature I have found is sitk.MinimumMaximumImageFilter which isn't too useful since it uses the entire image (instead of a particular region), and also only gives the minimum and maximum of said image.
I don't think that the GetPixel() function helps me in this case either, since the 'pixel value' that it outputs is different to the intensity which I observe in the ITK-SNAP viewer. Is this correct?
What tool or feature could I use to help in this scenario?
use itk::BinaryImageToStatisticsLabelMapFilter
You might want to use itk::Statistics::MaskedImageToHistogramFilter, followed by min = histogram.Quantile(0, 0.0) and max = histogram.Quantile(0, 1.0). You probably need to use more bins than the example uses.
I have an image that is created by ray casting a bunch of vectors on to a mesh with a uv map (in blender). There are not enough vectors to completely cover the image so I'd like a way to fill the rest of the image with the closest non zero color. I've been looking into some techniques with convolutions in numpy etc but can't really find what I need, attached is an example of an image I'm working with - png with RGBA.
[Edited to add]
Possibly a better description of what I am trying to do:
for each pixel that doesn't have a cast color (ie black) I need to find the closest pixel with a cast color based on the distance away, not based on how similar the RGB values are.
I am studying image segmentation and have got some my simulation results.
I wonder how to generate an image as the attached one.
This figure is illustrated by the authors of the paper “Learning deep features from discriminative localization” whose well-known concept is class activation map (CAM) (Fig. 1 in the paper).
The images in first row are the input images and the images in second row are the output images.
In order to generate the output, the another input, an mask image, might be required.
The value of each pixel of the mask image ranges from 0 to 1.
The pixel whose value in mask is 1 will be colored red and 0 will be blue.
I tried to find the method to do that but I have no idea what keyword I should use.
I also tried to find the way to compute the values of the output images but it was also ambiguous.
I thought the output can be generated by simple interpolation.
For example, if the value of mask is 1, I thought the output could be an average of original and [255,original,original] (RGB representation).
Can this be simply done using python?
The method indeed can be interpolation but I cannot find the exact values.
Any link of keyword to find out the way will be greatly thanksful.
it looks like the "heatmap" is just being drawn with fixed transparency on top of the photo
to answer this I'd:
create a 2d (x,y) activation map; each value is the "activation" level at that pixel
transform this into a 3d (x,y,rgb) image through a color map; each value is turned into a RGB triple, e.g. 0=blue, 1=red in your example. I'd also suggest using a different colormap, the example is particularly bad with respect to perceptual uniformity
use PIL to blend the map on top of the original
hope that helps!
I have a sample of two-dimensional, black-and-white/binary barcodes that have been photographed.
The (colour) photographs typically suffer from all the usual suspects: blurring, distortion, contrast issues/lighting gradients, and erosion.
I am trying to reconstruct the original barcodes, which were once computer-generated pixel arrays of black/white values.
We should be able to exploit the images' spatial-frequency information to infer the dimensions of each pixel. The hope is to use this to better restore the original by convolving the image with such a structuring element defined by the data.
Although this is a very broad topic, I therefore have a very specific question:
What is the best way to establish a structuring element from image data in OpenCV/Python, without using prior knowledge of it?
(Assume for now that the underlying pixel scale is to some good approximation spatially invariant)
Note that I am not trying to execute the whole extraction pipeline: this question is simply about inferring an optimal structuring element from the data.
For example, the spatial kernel could be used as input to an unsharp mask, a la Python unsharp mask
References:
(1-D ideas) http://answers.opencv.org/question/174384/how-to-reconstruct-damaged-barcode, http://www.windytan.com/2016/02/barcode-recovery-using-priori.html
(Similar idea) Finding CheckerBoard Points in opencv for any random ChessBoard( pattern size not known)
(Sort of but not really, and answer-less) OpenCV find image frequencies
(Broad) https://en.wikipedia.org/wiki/Chessboard_detection
One way of doing this is:
Compute the Scharr gradient magnitude representations in both the x and y direction.
Subtract the y-gradient from the x-gradient. By performing this subtraction we are left with regions of the image that have high horizontal gradients and low vertical gradients.
Blur and threshold the image to filter out the noise.
Apply a closing kernel to the thresholded image to close the gaps between vertical stripes of the barcode.
Perform a series of dilations and erosions.
Find the largest contour in the image, which is now presumably the barcode.
More details and complete code can be found in this PyImageSearch blog post.
I have a series of 2d images of two types, either a star or a pentagon. My aim is to classify all of these images respectively. I have 30 star images and 30 pentagon images. An example of each image is shown side by side here:
Before I apply the KNN classification algorithm, I need to extract a feature vector from all the images. The feature vectors must all be of the same size however the 2d images all vary in size. I have extracted read in my image and I get back a 2d array with zeros and ones.
image = pl.imread('imagepath.png')
My question is how do I process image in order produce a meaningful feature vector that contains enough information to allow me to do the classification. It has to be a single vector per image which I will use for training and testing.
If you want to use opencv then:
Resize images to a standard size:
import cv2
import numpy as np
src = cv2.imread("/path.jpg")
target_size = (64,64)
dst = cv2.resize(src, target_size)
Convert to a 1D vector:
dst = dst.reshape(target_size.shape[0] * target_size.shape[1])
Before you start coding, you have to decide whuch features are useful for this task:
The easiest way out is trying the approach in #Jordan's answer and converting the entire image to a feature. This could work because the classes are simple patterns, and is interesting if you are using KNN. If this does not work well, the following steps show how you should approach the problem.
The number of black pixels might not help, because the size of the
star and pentagon can vary.
The number of sharp corners is very likely to be useful.
The number of straight line segments might be useful, but this could
be unreliable because the shapes are hand-drawn.
Supposing you want to have a go at using the number of corners as a feature, you can refer to this page to learn how to extract corners.