Segmenting image by histogram - python

I am trying to classify infected red blood cells(RBC) and uninfected ones and am trying to do some image preprocessing that might help boost accuracy scores. I using this preprocessing for XGBOOST and SVM.
Asking for help here as my capstone tutor is not responding for quite some time.
Image segmentation example
I give two examples 1st on the left and 2nd on the right. My goal is to segment the infected places inside the RBC, the darker spots.
What I have currently done is:
normalize the image
get the histogram of the colored the normalized image
if there is one peak of the channel then pick the last lightest value of the "hill" base. If there are two peaks on the channel then pick the value in-between the "hills".
with the picked values of each channel segment the image in range from (1,1,1) to (red value, green value, blue value)
All of the steps above were done manually and they work (shown in the image link I gave).
I want to do this automatically as I have a huge data set.
My Questions:
How do I get the base values where the peak ends automatically
Also I am using python.
EDIT:
Sorry, I did not realize to add the images I'm working with.
Here is the data set zip folder I am working with: https://data.lhncbc.nlm.nih.gov/public/Malaria/cell_images.zip
Here are separate images just in case:
infected_img
uninfected_img
infected2_img
infected3_img
uninfected2_img

Related

how to locate and extract coordinates/data/sub-components of charts/map image data?

I'm working on creating a tile server from some raster nautical charts (maps) i've paid for access, and i'm trying to post-process the raw image data that these charts are distributed as, prior to geo-referencing them and slicing them up into tiles
i've got a two sets of tasks and would greatly appreciate any help or even sample code on how to get these done in an automated way. i'm no stranger to python/jupyter notebooks but have zero experience with this type of data-science to do image analysis/processing using things like opencv/machine learning (or if there's a better toolkit library that i'm not even yet aware of).
i have some sample images (originals are PNG but too big to upload so i encoded them in high-quality JPEGs to follow along/provide sample data).. here's what i'm trying to get done:
validation of all image data.. the first chart (as well as last four) demonstrate what properly formatted charts images should looks like (i manually added a few colored rectangles to the first, to highlight different parts of the image in the bonus section below)
some images will either have missing tile data, as in the 2nd sample image, these are ALWAYS chunks of 256x256 image data, so should be straightforward to identify black boxes of this exact size..
some images will have corrupt/misplaced tiles as in the 3rd image (notice in the center/upper half of the image is a large colorful semi-circle/arcs, it is slightly duplicated beneath and if you look along horizontally you can see the image data is shifted and so these tiles have been corrupted somehow
extraction of information, ultimately once all image data is verified to be valid (the above steps are ensured), there is a few bit of data i really need pulled out of the image, the most important of which is
the 4 coordinates (upper left, upper right, lower left, lower right) of the internal chart frame, in the first image they are highlighted in a small pink box at each corner (the other images don't have them but they are located in a simlar way) - NOTE, because these are geographic coordinates and involve projections, they are NOT always 100% horizontal/vertical of each other.
the critical bit is that SOME images container more than one "chartlet", i really need to obtain the above 4 coordinate for EACH chartlet (some charts have no chartlets, some two to several of them, and they are not always simple rectangular shapes), i may be able to generate for input the number of chartlets if that helps..
if possible, what would also help is extracting each chartlet as a separate image (each of these have a single capital letter, A, B, C in a circle that would be good if it appeared in the filename)
as a bonus, if there was a way to also extract the sections sampled in the first sample image (in the lower left corner), this would probably involve recognize where/if in the image this appears (would probably only appear once per file but not certain) and then extracting based on its coordinates?
mainly the most important is inside a green box and represents a pair of tables (the left table is an example and i believe would always be the same, and the right has a variable amount of columns)
also the table in the orange box would be good to also get the text from as it's related
as would the small overview map in the blue box, can be left as an image
i have been looking at tutorials on opencv and image recognition processes but the content so far has been highly elementary not to mention an overwhelming endless list of algorithms for different operations (which again i don't know which i'd even need), so i'm not sure how it relates to what i'm trying to do.. really i don't even know where to begin to structure the steps needed for undertaking all these tasks or how each should be broken down further to ease the processing.

Comparing fluorescence intensity of finger print residue after 5 contacts

I have an image as follows that shows the residue of fluorescent powder left on a surface after 5 sequential contacts. Is it possible to quantify the a difference in the amount of residue between contacts?
I have looked at Fiji/ImageJ and selected each finger print at a time to get the mean grey value but I don't see much difference to be honest. Any help or thoughts would be very much appreciated. Happy to think about python or matlab.
In order for the quantification of the intensities to be useful, I would imagine you would need to assume your image was evenly lit and that any fluorescence you see isn't a result of oversaturation. That being said, in principle, you could contour the given fingerprint, duplicate it to a new image, and then measure the stack histogram after adjusting the threshold such that regions darker than your fingerprint powder are set to black. Perhaps the change in distribution will illustrate the change that occurs, if any.
Edit:
First: merge the RGB channels by adding them to one another using the channel calculator function. Your signal is the result of multiple colors, so splitting it does not make sense to me here.
The steps would then be:
Duplicate your given print to a new window.
Use "Adjust Threshold" to set a threshold of 0-n, where n is the highest intensity that doesn't include your fingerprint.
Run the command "Edit/Selection/Create Selection."
Run the command "Edit/Clear."
Press "ctrl+H" to measure the histogram of the pixels, and then "List" to get the actual values.
Repeat for each print and plot on the same chart.
When you are already obtaining the actual histogram values, and not just the range from the particle analyzer, then I'm not sure there's much else that I can personally suggest.

Python - find tube-shape structure, fill holes, and generate binary mask for downstream image processing

First time posting here.
I am working on image processing for a bioengineering project, and I am using Python, mostly the Skimage package, to batch process the images we have taken. There are 6 - 8 tubes in each image, where cells flow through. At each time point, we captured normal bright field images, as well as fluorescence images. I am trying to identify the tubes based on the bright field images, separate the tubes from the background and label them. The masked/labeled images will be used for downstream processing for the fluorescence images, where we identify cells and get their shape metrics.
tl;dr: Python; skimage; image processing; separate tube-like structures from background in a bright field image.
I will use one example to show what I have done. I wanted to include all the intermediate images, but I do not have any reputation points to post more than two images. So I will show the first and last images.
I cropped out the scale bar first and obtained a greyscale image, and here is the resulting image.bf_image
I used sobel_h filter to find the horizontal edges.
bf_sobel = sobel_h(bf_cropped)
io.imshow(bf_sobel)
I then tried all thresholds and picked out a threshold algorithm that looked good to me (otsu)
fig, ax = try_all_threshold(bf_sobel, figsize=(25,20), verbose=False)
plt.show()
bf_threshold = threshold_otsu(bf_sobel)
bf_thresholded = bf_sobel > bf_threshold
io.imshow(bf_thresholded)
I then applied the closing function and remove_small_objects.
bf_closed = closing(bf_thresholded)
bf_small_removed = remove_small_objects(bf_closed, 50)
io.imshow(bf_small_removed)
bf_small_removed
This is where I got stuck. I am trying to fill the gaps between the tube edges, and create masks for individual tubes to separate them from the background. Any advice? Thanks!!!

Object (simple shapes) Detection in Image

I've got the following image.
Other Samples
I want to detect the six square-shaped green portions and the one circular portion above them. I basically want a binary image with these portions marked 1 (white) and everything else 0 (black).
What have I done so far?
I found a range of H, S, and V within which these colors fall which works fine for a single image, but I've got multiple such images, some under different illumination (brightness) conditions and the ranges do not work in those cases. What should I do to make the thresholding as invariant to brightness as possible? Is there a different approach I should take for thresholding?
What you did was manually analyze the values you need for thresholding for a specific image, and then apply that. What you see is that analysis done on one image doesn't necessarily fit other images.
The solution is to do the analysis automatically for each image. This can be achieved by creating a histogram for each of the channels, and if you're working in HSV, I'm guessing that the H channel would be pretty much useless in this case.
Anyway, once you have the histograms, you should analyze the threshold using something like Lloyd-Max, which is basically a K-Means type clustering of intensities. This should give the centroids for the intensity of the white background, and the other colors. Then you choose the threshold based on the cluster standard deviation.
For example, in the image you gave above, the histogram of the S channel looks like:
You can see the large blob near 0 is the white background that has the lowest saturation.

Blobs detection using machine learning?

I have a large stack of images showing a bar with some dark blobs, whose position changes with time (see Figure, b). To detect the blobs I am now using an intensity threshold (c in Figure, where all intensity values below the threshold are set to 1) and then I search for blobs in the binary image using the Matlab code below. As you see the binary image is quite noisy, complicating the blobs detection process. Do you have any suggestion on how to improve the shape detection, maybe including some machine learning algorithm? Thanks!
Code:
se = strel('disk',1);
se_1 = strel('disk',3);
pw2 = imclose(IM,se);
pw3 = imopen(pw2,se_1);
pw4 = imfill(pw3, 'holes');
% Consider only the blobs with more than threshold pixels
[L,num] = bwlabel(pw4);
counts = sum(bsxfun(#eq,L(:),1:num));
number_valid_counts = length(find(counts>threshold));
This might help.
Extract texture features of the boundary of the blobs you want to extract. This can be done using Local binary patterns. There are many other texture features, you can get a detailed survey here.
Then use them to train a binary classifier.
It seems that the data come like pulses in the lower side of the image, I suggest to get some images and to slice vertical lines of the pixels perpendicular to the pulse direction, each time you take a line of values, little bit above and lower the pulse, the strip width is one pixel, and its height is little bit larger than the pulse image to take some of the light values lower and above the pulse, you may start from pixel 420-490, each time you save 70 grey values, those will form the feature vector, take also lines from the non blob areas to save for class 2, do this on several images and lines from each image.
now you get your training data, you may use any machine learning algorithm to train the computer for pulses and non pulses,
in the test step, you scan the image reading each time 70 pixels vertically and test them against the trained model, create a new black image if they belong to class "bolob" draw white vertical line starting from little below the tested pixel, else draw nothing on the output image.
at the end of scanning the image: check if there is an isolated white line you may delete considering it as false accepted . if you find a dark line within a group of white line, then convert it to white, considering false rejection.
you may use my classifier: https://www.researchgate.net/publication/265168466_Solving_the_Problem_of_the_K_Parameter_in_the_KNN_Classifier_Using_an_Ensemble_Learning_Approach
if you decide I will send you coed to do it. the distance metric is a problem, because the values varies between 0 and 255, so the light values will dominate the distance, to solve this problem you may use Hassanat distance metric at : https://www.researchgate.net/publication/264995324_Dimensionality_Invariant_Similarity_Measure
because it is invariant to scale in data, as each feature output a value between 0 and 1 no more, thus the highest values will not dominate the final distance.
Good luck

Categories

Resources