I've been trying to implement the local ridge orientation for fingerprints in python. I've used the Gradient method, and using sobel operator to get the gradients I need. However it turned out that this method has quite a lot of flaws, especially around 90 degrees. I could include the code that I've done so far, but as it does not work as I want, I don't know if it's needed. I've also looked at the line segment method, however, I'm working with latent fingerprints so it is hard to know if one should look for maximum of black or white in the line segments. I've also tried to implement an algorithm to detect the area of maximum concentration of continous lines, but I couldn't get this to work. Any suggestion for other algorithms to use?
EDIT:
I'm using a function to apply my function to blocks, but that is hardly relevant
def lro(im_np):
orientsmoothsigma = 3
Gxx = cv2.Sobel(im_np,-1,2,0)
Gxy = cv2.Sobel(im_np,-1,1,1)
Gyy = cv2.Sobel(im_np,-1,0,2)
Gxx = scipy.ndimage.filters.gaussian_filter(Gxx, orientsmoothsigma)
Gxy = numpy.multiply(scipy.ndimage.filters.gaussian_filter(Gxy, orientsmoothsigma), 2.0)
Gyy = scipy.ndimage.filters.gaussian_filter(Gyy, orientsmoothsigma)
denom = numpy.sqrt(numpy.add(numpy.power(Gxy,2), (numpy.power(numpy.subtract(Gxx,Gyy),2))))# + eps;
sin2theta = numpy.divide(Gxy,denom) # Sine and cosine of doubled angles
cos2theta = numpy.divide(numpy.subtract(Gxx,Gyy),denom)
sze = math.floor(6*orientsmoothsigma);
if not sze%2: sze = sze+1
cos2theta = scipy.ndimage.filters.gaussian_filter(cos2theta, orientsmoothsigma) # Smoothed sine and cosine of
sin2theta = scipy.ndimage.filters.gaussian_filter(sin2theta, orientsmoothsigma)#filter2(f, sin2theta); # doubled angles
orientim = math.pi/2. + numpy.divide(numpy.arctan2(sin2theta,cos2theta),2.)
return orientim
I worked on this a long time ago, and wrote a paper on it. As I remember, look for both black and white ridges (invert the image and repeat the analysis) to give more results. I do remember some sensitivity at some angles. You probably need something with more extent than a pure Sobel. Try to reach out as many pixels as practical.
You may want to have a look at the work of Raymond Thai (Fingerprint Image
Enhancement and Minutiae Extraction), if you haven't already.
Related
I'm starting a research at the university, with the theme of enabling the use of AI to calculate a region of the retina. The first part we stipulated was to segment two important parts of the retina, using u-net. The second is to use the result of segmentation to find the important points and perform a calculation.
So, in the below image, I show the output of segmentation each region, using u-net (The red annotation isn't part of segmentation). I tried represent regions that I want find in first and second block. Once did, I could calculate distance between these points, when merge them.
So, my question is: what kind of technique I can use to read the pixels in order of to find the coordinates where I marked?
Is OpenCV a lib that could help me? It's the first time that I handle with this kind of problem, so thanks for any suggestion or guidance.
Using OpenCV:
detecting the margins can be done with connectedComponentsWithStats() method.
connectivity = 8
output = cv2.connectedComponentsWithStats(binary_img, connectivity, cv2.CV_32S)
stats = output[2] # stat matrix
first_blob_left = stats[0,cv2.CC_STAT_LEFT]
first_blob_right = stats[0,cv2.CC_STAT_RIGHT]
second_blob_left = stats[1,cv2.CC_STAT_LEFT]
second_blob_right = stats[1,cv2.CC_STAT_RIGHT]
if first_blob_left < second_blob_left:
dist = second_blob_left - first_blob_right
else:
dist = first_blob_left - second_blob_right
detecting the deepest point can be done with the same method:
connectivity = 8
output = cv2.connectedComponentsWithStats(binary_img, connectivity, cv2.CV_32S)
stats = output[2] # stat matrix
blob_top = stats[0,cv2.CC_STAT_TOP]
blob_height = stats[0,cv2.CC_STAT_HEIGHT]
deepest_point_y_position = blob_top + blob_height
Note: This code hasn't been tested, it may contains some typos. But still, the idea stays the same, and should work without much effort.
Take a look at
labels = output[1]
"labels" is an array the site of the input image, where each pixel of a blob is labelled with the same value. This should help you to find the coordinates of the margins
I have a 3D numpy array boolean mask which has been segmented from a MRI brain volume.
Brain voxels = True. Everything else = False.
What I would like to do is to enlarge this mask such that it would encompass the surrounding tissues in the MRI volume, not just the segmented organ, perhaps a 10mm rind of non-brain all around the brain.
I tried using a 2D dilation using the skimage.morphology.dilation with a diamond filter. While this is nice and fast for a single image, I need to repeat this in multiple slices through the volume and in at least 2 planes to come even close to uniformly dilating the 3D mask.
I largely took my code from here: https://scipy-lectures.org/packages/scikit-image/index.html
typical volume shape = 512, 512, 270
# 1st pass in axial plane
(x, y, z) = np.shape(3dMask)
for slice_number in range(z):
image_slice = 3dMask[:, :, slice_number]
3dMask[:, :, slice_number] = morphology.binary_dilation(image_slice, morphology.diamond(30))
# repeat in coronal plane...
This works very nicely with the desired effect in each slice, but is very slow for 3D.
I can speed things up by only dilating those slices containing at least one 'True', but that inevitably leaves 100+ slices in each plane. Still slow.
In the hope that the python side looping is slowing everything down, I have looked for a 3D equivalent single function in numpy and skimage but have found nothing that I can recognise as useful.
I toyed with the idea of finding the geometric centre and simply zooming the volume by 5%, but there will necessarily be holes in the mask (the space in-between the 2 halves of the brain) which will no longer match up with the MRI volume and so is of no use...
I assume this means that I am doing it wrong as I am new to both numpy and skimage.
Is there a fast way to do this? Perhaps a 3D alternative to the 2D skimage dilation?
This question actually has a bit of subtlety, which I'll try to unpack.
The first thing to note is that most scikit-image functions actually work totally fine in 3D, including binary_dilation! So you should in an ideal world be able to do:
dilated = morphology.binary_dilation(
mask3d, morphology.ball(radius=30)
)
I say in an ideal world because that crashes on my machine, probably because this longstanding SciPy bug prevents SciPy filters (which scikit-image uses under the hood) from working with large neighbourhood sizes.
For square- and diamond-shaped neighbourhoods, though, you do have a workaround: dilating once with a diamond of radius 30 is actually the same as dilating 30 times with a diamond of radius 1! You can do this manually in a for-loop, or you can use scipy.ndimage.binary_dilation using the iterations keyword argument. (See this issue for some discussion around this.)
from scipy import ndimage as ndi
# make a little 3D diamond:
diamond = ndi.generate_binary_structure(rank=3, connectivity=1)
# dilate 30x with it
dilated = ndi.binary_dilation(mask3d, diamond, iterations=30)
You can actually get pretty far with this strategy. For example, if your dataset doesn't have the same resolution in x, y, and z, maybe you want to dilate more, say twice as much, along x and y. You can do this in two steps:
dilated1 = ndi.binary_dilation(mask3d, diamond, iterations=15)
flat = np.copy(diamond)
flat[:, :, 0] = 0
flat[:, :, -1] = 0
dilated2 = ndi.binary_dilation(mask3d, flat, iterations=15)
Finally, note that binary dilation is equivalent to a (nonbinary) convolution followed by thresholding above 0. So I found that this also works:
from scipy import signal
b = morphology.ball(radius=30)
dilated = signal.fftconvolve(mask3d, b, mode='same') > 0
However, for this image size and on my machine, this was slower than the iterated dilation. But, it's worth keeping in mind because the performance will be different for different datasets.
As a side note, I recommend posting complete, working code in your StackOverflow questions, as explained here. In your case, np.shape(3dMask) is a syntax error since 3dMask is not a valid Python identifier! =)
I hope this helps!
I have an image from an electron micrograph depicting dense and rare layers in a biological system, as shown below.
The layers in question are in the middle of the image, starting just to near the label "re" and tapering up to the left. I would like to:
1) count the total number of dark/dense and light/rare layers
2) measure the width of each layer, given that the black scale bar in the bottom right is 1 micron long
I've been trying to do this in Python. If I crop the image beforehand so as to only contain parts of a few layers, such the 3 dark and 3 light layers shown here:
I am able to count the number of layers using the code:
import numpy as np
import matplotlib.pyplot as plt
from scipy import ndimage
from PIL import Image
tap = Image.open("VDtap.png").convert('L')
tap_a = np.array(tap)
tap_g = ndimage.gaussian_filter(tap_a, 1)
tap_norm = (tap_g - tap_g.min())/(float(tap_g.max()) - tap_g.min())
tap_norm[tap_norm < 0.5] = 0
tap_norm[tap_norm >= 0.5] = 1
result = 255 - (tap_norm * 255).astype(np.uint8)
tap_labeled, count = ndimage.label(result)
plt.imshow(tap_labeled)
plt.show()
However, I'm not sure how to incorporate the scale bar and measure the widths of these layers that I have counted. Even worse, when analyzing the entire image so as to include the scale bar I am having trouble even distinguishing the layers from everything else that is going on in the image.
I would really appreciate any insight in tackling this problem. Thanks in advance.
EDIT 1:
I've made a bit of progress on this problem so far. If I crop the image beforehand so as to contain just a bit of the layers, I've been able to use the following code to get at the thicknesses of each layer.
import numpy as np
import matplotlib.pyplot as plt
from scipy import ndimage
from PIL import Image
from skimage.measure import regionprops
tap = Image.open("VDtap.png").convert('L')
tap_a = np.array(tap)
tap_g = ndimage.gaussian_filter(tap_a, 1)
tap_norm = (tap_g - tap_g.min())/(float(tap_g.max()) - tap_g.min())
tap_norm[tap_norm < 0.5] = 0
tap_norm[tap_norm >= 0.5] = 1
result = 255 - (tap_norm * 255).astype(np.uint8)
tap_labeled, count = ndimage.label(result)
props = regionprops(tap_labeled)
ds = np.array([])
for i in xrange(len(props)):
if i==0:
ds = np.append(ds, props[i].bbox[1] - 0)
else:
ds = np.append(ds, props[i].bbox[1] - props[i-1].bbox[3])
ds = np.append(ds, props[i].bbox[3] - props[i].bbox[1])
Essentially, I discovered the Python module skimage, which can take a labeled image array and return the four coordinates of a boundary box for each labeled object; the 1 and [3] positions give the x coordinates of the boundary box, so their difference yields the extent of each layer in the x-dimension. Also, the first part of the for loop (the if-else condition) is used to get the light/rare layers that precede each dark/dense layer, since only the dark layers get labeled by ndimage.label.
Unfortunately this is still not ideal. Firstly, I would like to not have to crop the image beforehand, as I intend to repeat this procedure for many such images. I've considered that perhaps the (rough) periodicity of the layers could be highlighted using some sort of filter, but I'm not sure if such a filter exists? Secondly, the code above really only gives me the relative width of each layer - I still haven't figured out a way to incorporate the scale bar so as to get the actual widths.
I don't want to be a party-pooper, but I think your problem is harder than you first thought. I can't post a working code snippet because there are so many parts of your post that require in depth attention. I have worked in several bio/med labs and this work is usual done with a human to tag specific image points and a computer to calculate distances. That being said, one should probably try to automate =D.
To you, the problem is a simple, yet tedious job, of getting out a ruler and making a few hundred measurements. Perfect for a computer right? Well yes and no. The computer has no idea how to identify any of the bands in the picture and has to be told exactly what its looking for, and that will be tricky.
Identifying the scale bar
What do you know about the scale bars in all your images. Are they always the same number of vertical and horizontal pictures, are they always solid black? Are there always just one bar (what about the solid line for the letter r)? My suggestion is to try a wavelet transform. Imagine the 2d analog to the function
(probably helps to draw this function)
f(x) =
0 if |x| > 1,
1 if |x| <1 && |x| > 0.5
-1 if |x| < 0.5
Then when our wavelet f(x, y) is convolved over the image, the output image will have high values only when it finds the black scale bar. Also the length that I set to 1 can also be tuned for wavelets and that will help you find the scale bar too.
Finding the ridges
I'd solve the above problem first because it seems easier and sets you up for this one. I'd construct another wavelet for this one but just as a preprocessing step. For this wavelet I'd try a 2d 0-sum box function again, but this try to match three (or more) boxes next to each other. Also in addition to the height and width parameters for the box, we need a spacing and tilt angle parameter. You probably don't have to get very close to the actual value, just close enough that the rest of the image blackens out.
Measuring the ridges
There are lots and lots of ways to do this, but let's use our previous step for simplicity. Take your 3 box wavelet answer and it should be centered at the middle ridge and report a box "width" that is the average width of those three ridges it has captured. Probably close enough considering how slowly the widths are changing!
Good hunting!
After completion several chapters in computer vision books I decided to apply those methods to create some primitive bot for a game. I chose Fling that has almost no dynamics and all I needed to do was to find balls. Balls may have 5 different colors and also they can be directed to any of 4 directions (depending on eyes' location). I cropped each block in the field such that I can just check each block whether it contains a ball or not. My problem is that I'm not able to find balls correctly.
My first attempt was following. I sum RGB colors for each ball and get [R, G, B] array. Then I sum RGB colors for each block in the field. If block's array has a similar [R, G, B] as ball's array I suggest that this block has a ball.
The problem is it's hard to find good value for 'similarity'. Even different empty blocks vary in such sums significantly.
Second, I tried to use openCV module that has matchTemplate function. This function matches image with another source image and along with minMaxLoc function returns a value maxLoc. If maxLoc is close to 1 then the image is probably in source image. I made all possible variations of balls (20 overall), and passed them with the entire field. This function worked well but unfortunately it sometimes misses some balls in the field or assigns two different types of balls (say green and yellow) for one ball. I tried to improve the process by matching balls not with the entire field but with each block (this method has advantage that it checks each block and should detect correct number of balls in the field, when matching with entire field only gives one location for each color of ball. If there are two balls of the same color matchTemplate loses information about 2nd ball) . Surprisingly it still has false negatives\positives.
Probably there is much easier way to solve this problem (maybe a library that I don't know yet) but for now I can't find one. Any suggestions are welcomed.
The balls seem pretty distinct in terms of colour. The problems you initially described seem to be related to some of the finer, random detail present in the image - especially in the background and in the different shading/poses of the ball.
On this basis, I would say you could simplify the task significantly by applying a set of pre-processing steps to "collapse" the range of colours in the image.
There are any number of more principled ways to achieving accurate colour segmentation (which is what, more formally, you want to achieve) - but taking a more pragmatic view, here are a few quick'n'dirty hacks.
So, for example, we can initially smooth the image to reduce higher frequency components...
Then, convert to a normalised RGB representation...
Before, finally posterizing it with the mean shift filtering step...
Here is the code in Python, using the OpenCV bindings, that does all this in order:
import cv
# get orginal image
orig = cv.LoadImage('fling.png')
# show original
cv.ShowImage("orig", orig)
# blur a bit to remove higher frequency variation
cv.Smooth(orig,orig,cv.CV_GAUSSIAN,5,5)
# normalise RGB
norm = cv.CreateImage(cv.GetSize(orig), 8, 3)
red = cv.CreateImage(cv.GetSize(orig), 8, 1)
grn = cv.CreateImage(cv.GetSize(orig), 8, 1)
blu = cv.CreateImage(cv.GetSize(orig), 8, 1)
total = cv.CreateImage(cv.GetSize(orig), 8, 1)
cv.Split(orig,red,grn,blu,None)
cv.Add(red,grn,total)
cv.Add(blu,total,total)
cv.Div(red,total,red,255.0)
cv.Div(grn,total,grn,255.0)
cv.Div(blu,total,blu,255.0)
cv.Merge(red,grn,blu,None,norm)
cv.ShowImage("norm", norm)
# posterize simply with mean shift filtering
post = cv.CreateImage(cv.GetSize(orig), 8, 3)
cv.PyrMeanShiftFiltering(norm,post,20,30)
cv.ShowImage("post", post)
Your task is simpler in several respects than the ones the general computer vision algorithms you'll find were designed for: you know exactly what to look for and you know exactly where to look for it. As such I think involving an external library is an unnecessary complication, unless you're already familiar with it and can use it effectively as a tool to solve your own problem. In this post I will only use PIL.
First, distinguish the task into two simpler tasks:
Given a tile, determine whether there's a ball there.
Given a tile where we're pretty sure that there's a ball, identify the colour of the ball.
The second task should be simple and I won't spend time on it here. Basically, sample some pixels where the ball's main colour will be visible and compare the colours you find to the known ball colours.
So let's look at the first task.
First off, note that the balls don't extend to the edge of the tiles. Thus you can find a fairly representative sample of the background of a tile, whether or not there's a ball there, by sampling the pixels along the edge of the tile.
A simple way to proceed is to compare every pixel in a tile with this sample of the tile background, and to obtain some sort of measure of whether it's generally similar (no ball) or dissimilar (ball).
The following is one way to do this. The basic approach used here is to calculate the mean and the standard deviation of the background pixels -- separately for the red, green, and blue channels. For every pixel, we then calculate the number of standard deviations we are from the mean in every channel. We take this value for the most dissimilar channel as our measure of dissimilarity.
import Image
import math
def fetch_pixels(col, row):
img = Image.open( "image.png" )
img = img.crop( (col*32,row*32,(col+1)*32,(row+1)*32) )
return img.load()
def border_pixels( a ):
rv = [ a[x,y] for x in range(32) for y in (0,31) ]
rv.extend( [ a[x,y] for x in (0,31) for y in range(1,31) ] )
return rv
def mean_and_stddev( xs ):
mean = float(sum( xs )) / len(xs)
dev = math.sqrt( float(sum( [ (x-mean)**2 for x in xs ] )) / len(xs) )
return mean, dev
def calculate_deviations(cols = 7, rows = 8):
outimg = Image.new( "L", (cols*32,rows*32) )
pixels = outimg.load()
for col in range(cols):
for row in range(rows):
rv = calculate_deviations_for( col, row, pixels )
print rv
outimg.save( "image_output.png" )
def calculate_deviations_for( col, row, opixels ):
a = fetch_pixels( col, row )
border = border_pixels( a )
bru, brd = mean_and_stddev( map( lambda x : x[0], border ) )
bgu, bgd = mean_and_stddev( map( lambda x : x[1], border ) )
bbu, bbd = mean_and_stddev( map( lambda x : x[2], border ) )
rv = []
for y in range(32):
for x in range(32):
r, g, b = a[x,y]
dr = (bru-r) / brd
dg = (bgu-g) / bgd
db = (bbu-b) / bbd
t = max(abs(dr), abs(dg), abs(db))
opixel = 0
limit, span = 2.5, 8.0
if t > limit:
v = min(1.0, (t - limit) / span)
print t,v
opixel = 127 + int( 128 * v )
opixels[col*32+x,row*32+y] = opixel
rv.append( t )
return (sum(rv) / float(len(rv)))
A visualization of the result is here:
Note that most of the non-ball pixels are pure black. It should now be possible to determine whether a ball is present or not by simply counting the black pixels. (Or more reliably: count the size of the largest single blob of non-black pixels.)
Now, this is a very ad-hoc method and I certainly don't make any claim that it's the best method. The "limit" value was determined by experimentation -- essentially, by trial and error. It's included here to illustrate the sort of method I think you should be exploring, and to give you a starting point to tweak from. (If you want a place to start experimenting, you could try to make it give a better result for the top purple ball. Can you think of weaknesses in the approach above that might make it give a result like that? Always keep in mind, however, that you don't need a perfect-looking result, just one that's good enough. The final answer you want is "ball" or "no ball", and you just want to be able to answer that reliably.)
Note that:
You need to make sure you take the screengrab when the balls have finished rolling and are lying still in the center of their tiles. This simplifies the problem immensely.
The game's background affects the problem -- if there are ocean-themed or desert-themed levels coming up, you will need to test and possibly tweak the recognizer to make sure it still reliably works.
Special effects and/or GUI elements that cover the playing field will complicate the problem. (E.g. consider if the game has a 'cloud' or 'smoke' effect that sometimes floats over the playing field.) You may want to tweak the recognizer to be able to return "no result" if it's not sure -- then you can try another screengrab later. You may want to take several screengrabs and average the results.
I have assumed that there are only balls and non-balls. If later levels have other kinds of objects, you will have to experiment more to find out how to best recognize those.
I haven't used the 'reference picture' approach. However, if you have an image containing all the objects in the game and you can exactly align the pixels with your tiles, that's likely going to be the most reliable approach. Instead of comparing the foreground to the sampled background, compare the foreground to a set of known foreground images.
I have two dimensional discrete spatial data. I would like to make an approximation of the spatial boundaries of this data so that I can produce a plot with another dataset on top of it.
Ideally, this would be an ordered set of (x,y) points that matplotlib can plot with the plt.Polygon() patch.
My initial attempt is very inelegant: I place a fine grid over the data, and where data is found in a cell, a square matplotlib patch is created of that cell. The resolution of the boundary thus depends on the sampling frequency of the grid. Here is an example, where the grey region are the cells containing data, black where no data exists.
1st attempt http://astro.dur.ac.uk/~dmurphy/data_limits.png
OK, problem solved - why am I still here? Well.... I'd like a more "elegant" solution, or at least one that is faster (ie. I don't want to get on with "real" work, I'd like to have some fun with this!). The best way I can think of is a ray-tracing approach - eg:
from xmin to xmax, at y=ymin, check if data boundary crossed in intervals dx
y=ymin+dy, do 1
do 1-2, but now sample in y
An alternative is defining a centre, and sampling in r-theta space - ie radial spokes in dtheta increments.
Both would produce a set of (x,y) points, but then how do I order/link neighbouring points them to create the boundary?
A nearest neighbour approach is not appropriate as, for example (to borrow from Geography), an isthmus (think of Panama connecting N&S America) could then close off and isolate regions. This also might not deal very well with the holes seen in the data, which I would like to represent as a different plt.Polygon.
The solution perhaps comes from solving an area maximisation problem. For a set of points defining the data limits, what is the maximum contiguous area contained within those points To form the enclosed area, what are the neighbouring points for the nth point? How will the holes be treated in this scheme - is this erring into topology now?
Apologies, much of this is me thinking out loud. I'd be grateful for some hints, suggestions or solutions. I suspect this is an oft-studied problem with many solution techniques, but I'm looking for something simple to code and quick to run... I guess everyone is, really!
~~~~~~~~~~~~~~~~~~~~~~~~~
OK, here's attempt #2 using Mark's idea of convex hulls:
alt text http://astro.dur.ac.uk/~dmurphy/data_limitsv2.png
For this I used qconvex from the qhull package, getting it to return the extreme vertices. For those interested:
cat [data] | qconvex Fx > out
The sampling of the perimeter seems quite low, and although I haven't played much with the settings, I'm not convinced I can improve the fidelity.
I think what you are looking for is the Convex Hull of the data That will give a set of points that if connected will mean that all your points are on or inside the connected points
I may have mixed something, but what's the motivation for simply not determining the maximum and minimum x and y level? Unless you have an enormous amount of data you could simply iterate through your points determining minimum and maximum levels fairly quickly.
This isn't the most efficient example, but if your data set is small this won't be particularly slow:
import random
data = [(random.randint(-100, 100), random.randint(-100, 100)) for i in range(1000)]
x_min = min([point[0] for point in data])
x_max = max([point[0] for point in data])
y_min = min([point[1] for point in data])
y_max = max([point[1] for point in data])