I have the following code which looks for feature points in a binary skeletonized image. I need to find ending points, branch points and intersection points separately and display their coordinates as (x, y, point type). For example, (147, 45, 3), where 3 is the number of adjacent pixels (branch point).
import cv2 as cv
import numpy as np
def extraction(img):
# Find row and column locations that are non-zero
(rows, cols) = np.nonzero(img)
# Initialize empty list of co-ordinates
skel_coords = []
# For each non-zero pixel
for (r, c) in zip(rows, cols):
# Extract an 8-connected neighbourhood
(col_neigh, row_neigh) = np.meshgrid(np.array([c - 1, c, c + 1]), np.array([r - 1, r, r + 1]))
# Cast to int to index into image
col_neigh = col_neigh.astype('int')
row_neigh = row_neigh.astype('int')
# Convert into a single 1D array and check for non-zero locations
pix_neighbourhood = img[row_neigh, col_neigh].ravel() != 0
# If the number of non-zero locations, add this to our list of co-ordinates
if np.sum(pix_neighbourhood) == 2:
skel_coords.append((c, r, 1))
elif np.sum(pix_neighbourhood) == 4:
skel_coords.append((c, r, 3))
elif np.sum(pix_neighbourhood) == 5:
skel_coords.append((c, r, 4))
return skel_coords
img = cv.imread('abc.png', 0)
coord = extraction(img)
for element in coord:
print(element)
The code correctly finds the number of neighboring pixels, but they are not branching and crossing points. You can see it in the picture below (the found point is marked in gray):
An enlarged image of a 3x3 pixel matrix (below, two white pixels are in a row):
I need to find points of the following kind for branch points (so that neighboring pixels alternate):
Does anyone have any ideas how to implement this? I would be very grateful for your help!
Related
I would like to find minimum distance of each voxel to a boundary element in a binary image in which the z voxel size is different from the xy voxel size. This is to say that a single voxel represents a 225x110x110 (zyx) nm volume.
Normally, I would do something with scipy.ndimage.morphology.distance_transform_edt (https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.ndimage.morphology.distance_transform_edt.html) but this gives the assume that isotropic sizes of the voxel:
dtrans_stack = np.zeros_like(segm_stack) # empty array to add to
### iterate over the t dimension and get distance transform
for t_iter in range(dtrans_stack.shape[0]):
segm_ = segm_stack[t_iter, ...] # segmented image in single t
neg_segm = np.ones_like(segm_) - segm_ # negative of the segmented image
# get a ditance transform with isotropic voxel sizes
dtrans_stack_iso = distance_transform_edt(segm_)
dtrans_neg_stack_iso = -distance_transform_edt(neg_segm) # make distance in the segmented image negative
dtrans_stack[t_iter, ...] = dtrans_stack_iso + dtrans_neg_stack_iso
I can do this with brute force using scipy.spatial.distance.cdist (https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html) but this takes ages and I'd rather avoid it if I can
vox_multiplier = np.array([z_voxelsize, xy_voxelsize, xy_voxelsize]) # array of voxel sizes
## get a subset of coordinatess so I'm not wasting times in empty space
disk_size = 5 # size of disk for binary dilation
mip_tz = np.max(np.max(decon_stack, axis = 1), axis = 0)
thresh_li = threshold_li(mip_tz) # from from skimage.filters
mip_mask = mip_tz >= thresh_li
mip_mask = remove_small_objects(mip_mask) # from skimage.morphology
mip_dilated = binary_dilation(mip_mask, disk(disk_size)) # from skimage.morphology
# get the coordinates of the mask
coords = np.argwhere(mip_dilated == 1)
ycoords = coords[:, 0]
xcoords = coords[:, 1]
# get the lower and upper bounds of the xyz coordinates
ylb = np.min(ycoords)
yub = np.max(ycoords)
xlb = np.min(xcoords)
xub = np.max(xcoords)
zlb = 0
zub = zdims -1
# make zeros arrays of the proper size
dtrans_stack = np.zeros_like(segm_stack)
dtrans_stack_neg = np.zeros_like(segm_stack) # this will be the distance transform into the low inten area
for t_iter in range(dtrans_stack.shape[0]):
segm_ = segm_stack[t_iter, ...]
neg_segm_ = np.ones_like(segm_) - segm_ # negative of the segmented image
# get the coordinats of segmented image and convert to nm
segm_coords = np.argwhere(segm_ == 1)
segm_coords_nm = vox_multiplier * segm_coords
neg_segm_coords = np.argwhere(neg_segm_ == 1)
neg_segm_coords_nm = vox_multiplier * neg_segm_coords
# make an empty arrays for the xy and z distance transforms
dtrans_stack_x = np.zeros_like(segm_)
dtrans_stack_y = np.zeros_like(segm_)
dtrans_stack_z = np.zeros_like(segm_)
dtrans_stack_neg_x = np.zeros_like(segm_)
dtrans_stack_neg_y = np.zeros_like(segm_)
dtrans_stack_neg_z = np.zeros_like(segm_)
# iterate over the zyx and determine the minimum distance in nm from segmented image
for z_iter in range(zlb, zub):
for y_iter in range(ylb, yub):
for x_iter in range(xlb, xub):
coord_nm = vox_multiplier* np.array([z_iter, y_iter, x_iter]) # change coords from pixel to nm
coord_nm = coord_nm.reshape(1, 3) # reshape for distance calculateion
dists_segm = distance.cdist(coord_nm, segm_coords_nm) # distance from the segmented image
dists_neg_segm = distance.cdist(coord_nm, neg_segm_coords_nm) # distance from the negative segmented image
dtrans_stack[t_iter, z_iter, y_iter, x_iter] = np.min(dists_segm) # add minimum distance to distance transfrom stack
dtrans_neg_stack[t_iter, z_iter, y_iter, x_iter] = np.min(dists_neg_segm)
Here is image of a single zslice of segmented image if that helps to clear things up
single z-slice of segmented image
Normally, I would do something with scipy.ndimage.morphology.distance_transform_edt but this gives the assume that isotropic sizes of the voxel:
It does no such thing! You are looking for the sampling= parameter. From the latest version of the docs:
Spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.
The wording "sampling" or "spacing" is probably a bit mysterious if you think of pixels as little squares/cubes, and that is probably why you missed it. In most situations, it is better to think of pixels as point samples on a grid, with fixed spacing between samples. I recommend Alvy Ray's a pixel is not a little square for a better understanding of this terminology.
Marmot is a document image dataset (http://www.icst.pku.edu.cn/cpdp/data/marmot_data.htm) where labelling several things such as document body, image area, table area, table caption and so on. This dataset specially use for document image analysis research purpose. They mentioned all coordinates in 16 digit hexa decimal with little endian format. Is there anyone how worked with this dataset and how to convert that 16 digit XY coordinate to human understandable format?
Finally I got the clue after analysis and posting here if anyone need to investigate this dataset. However, they mentioned the unit value in which way they convert the given coordinate into pixel value but it was difficult to trace out because they did not mentioned it in their manual/guideline. They mentioned another place as an annotation.
First you have to convert their 16 character hexadecimal value using IEEE 754 little endian format. For example, a given coordinates for a label is,
BBox=['4074145c00000005', '4074dd95999999a9', '4080921e74bc6a80', '406fb9999999999a']
Convert using python,
conv_pound = struct.unpack('!d', str(t).decode('hex'))[0]) for t in BBox]
You will get value in "pound" unit which is 1/72 inch. We usually use coordinate in pixel unit and we know 1 inch is 96 pixel. So,
conv_pound = [321.2724609375003, 333.8490234375009, 530.2648710937501, 253.8]
Then, divided each value by 72 and multiply with 96 to finally get corresponding pixel value which is,
in_pixel = [428.36328, 445.13203, 707.01983, 338.40000]
They started to count pixel position from bottom-left corner of the document image. If you consider from top-left corner (usually we consider in this way), you have to subtract 2nd and 4th value from image height. If we consider image [height, width] is [1123, 793] then we can represent the above coordinates in integer value as,
label_boundary = [428, 678, 707, 785]
After staring at the xmls for an hour, I've found the last missing piece in the answer by #MMReza:
You don't need to rely on the units of measure in (step number 3). There is an attribute called "CropBox" of the root element "Page". Use that one to scale the coordinates.
I have something along the following lines (also inverse y axis here):
px0, py1, px1, py0 = list(map(hex_to_double, page.get("CropBox").split()))
pw = abs(px1 - px0)
ph = abs(py1 - py0)
for table in page.findall(".//Composite[#Label='TableBody']"):
x0p, y1m, x1p, y0m = list(map(hex_to_double, table.get("BBox").split()))
x0 = round(imgw*(x0p - px0)/pw)
x1 = round(imgw*(x1p - px0)/pw)
y0 = round(imgh*(py1 - y0m)/ph)
y1 = round(imgh*(py1 - y1m)/ph)
In case anyone is trying to do this in Python 3 like I did, you only have to change step 2 of the other answer like this :
conv_pound = [struct.unpack('!d', bytes.fromhex(t))[0] for t in BBox]
I wanted to convert the coordinates as well as wanted to verify that my conversion actually worked. So, I made this script to read label file and respective image file then extract coordinates of table body(for eg) and visualize them on the images. It can be used to extract other fields in the similar manner. Comments explain it all
import glob
import struct
import cv2
import binascii
import re
xml_files = glob.glob("path_to_labeled_files/*.xml")
for i in xml_files:
# Open the current file and read everything
cur_file = open(i,"r")
content = cur_file.read()
# Find index of all occurrences of only needed portions (eg TableBody this case)
residxs = [l.start() for l in re.finditer('Label="TableBody"', content)]
# Read the image
img = cv2.imread("path_to_images_folder/"+i.split('/')[-1][:-3]+"jpg")
# Traverse over all occurences
for r in residxs[:-1]:
# List to store output points
coords = []
# Start index of an occurence
sidx = r
# Substring from whole file content
substr = content[sidx:sidx+400]
# Now find start index and end index of coordinates in this substring
sidx = substr.find('BBox="')
eidx = substr.find('" CLIDs')
# String containing only points
points = substr[sidx+6:eidx]
# Make the conversion (also take care of little and big endian in unpack)
bins = ''
for j in points.split(' '):
if(j == ''):
continue
coords.append(struct.unpack('>d', binascii.unhexlify(j))[0])
if len(coords) != 4:
continue
# As suggested by MMReza
for k in range(4):
coords[k] = (coords[k]/72)*96
coords[1] = img.shape[0] - coords[1]
coords[3] = img.shape[0] - coords[3]
# Print the extracted coordinates
print(coords)
# Visualize it on the image
cv2.rectangle(img, (int(coords[0]),int(coords[1])) , (int(coords[2]),int(coords[3])), (255, 0, 0), 2)
cv2.imshow("frame",img)
cv2.waitKey(0)
Developing the A* algorithm for path planning, I am trying to get a list called edges in which all connections from one pixel to its neighbour pixels that occur in a non-occupied space (where the pixel value is 1).
The image from which I compute this connections is a 351x335 pixels image.
Pixels P2,4,6,8 are at a distance=1 from the center, while pixels P1,3,5,7 are at a approximate distance=1.4 from the center (Pythagoras theorem); see image:
The code written for the edges loop is never ending. Is this taking too much computational time due to the loops? The vertices loop ends in a second or so.
Note: I am initializing the lists as two very big arrays and cutting them at the end to not use dynamic allocation.
EDIT: The image (imOut) is the following one:
Link to image used as map
EDIT: The full code is the following:
'''
IMPORTS
'''
import cv2 as cv # Import OpenCV
import numpy as np # Import Numpy
from skimage.color import rgb2gray
import math
from datetime import datetime
import matplotlib.pyplot as plt
from scipy import arange
'''
CODE SETTINGS
'''
# Allowing to print full array without truncation
np.set_printoptions(threshold=np.nan)
'''
MAIN PROGRAM
'''
im = rgb2gray(cv.imread('map1.png'))
imOut = im # Making a copy of the image to output
# plt.imshow(imOut)
# plt.show()
vertices = np.zeros((imOut.shape[0]*imOut.shape[1], 3)) # 1st col to x, 2nd col to y, 3rd col to heuristic (euclidean distance to QGoal)
edges = np.zeros((100*imOut.shape[0]*imOut.shape[1], 3)) # 1st col 1st vertex, 2nd col 2nd vertex, 3rd col edge length
# Initialization of vertices with start pos.
# CREATE VERTICES AND EDGES LISTS FROM THE MAP PROVIDED
'''
Vertices List -> Add all obstacle-free configurations to the vertices list.
Edges List -> Go pixel by pixel in the map and, if they are obstacle-free configurations,
add, out of the 8 neighbouring pixels, the ones that are obstacle-free as feasible edges.
'''
# Vertices list creation
indexVertices = 0
for i in range(0, imOut.shape[0]):
for j in range(0, imOut.shape[1]):
if imOut[i,j] == 1: # If it's in free space
# Compute heuristic to goal node (euclidean distance).
heuristic = math.sqrt(pow(i-QGoal[0],2)
+ pow(j-QGoal[1],2))
vertices[indexVertices,:] = [i, j, heuristic]
indexVertices = indexVertices + 1
vertices = vertices[0:indexVertices,:]
# Edges list creation
# I loop over the same vertices array, as it only contains the free pixels.
indexEdges = 0
for i in range(0, vertices.shape[0]):
for k in range(0, vertices.shape[0]):
# If it is not the same pixel that we are checking
if i != k:
# Check if it is a neighbouring pixel and, if so,
#add it to the list of edges with its distance(path cost).
pathCost = (math.sqrt(pow(vertices[i,0]
- vertices[k,0], 2)
+ pow(vertices[i,1]
- vertices[k,1], 2))
if pathCost == 1 or round(pathCost,1) == 1.4:
edges[indexEdges,:] = [i, k, pathCost]
indexEdges = indexEdges + 1
edges = edges[0:indexEdges,:]
I want to get a list of indices (row,col) for all raster cells that fall within or are intersected by a polygon feature. Looking for a solution in python, ideally with gdal/ogr modules.
Other posts have suggested rasterizing the polygon, but I would rather have direct access to the cell indices if possible.
Since you don't provide a working example, it's bit unclear what your starting point is. I made a dataset with 1 polygon, if you have a dataset with multiple but only want to target a specific polygon you can add SQLStatement or where to the gdal.Rasterize call.
Sample polygon
geojson = """{"type":"FeatureCollection",
"name":"test",
"crs":{"type":"name","properties":{"name":"urn:ogc:def:crs:OGC:1.3:CRS84"}},
"features":[
{"type":"Feature","properties":{},"geometry":{"type":"MultiPolygon","coordinates":[[[[-110.254,44.915],[-114.176,37.644],[-105.729,36.41],[-105.05,43.318],[-110.254,44.915]]]]}}
]}"""
Rasterizing
Rasterizing can be done with gdal.Rasterize. You need to specify the properties of the target grid. If there is no predefined grid these could be extracted from the polygon itself
ds = gdal.Rasterize('/vsimem/tmpfile', geojson, xRes=1, yRes=-1, allTouched=True,
outputBounds=[-120, 30, -100, 50], burnValues=1,
outputType=gdal.GDT_Byte)
mask = ds.ReadAsArray()
ds = None
gdal.Unlink('/vsimem/tmpfile')
Converting to indices
Retrieving the indices from the rasterized polygon can be done with Numpy:
y_ind, x_ind = np.where(mask==1)
Clearly Rutger's solution above is the way to go with this, however I will leave my solution up. I developed a script that accomplished what I needed with the following:
Get the bounding box for each vector feature I want to check
Use the bounding box to limit the computational window (determine what portion of the raster could potentially have intersections)
Iterate over the cells within this part of the raster and construct a polygon geometry for each cell
Use ogr.Geometry.Intersects() to check if the cell intersects with the polygon feature
Note that I have only defined the methods, but I think implementation should be pretty clear -- just call match_cells with the appropriate arguments (ogr.Geometry object and geotransform matrix). Code below:
from osgeo import ogr
# Convert projected coordinates to raster cell indices
def parse_coords(x,y,gt):
row,col = None,None
if x:
col = int((x - gt[0]) // gt[1])
# If only x coordinate is provided, return column index
if not y:
return col
if y:
row = int((y - gt[3]) // gt[5])
# If only x coordinate is provided, return column index
if not x:
return row
return (row,col)
# Construct polygon geometry from raster cell
def build_cell((row,col),gt):
xres,yres = gt[1],gt[5]
x_0,y_0 = gt[0],gt[3]
top = (yres*row) + y_0
bottom = (yres*(row+1)) + y_0
right = (xres*col) + x_0
left = (xres*(col+1)) + x_0
# Create ring topology
ring = ogr.Geometry(ogr.wkbLinearRing)
ring.AddPoint(left,bottom)
ring.AddPoint(right,bottom)
ring.AddPoint(right,top)
ring.AddPoint(left,top)
ring.AddPoint(left,bottom)
# Create polygon
box = ogr.Geometry(ogr.wkbPolygon)
box.AddGeometry(ring)
return box
# Iterate over feature geometries & check for intersection
def match_cells(inputGeometry,gt):
matched_cells = []
for f,feature in enumerate(inputGeometry):
geom = feature.GetGeometryRef()
bbox = geom.GetEnvelope()
xmin,xmax = [parse_coords(x,None,gt) for x in bbox[:2]]
ymin,ymax = [parse_coords(None,y,gt) for y in bbox[2:]]
for cell_row in range(ymax,ymin+1):
for cell_col in range(xmin,xmax+1):
cell_box = build_cell((cell_row,cell_col),gt)
if cell_box.Intersects(geom):
matched_cells += [[(cell_row,cell_col)]]
return matched_cells
if you want to do this manually you'll need to test each cell for:
Square v Polygon intersection and
Square v Line intersection.
If you treat each square as a 2d point this becomes easier - it's now a Point v Polygon problem. Check in Game Dev forums for collision algorithms.
Good luck!
I'm trying to count the number pixels in a weather radar image for each dbz reflectivity level (the colored blocks of green, orange, yellow, red, etc.) so I can "score" the radar image based on the type of echos.
I'm new to numpy and numpy arrays, but I know it can be very efficient when I'm working with the individual pixels in an image, so I'd like to learn more.
I'm not even sure I'm selecting the pixels correctly, but I think I'm getting close.
I have a sample of using both numpy and basic pixel iteration to count the number of green pixels with an RGBA of (1, 197, 1, 255).
Hopefully I'm close and someone can give me guidance on how to select the pixels using numpy and then count them:
import io
import numpy as np
import PIL.Image
import urllib2
import sys
color_dbz_20 = (2, 253, 2, 255)
color_dbz_25 = (1, 197, 1, 255)
color_dbz_30 = (0, 142, 0, 255)
url = 'http://radar.weather.gov/ridge/RadarImg/N0R/DLH_N0R_0.gif'
image_bytes = io.BytesIO(urllib2.urlopen(url).read())
image = PIL.Image.open(image_bytes)
image = image.convert('RGBA')
total_pixels = image.height * image.width
# Count using numpy
np_pixdata = np.array(image)
# Didn't work, gave me the total size:
# np_counter = np_pixdata[(np_pixdata == color_dbz_20)].size
np_counter = np.count_nonzero(np_pixdata[(np_pixdata == color_dbz_20)])
# Count using pillow
pil_pixdata = image.load()
pil_counter = 0
for y in xrange(image.size[1]):
for x in xrange(image.size[0]):
if pil_pixdata[x, y] == color_dbz_20:
pil_counter += 1
print "Numpy Count: %d" % np_counter
print "Pillow Count: %d" % pil_counter
Output:
Numpy Count: 134573
Pillow Count: 9967
The problem is that the numpy array will be an array of size X * Y * 4 but you compare each element with a tuple - but it's only a number. That's the reason why your:
np_counter = np_pixdata[(np_pixdata == color_dbz_20)].size
didn't exclude any elements.
That you got different counts in the end is because you counted nonzero-elements. But there are zeros in some array elements, just for one color but nevertheless 0 - which are excluded even though you don't want that!
First you want to compare numpy arrays so better convert the color-tuples too:
color_dbz_20 = np.array([2, 253, 2, 255]), ...
To get the real result for your condition you must use np.all along axis=2:
np.all(np_pixdata == color_dbz_20, axis=2)
This checks if the values along axis 2 (colors) are equal to the ones in your color_dbz_20 and this for each pixel. To get the sum of all the matches:
np.sum(np.all(np_pixdata == color_dbz_20, axis=2)) # Sum of boolean array is integer!
which gives you the number of pixel where the condition is True. True is interpreted as 1 and False as 0 - that way doing the sum will work - alternativly you could also count_nonzero instead of sum here. Always assuming you created your color_dbz_20-array as np.array.
Maybe the image has a different dimensionality and it's not width * height * depth then you just need to adjust the axis in the np.all to the dimension where the colors are (the one with length 4).