Does SciPy have any function for extracting data from image? - python

I have to analyze the data given as an image like:
What I do is
Earasing the axises manually.
Convert the image to (x,y) coordinates by imagemagick (collecting the coordinates of black pixels)
Adjusting the (x,y) values (according to the axis values (rather than the pixel coordinates), then y direction: in images, the y coordinate increases from top to bottom).
Sorting the data by x.
Loading the data in a SciPy script.
I wonder if there is a function to do any of the steps 1-4 in the same SciPy script.
Since SciPy has a range of functions for image recognition, I would like to know if there is a function to translate an image into the (x,y) coordinates of the black pixels creating the curve, and so on.

First, load the image with sp.misc.imread or PIL so that it resides in a numpy array that I'll refer to as img below. (If it is a color image convert it to grayscale with img = np.mean(img, axis=-1). Then:
img = img[:-k, :] # Erase axes manually, k is the height of the axis in pixels
y, x = np.where(img == 0) # Find x and y coordinates of all black pixels
y -= y.max() # invert y axis so (0,0) is in the bottom left corner
i = np.argsort(x); x, y = x[i], y[i] # Sort data by x
Assumed imports:
import numpy as np
import scipy as sp


Creating an image mask using polygon points coordinates

I have a grayscale image with size (1920,1080) that I''m trying to create a mask for. I used an external software to manually get the points of interest (polygon). There are now 27 coordinates points representing a polygon in the middle of the image.
I created a mask using the following:
import numpy as np
import matplotlib.pyplot as plt
from skimage.draw import polygon2mask
#image= grayscale with shape (1920,1080)
coordinates = ([1080.15, 400.122], [1011.45, 400.90], .......) #27 points
polygon = np.array(coordinates)
mask = polygon2mask(image.shape, polygon)
result = ma.masked_array(image, np.invert(mask))
the problem I'm facing is the output in a wrong place; it should be somehow centred because I took the coordinates from the center, but it was actually in the edge of the image (bottom):
Also, the size seem to be a bit smaller that expected. I'm not sure what is causing this problem, I must have done something wrong in my code.. Kindly help me identifying the problem.
You inverted x and y coordinates. polygon2mask coordinates are in y,x order.
coordinates = [[y,x] for [x,y] in coordinates]
after defining coordinates, and you'll have probably what you expected.

image translation in Pytorch, using affine_grid & grid_sample functions

I am going to move the image for 1 or 2 pixels, as I specified a small number (1.25 , 1.9) in the affine matrix.
BUT, the image is moved far far away, like hundreds of pixels:
( my input image is fully filled with yellow pineapples)
Below is a working example.
import torch
import numpy as np
import matplotlib.pyplot as plt
from torchvision import datasets, transforms
import torch.nn.functional as F
rotation_simple = np.array([[1,0, 1.25],
[ 0,1, 1.9]])
#load image
transform = transforms.Compose([transforms.Resize(255),
dataloader ='/home/Pictures',transform=transform,), shuffle=True)
dtype = torch.FloatTensor
i = 0
while i<3:
img, labels = next(iter(dataloader))
img = img#.double() # 有时候要转为double有时候不用转
rotation_simple = torch.as_tensor(rotation_simple)[None]
grid = F.affine_grid(rotation_simple, img.size()).type(dtype)
x = F.grid_sample(img, grid)
plt.imshow(x[0].permute(1, 2, 0))
I wonder why does the function move the the image so far away instead of moving it for just 1 pixel in x and y direction.
Ps. Setting "align_corners=True" didn't help for this case.
Pps. My pytorch version is 1.4.0+cu100
The "unit of measures" for the grid and the affine transformation are not pixels, but rather normalized coordinates:
grid specifies the sampling pixel locations normalized by the input spatial dimensions. Therefore, it should have most values in the range of [-1, 1]. For example, values x = -1, y = -1 is the left-top pixel of input, and values x = 1, y = 1 is the right-bottom pixel of input.
Therefore, translating by [1.25, 1.9] is actually translating by almost the entire image size. You need to divide the translation values by 2*img.shape to get pixel-wise translations.
See the doc for grid_sample for more information.

Get pixel coordinates in raster from polygon boundary in coordinate reference system

I have some GeoTiff files that are relatively large (10980 x 10980 pixels), that all correspond to the same geographic area (and have the same coordinate reference system), and I have a large number of polygons (100,000+) corresponding to land parcels, and I want to extract from each image file the pixels corresponding to each polygon. Currently, the way I'm doing this is using shapely Polygons and the rasterio.mask.mask function, like this:
for filename in image_files:
with as src:
for shape in shapes:
data, _ = rasterio.mask.mask(src, [shape], crop=True)
This is empirically rather slow. If I have the mask indices precomputed, then I just need to read each image's entire data once and then use the pre-computed indices to pull out the relevant pixels for each polygon (I don't need them to be in the correct 2-dimensional configuration, I just need the values), and this is very fast. But I don't know if there's a fast way to get these pixel indices. I know that I could use rasterio's raster_geometry_mask function to get a mask the size of the whole image, and then use numpy to get the indices of the elements inside the polygon, but then it would be needlessly constructing a 10980 x 10980 array for each polygon to make the mask, and that's very very slow.
What I ended up doing is, when I open the first image, then for each polygon,
Use the image transform to convert the polygon to pixel coordinates, and find the rectangular bounding box containing the polygon in integer pixel coordinates.
To figure out which pixels in the bounding box are actually in the polygon, construct shapely Polygons for each pixel and use the .intersects() method (if you wanted to only include pixels that are completely inside the polygon, you could use .contains()). (I wasn't sure if this would be slow, but it turned out not to be.)
Save the list of coordinate pairs for all pixels in each polygon.
Then for every new image you open, you just read the entire image data and index out the parts for each polygon because you already have the pixel indices.
Code looks approximately like this:
import math
import numpy
import pyproj
import rasterio.mask
from shapely.geometry import Polygon
shape_pixels = None
for filename in image_files:
with as src:
if shape_pixels is None:
projector = pyproj.Proj(
pixelcoord_shapes = [
Polygon(zip(*(~src.transform * numpy.array(projector(*zip(*shape.boundary.coords))))))
for shape in shapes
pixels_per_shape = []
for shape in shapes:
xmin = max(0, math.floor(shape.bounds[0]))
ymin = max(0, math.floor(shape.bounds[1]))
xmax = math.ceil(shape.bounds[2])
ymax = math.ceil(shape.bounds[3])
pixel_squares = {}
for j in range(xmin, xmax+1):
for i in range(ymin, ymax+1):
pixel_squares[(i, j)] = Polygon.from_bounds(j, i, j+1, i+1)
coords for (coords, pixel) in pixel_squares.items()
if shape.intersects(pixel)
whole_data =
for pixels in pixels_per_shape:
ivals, jvals = zip(*pixels)
shape_data = whole_data[0, ivals, jvals]

affine transformation using nearest neighbor in python

I want to make an affine transformation and afterwards use nearest neighbor interpolation while keeping the same dimensions for input and output images. I use for example the scaling transformation T= [[2,0,0],[0,2,0],[0,0,1]]. Any idea how can I fill the black pixels with nearest neighbor ? I tryied giving them the min value of neighbors' intensities. For ex. if a pixel has neighbors [55,22,44,11,22,55,23,231], I give it the value of min intensity: 11. But the result is not anything clear..
import numpy as np
from matplotlib import pyplot as plt
#Importing the original image and init the output image
img = plt.imread('/home/left/Desktop/computerVision/SET1/brain0030slice150_101x101.png',0)
outImg = np.zeros_like(img)
# Dimensions of the input image and output image (the same dimensions)
(width , height) = (img.shape[0], img.shape[1])
# Initialize the transformation matrix
T = np.array([[2,0,0], [0,2,0], [0,0,1]])
# Make an array with input image (x,y) coordinations and add [0 0 ... 1] row
coords = np.indices((width, height), 'uint8').reshape(2, -1)
coords = np.vstack((coords, np.zeros(coords.shape[1], 'uint8')))
output = T # coords
# Arrays of x and y coordinations of the output image within the image dimensions
x_array, y_array = output[0] ,output[1]
indices = np.where((x_array >= 0) & (x_array < width) & (y_array >= 0) & (y_array < height))
# Final coordinations of the output image
fx, fy = x_array[indices], y_array[indices]
# Final output image after the affine transformation
outImg[fx, fy] = img[fx, fy]
The input image is:
The output image after scaling is:
well you could simply use the opencv resize function
import cv2
new_image = cv2.resize(image, new_dim, interpolation=cv.INTER_AREA)
it'll do the resize and fill in the empty pixels in one go
more on cv2.resize
If you need to do it manually, then you could simply detect dark pixels in resized image and change their value to mean of 4 neighbour pixels (for example - it depends on your required alghoritm)
See: nereast neighbour, bilinear, bicubic, etc.

determining the average colour of a given circular sample of an image?

What I am trying to achieve is similar to photoshop/gimp's eyedropper tool: take a round sample of a given area in an image and return the average colour of that circular sample.
The simplest method I have found is to take a 'regular' square sample, mask it as a circle, then reduce it to 1 pixel, but this is very CPU-demanding (especially when repeated millions of times).
A more mathematically complex method is to take a square area and average only the pixels that fall within a circular area within that sample, but determining what pixel is or isn't within that circle, repeated, is CPU-demanding as well.
Is there a more succinct, less-CPU-demanding means to achieve this?
Here's a little example of which doesn't actually draw a circle but gives you the coordinates of points within a circle which you can use to index Numpy arrays with.
#!/usr/bin/env python3
import numpy as np
from import imsave
from skimage.draw import circle
# Make rectangular canvas of mid-grey
w, h = 200, 100
img = np.full((h, w), 128, dtype=np.uint8)
# Get coordinates of points within a central circle
Ycoords, Xcoords = circle(h//2, w//2, 45)
# Make all points in circle=200, i.e. fill circle with 200
img[Ycoords, Xcoords] = 200
# Get mean of points in circle
print(img[Ycoords, Xcoords].mean()) # prints 200.0
# DEBUG: Save image for checking
I'm sure that there's a more succinct way to go about it, but:
import math
import numpy as np
import imageio as ioimg # as scipy's i/o function is now depreciated
from skimage.draw import circle
import matplotlib.pyplot as plt
# base sample dimensions (rest below calculated on this).
# Must be an odd number.
wh = 49
# tmp - this placement will be programmed later
dp = 500
#load work image (from same work directory)
img = ioimg.imread('830.jpg')
# convert to numpy array (droppying the alpha while we're at it)
np_img = np.array(img)[:,:,:3]
# take sample of resulting array
sample = np_img[dp:wh+dp, dp:wh+dp]
# set up numpy circle mask
## this mask will be multiplied against each RGB layer in extracted sample area
# set up basic square array
sample_mask = np.zeros((wh, wh), dtype=np.uint8)
# set up circle centre coords and radius values
xy, r = math.floor(wh/2), math.ceil(wh/2)
# use these values to populate circle area with ones
rr, cc = circle(xy, xy, r)
sample_mask[rr, cc] = 1
# add axis to make array multiplication possible (do I have to do this)
sample_mask = sample_mask[:, :, np.newaxis]
result = sample * sample_mask
# count number of nonzero values (this will be our median divisor)
nz = np.count_nonzero(sample_mask)
sample_color = []
for c in range(result.shape[2]):
print(sample_color) # will return array like [225, 205, 170]
plt.imshow(result, interpolation='nearest')
Perhaps asking this question here wasn't necessary (it has been a while since I've python-ed, and was hoping that some new library had been developed for this since), but I hope this can be a reference for others who have the same goal.
This operation will be performed for every pixel in the image (sometimes millions of times) for thousands of images (scanned pages), so therein are my performance issue worries, but thanks to numpy, this code is pretty quick.

