Transform a polynomial fit back into image-space - python

I have an image:
>>> image.shape
(720, 1280)
My image is a binary array of 0s and 255s. I've done some cursory edge detection, and now I want to fit a polynomial through the points.
I want to see these points back on my original image, in image-space.
As far as I can tell, the standard way to do this is to unwrap the x,y- image with a reshape, fit on the unwrapped version, then re-reshape back into the original image.
pts = np.array(image).reshape((-1, 2))
xdata = pts[:,0]
ydata = pts[:,1]
z1 = np.polyfit(xdata, ydata, 1)
z2 = np.polyfit(xdata, ydata, 2) # or quadratic...
f = np.poly1d(z)
Now that I have this function, f, how do I use it to paint my lines in the original image space?
In particular:
What's the right inverse indexing of .reshape() to get back into image space?
This seems a bit cumbersome. Is this reshape reshape dance a common thing in image processing? Is what is described above the standard way to do this, or is there a different approach?
If mapping onto the 720, 1280, 1 array is called the image space, what is the reshaped space called? data-space? Linearized space?

You don't need to do this. You can combine np.nonzero, np.polyfit and np.polyval to do this. It would look like this:
import numpy as np
from matplotlib import pyplot as plt
# in your case, you would read your image
# > cv2.imread(...) # import cv2 before
# but we are going to create an image based on a polynomial
img = np.zeros((400, 400), dtype=np.uint8)
h, w = img.shape
xs = np.arange(150, 250)
ys = np.array(list(map(lambda x: 0.01 * x**2 - 4*x + 600, xs))).astype(np.int)
img[h - ys, xs] = 255
# I could use the values I have, but if you have a binary image,
# you will need to get them, and you could do something like this
ys, xs = np.nonzero(img) # use (255-img) if your image is inverted
ys = h - ys
# compute the coefficients
coefs = np.polyfit(xs, ys, 2)
xx = np.arange(0, w).astype(np.int)
yy = h - np.polyval(coefs, xx)
# filter those ys out of the image, because we are going to use as index
xx = xx[(0 <= yy) & (yy < h)]
yy = yy[(0 <= yy) & (yy < h)].astype(np.int) # convert to int to use as index
# create and display a color image just to viz the result
color_img = np.repeat(img[:, :, np.newaxis], 3, axis=2)
color_img[yy, xx, 0] = 255 # 0 because pyplot is RGB
f, ax = plt.subplots(1, 2)
ax[0].imshow(img, cmap='gray')
ax[0].set_title('Binary')
ax[1].imshow(color_img)
ax[1].set_title('Polynomial')
plt.show()
The results look like this:
If you print coefs, you will have [ 1.00486819e-02 -4.01966712e+00 6.01540472e+02] which are very close to the [0.01, -4, 600] we chose.

Related

Python - Interpolating a grid in given border

For a project I want a grid like this: 5x5. The points should be movable later but I got that i guess.
What i wanna be able to do now is to interpolate for example 100x50 points in this grid of 5x5 marker points but not just linear, CUBIC in both axis. I cant wrap my head around it. I saw how to lay scipy.interpolate.CubicSpline through for example the 5 horizontal markers at the top but how do i combine it with the vertical warp?
is there a fnc to interpolate a grid in a given frame like this?
Use scipy.interpolate.interp2d:
Interpolate over a 2-D grid.
x, y and z are arrays of values used to approximate some function f: z = f(x, y) which returns a scalar value z. This class returns a function whose call method uses spline interpolation to find the value of new points.
So you have an array orig that you want to generate 100x50 array res using bicubic interpolation
# Adapted from https://stackoverflow.com/a/58126099/17595968
from scipy import interpolate as interp
import numpy as np
orig = np.random.randint(0, 100, 25).reshape((5, 5))
W, H = orig.shape
new_W, new_H = (100, 50)
map_range = lambda x: np.linspace(0, 1, x)
f = interp.interp2d(map_range(W), map_range(H), orig, kind="cubic")
res = f(range(new_W), range(new_H))
Edit:
If what you want is coordinates of 100x50 grid in 5x5 grid you can use numpy.meshgrid:
#From https://stackoverflow.com/a/32208788/17595968
import numpy as np
W, H = 5, 5
new_W, new_H = 100, 50
x_step = W / new_W
y_step = H / new_H
xy = np.mgrid[0:H:y_step, 0:W:x_step].reshape(2, -1).T
xy = xy.reshape(new_H, new_W, 2)
xy

NumPy FFT producing off centre output

TL;DR: NumPy FFT creates non uniform output when output is wanted to be uniform. I want the output to be a uniform corona.
I am trying to eventually run a Gerchberg-Saxton phase retrieval algorithm. I have been trying to make sure that I understand how FFT works in NumPy. I have used fftshift to create the correct looking output but the image does not have uniform intensity afterwards.
My input image is a circle, output should be a coronagraph looking thing from the circle aperture. I am trying to reproduce the results detailed in https://www.osapublishing.org/optica/fulltext.cfm?uri=optica-2-2-147&id=311836#articleSupplMat.
My algorithm to produce the error:
Initial image, f
FT(f)
x exp ( i phase_mask)
IFT(FT(f)x exp( i phase_mask)
Happy to clear anything up.
import numpy as np
import matplotlib.pyplot as plt
#Create 'pixels' for circle
pixels = 400
edge = np.linspace(-10, 10, pixels)
xv, yv = np.meshgrid(edge, edge)
def circle(x, y, r):
'''
x, y : dimensions of grid to place circle on
r : radius
Function defines aperture
'''
x0 = 0
y0 = 0
return np.select([((x-x0)**2+(y-y0)**2)>=r**2,
((x-x0)**2+(y-y0)**2)<r**2],
[0,
1.])
#Create input and output images
radius = 4
input_img = circle(xv, yv, radius)
constraint_img = xcircle(xv, yv, radius)
img = input_img
constraint = 1 - img
max_iter = 10
re,im = np.mgrid[-1:1:400j, -1:1:400j] #Creates grid of values, 400=pixels
mask = 2*np.angle(re + 1j*im) #Gets angle from centre of grid
mask_i = mask
#Initial focal plane field, F. Initial image f.
f = np.sqrt(img)
F = np.fft.fftshift(np.fft.fft2(f)) * np.exp(mask * 1j) #Focal plane field
F_1 = F
am_f = np.abs(F_1) #Initial amplitude
g = np.fft.ifft2(F)
mask = np.angle(F/(F_1+1e-18)) #Final phase mask
recovery = (np.fft.ifft2(F*np.exp(-1j * mask)))
im3 = plt.imshow(np.abs(g)**2, cmap='gray')
plt.title('Recovered image')
plt.tight_layout()
plt.show()
plt.imshow(mask_i)
plt.colorbar()
plt.show()
Your issue is in this bit of code:
pixels = 400
edge = np.linspace(-10, 10, pixels)
as well as this one:
re,im = np.mgrid[-1:1:400j, -1:1:400j]
Because you use fftshift*, you need the origin to be at pixels//2. However, you don't sample the origin at all, it is in between two samples.
* You should really be using ifftshift instead, which moves the origin from pixels//2 to 0. fftshift moves the origin from 0 to pixels//2. For an even number of samples, these two do the same thing though.
To properly sample the origin, create edge as follows:
edge = np.linspace(-10, 10, pixels, endpoint=False)
We now see that edge[pixels//2] is equal to 0.
For np.mgrid there's no equivalent option. You will have to do this manually by creating one more sample, then deleting the last sample:
re,im = np.mgrid[-1:1:401j, -1:1:401j] #Creates grid of values, 400=pixels
mask = 2*np.angle(re + 1j*im) #Gets angle from centre of grid
mask = mask[:-1, :-1]
With these two changes, you will see a symmetric output.

I don't know why the same code works for Julia but not work for Mandelbrot?

I have the following code that generates a Mandelbrot image. The white spaces around the image, which has to be gotten rid.
import numpy as np
import matplotlib.pyplot as plt
from pylab import *
from numpy import NaN
def mandelbrot(C):
z = 0
for n in range(1, 10):
z = z**2 + C
if abs(z) > 2:
return n
return NaN
def plot():
X = np.arange(-2.0, 1.0, 0.05)
Y = np.arange(-1.5, 1.5, 0.05)
pixel = np.zeros((len(Y), len(X)))
for x_iter, x in enumerate(X):
for y_iter, y in enumerate(Y):
pixel[y_iter, x_iter] = mandelbrot(x + 1j * y)
imshow(pixel, cmap = 'gray', extent = (X.min(), X.max(), Y.min(), Y.max()))
return pixel
pixel = mandelbrot(-0.7 + 0.27015j)
plt.axis('off')
plot()
plt.show()
from PIL import Image
min_value = np.nanmin(pixel)
max_value = np.nanmax(pixel)
pixel_int = (255*(pixel-min_value)/(max_value-min_value)).astype(np.uint8)
# sample LUT from matplotlib
lut = (plt.cm.viridis(np.arange(256)) * 255).astype(np.uint8) # CHOOSE COLORMAP HERE viridis, jet, rainbow
pixel_rgb = lut[pixel_int]
# changing NaNs to a chosen color
nan_color = [0,0,0,0] # Transparent NaNs
for i,c in enumerate(nan_color):
pixel_rgb[:,:,i] = np.where(np.isnan(pixel),c,pixel_rgb[:,:,i])
# apply LUT and display
img = Image.fromarray(pixel_rgb, 'RGBA')
print(pixel)
But it turns out IndexError: too many indices for array for the line
pixel_rgb[:,:,i] = np.where(np.isnan(pixel),c,pixel_rgb[:,:,i])
Please, how to fix it?
Actually, in order to get rid of the white spaces around the image the same code (same line) had worked for Julia instead of Mandelbrot a few weeks ago. The following code that generates the Julia image is getting rid of the white spaces around the image.
import numpy as np
import matplotlib.pyplot as plt
def julia(C):
X = np.arange(-1.5, 1.5, 0.05)
Y = np.arange(-1.5, 1.5, 0.05)
pixel = np.zeros((len(Y), len(X)))
for x_iter, x in enumerate(X):
for y_iter, y in enumerate(Y):
z = x + 1j * y
intensity = np.nan
r = np.empty((100, 100)) # Unused at the moment
for n in range(1, 1024):
if abs(z) > 2:
intensity = n
break
z = z**2 + C
pixel[y_iter, x_iter] = intensity
r.fill(intensity) # Unused at the moment
# We return pixel matrix
return pixel
# Compute Julia set image
pixel = julia(-0.7 + 0.27015j)
# Plotting
print(pixel)
plt.show()
from PIL import Image
min_value = np.nanmin(pixel)
max_value = np.nanmax(pixel)
#want to set all the 255 pixels to removed
pixel_int = (255*(pixel-min_value)/(max_value-min_value)).astype(np.uint8)
# sample LUT from matplotlib,If lut is not None it must be an integer giving the number of entries desired in the lookup table
lut = (plt.cm.viridis(np.arange(256)) * 255).astype(np.uint8) # CHOOSE COLORMAP HERE viridis, jet, rainbow
pixel_rgb = lut[pixel_int]
# changing NaNs to a chosen color
nan_color = [0,0,0,0] # Transparent NaNs
for i,c in enumerate(nan_color):
pixel_rgb[:,:,i] = np.where(np.isnan(pixel),c,pixel_rgb[:,:,i])
# apply LUT and display
img = Image.fromarray(pixel_rgb, 'RGBA')
img.save('julia.tiff')
Image.open('julia.tiff').show()
print(min_value, max_value)
Now, I just don't know why this code of getting rid of the white space around the image doesn't work for the Mandelbrot?! Please help me to figure out the problem!
Your direct problem is that in the Julia case, pixel_rgb is a three dimensional array, where in the Mandelbrot case, pixel_rgb is a one dimensional array. So you're trying to apply a three dimensional transform to each of them, and this blows up for the Mandelbrot case, because what you're operating on has only a single dimension, not three.
I don't have more time to completely understand and play with your code, but in the Mandelbrot case, it seems that the mandelbrot() function only returns a single value, where the julia() function returns a 2D array. It is the plot() function that returns a 2D array in the Mandelbrot case. So my quick guess at the change you want to make is to change this:
pixel = mandelbrot(-0.7 + 0.27015j)
plt.axis('off')
plot()
to this:
# pixel = mandelbrot(-0.7 + 0.27015j)
plt.axis('off')
pixel = plot()
This allows the Mandelbrot code to run without crashing. I don't know if it's doing exactly what you want though.

How to isolate handwritten digit candidates out of an image?

I implemented a neural network that recognizes handwritten digits based on the MNIST dataset. I am using bare python/numpy and now I want to test my own handwritten images against the network. I would however like to automate the cropping and scaling process, so that I can supply an image taken by a smartphone and get a mnist-format numpy array.
So far, I had some success, but I don't really know how to proceed from here.
These are two example images and below the respective masked images, which are a crop half the size of the original image to narrow the field of search down:
As you can see, there is something happening, but it is not satisfactory. I also would not know how to proceed if i had segmented and masked the '4' and '7' perfectly. How do you get the precise position so I can crop and downscale to 28x28 pixels?
The code that produced those images looks as follows. It basically calculates a spatial histogram of the x and y pixelspace axes, and then blacks out everything that does not contain enough black for it to be something written.
plot() and hist() are just convenience functions, but do produce the image you see, so I included them.
import matplotlib.pyplot as plt
from matplotlib.ticker import NullFormatter
import numpy as np
from PIL import Image
def hists(x, y):
histx,_ = np.histogram(np.arange(len(x)), bins=len(x), weights=x)
histy,_ = np.histogram(np.arange(len(y)), bins=len(y), weights=y)
return histx, histy
def plot(ndimg):
w, h = ndimg.shape
x = np.mean(ndimg, axis=0)
x -= np.mean(x)
y = np.mean(ndimg, axis=1)
y -= np.mean(y)
nullfmt = NullFormatter()
left, width = 0.1, 0.65*h/w if w > h else 0.65
bottom, height = 0.1, 0.65*w/h if h > w else 0.65
left_h = left + width + 0.02
bottom_h = bottom + height + 0.02
rect_img = [left, bottom, width, height]
rect_histx = [left, bottom_h, width, 0.2]
rect_histy = [left_h, bottom, 0.2, height]
plt.figure(1, figsize=(8, 8))
axImg = plt.axes(rect_img)
axHistx = plt.axes(rect_histx)
axHisty = plt.axes(rect_histy)
axHistx.xaxis.set_major_formatter(nullfmt)
axHisty.yaxis.set_major_formatter(nullfmt)
axImg.imshow(ndimg, cmap=plt.get_cmap('gray'))
axHistx.hist(np.arange(len(x)), bins=int(0.03*len(x)), weights=x)
axHisty.hist(np.arange(len(y)), bins=int(0.03*len(y)), weights=y,
orientation='horizontal')
axHistx.set_xlim(axImg.get_xlim())
axHisty.set_ylim(axImg.get_ylim())
plt.show()
def mask(ndimg, bw_threshhold=0.6, mask_threshhold=5e-3):
ndimg = ndimg / np.max(ndimg)
ndimg = np.where(ndimg < bw_threshhold, 0.0, 1.0)
#ndimg = np.exp(-np.logaddexp(0, -10*(ndimg-0.6)))
x = np.mean(ndimg, axis=0)
#x = x - np.mean(x)
y = np.mean(ndimg, axis=1)
#y = y - np.mean(y)
histx, histy = hists(x, y)
histx = histx - np.mean(histx)
histy = histy - np.mean(histy)
#histx -= (histx.max() + histx.min())/2
#histy -= (histy.max() + histy.min())/2
maskx = np.where(histx < mask_threshhold, False, True)
masky = np.where(histy < mask_threshhold, False, True)
ndimg[masky, :] = 0.
ndimg[:, maskx] = 0.
return ndimg
img = Image.open('C:/Users/maxid/Desktop/Pics/7_1.jpg')
.convert(mode='L')
w, h = img.size
img = img.crop((0.25*w, 0.25*h, 0.75*w, 0.75*h))
ndimg = np.asarray(img)
plot(ndimg)
ndimg = mask(ndimg, )
plot(ndimg)
a, b = hists(np.mean(ndimg, axis=0), np.mean(ndimg, axis=1))
print((a.max()+a.min())/2, np.mean(a), np.median(a))
plt.plot(a)
So, I would like to end up with a square image of the handwritten digit displayed approximately in the middle of the picture. For this it would probably be enough to get the middle of the digit, but I cannot think of an easy and semi-reliable (does not have to be production grade) way to do this.

efficient calculation of distance to spline curve for all pixels on an image

My problem is that I have a list of 2D parametric splines, and I need a more efficient way of rendering them onto an image grid. Each spline is determined by a series of points, a line radius / thickness (in pixels), and an opacity.
The original implementation I had in mind is similar to the question discussed here, which iterates through every single pixel on the image, finds the minimum distance to the curve, and then marks the pixel if the minimum distance is below the desired radius.
import math
import matplotlib.pyplot as plt
import numpy as np
import scipy.interpolate
import time
from PIL import Image
class GenePainter(object):
def __init__(self, source):
self.source = source
def render(self):
output = np.zeros(self.source.shape, dtype=np.float32)
Ny, Nx = output.shape[0], output.shape[1]
#x = np.array([5, 10, 15, 20, 5, 5])
#y = np.array([5, 5, 20, 15, 10, 30])
x = np.array(np.random.random(4) * 128, dtype=np.float32)
y = np.array(np.random.random(4) * 128, dtype=np.float32)
sx, sy = spline(x, y, 1000)
t = time.time()
for yi in xrange(Ny):
for xi in xrange(Nx):
d = min_distance(sx, sy, xi, yi)
if d < 10.: # radius
output[yi, xi, :] = np.array([1, 1, 0, 0.5])
print time.time() - t
# t = time.time()
# for _ in xrange(100):
# plt.plot(sx, sy, label='spline', linewidth=10, aa=False, solid_capstyle="round")
# print time.time() - t
plt.imshow(output, interpolation='none')
plt.show()
def score(self, image):
return np.linalg.norm(self.source - image, 2)
def spline(x, y, n):
if x.ndim != 1 or y.ndim != 1 or x.size != y.size:
raise Exception()
t = np.linspace(0, 1, x.size)
sx = scipy.interpolate.interp1d(t, x, kind='cubic')
sy = scipy.interpolate.interp1d(t, y, kind='cubic')
st = np.linspace(0, 1, n)
return sx(st), sy(st)
def min_distance(sx, sy, px, py):
dx = sx - px
dy = sy - py
d = dx ** 2 + dy ** 2
return math.sqrt(np.amin(d))
def read_image(file):
image_raw = Image.open(file)
image_raw.load()
# return np.array(image_raw, dtype=np.float32)
image_rgb = Image.new('RGB', image_raw.size)
image_rgb.paste(image_raw, None)
return np.array(image_rgb, dtype=np.float32)
if __name__ == "__main__":
# source = read_image('ML129.png')
source = np.zeros((256, 256, 4), dtype=np.float32)
p = GenePainter(source)
p.render()
The problem is that each spline drawing on a 256 x 256 RGBA image takes ~1.5 seconds because of the unoptimized iteration through each pixel, which is too slow for my purposes. I plan to have up to ~250 of these splines on a single image, and will processing up to ~100 images for a job, and maybe have up to ~1000 jobs in total, so I'm looking for any optimization that will cut down my computation time.
An alternative that I've looked into is to just draw all the splines onto a PyPlot plot, and then dump the final image to a numpy array that I can use for other calculations, which seems to run a bit faster, ~0.15 seconds to draw 100 splines.
plt.plot(sx, sy, label='spline', linewidth=10, aa=False, solid_capstyle="round")
The problem is that the linewidth parameter seems to correspond to pixels on my screen, rather than the number of pixels on the image (on the 256 x 256 grid), so when I resize the window, the scale of the line changes with the window, but the linewidth stays the same. I would like the curve width to correspond to the pixels on the 256 x 256 grid instead.
I would prefer to solve the issue by finding a way to greatly optimize the first numerical implementation, rather than the PyPlot drawing. I've also looked into downsampling the image (only computing distances for a subset of pixels rather than every pixel), but even with using 10% pixels 0.15 seconds per spline is still too slow.
Thank you in advance for any help or advice!
You can use matplotlib to do the drawing, here is an example:
I create a RendererAgg and a ndarray share the same memory with it. Then create the Line2D artist, and call the draw() method on the RendererAgg object.
import numpy as np
from matplotlib.backends.backend_agg import RendererAgg
w, h = 256, 256
r = RendererAgg(w, h, 72)
arr = np.frombuffer(r.buffer_rgba(), np.uint8)
arr.shape = r.height, r.width, -1
t = np.linspace(0, 2*np.pi, 100)
x = np.sin(2*t) * w*0.45 + w*0.5
y = np.cos(3*t) * h*0.45 + h*0.5
from matplotlib.lines import Line2D
line = Line2D(x, y, linewidth=5, color=(1.0, 0.0, 0.0), alpha=0.3)
line.draw(r)
pl.imsave("test.png", arr)
Here is the output:

Categories

Resources