Understanding average (sum) pooling padding in keras

Understanding average (sum) pooling padding in keras - python

I have a simple sum pooling implemented in keras tensorflow, using AveragePooling2D*N*N, so it creates a sum of the elements in pool with some shape, same padding so the shape won't change:
import numpy as np
import seaborn as sns
import matplotlib.pylab as plt
import tensorflow as tf
from tensorflow.keras.backend import square
#generating the example matrix
def getMatrixByDefinitions(definitions, width, height):
matrix = np.zeros((width, height))
for definition in definitions:
x_cor = definition[1]
y_cor = definition[0]
value = definition[2]
matrix.itemset((x_cor, y_cor), value)
return matrix
generated = getMatrixByDefinitions(width=32, height=32, definitions =[[7,16,1]])
def avg_pool(pool):
return tf.keras.layers.AveragePooling2D(pool_size=(pool,pool), strides=(1, 1), padding='same')
def summer(pool, tensor):
return avg_pool(pool)(tensor)*pool*pool
def numpyToTensor(numpy_data):
numpy_as_array = np.asarray(numpy_data)
tensor_data = numpy_as_array.reshape(1, numpy_data.shape[1], numpy_data.shape[1], 1)
return tensor_data
data = numpyToTensor(generated)
pooled_data = summer(11, data)
def printMatrixesToHeatMap(matrixes, title):
# f = pyplot.figure() # width and height in inches
matrix_count = len(matrixes)
width_ratios = [4] * matrix_count + [0.2]
mergedMatrixes = matrixes[0][0]
for matrix in matrixes:
mergedMatrixes = np.concatenate((mergedMatrixes, matrix[0]), axis=0)
vmin = np.min(mergedMatrixes)
vmax = np.max(mergedMatrixes)
fig, axs = plt.subplots(ncols=matrix_count + 1, gridspec_kw=dict(width_ratios=width_ratios))
fig.set_figheight(20)
fig.set_figwidth(20 * matrix_count + 5)
axis_id = 0
for matrix in matrixes:
sns.heatmap(matrix[0], annot=True, cbar=False, ax=axs[axis_id], vmin=vmin, vmax=vmax)
axs[axis_id].set_title(matrix[1])
axis_id = axis_id + 1
#fig.colorbar(axs[1].collections[0], cax=axs[matrix_count])
fig.savefig(title+".pdf", bbox_inches='tight')
def tensorToNumpy(tensor):
width = tensor.get_shape()[1]
height = tensor.get_shape()[2]
output = tf.reshape(tensor, [width, height])
#output = output.eval(session=tf.compat.v1.Session())
output = output.numpy()
return np.array(output)
printMatrixesToHeatMap([[tensorToNumpy(pooled_data), "Pooled data"]],
"name")
After testing it on very simple 2D array I have found out it does not do what I expect (original and pooled data):
You can see that the single one sum-pooled (according to average pooling) ended up with sum greater than real sum, which is 1, near the borders. (in this case max can be used, but the real data are more complex and we need sum) This would mean that average near borders is count not from padded data but the original. Or is this misunderstanding of padding from my side? I need to have ones on indices where 1.1, 1.2, 1.4 is. Why is this and how can I solve such problem?
Note that I do not want to manually set the correct sum, so I am looking for a way to achieve this in keras pooling itself.

It seems to be a problem with the "SAME" padding algorithm. Unfortunately,there is no way of specifying an explicit padding to the avg_pool2d op. It is possible to manually pad the input with tf.pad though. Here is a really naive approach to padding that will work with odd shaped pooling filters and strides size of 1 :
generated = getMatrixByDefinitions(width=32, height=32, definitions =[[7,16,1]])
gen_nhwc = tf.constant(generated[np.newaxis,:,:,np.newaxis])
pool = 11
paddings = [[0,0],[pool//2,pool//2],[pool//2,pool//2],[0,0]]
gen_pad = tf.pad(gen_nhwc, paddings, "CONSTANT")
res = tf.nn.avg_pool2d(gen_pad, (pool,pool), (1,1),"VALID")*pool*pool
result = np.squeeze(res.numpy())
printMatrixesToHeatMap([[generated, "input"],[result, "output"]], "name")
Results in images :
Edit : I created an issue on Github regarding the problem.

Related

Pytorch Data Generator for extracting 2D images from many 3D cube

I'm struggling in creating a data generator in PyTorch to extract 2D images from many 3D cubes saved in .dat format
There is a total of 200 3D cubes each having a 128*128*128 shape. Now I want to extract 2D images from all of these cubes along length and breadth.
For example, a is a cube having size 128*128*128
So I want to extract all 2D images along length i.e., [:, i, :] which will get me 128 2D images along the length, and similarly i want to extract along width i.e., [:, :, i], which will give me 128 2D images along the width. So therefore i get a total of 256 2D images from 1 3D cube, and i want to repeat this whole process for all 200 cubes, there by giving me 51200 2D images.
So far I've tried a very basic implementation which is working fine but is taking approximately 10 minutes to run. I want you guys to help me create a more optimal implementation keeping in mind time and space complexity. Right now my current approach has a time complexity of O(n2), can we dec it further to reduce the time complexity
I'm providing below the current implementation
from os.path import join as pjoin
import torch
import numpy as np
import os
from tqdm import tqdm
from torch.utils import data
class DataGenerator(data.Dataset):
def __init__(self, is_transform=True, augmentations=None):
self.is_transform = is_transform
self.augmentations = augmentations
self.dim = (128, 128, 128)
seismicSections = [] #Input
faultSections = [] #Ground Truth
for fileName in tqdm(os.listdir(pjoin('train', 'seis')), total = len(os.listdir(pjoin('train', 'seis')))):
unrolledVolSeismic = np.fromfile(pjoin('train', 'seis', fileName), dtype = np.single) #dat file contains unrolled cube, we need to reshape it
reshapedVolSeismic = np.transpose(unrolledVolSeismic.reshape(self.dim)) #need to transpose the axis to get height axis at axis = 0, while length (axis = 1), and width(axis = 2)
unrolledVolFault = np.fromfile(pjoin('train', 'fault', fileName),dtype=np.single)
reshapedVolFault = np.transpose(unrolledVolFault.reshape(self.dim))
for idx in range(reshapedVolSeismic.shape[2]):
seismicSections.append(reshapedVolSeismic[:, :, idx])
faultSections.append(reshapedVolFault[:, :, idx])
for idx in range(reshapedVolSeismic.shape[1]):
seismicSections.append(reshapedVolSeismic[:, idx, :])
faultSections.append(reshapedVolFault[:, idx, :])
self.seismicSections = seismicSections
self.faultSections = faultSections
def __len__(self):
return len(self.seismicSections)
def __getitem__(self, index):
X = self.seismicSections[index]
Y = self.faultSections[index]
return X, Y
Please Help!!!

why not storing only the 3D data in mem, and let the __getitem__ method "slice" it on the fly?
class CachedVolumeDataset(Dataset):
def __init__(self, ...):
super(...)
self._volumes_x = # a list of 200 128x128x128 volumes
self._volumes_y = # a list of 200 128x128x128 volumes
def __len__(self):
return len(self._volumes_x) * (128 + 128)
def __getitem__(self, index):
# extract volume index from general index:
vidx = index // (128 + 128)
# extract slice index
sidx = index % (128 + 128)
if sidx < 128:
# first dim
x = self._volumes_x[vidx][:, :, sidx]
y = self._volumes_y[vidx][:, :, sidx]
else:
sidx -= 128
# second dim
x = self._volumes_x[vidx][:, sidx, :]
y = self._volumes_y[vidx][:, sidx, :]
return torch.squeeze(x), torch.squeeze(y)

How to use Triton server "ensemble model" with 1:N input/output to create patches from large image?

I am trying to feed a very large image into Triton server. I need to divide the input image into patches and feed the patches one by one into a tensorflow model. The image has a variable size, so the number of patches N is variable for each call.
I think a Triton ensemble model that calls the following steps would do the job:
A python model (pre-process) to create the patches
The segmentation model
Finally another python model (post-process) to merge the output patches into a big output mask
However, for this, I would have to write a config. pbtxt file with 1:N and N:1 relation, meaning the ensemble scheduler needs to call the 2nd step multiple times and the 3rd once with the aggregated output.
Is this possible, or do I need to use some other technique?

Disclaimer
The below answer isn't the actual solution to the above question. I misunderstood the above query. But I'm leaving this response in case of future readers find it useful.
Input
import cv2
import matplotlib.pyplot as plt
input_img = cv2.imread('/content/2.jpeg')
print(input_img.shape) # (719, 640, 3)
plt.imshow(input_img)
Slice and Stitch
The following functionality is adopted from here. More details and discussion can be found here.. Apart from the original code, we bring together the necessary functionality and put them in a single class (ImageSliceRejoin).
# ref: https://github.com/idealo/image-super-resolution
class ImageSliceRejoin:
def pad_patch(self, image_patch, padding_size, channel_last=True):
""" Pads image_patch with padding_size edge values. """
if channel_last:
return np.pad(
image_patch,
((padding_size, padding_size),
(padding_size, padding_size), (0, 0)),
'edge',
)
else:
return np.pad(
image_patch,
((0, 0), (padding_size, padding_size), (padding_size, padding_size)),
'edge',
)
# function to split the image into patches
def split_image_into_overlapping_patches(self, image_array, patch_size, padding_size=2):
""" Splits the image into partially overlapping patches.
The patches overlap by padding_size pixels.
Pads the image twice:
- first to have a size multiple of the patch size,
- then to have equal padding at the borders.
Args:
image_array: numpy array of the input image.
patch_size: size of the patches from the original image (without padding).
padding_size: size of the overlapping area.
"""
xmax, ymax, _ = image_array.shape
x_remainder = xmax % patch_size
y_remainder = ymax % patch_size
# modulo here is to avoid extending of patch_size instead of 0
x_extend = (patch_size - x_remainder) % patch_size
y_extend = (patch_size - y_remainder) % patch_size
# make sure the image is divisible into regular patches
extended_image = np.pad(image_array, ((0, x_extend), (0, y_extend), (0, 0)), 'edge')
# add padding around the image to simplify computations
padded_image = self.pad_patch(extended_image, padding_size, channel_last=True)
xmax, ymax, _ = padded_image.shape
patches = []
x_lefts = range(padding_size, xmax - padding_size, patch_size)
y_tops = range(padding_size, ymax - padding_size, patch_size)
for x in x_lefts:
for y in y_tops:
x_left = x - padding_size
y_top = y - padding_size
x_right = x + patch_size + padding_size
y_bottom = y + patch_size + padding_size
patch = padded_image[x_left:x_right, y_top:y_bottom, :]
patches.append(patch)
return np.array(patches), padded_image.shape
# joing the patches
def stich_together(self, patches, padded_image_shape, target_shape, padding_size=4):
""" Reconstruct the image from overlapping patches.
After scaling, shapes and padding should be scaled too.
Args:
patches: patches obtained with split_image_into_overlapping_patches
padded_image_shape: shape of the padded image contructed in split_image_into_overlapping_patches
target_shape: shape of the final image
padding_size: size of the overlapping area.
"""
xmax, ymax, _ = padded_image_shape
# unpad patches
patches = patches[:, padding_size:-padding_size, padding_size:-padding_size, :]
patch_size = patches.shape[1]
n_patches_per_row = ymax // patch_size
complete_image = np.zeros((xmax, ymax, 3))
row = -1
col = 0
for i in range(len(patches)):
if i % n_patches_per_row == 0:
row += 1
col = 0
complete_image[
row * patch_size: (row + 1) * patch_size, col * patch_size: (col + 1) * patch_size, :
] = patches[i]
col += 1
return complete_image[0: target_shape[0], 0: target_shape[1], :]
Initiate Slicing
import numpy as np
isr = ImageSliceRejoin()
padding_size = 1
patches, p_shape = isr.split_image_into_overlapping_patches(
input_img,
patch_size=220,
padding_size=padding_size
)
patches.shape, p_shape, input_img.shape
((12, 222, 222, 3), (882, 662, 3), (719, 640, 3))
Verify
n = np.ceil(patches.shape[0] / 2)
plt.figure(figsize=(20, 20))
patch_size = patches.shape[1]
for i in range(patches.shape[0]):
patch = patches[i]
ax = plt.subplot(n, n, i + 1)
patch_img = np.reshape(patch, (patch_size, patch_size, 3))
plt.imshow(patch_img.astype("uint8"))
plt.axis("off")
Inference
I'm using the Image-Super-Resolution model for demonstration.
# import model
from ISR.models import RDN
model = RDN(weights='psnr-small')
# number of patches that will pass to model for inference:
# here, batch_size < len(patches)
batch_size = 2
for i in range(0, len(patches), batch_size):
# get some patches
batch = patches[i: i + batch_size]
# pass them to model to give patches output
batch = model.model.predict(batch)
# save the output patches
if i == 0:
collect = batch
else:
collect = np.append(collect, batch, axis=0)
Now, the collect holds the output of each patch from the model.
patches.shape, collect.shape
((12, 222, 222, 3), (12, 444, 444, 3))
Rejoin Patches
scale = 2
padded_size_scaled = tuple(np.multiply(p_shape[0:2], scale)) + (3,)
scaled_image_shape = tuple(np.multiply(input_img.shape[0:2], scale)) + (3,)
sr_img = isr.stich_together(
collect,
padded_image_shape=padded_size_scaled,
target_shape=scaled_image_shape,
padding_size=padding_size * scale,
)
Verify
print(input_img.shape, sr_img.shape)
# (719, 640, 3) (1438, 1280, 3)
fig, ax = plt.subplots(1,2)
fig.set_size_inches(18.5, 10.5)
ax[0].imshow(input_img)
ax[1].imshow(sr_img.astype('uint8'))

Creating a 2D Gaussian random field from a given 2D variance

I've been trying to create a 2D map of blobs of matter (Gaussian random field) using a variance I have calculated. This variance is a 2D array. I have tried using numpy.random.normal since it allows for a 2D input of the variance, but it doesn't really create a map with the trend I expect from the input parameters. One of the important input constants lambda_c should manifest itself as the physical size (diameter) of the blobs. However, when I change my lambda_c, the size of the blobs does not change if at all. For example, if I set lambda_c = 40 parsecs, the map needs blobs that are 40 parsecs in diameter. A MWE to produce the map using my variance:
import numpy as np
import random
import matplotlib.pyplot as plt
from matplotlib.pyplot import show, plot
import scipy.integrate as integrate
from scipy.interpolate import RectBivariateSpline
n = 300
c = 3e8
G = 6.67e-11
M_sun = 1.989e30
pc = 3.086e16 # parsec
Dds = 1097.07889283e6*pc
Ds = 1726.62069147e6*pc
Dd = 1259e6*pc
FOV_arcsec_original = 5.
FOV_arcmin = FOV_arcsec_original/60.
pix2rad = ((FOV_arcmin/60.)/float(n))*np.pi/180.
rad2pix = 1./pix2rad
x_pix = np.linspace(-FOV_arcsec_original/2/pix2rad/180.*np.pi/3600.,FOV_arcsec_original/2/pix2rad/180.*np.pi/3600.,n)
y_pix = np.linspace(-FOV_arcsec_original/2/pix2rad/180.*np.pi/3600.,FOV_arcsec_original/2/pix2rad/180.*np.pi/3600.,n)
X_pix,Y_pix = np.meshgrid(x_pix,y_pix)
conc = 10.
M = 1e13*M_sun
r_s = 18*1e3*pc
lambda_c = 40*pc ### The important parameter that doesn't seem to manifest itself in the map when changed
rho_s = M/((4*np.pi*r_s**3)*(np.log(1+conc) - (conc/(1+conc))))
sigma_crit = (c**2*Ds)/(4*np.pi*G*Dd*Dds)
k_s = rho_s*r_s/sigma_crit
theta_s = r_s/Dd
Renorm = (4*G/c**2)*(Dds/(Dd*Ds))
#### Here I just interpolate and zoom into my field of view to get better resolutions
A = np.sqrt(X_pix**2 + Y_pix**2)*pix2rad/theta_s
A_1 = A[100:200,0:100]
n_x = n_y = 100
FOV_arcsec_x = FOV_arcsec_original*(100./300)
FOV_arcmin_x = FOV_arcsec_x/60.
pix2rad_x = ((FOV_arcmin_x/60.)/float(n_x))*np.pi/180.
rad2pix_x = 1./pix2rad_x
FOV_arcsec_y = FOV_arcsec_original*(100./300)
FOV_arcmin_y = FOV_arcsec_y/60.
pix2rad_y = ((FOV_arcmin_y/60.)/float(n_y))*np.pi/180.
rad2pix_y = 1./pix2rad_y
x1 = np.linspace(-FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,n_x)
y1 = np.linspace(-FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,n_y)
X1,Y1 = np.meshgrid(x1,y1)
n_x_2 = 500
n_y_2 = 500
x2 = np.linspace(-FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,n_x_2)
y2 = np.linspace(-FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,n_y_2)
X2,Y2 = np.meshgrid(x2,y2)
interp_spline = RectBivariateSpline(y1,x1,A_1)
A_2 = interp_spline(y2,x2)
A_3 = A_2[50:450,0:400]
n_x_3 = n_y_3 = 400
FOV_arcsec_x = FOV_arcsec_original*(100./300)*400./500.
FOV_arcmin_x = FOV_arcsec_x/60.
pix2rad_x = ((FOV_arcmin_x/60.)/float(n_x_3))*np.pi/180.
rad2pix_x = 1./pix2rad_x
FOV_arcsec_y = FOV_arcsec_original*(100./300)*400./500.
FOV_arcmin_y = FOV_arcsec_y/60.
pix2rad_y = ((FOV_arcmin_y/60.)/float(n_y_3))*np.pi/180.
rad2pix_y = 1./pix2rad_y
x3 = np.linspace(-FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,n_x_3)
y3 = np.linspace(-FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,n_y_3)
X3,Y3 = np.meshgrid(x3,y3)
n_x_4 = 1000
n_y_4 = 1000
x4 = np.linspace(-FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,n_x_4)
y4 = np.linspace(-FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,n_y_4)
X4,Y4 = np.meshgrid(x4,y4)
interp_spline = RectBivariateSpline(y3,x3,A_3)
A_4 = interp_spline(y4,x4)
############### Function to calculate variance
variance = np.zeros((len(A_4),len(A_4)))
def variance_fluctuations(x):
for i in xrange(len(x)):
for j in xrange(len(x)):
if x[j][i] < 1.:
variance[j][i] = (k_s**2)*(lambda_c/r_s)*((np.pi/x[j][i]) - (1./(x[j][i]**2 -1)**3.)*(((6.*x[j][i]**4. - 17.*x[j][i]**2. + 26)/3.)+ (((2.*x[j][i]**6. - 7.*x[j][i]**4. + 8.*x[j][i]**2. - 8)*np.arccosh(1./x[j][i]))/(np.sqrt(1-x[j][i]**2.)))))
elif x[j][i] > 1.:
variance[j][i] = (k_s**2)*(lambda_c/r_s)*((np.pi/x[j][i]) - (1./(x[j][i]**2 -1)**3.)*(((6.*x[j][i]**4. - 17.*x[j][i]**2. + 26)/3.)+ (((2.*x[j][i]**6. - 7.*x[j][i]**4. + 8.*x[j][i]**2. - 8)*np.arccos(1./x[j][i]))/(np.sqrt(x[j][i]**2.-1)))))
variance_fluctuations(A_4)
#### Creating the map
mean = 0
delta_kappa = np.random.normal(0,variance,A_4.shape)
xfinal = np.linspace(-FOV_arcsec_x*np.pi/180./3600.*Dd/pc/2,FOV_arcsec_x*np.pi/180./3600.*Dd/pc/2,1000)
yfinal = np.linspace(-FOV_arcsec_x*np.pi/180./3600.*Dd/pc/2,FOV_arcsec_x*np.pi/180./3600.*Dd/pc/2,1000)
Xfinal, Yfinal = np.meshgrid(xfinal,yfinal)
plt.contourf(Xfinal,Yfinal,delta_kappa,100)
plt.show()
The map looks like this, with the density of blobs increasing towards the right. However, the size of the blobs don't change and the map looks virtually the same whether I use lambda_c = 40*pc or lambda_c = 400*pc.
I'm wondering if the np.random.normal function isn't really doing what I expect it to do? I feel like the pixel scale of the map and the way samples are drawn make no link to the size of the blobs. Maybe there is a better way to create the map using the variance, would appreciate any insight.
I expect the map to look something like this , the blob sizes change based on the input parameters for my variance :

This is quite a well visited problem in (surprise surprise) astronomy and cosmology.
You could use lenstool: https://lenstools.readthedocs.io/en/latest/examples/gaussian_random_field.html
You could also try here:
https://andrewwalker.github.io/statefultransitions/post/gaussian-fields
Not to mention:
https://github.com/bsciolla/gaussian-random-fields
I am not reproducing code here because all credit goes to the above authors. However, they did just all come right out a google search :/
Easiest of all is probably a python module FyeldGenerator, apparently designed for this exact purpose:
https://github.com/cphyc/FyeldGenerator
So (adapted from github example):
pip install FyeldGenerator
from FyeldGenerator import generate_field
from matplotlib import use
use('Agg')
import matplotlib.pyplot as plt
import numpy as np
plt.figure()
# Helper that generates power-law power spectrum
def Pkgen(n):
def Pk(k):
return np.power(k, -n)
return Pk
# Draw samples from a normal distribution
def distrib(shape):
a = np.random.normal(loc=0, scale=1, size=shape)
b = np.random.normal(loc=0, scale=1, size=shape)
return a + 1j * b
shape = (512, 512)
field = generate_field(distrib, Pkgen(2), shape)
plt.imshow(field, cmap='jet')
plt.savefig('field.png',dpi=400)
plt.close())
This gives:
Looks pretty straightforward to me :)
PS: FoV implied a telescope observation of the gaussian random field :)

A completely different and much quicker way may be just to blur the delta_kappa array with gaussian filter. Try adjusting sigma parameter to alter the blobs size.
from scipy.ndimage.filters import gaussian_filter
dk_gf = gaussian_filter(delta_kappa, sigma=20)
Xfinal, Yfinal = np.meshgrid(xfinal,yfinal)
plt.contourf(Xfinal,Yfinal,dk_ma,100, cmap='jet')
plt.show();
this is image with sigma=20
this is image with sigma=2.5

ThunderFlash, try this code to draw the map:
# function to produce blobs:
from scipy.stats import multivariate_normal
def blob (positions, mean=(0,0), var=1):
cov = [[var,0],[0,var]]
return multivariate_normal(mean, cov).pdf(positions)
"""
now prepare for blobs generation.
note that I use less dense grid to pick blobs centers (regulated by `step`)
this makes blobs more pronounced and saves calculation time.
use this part instead of your code section below comment #### Creating the map
"""
delta_kappa = np.random.normal(0,variance,A_4.shape) # same
step = 10 #
dk2 = delta_kappa[::step,::step] # taking every 10th element
x2, y2 = xfinal[::step],yfinal[::step]
field = np.dstack((Xfinal,Yfinal))
print (field.shape, dk2.shape, x2.shape, y2.shape)
>> (1000, 1000, 2), (100, 100), (100,), (100,)
result = np.zeros(field.shape[:2])
for x in range (len(x2)):
for y in range (len(y2)):
res2 = blob(field, mean = (x2[x], y2[y]), var=10000)*dk2[x,y]
result += res2
# the cycle above took over 20 minutes on Ryzen 2700X. It could be accelerated by vectorization presumably.
plt.contourf(Xfinal,Yfinal,result,100)
plt.show()
you may want to play with var parameter in blob() to smoothen the image and with step to make it more compressed.
Here is the image that I got using your code (somehow axes are flipped and more dense areas on the top):

Animate Self Organizing Map in Tensorflow

I found this very helpful blog for the implementation of self organizing maps using tensorflow. I tried running the scikit learn iris data set on it and I get the result see image below. To see how the SOM evolves I would like to animate my graph and here is where I got stuck. I found some basic example for animation:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
fig2 = plt.figure()
x = np.arange(-9, 10)
y = np.arange(-9, 10).reshape(-1, 1)
base = np.hypot(x, y)
ims = []
for add in np.arange(15):
ims.append((plt.pcolor(x, y, base + add, norm=plt.Normalize(0, 30)),))
im_ani = animation.ArtistAnimation(fig2, ims, interval=50, repeat_delay=3000, blit=True)
plt.show()
To animate I must edit the train function of som.py because the training for loop is encapsulated there. It looks like this:
def train(self, input_vects):
"""
Trains the SOM.
'input_vects' should be an iterable of 1-D NumPy arrays with
dimensionality as provided during initialization of this SOM.
Current weightage vectors for all neurons(initially random) are
taken as starting conditions for training.
"""
#fig2 = plt.figure()
#Training iterations
for iter_no in tqdm(range(self._n_iterations)):
#Train with each vector one by one
for input_vect in input_vects:
self._sess.run(self._training_op,
feed_dict={self._vect_input: input_vect,
self._iter_input: iter_no})
#Store a centroid grid for easy retrieval later on
centroid_grid = [[] for i in range(self._m)]
self._weightages = list(self._sess.run(self._weightage_vects))
self._locations = list(self._sess.run(self._location_vects))
for i, loc in enumerate(self._locations):
centroid_grid[loc[0]].append(self._weightages[i])
#im_ani = animation.ArtistAnimation(fig2, centroid_grid, interval=50, repeat_delay=3000, blit=True)
self._centroid_grid = centroid_grid
self._trained = True
#plt.show()
The comments are my try to implement the animation but it doesn't work because in the basic example the ims list is a matplotlib object and in the training function the list is a 4d numpy array.
To sum it up how can I animate my plot? Thanks for your help in advance.
Here is my full code:
som.py
import tensorflow as tf
import numpy as np
from tqdm import tqdm
import matplotlib.animation as animation
from matplotlib import pyplot as plt
import time
class SOM(object):
"""
2-D Self-Organizing Map with Gaussian Neighbourhood function
and linearly decreasing learning rate.
"""
#To check if the SOM has been trained
_trained = False
def __init__(self, m, n, dim, n_iterations=100, alpha=None, sigma=None):
"""
Initializes all necessary components of the TensorFlow
Graph.
m X n are the dimensions of the SOM. 'n_iterations' should
should be an integer denoting the number of iterations undergone
while training.
'dim' is the dimensionality of the training inputs.
'alpha' is a number denoting the initial time(iteration no)-based
learning rate. Default value is 0.3
'sigma' is the the initial neighbourhood value, denoting
the radius of influence of the BMU while training. By default, its
taken to be half of max(m, n).
"""
#Assign required variables first
self._m = m
self._n = n
if alpha is None:
alpha = 0.3
else:
alpha = float(alpha)
if sigma is None:
sigma = max(m, n) / 2.0
else:
sigma = float(sigma)
self._n_iterations = abs(int(n_iterations))
##INITIALIZE GRAPH
self._graph = tf.Graph()
##POPULATE GRAPH WITH NECESSARY COMPONENTS
with self._graph.as_default():
##VARIABLES AND CONSTANT OPS FOR DATA STORAGE
#Randomly initialized weightage vectors for all neurons,
#stored together as a matrix Variable of size [m*n, dim]
self._weightage_vects = tf.Variable(tf.random_normal(
[m*n, dim]))
#Matrix of size [m*n, 2] for SOM grid locations
#of neurons
self._location_vects = tf.constant(np.array(
list(self._neuron_locations(m, n))))
##PLACEHOLDERS FOR TRAINING INPUTS
#We need to assign them as attributes to self, since they
#will be fed in during training
#The training vector
self._vect_input = tf.placeholder("float", [dim])
#Iteration number
self._iter_input = tf.placeholder("float")
##CONSTRUCT TRAINING OP PIECE BY PIECE
#Only the final, 'root' training op needs to be assigned as
#an attribute to self, since all the rest will be executed
#automatically during training
#To compute the Best Matching Unit given a vector
#Basically calculates the Euclidean distance between every
#neuron's weightage vector and the input, and returns the
#index of the neuron which gives the least value
bmu_index = tf.argmin(tf.sqrt(tf.reduce_sum(
tf.pow(tf.subtract(self._weightage_vects, tf.stack([self._vect_input for i in range(m*n)])), 2), 1)), 0)
#This will extract the location of the BMU based on the BMU's
#index
slice_input = tf.pad(tf.reshape(bmu_index, [1]),
np.array([[0, 1]]))
bmu_loc = tf.reshape(tf.slice(self._location_vects, slice_input,
tf.constant(np.array([1, 2]))),
[2])
#To compute the alpha and sigma values based on iteration
#number
learning_rate_op = tf.subtract(1.0, tf.div(self._iter_input,
self._n_iterations))
_alpha_op = tf.multiply(alpha, learning_rate_op)
_sigma_op = tf.multiply(sigma, learning_rate_op)
#Construct the op that will generate a vector with learning
#rates for all neurons, based on iteration number and location
#wrt BMU.
bmu_distance_squares = tf.reduce_sum(tf.pow(tf.subtract(
self._location_vects, tf.stack(
[bmu_loc for i in range(m*n)])), 2), 1)
neighbourhood_func = tf.exp(tf.negative(tf.div(tf.cast(
bmu_distance_squares, "float32"), tf.pow(_sigma_op, 2))))
learning_rate_op = tf.multiply(_alpha_op, neighbourhood_func)
#Finally, the op that will use learning_rate_op to update
#the weightage vectors of all neurons based on a particular
#input
learning_rate_multiplier = tf.stack([tf.tile(tf.slice(
learning_rate_op, np.array([i]), np.array([1])), [dim])
for i in range(m*n)])
weightage_delta = tf.multiply(
learning_rate_multiplier,
tf.subtract(tf.stack([self._vect_input for i in range(m*n)]),
self._weightage_vects))
new_weightages_op = tf.add(self._weightage_vects,
weightage_delta)
self._training_op = tf.assign(self._weightage_vects,
new_weightages_op)
##INITIALIZE SESSION
self._sess = tf.Session()
##INITIALIZE VARIABLES
init_op = tf.global_variables_initializer()
self._sess.run(init_op)
def _neuron_locations(self, m, n):
"""
Yields one by one the 2-D locations of the individual neurons
in the SOM.
"""
#Nested iterations over both dimensions
#to generate all 2-D locations in the map
for i in range(m):
for j in range(n):
yield np.array([i, j])
def train(self, input_vects):
"""
Trains the SOM.
'input_vects' should be an iterable of 1-D NumPy arrays with
dimensionality as provided during initialization of this SOM.
Current weightage vectors for all neurons(initially random) are
taken as starting conditions for training.
"""
#fig2 = plt.figure()
#Training iterations
for iter_no in tqdm(range(self._n_iterations)):
#Train with each vector one by one
for input_vect in input_vects:
self._sess.run(self._training_op,
feed_dict={self._vect_input: input_vect,
self._iter_input: iter_no})
#Store a centroid grid for easy retrieval later on
centroid_grid = [[] for i in range(self._m)]
self._weightages = list(self._sess.run(self._weightage_vects))
self._locations = list(self._sess.run(self._location_vects))
for i, loc in enumerate(self._locations):
centroid_grid[loc[0]].append(self._weightages[i])
#im_ani = animation.ArtistAnimation(fig2, centroid_grid, interval=50, repeat_delay=3000, blit=True)
self._centroid_grid = centroid_grid
#print(centroid_grid)
self._trained = True
#plt.show()
def get_centroids(self):
"""
Returns a list of 'm' lists, with each inner list containing
the 'n' corresponding centroid locations as 1-D NumPy arrays.
"""
if not self._trained:
raise ValueError("SOM not trained yet")
return self._centroid_grid
def map_vects(self, input_vects):
"""
Maps each input vector to the relevant neuron in the SOM
grid.
'input_vects' should be an iterable of 1-D NumPy arrays with
dimensionality as provided during initialization of this SOM.
Returns a list of 1-D NumPy arrays containing (row, column)
info for each input vector(in the same order), corresponding
to mapped neuron.
"""
if not self._trained:
raise ValueError("SOM not trained yet")
to_return = [self._locations[min([i for i in range(len(self._weightages))],
key=lambda x: np.linalg.norm(vect-self._weightages[x]))] for vect in input_vects]
return to_return
usage.py
from matplotlib import pyplot as plt
import matplotlib.animation as animation
import numpy as np
from som import SOM
from sklearn.datasets import load_iris
data = load_iris()
flower_data = data['data']
normed_flower_data = flower_data / flower_data.max(axis=0)
target_int = data['target']
target_names = data['target_names']
targets = [target_names[i] for i in target_int]
#Train a 20x30 SOM with 400 iterations
som = SOM(25, 25, 4, 100) # My parameters
som.train(normed_flower_data)
#Get output grid
image_grid = som.get_centroids()
#Map colours to their closest neurons
mapped = som.map_vects(normed_flower_data)
#Plot
plt.imshow(image_grid)
plt.title('SOM')
for i, m in enumerate(mapped):
plt.text(m[1], m[0], targets[i], ha='center', va='center',
bbox=dict(facecolor='white', alpha=0.5, lw=0))
plt.show()

scipy.ndimage.interpolation.zoom uses nearest-neighbor-like algorithm for scaling-down

While testing scipy's zoom function, I found that the results of scailng-down an array are similar to the nearest-neighbour algorithm, rather than averaging. This increases noise drastically, and is generally suboptimal for many application.
Is there an alternative that does not use nearest-neighbor-like algorithm and will properly average the array when downsizing? While coarsegraining works for integer scaling factors, I would need non-integer scaling factors as well.
Test case: create a random 100*M x 100*M array, for M = 2..20
Downscale the array by the factor of M three ways:
1) by taking the mean in MxM blocks
2) by using scipy's zoom with a scaling factor 1/M
3) by taking a first point within a
Resulting arrays have the same mean, the same shape, but scipy's array has the variance as high as the nearest-neighbor. Taking a different order for scipy.zoom does not really help.
import scipy.ndimage.interpolation
import numpy as np
import matplotlib.pyplot as plt
mean1, mean2, var1, var2, var3 = [],[],[],[],[]
values = range(1,20) # down-scaling factors
for M in values:
N = 100 # size of an array
a = np.random.random((N*M,N*M)) # large array
b = np.reshape(a, (N, M, N, M))
b = np.mean(np.mean(b, axis=3), axis=1)
assert b.shape == (N,N) #coarsegrained array
c = scipy.ndimage.interpolation.zoom(a, 1./M, order=3, prefilter = True)
assert c.shape == b.shape
d = a[::M, ::M] # picking one random point within MxM block
assert b.shape == d.shape
mean1.append(b.mean())
mean2.append(c.mean())
var1.append(b.var())
var2.append(c.var())
var3.append(d.var())
plt.plot(values, mean1, label = "Mean coarsegraining")
plt.plot(values, mean2, label = "mean scipy.zoom")
plt.plot(values, var1, label = "Variance coarsegraining")
plt.plot(values, var2, label = "Variance zoom")
plt.plot(values, var3, label = "Variance Neareset neighbor")
plt.xscale("log")
plt.yscale("log")
plt.legend(loc=0)
plt.show()
EDIT: Performance of scipy.ndimage.zoom on a real noisy image is also very poor
The original image is here http://wiz.mit.edu/lena_noisy.png
The code that produced it:
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage.interpolation import zoom
im = Image.open("/home/magus/Downloads/lena_noisy.png")
im = np.array(im)
plt.subplot(131)
plt.title("Original")
plt.imshow(im, cmap="Greys_r")
plt.subplot(132)
im2 = zoom(im, 1 / 8.)
plt.title("Scipy zoom 8x")
plt.imshow(im2, cmap="Greys_r", interpolation="none")
im.shape = (64, 8, 64, 8)
im3 = np.mean(im, axis=3)
im3 = np.mean(im3, axis=1)
plt.subplot(133)
plt.imshow(im3, cmap="Greys_r", interpolation="none")
plt.title("averaging over 8x8 blocks")
plt.show()

Nobody posted a working answer, so I will post a solution I currently use. Not the most elegant, but works.
import numpy as np
import scipy.ndimage
def zoomArray(inArray, finalShape, sameSum=False,
zoomFunction=scipy.ndimage.zoom, **zoomKwargs):
"""
Normally, one can use scipy.ndimage.zoom to do array/image rescaling.
However, scipy.ndimage.zoom does not coarsegrain images well. It basically
takes nearest neighbor, rather than averaging all the pixels, when
coarsegraining arrays. This increases noise. Photoshop doesn't do that, and
performs some smart interpolation-averaging instead.
If you were to coarsegrain an array by an integer factor, e.g. 100x100 ->
25x25, you just need to do block-averaging, that's easy, and it reduces
noise. But what if you want to coarsegrain 100x100 -> 30x30?
Then my friend you are in trouble. But this function will help you. This
function will blow up your 100x100 array to a 120x120 array using
scipy.ndimage zoom Then it will coarsegrain a 120x120 array by
block-averaging in 4x4 chunks.
It will do it independently for each dimension, so if you want a 100x100
array to become a 60x120 array, it will blow up the first and the second
dimension to 120, and then block-average only the first dimension.
Parameters
----------
inArray: n-dimensional numpy array (1D also works)
finalShape: resulting shape of an array
sameSum: bool, preserve a sum of the array, rather than values.
by default, values are preserved
zoomFunction: by default, scipy.ndimage.zoom. You can plug your own.
zoomKwargs: a dict of options to pass to zoomFunction.
"""
inArray = np.asarray(inArray, dtype=np.double)
inShape = inArray.shape
assert len(inShape) == len(finalShape)
mults = [] # multipliers for the final coarsegraining
for i in range(len(inShape)):
if finalShape[i] < inShape[i]:
mults.append(int(np.ceil(inShape[i] / finalShape[i])))
else:
mults.append(1)
# shape to which to blow up
tempShape = tuple([i * j for i, j in zip(finalShape, mults)])
# stupid zoom doesn't accept the final shape. Carefully crafting the
# multipliers to make sure that it will work.
zoomMultipliers = np.array(tempShape) / np.array(inShape) + 0.0000001
assert zoomMultipliers.min() >= 1
# applying scipy.ndimage.zoom
rescaled = zoomFunction(inArray, zoomMultipliers, **zoomKwargs)
for ind, mult in enumerate(mults):
if mult != 1:
sh = list(rescaled.shape)
assert sh[ind] % mult == 0
newshape = sh[:ind] + [sh[ind] // mult, mult] + sh[ind + 1:]
rescaled.shape = newshape
rescaled = np.mean(rescaled, axis=ind + 1)
assert rescaled.shape == finalShape
if sameSum:
extraSize = np.prod(finalShape) / np.prod(inShape)
rescaled /= extraSize
return rescaled
myar = np.arange(16).reshape((4,4))
rescaled = zoomArray(myar, finalShape=(3, 5))
print(myar)
print(rescaled)

FWIW i found that order=1 at least preserves the mean a lot better than the default or order=3 (as expected really)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Understanding average (sum) pooling padding in keras - python

Related

Pytorch Data Generator for extracting 2D images from many 3D cube

How to use Triton server "ensemble model" with 1:N input/output to create patches from large image?

Creating a 2D Gaussian random field from a given 2D variance

Animate Self Organizing Map in Tensorflow

scipy.ndimage.interpolation.zoom uses nearest-neighbor-like algorithm for scaling-down

Categories

Resources