How best to implement a matrix mask operation in tensorflow?

How best to implement a matrix mask operation in tensorflow? - python

I had a case where I needed to fill some holes (missing data) in an image processing application in tensorflow. The 'holes' are easy to locate as they are zeros and the good data is not zeros. I wanted to fill the holes with random data. This is quite easy to do using python numpy but doing it in tensorflow requires some work. I came up with a solution and wanted to see if there is a better or more efficient way to do the same thing. I understand that tensorflow does not yet support the more advanced numpy type indexing yet but there is a function tf.gather_nd() that seems promising for this. However, I could not tell from the documentation how to us it for what I wanted to do. I would appreciate answers that improve on what I did or especially if someone can show me how to do it using tf.gather_nd(). Also, tf.boolean_mask() does not work for what I am trying to do because it does not allow you to use the output as an index. In python what I am trying to do:
a = np.ones((2,2))
a[0,0]=a[0,1] = 0
mask = a == 0
a[mask] = np.random.random_sample(a.shape)[mask]
print('new a = ', a)
What I ended up doing in Tensorflow to achieve same thing (skipping filling the array steps)
zeros = tf.zeros(tf.shape(a))
mask = tf.greater(a,zeros)
mask_n = tf.equal(a,zeros)
mask = tf.cast(mask,tf.float32)
mask_n = tf.cast(mask_n,tf.float32
r = tf.random_uniform(tf.shape(a),minval = 0.0,maxval=1.0,dtype=tf.float32)
r_add = tf.multiply(mask_n,r)
targets = tf.add(tf.multiply(mask,a),r_add)

I think these three lines might do what you want. First, you make a mask. Then, you create the random data. Finally, fill in the masked values with the random data.
mask = tf.equal(a, 0.0)
r = tf.random_uniform(tf.shape(a), minval = 0.0,maxval=1.0,dtype=tf.float32)
targets = tf.where(mask, r, a)

You can use tf.where to achieve the same:
A = tf.Variable(a)
B = tf.where(A==0., tf.random_normal(A.get_shape()), tf.cast(A, tf.float32))

Related

Finite difference using xarray rolling

My goal is to compute a derivative of a moving window of a multidimensional dataset along a given dimension, where the dataset is stored as Xarray DataArray or DataSet.
In the simplest case, given a 2D array I would like to compute a moving difference across multiple entries in one dimension, e.g.:
data = np.kron(np.linspace(0,1,10), np.linspace(1,4,6) ).reshape(10,6)
T=3
reducedArray = np.zeros_like(data)
for i in range(data.shape[1]):
if i < T:
reducedArray[:,i] = data[:,i] - data[:,0]
else:
reducedArray[:,i] = data[:,i] - data[:,i-T]
where the if i <T condition ensures that input and output contain proper values (i.e., no nans) and are of identical shape.
Xarray's diff aims to perform a finite-difference approximation of a given derivative order using nearest-neighbours, so it is not suitable here, hence the question:
Is it possible to perform this operation using Xarray functions only?
The rolling weighted average example appears to be something similar, but still too distinct due to the usage of NumPy routines. I've been thinking that something along the lines of the following should work:
xr2DDataArray = xr.DataArray(
data,
dims=('x','y'),
coords={'x':np.linspace(0,1,10), 'y':np.linspace(1,4,6)}
)
r = xr2DDataArray.rolling(x=T,min_periods=2)
r.reduce( redFn )
I am struggling with the definition of redFn here ,though.
Caveat The actual dataset to which the operation is to be applied will have a size of ~10GiB, so a solution that does not blow up the memory requirements will be highly appreciated!
Update/Solution
Using Xarray rolling
After sleeping on it and a bit more fiddling the post linked above actually contains a solution. To obtain a finite difference we just have to define the weights to be $\pm 1$ at the ends and $0$ else:
def fdMovingWindow(data, **kwargs):
T = kwargs['T'];
del kwargs['T'];
weights = np.zeros(T)
weights[0] = -1
weights[-1] = 1
axis = kwargs['axis']
if data.shape[axis] == T:
return np.sum(data * weights, **kwargs)
else:
return 0
r.reduce(fdMovingWindow, T=4)
alternatively, using construct and a dot product:
weights = np.zeros(T)
weights[0] = -1
weights[-1] = 1
xrWeights = xr.DataArray(weights, dims=['window'])
xr2DDataArray.rolling(y=T,min_periods=1).construct('window').dot(xrWeights)
This carries a massive caveat: The procedure essentially creates a list arrays representing the moving window. This is fine for a modest 2D / 3D array, but for a 4D array that takes up ~10 GiB in memory this will lead to an OOM death!
Simplicistic - memory efficient
A less memory-intensive way is to copy the array and work in a way similar to NumPy's arrays:
xrDiffArray = xr2DDataArray.copy()
dy = xr2DDataArray.y.values[1] - xr2DDataArray.y.values[0] #equidistant sampling
for src in xr2DDataArray:
if src.y.values < xr2DDataArray.y.values[0] + T*dy:
xrDiffArray.loc[dict(y = src.y.values)] = src.values - xr2DDataArray.values[0]
else:
xrDiffArray.loc[dict(y = src.y.values)] = src.values - xr2DDataArray.sel(y = src.y.values - dy*T).values
This will produce the intended result without dimensional errors, but it requires a copy of the dataset.
I was hoping to utilise Xarray to prevent a copy and instead just chain operations that are then evaluated if and when values are actually requested.
A suggestion as to how to accomplish this will still be welcomed!

I have never used xarray, so maybe I am mistaken, but I think you can get the result you want avoiding using loops and conditionals. This is at least twice faster than your example for numpy arrays:
data = np.kron(np.linspace(0,1,10), np.linspace(1,4,6)).reshape(10,6)
reducedArray = np.empty_like(data)
reducedArray[:, T:] = data[:, T:] - data[:, :-T]
reducedArray[:, :T] = data[:, :T] - data[:, 0, np.newaxis]
I imagine the improvement will be higher when using DataArrays.
It does not use xarray functions but neither depends on numpy functions. I am confident that translating this to xarray will be straightforward, I know that it works if there are no coords, but once you include them, you get an error because of the coords mismatch (coords of data[:, T:] and of data[:, :-T] are different). Sadly, I can't do better now.

Sum values from numpy array if condition on value in another array is met

I'm facing a problem with vectorizing a function so that it applies efficiently on a numpy array.
My program entries :
A pos_part 2D Array of Nb_particles lines, 3 columns (basicaly x,y,z coordinates, only z is relevant for the part that bothers me) Nb_particles can up to several hundreds of thousands.
An prop_part 1D array with Nb_particles values. This part I got covered, creation is made with some nice numpy functions ; I just put here a basic distribution that ressembles real values.
A z_distances 1D Array, a simple np.arange betwwen z=0 and z=z_max.
Then come the calculation that takes time, because where I can't find a way to do things properply with only numpy operation of arrays. What i want to do is :
For all distances z_i in z_distances, sum all values from prop_part if corresponding particle coordinate z_particle < z_i. This would return a 1D array the same length as z_distances.
My ideas so far :
Version 0, for loop, enumerate and np.where do retrieve the index of values that I need to sum. Obviously quite long.
Version 1, using a mask on a new array (combination of z coordinates and particle properties), and sum on the masked array. Seems better than v0
Version 2, another mask and a np.vectorize, but i understand it's not efficient as vectorize is basicaly a for loop. Still seems better than v0
Version 3, I'm trying to use mask on a function that can I directly apply to z_distances, but it's not working so far.
So, here I am. There is maybe something to do with a sort and a cumulative sum, but I don't know how to do this, so any help would be greatly appreciated. Please find below the code to make things clearer
Thanks in advance.
import numpy as np
import time
import matplotlib.pyplot as plt
# Creation of particles' positions
Nb_part = 150_000
pos_part = 10*np.random.rand(Nb_part,3)
pos_part[:,0] = pos_part[:,1] = 0
#usefull property creation
beta = 1/1.5
prop_part = (1/beta)*np.exp(-pos_part[:,2]/beta)
z_distances = np.arange(0,10,0.1)
#my version 0
t0=time.time()
result = np.empty(len(z_distances))
for index_dist, val_dist in enumerate(z_distances):
positions = np.where(pos_part[:,2]<val_dist)[0]
result[index_dist] = sum(prop_part[i] for i in positions)
print("v0 :",time.time()-t0)
#A graph to help understand
plt.figure()
plt.plot(z_distances,result, c="red")
plt.ylabel("Sum of particles' usefull property for particles with z-pos<d")
plt.xlabel("d")
#version 1 ??
t1=time.time()
combi = np.column_stack((pos_part[:,2],prop_part))
result2 = np.empty(len(z_distances))
for index_dist, val_dist in enumerate(z_distances):
mask = (combi[:,0]<val_dist)
result2[index_dist]=sum(combi[:,1][mask])
print("v1 :",time.time()-t1)
plt.plot(z_distances,result2, c="blue")
#version 2
t2=time.time()
def themask(a):
mask = (combi[:,0]<a)
return sum(combi[:,1][mask])
thefunc = np.vectorize(themask)
result3 = thefunc(z_distances)
print("v2 :",time.time()-t2)
plt.plot(z_distances,result3, c="green")
### This does not work so far
# version 3
# =============================
# t3=time.time()
# def thesum(a):
# mask = combi[combi[:,0]<a]
# return sum(mask[:,1])
# result4 = thesum(z_distances)
# print("v3 :",time.time()-t3)
# =============================

You can get a lot more performance by writing your first version completely in numpy. Replace pythons sum with np.sum. Instead of the for i in positions list comprehension, simply pass the positions mask you are creating anyways.
Indeed, the np.where is not necessary and my best version looks like:
#my version 0
t0=time.time()
result = np.empty(len(z_distances))
for index_dist, val_dist in enumerate(z_distances):
positions = pos_part[:, 2] < val_dist
result[index_dist] = np.sum(prop_part[positions])
print("v0 :",time.time()-t0)
# out: v0 : 0.06322097778320312
You can get a bit faster if z_distances is very long by using numba.
Running calc for the first time usually creates some overhead which we can get rid of by running the function for some small set of `z_distances.
The below code achieves roughly a factor of two speedup over pure numpy on my laptop.
import numba as nb
#nb.njit(parallel=True)
def calc(result, z_distances):
n = z_distances.shape[0]
for ii in nb.prange(n):
pos = pos_part[:, 2] < z_distances[ii]
result[ii] = np.sum(prop_part[pos])
return result
result4 = np.zeros_like(result)
# _t = time.time()
# calc(result4, z_distances[:10])
# print(time.time()-_t)
t3 = time.time()
result4 = calc(result4, z_distances)
print("v3 :", time.time()-t3)
plt.plot(z_distances, result4)

How to do convolution matrix operation in numpy?

Is there a way to do convolution matrix operation using numpy?
The numpy.convolve only operates on 1D arrays, so this is not the solution.
I rather want to avoid using scipy, since it appears to be more difficult getting installed on Windows.

You have scipy's ndimage which allows you to perform N-dimensional convolution with convolve:
from scipy.ndimage import convolve
convolve(data, kernel)
I know that you said that you want to avoid scipy... but I would advise against it. Scipy is great in so many ways. If you want to install it on windows, try Anaconda Distribution, which already comes with scipy installed.
Anaconda is a multiplatform python distribution that comes with all the essential libraries (including a lot of scientific computing libraries) preinstalled, and tools like pip or conda to install new ones. And no, they don't pay me to advertise it :/ but makes your multiplatform life much easier.

I would highly recommend using openCV for this purpose. However in principle you can almost directly use the "pseudo-code" on the wiki-article on kernel convolution to create your own function...
ks = (kl-1)/2 ## kernels usually square with odd number of rows/columns
kl = len(kernel)
imx = len(matrix)
imy = len(matrix[0])
for i in range(imx):
for j in range(imy):
acc = 0
for ki in range(kl): ##kernel is the matrix to be used
for kj in range(kl):
if 0 <= i-ks <= kl: ## make sure you don't get out of bound error
acc = acc + (matrix[i-ks+ki][j-ks+kj] * kernel[ki][kj])
matrix[i][j] = acc
this should in principle do the trick (but I have not yet tested it...)
I hope this is helpful.

I used the example on the wikipedia article and extrapolated it for every element in the matrix:
def image_convolution(matrix, kernel):
# assuming kernel is symmetric and odd
k_size = len(kernel)
m_height, m_width = matrix.shape
padded = np.pad(matrix, (k_size-1, k_size-1))
# iterates through matrix, applies kernel, and sums
output = []
for i in range(m_height):
for j in range(m_width):
output.append(np.sum(padded[i:k_size+i, j:k_size+j]*kernel))
output=np.array(output).reshape((m_height, m_width))
return output
padded[i:k_size+i, j:k_size+j] is a slice of the array the same size as the kernel.
Hope this is clear and helps.

An alternate numpy way to perform using matrix adds instead of cells reduces the looping.
def zConv(m,K):
#input assumed to be numpy arrays Kr<=mrow, Kc<=mcol, Kernal odd
#edges wrap Top/Bottom, Left/Right
#Zero Pad m by kr,kc if no wrap desired
mc=m*0
Kr,Kc= K.shape
kr=Kr//2 #kernel center
kc=Kc//2
for dr in range(-kr,kr+1):
mr=np.roll(m,dr,axis=0)
for dc in range(-kc,kc+1):
mrc=np.roll(mr,dc,axis=1)
mc=mc+K[dr+kr,dc+kc]*mrc
return mc

If your kernel is not symmetric (adjusted from the other answers):
def image_convolution(matrix, kernel):
# kernel can be asymmetric but still needs to be odd
k_height, k_width = kernel.shape
m_height, m_width = matrix.shape
k_size = max(k_height, k_width)
padded = np.pad(matrix, (int(k_size/2), int(k_size/2)))
if k_size > 1:
if k_height == 1:
padded = padded[1:-1,:]
elif k_width == 1:
padded = padded[:,1:-1]
# iterates through matrix, applies kernel, and sums
output = []
for i in range(m_height):
for j in range(m_width):
between = padded[i:k_height+i, j:k_width+j]*kernel
output.append(np.sum(between))
output=np.array(output).reshape((m_height, m_width))
return output

median-absolute-deviation (MAD) based outlier detection

I wanted to apply median-absolute-deviation (MAD) based outlier detection using the answer from #Joe Kington as given below:
Pythonic way of detecting outliers in one dimensional observation data
However, what's going wrong with my code, I could not figure out how to assign the outliers as nan values for MY DATA:
import numpy as np
data = np.array([55,32,4,5,6,7,8,9,11,0,2,1,3,4,5,6,7,8,25,25,25,25,10,11,12,25,26,27,28],dtype=float)
median = np.median(data, axis=0)
diff = np.sum((data - median)**2, axis=-1)
diff = np.sqrt(diff)
med_abs_deviation = np.median(diff)
modified_z_score = 0.6745 * diff / med_abs_deviation
data_without_outliers = data[modified_z_score < 3.5]
?????
print data_without_outliers

What is the problem with using:
data[modified_z_score > 3.5] = np.nan
Note that this will only work if data is a floating point array (which it should be if you are calculating MAD).

The problem might be line:
diff = np.sum((data - median)**2, axis=-1)
Applying np.sum() will collapse the result to scalar.
Remove top-level sum, and your code will work.
Other way around it to ensure that that data is at least 2d array. You can use numpy.atleast_2d() for that.
In order to assign NaNs, follow answer from https://stackoverflow.com/a/22804327/4989451

Convolution computations in Numpy/Scipy

Profiling some computational work I'm doing showed me that one bottleneck in my program was a function that basically did this (np is numpy, sp is scipy):
def mix1(signal1, signal2):
spec1 = np.fft.fft(signal1, axis=1)
spec2 = np.fft.fft(signal2, axis=1)
return np.fft.ifft(spec1*spec2, axis=1)
Both signals have shape (C, N) where C is the number of sets of data (usually less than 20) and N is the number of samples in each set (around 5000). The computation for each set (row) is completely independent of any other set.
I figured that this was just a simple convolution, so I tried to replace it with:
def mix2(signal1, signal2):
outputs = np.empty_like(signal1)
for idx, row in enumerate(outputs):
outputs[idx] = sp.signal.convolve(signal1[idx], signal2[idx], mode='same')
return outputs
...just to see if I got the same results. But I didn't, and my questions are:
Why not?
Is there a better way to compute the equivalent of mix1()?
(I realise that mix2 probably wouldn't have been faster as-is, but it might have been a good starting point for parallelisation.)
Here's the full script I used to quickly check this:
import numpy as np
import scipy as sp
import scipy.signal
N = 4680
C = 6
def mix1(signal1, signal2):
spec1 = np.fft.fft(signal1, axis=1)
spec2 = np.fft.fft(signal2, axis=1)
return np.fft.ifft(spec1*spec2, axis=1)
def mix2(signal1, signal2):
outputs = np.empty_like(signal1)
for idx, row in enumerate(outputs):
outputs[idx] = sp.signal.convolve(signal1[idx], signal2[idx], mode='same')
return outputs
def test(num, chans):
sig1 = np.random.randn(chans, num)
sig2 = np.random.randn(chans, num)
res1 = mix1(sig1, sig2)
res2 = mix2(sig1, sig2)
np.testing.assert_almost_equal(res1, res2)
if __name__ == "__main__":
np.random.seed(0x1234ABCD)
test(N, C)

So I tested this out and can now confirm a few things:
1) numpy.convolve is not circular, which is what the fft code is giving you:
2) FFT does not internally pad to a power of 2. Compare the vastly different speeds of the following operations:
x1 = np.random.uniform(size=2**17-1)
x2 = np.random.uniform(size=2**17)
np.fft.fft(x1)
np.fft.fft(x2)
3) Normalization is not a difference -- if you do a naive circular convolution by adding up a(k)*b(i-k), you will get the result of the FFT code.
The thing is padding to a power of 2 is going to change the answer. I've heard tales that there are ways to deal with this by cleverly using prime factors of the length (mentioned but not coded in Numerical Recipes) but I've never seen people actually do that.

scipy.signal.fftconvolve does convolve by FFT, it's python code. You can study the source code, and correct you mix1 function.

As mentioned before, the scipy.signal.convolve function does not perform a circular convolution. If you want a circular convolution performed in realspace (in contrast to using fft's) I suggest using the scipy.ndimage.convolve function. It has a mode parameter which can be set to 'wrap' making it a circular convolution.
for idx, row in enumerate(outputs):
outputs[idx] = sp.ndimage.convolve(signal1[idx], signal2[idx], mode='wrap')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How best to implement a matrix mask operation in tensorflow? - python

I think these three lines might do what you want. First, you make a mask. Then, you create the random data. Finally, fill in the masked values with the random data. mask = tf.equal(a, 0.0) r = tf.random_uniform(tf.shape(a), minval = 0.0,maxval=1.0,dtype=tf.float32) targets = tf.where(mask, r, a)

You can use tf.where to achieve the same: A = tf.Variable(a) B = tf.where(A==0., tf.random_normal(A.get_shape()), tf.cast(A, tf.float32))

Related

Finite difference using xarray rolling

Sum values from numpy array if condition on value in another array is met

How to do convolution matrix operation in numpy?

median-absolute-deviation (MAD) based outlier detection

Convolution computations in Numpy/Scipy

Categories

Resources