How to put an "arbitrary" operation into a sliding window using Theano? - python

I want to define some function on a matrix X. For example mean(pow(X - X0, 2)), where X0 is another matrix (X0 is fixed / constant). To make it more specific, let's assume that both X and X0 are 10 x 10 matrices. The result of the operation is a real number.
Now I have a big matrix (let's say 500 x 500). I want to apply the operation defined above to all 10 x 10 sub-matrices of the "big" matrix. In other words, I want to slide the 10 x 10 window over the "big" matrix. For each location of the window, I should get a real number. So, as a final result, I need to get a real-valued matrix (or 2D tensor) (its shape should be 491 x 491).
What I want to have is close to a convolutional layer but not exactly the same because I want to use a mean squared deviation instead of a linear function represented by a neuron.

This is only a Numpy solution, hope it suffices.
I assume that your function is made up of an operation on the matrix elements and of a mean, i.e. a scaled sum. Hence, it is sufficient to look at Y as in
Y = np.power(X-X0, 2)
So we only need to deal with determining a windowed mean. Note that for the 1D case the matrix product with an appropriate vector of ones can be determined for calculating the mean, e.g.
h = np.array([0, 1, 1, 0]) # same dimension as y
m1 = np.dot(h, y) / 2
m2 = (y[1] + y[2]) / 2
print(m1 == m2) # True
The 2D case is analogous, but with two matrix multiplications, one for the rows and one for the columns. E.g.
m_2 = np.dot(np.dot(h, Y), h) / 2**2
To construct a sliding window, we need to build a matrix of shifted windows, e.g.
H = [[1, 1, 1, 0, 0, ..., 0],
[0, 1, 1, 1, 0, ..., 0],
.
.
.
[0, ..., 0, 0, 1, 1, 1]]
to calculate all the sums
S = np.dot(np.dot(H, Y), H.T)
A full example for a (n, n) matrix with a (m, m) window would be
import numpy as np
n, m = 500, 10
X0 = np.ones((n, n))
X = np.random.rand(n, n)
Y = np.power(X-X0, 2)
h = np.concatenate((np.ones(m), np.zeros(n-m))) # window at position 0
H = np.vstack((np.roll(h, k) for k in range(n+1-m))) # slide the window
M = np.dot(np.dot(H,Y), H.T) / m**2 # calculate the mean
print(M.shape) # (491, 491)
An alternative but probably slightly less efficient way for building H is
H = np.sum(np.diag(np.ones(n-k), k)[:-m+1, :] for k in range(m))
Update
Calculating the mean squared deviation is also possible with that approach. For that, we generalize the vector identity |x-x0|^2 = (x-x0).T (x-x0) = x.T x - 2 x0.T x + x0.T x0 (a space denotes a scalar or matrix multiplication and .T a transposed vector) to the matrix case:
We assume W is a (m,n) matrix containing a block (m.m) identity matrix, which is able to extract the (k0,k1)-th (m,m) sub-matrix by Y = W Z W.T, where Z is the (n,n) matrix containing the data. Calculating the difference
D = Y - X0 = Y = W Z W.T - X0
is straightforward, where X0 and D is a (m,m) matrix. The square-root of the squared sum of the elements is called Frobenius norm. Based on those identities, we can write the squared sum as
s = sum_{i,j} D_{i,j}^2 = trace(D.T D) = trace((W Z W.T - X0).T (H Z H.T - X0))
= trace(W Z.T W.T W Z W.T) - 2 trace(X0.T W Z W.T) + trace(X0.T X0)
=: Y0 + Y1 + Y2
The term Y0 can be interpreted as H Z H.T from the method from above.The term Y1 can be interpreted as a weighted mean on Z and Y2 is a constant, which only needs to be determined once.
Thus, a possible implementation would be:
import numpy as np
n, m = 500, 10
x0 = np.ones(m)
Z = np.random.rand(n, n)
Y0 = Z**2
h0 = np.concatenate((np.ones(m), np.zeros(n-m)))
H0 = np.vstack((np.roll(h0, k) for k in range(n+1-m)))
M0 = np.dot(np.dot(H0, Y0), H0.T)
h1 = np.concatenate((-2*x0, np.zeros(n-m)))
H1 = np.vstack((np.roll(h1, k) for k in range(n+1-m)))
M1 = np.dot(np.dot(H1, Z), H0.T)
Y2 = np.dot(x0, x0)
M = (M0 + M1) / m**2 + Y2

Related

Optimal way to convolute continuous functions in python

I am trying to numerically compute in python integrals of the form
To that aim, I first define two discrete sets of x and t values, let's say
x_samples = np.linspace(-10, 10, 100)
t_samples = np.linspace(0, 1, 100)
dx = x_samples[1]-x_samples[0]
dt = t_samples[1]-t_samples[0]
declare symbolically that the function g(x,t) is equal to 0 if t<0 and discretise the two functions to integrate as
discretG = g(x_samples[None, :], t_samples[:, None])
discretH = h(x_samples[None, :], t_samples[:, None])
I have then tried to run
discretF = signal.fftconvolve(discretG, discretH, mode='full') * dx * dt
Yet, on basic test functions such as
g(x,t) = lambda x,t: np.exp(-np.abs(x))+t
h(x,t) = lambda x,t: np.exp(-np.abs(x))-t
I don't find an agreement between the the numerical integration and the convolution using scipy and I would like to have a fairly fast way of computing these integrals, especially when I only have access to discretised representations of the functions rather than their symbolic one.
According to your code, I assume you want to conduct convolution on two function g and h that are non-zero only on [a, b]*[m,n].
Of course you can use signal.fftconvolve to compute the convolution. The key is don't forget the transformation between the indices inside discretF and the real coordinates. Here I use interpolation to compute for arbitrary (x,t).
import numpy as np
from scipy import signal, interpolate
a = -1
b = 2
m = -10
n = 15
samples_num = 1000
x_eval_index = 200
t_eval_index = 300
x_samples = np.linspace(a, b, samples_num)
t_samples = np.linspace(m, n, samples_num)
dx = x_samples[1]-x_samples[0]
dt = t_samples[1]-t_samples[0]
g = lambda x,t: np.exp(-np.abs(x))+t
h = lambda x,t: np.exp(-np.abs(x))-t
discretG = g(x_samples[None, :], t_samples[:, None])
discretH = h(x_samples[None, :], t_samples[:, None])
discretF = signal.fftconvolve(discretG, discretH, mode='full')
def compute_f(x, t):
if x < 2*a or x > 2*b or t < 2*m or t > 2*n:
return 0
# use interpolation t get data on new point
x_samples_for_conv = np.linspace(2*a, 2*b, 2*samples_num-1)
t_samples_for_conv = np.linspace(2*m, 2*n, 2*samples_num-1)
f = interpolate.RectBivariateSpline(x_samples_for_conv, t_samples_for_conv, discretF.T)
return f(x, t)[0, 0] * dx * dt
Note: you can extend my codes to compute convolution on a meshgrid defined by x and y, where x and y are 1D array. (In my code, x and y are float now)
You can use the following code to explore the "agreement" between "the numerical integration" and "the convolution using scipy" (and also, the correctness of compute_f function above):
# how the convolve work
# for 1D f[i]=sigma_{j} g[j]h[i-j]
sum = 0
for y_idx, y in enumerate(x_samples[0:]):
for s_idx, s in enumerate(t_samples[0:]):
if x_eval_index - y_idx < 0 or t_eval_index - s_idx < 0:
continue
if t_eval_index - s_idx >= len(x_samples[0:]) or x_eval_index - y_idx >= len(t_samples[0:]):
continue
sum += discretG[t_eval_index - s_idx, x_eval_index - y_idx] * discretH[s_idx, y_idx] * dx * dt
print("Do discrete convolution manually, I get: %f" % sum)
print("Do discrete convolution using scipy, I get: %f" % (discretF[t_eval_index, x_eval_index] * dx * dt))
# numerical integral
# the x_val and t_val
# take 1D convolution as example, function defined on [a, b], and index of your samples range from [0, samples_num-1]
# after convolution, function defined on [2a, 2b], index of your samples range from [0, 2*samples_num-2]
dx_prime = (b-a) / (samples_num-1)
dt_prime = (n-m) / (samples_num-1)
x_eval = 2*a + x_eval_index * dx_prime
t_eval = 2*m + t_eval_index * dt_prime
sum = 0
for y in x_samples[:]:
for s in t_samples[:]:
if x_eval - y < a or x_eval - y > b:
continue
if t_eval - s < m or t_eval - s > n:
continue
if y < a or y >= b:
continue
if s < m or s >= n:
continue
sum += g(x_eval - y, t_eval - s) * h(y, s) * dx * dt
print("Do numerical integration, I get: %f" % sum)
print("The convolution result of 'compute_f' is: %f" % compute_f(x_eval, t_eval))
Which gives:
Do discrete convolution manually, I get: -154.771369
Do discrete convolution using scipy, I get: -154.771369
Do numerical integration, I get: -154.771369
The convolution result of 'compute_f' is: -154.771369

rotate a set of 3d coordinates in python

I have a set of 3d coordinates of a molecule and a vector passing through its center of mass. I want to rotate the molecule coordinates and the vector to make the vector align with z-axis.
I used a script in the link below to calculate the rotation matrix from the vector to z-axis, then apply the same rotation matrix to the other 3d coordinates:
Calculate rotation matrix to align two vectors in 3D space?
But this method makes my molecule "flat" (in magentas the molecule before rotation and in yellow after rotation):
Front view of molecule and Side view of molecule
Does anyone know why this method doesn't work? Is it mathematically correct? Thank you!
The method in this answer of the question you linked to seems correct to me, and produces one rotation matrix (from the infinite set of rotation matrices that will align vec1 to vec2):
def rotation_matrix_from_vectors(vec1, vec2):
""" Find the rotation matrix that aligns vec1 to vec2
:param vec1: A 3d "source" vector
:param vec2: A 3d "destination" vector
:return mat: A transform matrix (3x3) which when applied to vec1, aligns it with vec2.
"""
a, b = (vec1 / np.linalg.norm(vec1)).reshape(3), (vec2 / np.linalg.norm(vec2)).reshape(3)
v = np.cross(a, b)
c = np.dot(a, b)
s = np.linalg.norm(v)
kmat = np.array([[0, -v[2], v[1]], [v[2], 0, -v[0]], [-v[1], v[0], 0]])
rotation_matrix = np.eye(3) + kmat + kmat.dot(kmat) * ((1 - c) / (s ** 2))
return rotation_matrix
That rotation matrix is orthonormal, as it should be.
Perhaps (aka, wild guess) what's happening with your data is that the various axes have considerably different variance (perhaps different units?) In this case, you should first normalize your data before rotation. For example, say your original data is an array x with x.shape == (n, 3) and your vector is v with shape (3,):
u, s = x.mean(0), x.std(0)
x2 = (x - u) / s
v2 = (v - u) / s
Now, try to apply your rotation on x2, aligning v2 to [0,0,1].
Here is a toy example for illustration:
n = 100
x = np.c_[
np.random.normal(0, 100, n),
np.random.normal(0, 1, n),
np.random.normal(4, 3, n),
]
v = np.array([1,2,3])
x = np.r_[x, v[None, :]] # adding v into x so we can visualize it easily
Without normalization
A = rotation_matrix_from_vectors(np.array(v), np.array((0,0,1)))
y = x # A.T
fig, axes = plt.subplots(nrows=2, ncols=2)
for ax, (a, b) in zip(np.ravel(axes), combinations(range(3), 2)):
ax.plot(y[:, a], y[:, b], '.')
ax.plot(y[-1, a], y[-1, b], 'ro')
ax.set_xlabel(a)
ax.set_ylabel(b)
axes[1][1].set_visible(False)
With prior normalization
u, s = x.mean(0), x.std(0)
x2 = (x - u) / s
v2 = (v - u) / s
A = rotation_matrix_from_vectors(np.array(v2), np.array((0,0,1)))
y = x2 # A.T
fig, axes = plt.subplots(nrows=2, ncols=2)
for ax, (a, b) in zip(np.ravel(axes), combinations(range(3), 2)):
ax.plot(y[:, a], y[:, b], '.')
ax.plot(y[-1, a], y[-1, b], 'ro')
ax.set_xlabel(a)
ax.set_ylabel(b)
axes[1][1].set_visible(False)

Trying to convert Matlab to python: (https://en.wikipedia.org/wiki/Laplacian_matrix#Example_of_the_operator_on_a_grid)

I'm trying to get my head around the example code on the wikipedia page for Laplacian matricies.
It is written in MatLab and I only have access to open source tools.
I think most of it is fairly straight forward but I'm having a bit of trouble on this one line.
C0V = V'*C0; % Transform the initial condition into the coordinate system
% of the eigenvectors
Now my maths isn't up to scratch to really understand what the comment means. Judging by the matlab website this appears to be a transposed matrix multiplied (inner product) by another matrix.
Where the left matrix (after transposing) is ( m x p ) and the right is ( p x n ).
The matrix V is produced by a call to eig a few lines above, which from this answer I am substituting scipy.linalg.eig.
The problem is that C0 is clearly defined as an N x N matrix (ndarray), but in my code V is an N**2 x N**2 matrix. I have no way of knowing what shape V is in the original code.
For reference the wikipedia code is below, and below that is my attempt to rewrite it in python (using scipy and numpy), followed by the error attributed to the line described above.
Matlab code from wikipedia
N = 20; % The number of pixels along a dimension of the image
A = zeros(N, N); % The image
Adj = zeros(N * N, N * N); % The adjacency matrix
% Use 8 neighbors, and fill in the adjacency matrix
dx = [- 1, 0, 1, - 1, 1, - 1, 0, 1];
dy = [- 1, - 1, - 1, 0, 0, 1, 1, 1];
for x = 1:N
for y = 1:N
index = (x - 1) * N + y;
for ne = 1:length(dx)
newx = x + dx(ne);
newy = y + dy(ne);
if newx > 0 && newx <= N && newy > 0 && newy <= N
index2 = (newx - 1) * N + newy;
Adj(index, index2) = 1;
end
end
end
end
% BELOW IS THE KEY CODE THAT COMPUTES THE SOLUTION TO THE DIFFERENTIAL EQUATION
Deg = diag(sum(Adj, 2)); % Compute the degree matrix
L = Deg - Adj; % Compute the laplacian matrix in terms of the degree and adjacency matrices
[V, D] = eig(L); % Compute the eigenvalues/vectors of the laplacian matrix
D = diag(D);
% Initial condition (place a few large positive values around and
% make everything else zero)
C0 = zeros(N, N);
C0(2:5, 2:5) = 5;
C0(10:15, 10:15) = 10;
C0(2:5, 8:13) = 7;
C0 = C0(:);
C0V = V'*C0; % Transform the initial condition into the coordinate system
% of the eigenvectors
for t = 0:0.05:5
% Loop through times and decay each initial component
Phi = C0V .* exp(- D * t); % Exponential decay for each component
Phi = V * Phi; % Transform from eigenvector coordinate system to original coordinate system
Phi = reshape(Phi, N, N);
% Display the results and write to GIF file
imagesc(Phi);
caxis([0, 10]);
title(sprintf('Diffusion t = %3f', t));
frame = getframe(1);
im = frame2im(frame);
[imind, cm] = rgb2ind(im, 256);
if t == 0
imwrite(imind, cm, 'out.gif', 'gif', 'Loopcount', inf, 'DelayTime', 0.1);
else
imwrite(imind, cm, 'out.gif', 'gif', 'WriteMode', 'append', 'DelayTime', 0.1);
end
end
My attempted translation
import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg as la
N = 20 # The number of pixels along a dimension of the image
A = np.zeros((N, N)) # The image
Adj = np.zeros((N**2, N**2)) # The adjacency matrix
# Use 8 neighbors, and fill in the adjacency matrix
dx = [- 1, 0, 1, - 1, 1, - 1, 0, 1]
dy = [- 1, - 1, - 1, 0, 0, 1, 1, 1]
for x in range(N):
for y in range(N):
index = x * N + y
for ne in range(len(dx)):
newx = x + dx[ne]
newy = y + dy[ne]
if (newx >= 0 and newx < N
and newy >= 0 and newy < N):
index2 = newx * N + newy;
Adj[index, index2] = 1
# BELOW IS THE KEY CODE THAT COMPUTES THE SOLUTION TO THE DIFFERENTIAL EQUATION
Deg = np.diag(np.sum(Adj, 1)) # Compute the degree matrix
L = Deg - Adj # Compute the laplacian matrix in terms of the degree and adjacency matrices
D, V = la.eig(L) # Compute the eigenvalues/vectors of the laplacian matrix
D = np.diag(D)
# Initial condition (place a few large positive values around and
# make everything else zero)
C0 = np.zeros((N, N))
C0[1:4, 1:4] = 5
C0[9:14, 9:14] = 10
C0[1:5, 7:12] = 7
#C0 = C0(:) #This doesn't seem to do anything?
# matlab:C0V = V'*C0; % Transform the initial condition into the coordinate system
# of the eigenvectors
C0V = V.T * C0 # ???
Error
----------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-c3014dc90c9f> in <module>
2 # of the eigenvectors
3
----> 4 C0V = V.T * C0
ValueError: operands could not be broadcast together with shapes (400,400) (20,20)
Edit
It appears #hpaulj has identified my problem. The line I ommited is a ravel operation ie C0 = C0(:) takes a 20, 20 matrix to a 400, 1 vector. So I need
C0 = C0.ravel()

On the division and multiplication of matrices in python

Here the question with details and I think it's clearer,
suppose I have a matrix h of size 4 x 4 , and a vector of x of size 4 x 1, if we have y is the output of multiplication between h and x which means y = h * x; whose size is 1 x 4. So when I multiply again the inverse of every column in h by vector y, I should be able to get a vector equivalent of vector x which means $x = h^{-1} * y $. But unfortunately, I can't get that in python.
for example, let's first do that in MATLAB:
clear all
clc
h = (randn(4,4) + 1j*randn(4,4)); %any matrix of 4 x 4
x = [1 + 1j ; 0; 0 ; 0]; % a vector of 4 x 1
y = h * x ; % y is the output of multiplication
x2 = [];
for ii = 1 : 4
x1 = pinv(h(:,ii))*y; %multiply every column of h^(-1) with y
x2 = [x2 x1]; % the output
end
in that case, the output x2 is as expected, a vector 1 x 4 as below:
x2 =
1.0000 + 1.0000i 0.7249 + 0.5054i -0.0202 + 0.0104i 0.2429 + 0.0482i
In MATLAB, that's ok.
Now let's do that in python:
import numpy as np
h = np.random.randn(4,4) + 1j*np.random.randn(4,4)
x = [[1+1j],[0+0j],[0+0j],[0+0j]]
y = h.dot(x)
x2 = []
for ii in range(4):
x1 = np.divide(y, h[:,ii])
x2.append(x1)
print(x2)
Although x2 is supposed to be a vector of dimension 1 x 4 similar as in output of above MATLAB code, but in that case, I get x2 a matrix of size 4 x 4 !!
please any help.
There are two issues here:
np.divide() is for element-wise division, you may be looking for np.linalg.pinv() instead.
MATLAB is col major (FORTRAN-style), while NumPy is row major (C-style) so getting a list as a NumPy array will get you to a shape (n,) with n the length of the list and not an object of size (1, n) as MATLAB would.
The Python code equivalent (sort of, I'll do preallocation) to your MATLAB one, would be:
import numpy as np
h = np.random.randn(4, 4) + 1j * np.random.randn(4, 4)
x = np.array([[1 + 1j], [0 + 0j], [0 + 0j], [0 + 0j]])
# y = h.dot(x) <-- now NumPy supports also `#` in place of `np.dot()`
y = h # x
x2 = np.zeros((1, 4), dtype=np.complex)
for i in range(4):
x2[0, i] = np.linalg.pinv(h[:, i:i + 1]) # y
as you can see, the shape of the output is enforced right away.

how to compute covariance in tensorflow?

I have a problem that I don't know how to compute the covariance of two tensor. I have tried the contrib.metrics.streaming_covariance. But is always returns 0. There must be some errors.
You could use the definition of the covariance of two random variables X and Y with the expected values x0 and y0:
cov_xx = 1 / (N-1) * Sum_i ((x_i - x0)^2)
cov_yy = 1 / (N-1) * Sum_i ((y_i - y0)^2)
cov_xy = 1 / (N-1) * Sum_i ((x_i - x0) * (y_i - y0))
The crucial point is to estimate x0 and y0 here, since you normally do not know the probability distribution. In many cases, the mean of the x_i or y_i is estimated to be x_0 or y_0, respectively, i.e. the distribution is estimated to be uniform.
Then you can compute the elements of the covariance matrix as follows:
import tensorflow as tf
x = tf.constant([1, 4, 2, 5, 6, 24, 15], dtype=tf.float64)
y = tf.constant([8, 5, 4, 6, 2, 1, 1], dtype=tf.float64)
cov_xx = 1 / (tf.shape(x)[0] - 1) * tf.reduce_sum((x - tf.reduce_mean(x))**2)
cov_yy = 1 / (tf.shape(x)[0] - 1) * tf.reduce_sum((y - tf.reduce_mean(y))**2)
cov_xy = 1 / (tf.shape(x)[0] - 1) * tf.reduce_sum((x - tf.reduce_mean(x)) * (y - tf.reduce_mean(y)))
with tf.Session() as sess:
sess.run([cov_xx, cov_yy, cov_xy])
print(cov_xx.eval(), cov_yy.eval(), cov_xy.eval())
Of course, if you need the covariance in a matrix form, you can modify the last part as follows:
with tf.Session() as sess:
sess.run([cov_xx, cov_yy, cov_xy])
print(cov_xx.eval(), cov_yy.eval(), cov_xy.eval())
cov = tf.constant([[cov_xx.eval(), cov_xy.eval()], [cov_xy.eval(),
cov_yy.eval()]])
print(cov.eval())
To verify the elements of the TensorFlow way, you can check with numpy:
import numpy as np
x = np.array([1,4,2,5,6, 24, 15], dtype=float)
y = np.array([8,5,4,6,2,1,1], dtype=float)
pc = np.cov(x,y)
print(pc)
You can also try tensorflow probability for easy calculation of correlation or covariance.
x = tf.random_normal(shape=(100, 2, 3))
y = tf.random_normal(shape=(100, 2, 3))
# cov[i, j] is the sample covariance between x[:, i, j] and y[:, i, j].
cov = tfp.stats.covariance(x, y, sample_axis=0, event_axis=None)
# cov_matrix[i, m, n] is the sample covariance of x[:, i, m] and y[:, i, n]
cov_matrix = tfp.stats.covariance(x, y, sample_axis=0, event_axis=-1)

Categories

Resources