I have written the following code to convert a matrix into a stochastic and irreducible matrix. I have followed a paper (Deeper Inside PageRank) to write this code. This code works well for the square matrix but giving an error for rectangular matrices. How can I modify it to convert rectangular matrices into stochastic and irreducible matrices?
My Code:
import numpy as np
P = np.array([[0, 1/2, 1/2, 0, 0, 0], [0, 0, 0, 0, 0, 0], [1/3, 1/3, 0, 0, 1/3, 0], [0, 0, 0, 0, 1/2, 1/2], [0, 0, 0, 1/2, 0, 1/2]])
#P is the original matrix containing 0 rows
col_len = len(P[0])
row_len = len(P)
eT = np.ones(shape=(1, col_len)) # Row vector of ones to replace row of zeros
e = eT.transpose() # it is a column vector e
eT_n = np.array(eT / col_len) # obtained by dividing row vector of ones by order of matrix
Rsum = 0
for i in range(row_len):
for j in range(col_len):
Rsum = Rsum + P[i][j]
if Rsum == 0:
P[i] = eT_n
Rsum = 0
P_bar = P.astype(float) #P_bar is stochastic matrix obtained by replacing row of ones by eT_n in P
alpha = 0.85
P_dbar = alpha * P_bar + (1 - alpha) * e * (eT_n) #P_dbar is the irreducible matrix
print("The stocastic and irreducible matrix P_dbar is:\n", P_dbar)
Expected output:
A rectangular stochastic and irreducible matrix.
Actual output:
Traceback (most recent call last):
File "C:/Users/admin/PycharmProjects/Recommender/StochasticMatrix_11Aug19_BSK_v3.py", line 13, in <module>
P_dbar = alpha * P_bar + (1 - alpha) * e * (eT_n) #P_dbar is the irreducible matrix
ValueError: operands could not be broadcast together with shapes (5,6) (6,6)
You are trying to multiply two arrays of different shapes. That will not work, since one array has 30 elements, and the other has 36 elements.
You have to make sure the array e * eT_n has the same shape as your input array P.
You are not using the row_len value. But if e has the correct number of rows, your code will run.
# e = eT.transpose() # this will only work when the input array is square
e = np.ones(shape=(row_len, 1)) # this also works with a rectangular P
You can check that the shape is correct:
(e * eT_n).shape == P.shape
You should study the numpy documentation and tutorials to learn how to use the ndarray data structure. It's very powerful, but also quite different from the native python data types.
For example, you can replace this verbose and very slow nested python loop with a vectorized array operations.
Original code (with fixed indentation):
for i in range(row_len):
Rsum = 0
for j in range(col_len):
Rsum = Rsum + P[i][j]
if Rsum == 0:
P[i] = eT_n
Idiomatic numpy code:
P[P.sum(axis=1) == 0] = eT_n
Furthermore, you don't need to create the array eT_n. Since it's just a single value repeated, you can assign the scalar 1/6 directly instead.
# eT = np.ones(shape=(1, col_len))
# eT_n = np.array(eT / col_len)
P[P.sum(axis=1) == 0] = 1 / P.shape[1]
I am trying to create a function which can transform a given input sequence to a transition matrix of the requested order. I found an implementation for the first-order Markovian transition matrix.
Now, I want to be able to come up with a solution which can calculate 2nd and 3rd order transition matrices.
Example of the 1st order matrix implementation:
import numpy as np
# sequence with 3 states -> 0, 1, 2
a = [0, 1, 0, 0, 0, 2, 2, 1, 1, 1, 0, 0, 0, 0, 0, 1, 2, 2, 2, 0, 0, 2]
def transition_matrix_first_order(seq):
M = np.full((3, 3), fill_value = 1/3, dtype= np.float64)
for (i,j) in zip(seq, seq[1:]):
M[i, j] += 1
M = M / M.sum(axis = 1, keepdims = True)
return M
Which gives me this:
[[0.61111111 0.19444444 0.19444444]
[0.38888889 0.38888889 0.22222222]
[0.22222222 0.22222222 0.55555556]]
When making a 2nd order matrix, it should have unique_state_count ** order rows and unique_state_count columns. In the example above, I have 3 unique states, so the matrix will have 9x3 structure.
Desirable function sample:
cal_tr_matrix(seq, unique_state_count, order)
I think you have a slight misunderstanding about the Markov chains and their transition matrices.
First of all, the estimated transition matrix your function produces is unfortunately not correct. Why? Let's refresh.
A discrete Markov chain in discrete time with N different states has a transition matrix P of size N x N, where a (i, j) element is P(X_1=j|X_0=i), i.e. the probability of transition from state i to state j in a single time step.
Now a transition matrix of order n, denoted P^{n}is once again a matrix of size N x N where a (i, j) element is P(X_n=j|X_0=i), i.e. the probability of transition from state i to state j in n time steps.
A wonderful result says: P^{n} = P^n, i.e. taking n powers of single-step transition matrix gives you the n-step transition matrix.
Now with this recap, all that is needed is to estimate P from the given sequence, then to estimate P^{n} one can just use the already estimated P and take a n-th power of the matrix. So how to estimate the matrix P? Well if we denote N_{ij} the number of observations of transition from state i to state j and N_{i*} the number of observations being in state i, then P_{ij} = N_{ij} / N_{i*}.
Overall here in Python:
import numpy as np
def transition_matrix(arr, n=1):
Computes the transition matrix from Markov chain sequence of order `n`.
:param arr: Discrete Markov chain state sequence in discrete time with states in 0, ..., N
:param n: Transition order
M = np.zeros(shape=(max(arr) + 1, max(arr) + 1))
for (i, j) in zip(arr, arr[1:]):
M[i, j] += 1
T = (M.T / M.sum(axis=1)).T
return np.linalg.matrix_power(T, n)
transition_matrix(arr=a, n=1)
>>> array([[0.63636364, 0.18181818, 0.18181818],
>>> [0.4 , 0.4 , 0.2 ],
>>> [0.2 , 0.2 , 0.6 ]])
transition_matrix(arr=a, n=2)
>>> array([[0.51404959, 0.22479339, 0.26115702],
>>> [0.45454545, 0.27272727, 0.27272727],
>>> [0.32727273, 0.23636364, 0.43636364]])
transition_matrix(arr=a, n=3)
>>> array([[0.46927122, 0.23561232, 0.29511645],
>>> [0.45289256, 0.24628099, 0.30082645],
>>> [0.39008264, 0.24132231, 0.36859504]])
Interesting thing, when you set the order n to a fairly high number, the higher and higher powers of the P matrix seem to converge to some very specific values. That's known as stationary/invariant distribution of the Markov chain and it gives a very good indication of how the chain behaves over a long period of time/transitions. Also:
P = transition_matrix(a, 1)
P111 = transition_matrix(a, 111)
EDIT: Now to the tweaked solution based on your comment, I'd suggest to have higher dimensional matrices for higher orders instead of exploding the number of rows. One way would be like this:
def cal_tr_matrix(arr, order):
_shape = (max(arr) + 1,) * (order + 1)
M = np.zeros(_shape)
for _ind in zip(*[arr[_x:] for _x in range(order + 1)]):
M[_ind] += 1
return M
res1 = cal_tr_matrix(a, 1)
res2 = cal_tr_matrix(a, 2)
Now the element res1[i, j] says how many times transition i->j happened, while the element res2[i, j, k] says how many times transition i->j->k happened.
Let's say I have an 2D array of (N, N) shape:
import numpy as np
my_array = np.random.random((N, N))
Now I want to do some computations only on some "cells" of this array, for instance the ones inside the central part of the array. To avoid doing computations on cells I'm not interested in, what I usually do here is create a Boolean mask, in this spirit:
my_mask = np.zeros_like(my_array, bool)
my_mask[40:61,40:61] = True
my_array[my_mask] = some_twisted_computations(my_array[my_mask])
But what if some_twisted_computations() involves values of the neighboring cells if they are inside the mask? Performance-wise, would it be a good idea to create an "adjacency array" with a (len(my_mask), 4) shape, storing the index of 4-connected neighbor cells in the flat my_array[mask] array that I will use in some_twisted_computations()? If yes, what are the efficient options for computing such adjacency array? Should I switch to lower-level langage/other data structures?
My real-worlds arrays shapes are around (1000,1000,1000), the mask concerns only a small subset (~100000) of these values and is of rather complex geometry. I hope my questions make sense...
EDIT: the very dirty and slow solution I've worked out:
wall = mask
i = 0
top_neighbors = []
down_neighbors = []
left_neighbors = []
right_neighbors = []
indices = []
for index, val in np.ndenumerate(wall):
if not val:
indices += [index]
if wall[index[0] + 1, index[1]]:
down_neighbors += [(index[0] + 1, index[1])]
down_neighbors += [i]
if wall[index[0] - 1, index[1]]:
top_neighbors += [(index[0] - 1, index[1])]
top_neighbors += [i]
if wall[index[0], index[1] - 1]:
left_neighbors += [(index[0], index[1] - 1)]
left_neighbors += [i]
if wall[index[0], index[1] + 1]:
right_neighbors += [(index[0], index[1] + 1)]
right_neighbors += [i]
i += 1
top_neighbors = [i if type(i) is int else indices.index(i) for i in top_neighbors]
down_neighbors = [i if type(i) is int else indices.index(i) for i in down_neighbors]
left_neighbors = [i if type(i) is int else indices.index(i) for i in left_neighbors]
right_neighbors = [i if type(i) is int else indices.index(i) for i in right_neighbors]
The best answer will probably depend on the nature of the computations you want to do. For example, if they can be expressed as summations over neighboring pixels, then something like np.convolve or scipy.signal.fftconvolve can be a really nice solution.
For your specific question of efficiently generating arrays of neighbor indices, you might try something like this:
x = np.random.rand(100, 100)
mask = x > 0.9
i, j = np.where(mask)
i_neighbors = i[:, np.newaxis] + [0, 0, -1, 1]
j_neighbors = j[:, np.newaxis] + [-1, 1, 0, 0]
# need to do something with the edge cases
# the best choice will depend on your application
# here we'll change out-of-bounds neighbors to the
# central point itself.
i_neighbors = np.clip(i_neighbors, 0, 99)
j_neighbors = np.clip(j_neighbors, 0, 99)
# compute some vectorized result over the neighbors
# as a concrete example, here we'll do a standard deviation
result = x[i_neighbors, j_neighbors].std(axis=1)
The result is an array of values corresponding to the masked region, containing the standard deviation of neighboring values.
Hopefully that approach will work for whatever specific problem you have in mind!
Edit: given the edited question above, here's how my response can be adapted to generate arrays of indices in a vectorized manner:
x = np.random.rand(100, 100)
mask = x > -0.9
i, j = np.where(mask)
i_neighbors = i[:, np.newaxis] + [0, 0, -1, 1]
j_neighbors = j[:, np.newaxis] + [-1, 1, 0, 0]
i_neighbors = np.clip(i_neighbors, 0, 99)
j_neighbors = np.clip(j_neighbors, 0, 99)
indices = np.zeros(x.shape, dtype=int)
indices[mask] = np.arange(len(i))
neighbor_in_mask = mask[i_neighbors, j_neighbors]
neighbors = np.where(neighbor_in_mask,
indices[i_neighbors, j_neighbors],
np.arange(len(i))[:, None])
left_indices, right_indices, top_indices, bottom_indices = neighbors.T
I want to compute the pairwise square distance of a batch of feature in Tensorflow. I have a simple implementation using + and * operations by
tiling the original tensor :
def pairwise_l2_norm2(x, y, scope=None):
with tf.op_scope([x, y], scope, 'pairwise_l2_norm2'):
size_x = tf.shape(x)[0]
size_y = tf.shape(y)[0]
xx = tf.expand_dims(x, -1)
xx = tf.tile(xx, tf.pack([1, 1, size_y]))
yy = tf.expand_dims(y, -1)
yy = tf.tile(yy, tf.pack([1, 1, size_x]))
yy = tf.transpose(yy, perm=[2, 1, 0])
diff = tf.sub(xx, yy)
square_diff = tf.square(diff)
square_dist = tf.reduce_sum(square_diff, 1)
return square_dist
This function takes as input two matrices of size (m,d) and (n,d) and compute the squared distance between each row vector. The output is a matrix of size (m,n) with element 'd_ij = dist(x_i, y_j)'.
The problem is that I have a large batch and high dim features 'm, n, d' replicating the tensor consume a lot of memory.
I'm looking for another way to implement this without increasing the memory usage and just only store the final distance tensor. Kind of double looping the original tensor.
You can use some linear algebra to turn it into matrix ops. Note that what you need matrix D where a[i] is the ith row of your original matrix and
D[i,j] = (a[i]-a[j])(a[i]-a[j])'
You can rewrite that into
D[i,j] = r[i] - 2 a[i]a[j]' + r[j]
Where r[i] is squared norm of ith row of the original matrix.
In a system that supports standard broadcasting rules you can treat r as a column vector and write D as
D = r - 2 A A' + r'
In TensorFlow you could write this as
A = tf.constant([[1, 1], [2, 2], [3, 3]])
r = tf.reduce_sum(A*A, 1)
# turn r into column vector
r = tf.reshape(r, [-1, 1])
D = r - 2*tf.matmul(A, tf.transpose(A)) + tf.transpose(r)
sess = tf.Session()
array([[0, 2, 8],
[2, 0, 2],
[8, 2, 0]], dtype=int32)
Using squared_difference:
def squared_dist(A):
expanded_a = tf.expand_dims(A, 1)
expanded_b = tf.expand_dims(A, 0)
distances = tf.reduce_sum(tf.squared_difference(expanded_a, expanded_b), 2)
return distances
One thing I noticed is that this solution using tf.squared_difference gives me out of memory (OOM) for very large vectors, while the approach by #YaroslavBulatov doesn't. So, I think decomposing the operation yields a smaller memory footprint (which I thought squared_difference would handle better under the hood).
Here is a more general solution for two tensors of coordinates A and B:
def squared_dist(A, B):
assert A.shape.as_list() == B.shape.as_list()
row_norms_A = tf.reduce_sum(tf.square(A), axis=1)
row_norms_A = tf.reshape(row_norms_A, [-1, 1]) # Column vector.
row_norms_B = tf.reduce_sum(tf.square(B), axis=1)
row_norms_B = tf.reshape(row_norms_B, [1, -1]) # Row vector.
return row_norms_A - 2 * tf.matmul(A, tf.transpose(B)) + row_norms_B
Note that this is the square distance. If you want to change this to the Euclidean distance, perform a tf.sqrt on the result. If you want to do that, don't forget to add a small constant to compensate for the floating point instabilities: dist = tf.sqrt(squared_dist(A, B) + 1e-6).
If you want compute other method , then change the order of the tf modules.
def compute_euclidean_distance(x, y):
size_x = x.shape.dims[0]
size_y = y.shape.dims[0]
for i in range(size_x):
tile_one = tf.reshape(tf.tile(x[i], [size_y]), [size_y, -1])
eu_one = tf.expand_dims(tf.sqrt(tf.reduce_sum(tf.pow(tf.subtract(tile_one, y), 2), axis=1)), axis=0)
if i == 0:
d = eu_one
d = tf.concat([d, eu_one], axis=0)
return d
I am having trouble converting some MATLAB code into python. I am trying to build a signal by adding in shifted copies of base signal into a much longer one. The code that works in MATLAB is
function [time, signal] = generateRandomSignal(pulse,data,samples,Tb)
N = length(data);
time = linspace(0,N*Tb,samples*N);
signal = zeros(1,length(time));
k = 1;
for n = 1:N
window = k:k+samples-1;
signal(window) = signal(window) + data(n)*pulse;
k = k + samples;
In python using the variable to slice the larger array wasn't working so I changed that but now I got what I think should work but I keep getting errors about inconsistent array sizes even though when I inspect the sizes in a debugger it looks like it should work.
from numpy import *
def generateRandomSignal(pulse,data,samples,Tb):
N = data.size;
time = linspace(0,N*Tb,samples*N);
signal = zeros((1,time.size));
k = 0;
for n in range(0,N):
signal[k:k+samples] = signal[k:k+samples].copy() + data[n]*pulse[:].copy();
k = k + samples;
return time, signal
What is the correct way to do this in Python?
EDIT: Minimal expected input and output
data = [1, -1, 0, 1, 1]
pulse = [1, 1, 1]
samples = 3. #length of pulse
Tb = 0.1
signal = [1, 1, 1, -1, -1, -1, 0, 0, 0, 1, 1, 1, 1, 1, 1]
time = vector of 15 points evenly spaced from 0 to 0.3. (Not the problem)
EDIT2 Error
ValueError: operands could not be broadcast together with shapes (1920,) (1,4410)
That is the actual error produced. (1,4410) is the correct shape for the pulse array but I have no idea where the 1920 is coming from or what the empty comma means
Change your definition of signal to signal = zeros(time.size). Unlike Matlab, NumPy's 1D arrays have shape (N,), not (N,1).
I can't see why you should have 0 index in signal:
signal[0,k:k+samples] = signal[0,k:k+samples].copy() + data[n]*pulse[:].copy();
What is the easiest/fastest way to take a weighted sum of values in a numpy array?
Example: Solving the heat equation with the Euler method
u=zeros((length_l,length_l))# (x,y)
u[:, 0]=1
def dStep(ALPHA=0.1):
for position,value in ndenumerate(u):
D2u= (u[position+(1,0)]-2*value+u[position+(-1, 0)])/(1**2) \
+(u[position+(0,1)]-2*value+u[position+( 0,-1)])/(1**2)
while True:
D2u should be the second central difference in two dimensions. This would work if I could add indexes like (1,4)+(1,3)=(2,7). Unfortunately, python adds them as (1,4)+(1,3)=(1,4,1,3).
Note that computing D2u is equivalent to taking a dot product with this kernel centered around the current position:
0, 1, 0
1,-4, 1
0, 1, 0
Can this be vectorised as a dot product?
I think you want something like:
import numpy as np
from scipy.ndimage import convolve
length_l = 10
time_l = 10
u = np.zeros((length_l, length_l))# (x,y)
u[:, 0] = 1
u[:, -1] = 1
alpha = .1
weights = np.array([[ 0, 1, 0],
[ 1, -4, 1],
[ 0, 1, 0]])
for i in range(5):
u += alpha * convolve(u, weights)
You could reduce down a bit by doing:
weights = alpha * weights
weights[1, 1] = weights[1, 1] + 1
for i in range(5):
u = convolve(u, weights)