Making two (MxD) tensors into a single (Mx1) - python

The shape of Y[n,:,:] is (200,1) and so I need Z[n,,:,:]*H[n,:,:] (or something related) to be (200,1) also. But Z[n,,:,:] and H[n,:,:] are both (200,6) so I need a multiplication operator that multiplies them and gets rid of the 6 to give an answer of shape (200,1). Any suggestions? The code is below
n=10
M = 200
D=6
dW = np.sqrt(1/n)*randn(n,M,D);
H=cap(dW,1/n,np.log(n))#the generation of the Brownian motion increment
X = define_X(1,dW,1,1,1)
H[1]
H.shape
Y = np.zeros((n+1,M,1))
Z = np.zeros_like(X)
Z[n-1,:,:]=np.dot(np.transpose(Y[n,:,:]),H[n-1,:,:])
Y[n-1,:,:]= Y[n,:,:] +f(X[n-1,:,:],Y[n,:,:],Z[n-1,:,:])*(1/10)-Z[n,,:,:]*H[n,:,:]

Related

Understanding fancy einsum equation

I was reading about attention and came across this equation:
import einops
from fancy_einsum import einsum
import torch
x = torch.rand((200, 10, 768))
y = torch.rand((20, 768, 64))
res = einsum("batch query_pos d_model, n_heads d_model d_head -> batch query_pos n_heads d_head", x, y)
And I am not able to understand the underlying operations that give the result res
I thought it might be matmul and tried this:
import torch
x_ = x.unsqueeze(dim = 2).unsqueeze(dim = 2)
y_ = torch.broadcast_to(y, (1, 1, 20, 768, 64))
res2 = x_ # y_
res2 = res2.squeeze(dim = -2)
(res == res2).all() # Prints False
But that does not seem to be right.
Any help regarding this is greatly appreciated
So whenever using einsum you best think about the meaning of the dimensions. Basically we perform a multiplication between the two inputs in this case. The signature passed to einsum shows what dimensions will be preserved and which ones will be "summed away". I simplified the signature with single letters here:
res = einsum("b q m, n m h -> b q n h", x, y)
We can read from this that both x and y have three dimensions. Furthermore both have a dimension called m, and this doesn't appear in the output. So we can conclude that it gets "summed away". So for each entry of the output we have following formula. For simplicity I reused the dimension names as indices, so for every b,q,n,h we get
___
\
res[b,q,n,h] = / x[b,q,m] * y[n,m,h]
/__
m
To do this with any other function than einsum is usually more cumbersome. So first we need to reorder and unsqueeze the dimensions in a way that they are compatible to be multiplied, so we can do the following (the shapes annotated above):
#(b,q,m,n,h) (b, q, m, 1, 1) (m, n, h)
product = x[:, :, :, None, None] * y.permute([1,0,2])
Due to the broadcasting rules, the second (y-) term will implicitly get the required leading dummy dimensions.
Then we can "sum away" the dimension m:
res = product.sum(dim=2) # (b,q,n,h)
So you can interpret that as a matrix multiplication if you want, or also just a scalar product, but of course with many "batch"-dimensions.

creating matrix for armodel but np.arange returning a none type eventhough the matrix is correct. How to convert to an array? Below is my code so far

I have a function to generate matrix x and y for ar_model to calculate the coefficients using least_square method. However, np.arange when i do print correctly prints the x matrix as it should be but when i convert the np.arange value to an array it's not correct. Please help me how I can correctly generate the array version of the matrix. Thank you!
#example lists
x = [1,2,3,4,5]
y = [2,4,6,8,10,12,14]
def matrix(x, na): #na is order or armodel
X = np.array(x)
N = len(X)
p = na
for n in range(p, N):
u = X[np.arange((n-1),(n-p-1),-1)]
matrix = print(u)
array = np.array(matrix) #not correct
#need to get the negative versions of u
#but u isnt an array so I wasn't able multiply by -1
#matrix_y
y = X[na:]
return matrix, array

How do I convert this Matlab code with meshgrid and arrays to Python code?

I am attempting to write a program which constructs a matrix and performs a singular value decomposition on it. I am evaluating the function ax^2 +bx + 1 on a grid. I then make a uniform meshgrid of a and b. The rows of the matrix correspond to different quadratic coefficients, while each column corresponds to a grid point at which the function is evaluated.
The matlab code is here:
% Collect data
x = linspace(-1,1,100);
[a,b] = meshgrid(0:0.1:1,0:0.1:1);
D=zeros(numel(x),numel(a));
sz = size(D)
% Build “Dose” matrix
for i=1:numel(a)
D(:,i) = a(i)*x.^2+b(i)*x+1;
end
% Do the SVD:
[U,S,V]=svd(D,'econ');
D_reconstructed = U*S*V';
plot(diag(S))
scatter3(a(:),b(:),V(:,1))
This is my attempt at a solution:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-1, 1, 100)
def f(x, a, b):
return a*x*x + b*x + 1
a, b = np.mgrid[0:1:0.1,0:1:0.1]
#a = b = np.arange(0,1,0.01)
D = np.zeros((x.size, a.size))
for i in range(a.size):
D[i] = a[i]*x*x +b[i]*x +1
U, S, V = np.linalg.svd(D)
plt.plot(np.diag(S))
fig = plt.figure()
ax = plt.axes(projection="3d")
ax.scatter(a, b, V[0])
but I always get broadcasting errors which I am not sure how to fix.
Firstly, in MATLAB you're assigning to D(:,i), but in python you're assigning to D[i]. The latter is equivalent to D[i, ...] which is in your case D[i, :]. Instead you seem to need D[:, i].
Secondly, in MATLAB using a linear index into a 2d array (namely a and b) will give you flattened views. If you do that with numpy you get slices of an array instead, just as I mentioned with D[i].
You can do away with the loop with broadcasting and getting your desired 2d array by .ravelling (or reshaping) your a and b arrays:
x = np.linspace(-1, 1, 100)[:, None] # inject trailing singleton for broadcasting
a, b = np.mgrid[0:1:0.1, 0:1:0.1]
D = a.ravel() * x**2 + b.ravel() * x + 1
The way this works is that x has shape (100, 1) after we inject a trailing singleton (in MATLAB trailing singletons are implied, in numpy leading ones), and both a.ravel() and b.ravel() have shape (10*10,) which is compatible with (1, 10*10), making broadcasting possible into shape (100, 10*10). You could also replace the calls to ravel with
a, b = np.mgrid[...].reshape(2, -1)
which is a trick I sometimes use, but this is harder to read if you're unfamiliar with the pattern.
Side note: it's better to use example data where dimensions end up being of different size so that you notice if something ends up being transposed.

numpy contour plot with cost function

a = np.array(x)
b = np.array(y)
a_transpose = a.transpose()
a_trans_times_a = np.dot(a_transpose,a)
a_trans_times_b = np.dot(a_transpose,b)
def cost(theta):
x_times_theta = np.dot(a, theta)
_y_minus_x_theta = b - x_times_theta
_y_minus_x_theta_transpose = _y_minus_x_theta.transpose()
return np.dot(_y_minus_x_theta_transpose, _y_minus_x_theta)
n = 256
p = np.linspace(-100,100, n)
q= np.linspace(-100,100, n)
P, Q = np.meshgrid(p,q)
pl.contourf(P, Q, cost(np.array([P,Q])) ,8, alpha =0.75, cmap = 'jet')
C = pl.contour(P,Q, cost(np.array([P,Q])), 8, colors = 'black', linewidth = 0.5 )
Hi, I'm trying to make a contour plot using a cost function on two parameters, involving matrix multiplication. I've tested the cost function and it works properly in interactive session. However, running it on a linspace makes it get error "ValueError: objects are not aligned". I understand now that it has to do with how I structure P,Q. Would the solution involve writing a for loop to explicitly get an array of outputs? How would I write this?
EDIT: a,b are matrices with correct size. The cost function takes a 2-vector and outputs a number.
It's hard to know exactly without having at hand the shapes of a and b, but this error is probably caused by np.array[P,Q] being a 3-dimensional array. It seems you expect it to be 2-dimensional and for np.dot(a,theta) to perform matrix multiplication.
Presumably you want theta to be the the angular coordinate at a particular x and y value. In this case you should do
theta = np.arctan2(Q,P) #this is a 2D array of theta coordinates
costarray = cost(theta)
pl.contourf(P,Q,costarray,8,alpha=0.75,cmap='jet')

Correspondence between a "ij" meshgrid and a long meshgrid

Consider a matrix Z that contains grid-based results for z = z(a,m,e). Z has shape (len(aGrid), len(mGrid), len(eGrid)). Z[0,1,2] contains the z(a=aGrid[0], m=mGrid[1], e=eGrid[2]). However, we may have removed some elements from the state space from the object (for example and simplicity, (a,m,e : a > 3). Say that the size of the valid state space is x.
I have been suggested a code to transform this object to an object Z2 of shape (x, 3). Every row in Z2 corresponds to an element i from Z2: (aGrid[a[i]], mGrid[m[i]], eGrid[e[i]]).
# first create Z, a mesh grid based matrix that has some invalid states (we set them to NaN)
aGrid = np.arange(0, 10, dtype=float)
mGrid = np.arange(100, 110, dtype=float)
eGrid = np.arange(1000, 1200, dtype=float)
A,M,E = np.meshgrid(aGrid, mGrid, eGrid, indexing='ij')
Z = A
Z[Z > 3] = np.NaN #remove some states from being "allowed"
# now, translate them from shape (len(aGrid), len(mGrid), len(eGrid)) to
grids = [A,M,E]
grid_bc = np.broadcast_arrays(*grids)
Z2 = np.column_stack([g.ravel() for g in grid_bc])
Z2[np.isnan(Z.ravel())] = np.nan
Z3 = Z2[~np.isnan(Z2)]
Through some computation, I then get a matrix V4 that has the shape of Z3 but contains 4 columns.
I am given
Z2 (as above)
Z3 (as above)
V4 which is a matrix shape (Z3.shape[0], Z3.shape[1]+1): it has an additional column appended
(if necessary, I still have access to the grid A,M,E)
and I need to recreate
V, which is the matrix that contains the values (of the last column) of V4, but is transformed back to the shape of Z1.
That is, if there is a row in V4 that reads (aGrid[0], mGrid[1], eGrid[2], v1), then the the value of V at V[0,1,2] = v1, etc. for all rows in V4,
Efficiency is key.
Given your original problem conditions, recreated as follows, modified such that A is a copy of Z:
aGrid = np.arange(0, 10, dtype=float)
mGrid = np.arange(100, 110, dtype=float)
eGrid = np.arange(1000, 1200, dtype=float)
A,M,E = np.meshgrid(aGrid, mGrid, eGrid, indexing='ij')
Z = A.copy()
Z[Z > 3] = np.NaN
grids = [A,M,E]
grid_bc = np.broadcast_arrays(*grids)
Z2 = np.column_stack([g.ravel() for g in grid_bc])
Z2[np.isnan(Z.ravel())] = np.nan
Z3 = Z2[~np.isnan(Z2)]
A function can be defined as follows, to recreate a dense N-D matrix from a sparse 2D # data points x # dims + 1 matrix. The first argument of the function is the aformentioned 2D matrix, the last (optional) arguments are the grid indexes for each dimension:
import numpy as np
def map_array_to_index(uniq_arr):
return np.vectorize(dict(map(reversed, enumerate(uniq_arr))).__getitem__)
def recreate(arr, *coord_arrays):
if len(coord_arrays) != arr.shape[1] - 1:
coord_arrays = map(np.unique, arr.T[0:-1])
lookups = map(map_array_to_index, coord_arrays)
new_array = np.nan * np.ones(map(len, coord_arrays))
new_array[tuple(l(c) for c, l in zip(arr.T[0:-1], lookups))] = arr[:, -1]
new_grids = np.meshgrid(*coord_arrays, indexing='ij')
return new_array, new_grids
Given a 2D matrix V4, defined above with values derived from Z,
V4 = np.column_stack([g.ravel() for g in grid_bc] + [Z.ravel()])
it is possible to recreate Z as follows:
V4_orig_form, V4_grids = recreate(V4, aGrid, mGrid, eGrid)
All non-NaN values correctly test for equality:
np.all(Z[~np.isnan(Z)] == V4_orig_form[~np.isnan(V4_orig_form)])
The function also works without aGrid, mGrid, eGrid passed in, but in this case it will not include any coordinate that is not present in the corresponding column of the input array.
So Z is the same shape as A,M,E; and Z2 is the shape (Z.ravel(),len(grids)) = (10x10x200, 3) in this case (if you do not filter out the NaN elements).
This is how you recreate your grids from the values of Z2:
grids = Z2.T
A,M,E = [g.reshape(A.shape) for g in grids]
Z = A # or whatever other calculation you need here
The only thing you need is the shape to which you want to go back. NaN will propagate to the final array.

Categories

Resources