How can I implement the Rosenbrock function with D dimension, using PyTorch?
Create the variables, where D is the number of dimensions and N is the number of elements.
x = (xmax - xmin)*torch.rand(N,D).type(dtype) + xmin
Function :
Using straight Python I'd do something like this:
fit = 0
for i in range(D-1):
term1 = x[i + 1] - x[i]**2
term2 = 1 - x[i]
fit = fit + 100 * term1**2 + term2**2
My attempt using Pytorch:
def Rosenbrock(x):
return torch.sum(100*(x - x**2)**2 + (x-1)**2)
I just do not know how to do the x[i+1] without using a for loop.
How can I deal with it?
Thank you!
Numpy has the function roll which I think can be really helpful.
Unfortunately, I am not aware of any function similar to numpy.roll for pytorch.
In my attempt, x is a numpy array in the form DxN. First we use roll to move the items in the first dimension (axis=0) one position to the left. Like this, everytime we compare x_1[i] is like if we do x[i+1]. Then, since the function takes only D-1 elements for the sum, we remove the last column slicing the pytorch tensor with [:-1, :]. Then the summation is really similar to the code you posted, just changing x for x_1 at the correct place.
def Rosenbrock(x):
x_1 = torch.from_numpy(np.roll(x, -1, axis=0)).float()[:-1, :]
x = torch.from_numpy(x).float()[:-1, :]
return torch.sum(100 * (x_1 - x ** 2) ** 2 + (x - 1) ** 2, 0)
Similarly, by using roll you could remove you for loop in the numpy version
Related
I have made a code that calculates the dissolution of fluids, the problem is that the code is very poor, so I have been looking at that with numpy I can optimize it but I have been stuck without knowing how to do the following code using numpy and the roll function. Basically I have a matrix that the index and cannot be more than 1024, for this I use% to calculate what index it is. But this takes a long time.
I tried using numpy, using roll, rotating the matrix and then I don't have to calculate the module. But I don't know how to take the values of the neighbors.
def evolve(grid, dt, D=1.0):
xmax, ymax = grid_shape
new_grid = [[0.0,] * ymax for x in range(xmax)]
for i in range(xmax):
for j in range(ymax):
grid_xx = grid[(i+1)%xmax][j] + grid[(i-1)%xmax][j] - 2.0 * grid[i][j]
grid_yy = grid[i][(j+1)%ymax] + grid[i][(j-1)%ymax] - 2.0 * grid[i][j]
new_grid[i][j] = grid[i][j] + D * (grid_xx + grid_yy) * dt
return new_grid
You have to rewrite the evolve function from (almost) zero using numpy.
Here the guidelines:
First, grid must be a 2D numpy array, not a list of lists.
Your teacher suggested the roll function: look at its docs and try to understand how it works. roll will solve the problem of finding neighbour entries in the matrix by shifting (or rolling) the matrix over one of the axis. You can then create shifted versions of grid in the four directions and use them, instead of searching for neighbours.
Once you have the shifted grids, you'll see that you will not need the for loops to calculate each cell of new_grid: you can use vectorized calculation, which is faster.
So the code will look like this:
def evolve(grid, dt, D=1.0):
if not isinstance(grid, np.ndarray): #ensuring that is a numpy array.
grid = np.array(grid)
u_grid = np.roll(grid, 1, axis=0)
d_grid = np.roll(grid, -1, axis=0)
r_grid = np.roll(grid, 1, axis=1)
l_grid = np.roll(grid, -1, axis=1)
new_grid = grid + D * (u_grid + d_grid + r_grid + l_grid - 4.0*grid) * dt
return new_grid
With a 1024 x 1024 matrix, each numpy evolve takes (on my machine) ~0.15 seconds to return the new_grid. Your evolve with the for loops takes ~3.85 seconds.
a,b=np.ogrid[0:n:1,0:n:1]
A=np.exp(1j*(np.pi/3)*np.abs(a-b))
a,b=np.diag_indices_from(A)
A[a,b]=1-1j/np.sqrt(3)
is my basis. it produces a grid which acts as an n*n matrix.
My issue is I need to replace a column in the grid, say for example where b=17.
I need for this column to be:
A=np.exp(1j*(np.pi/3)*np.abs(a-17+geo_mean(x)))
except for where a=b where it needs to stay as:
A[a,b]=1-1j/np.sqrt(3)
geo_mean(x) is just a geometric average of 50 values determined from a pseudo random number generator, defined in my code as:
x=[random.uniform(0,0.5) for p in range(0,50)]
def geo_mean(iterable):
a = np.array(iterable)
return a.prod()**(1.0/len(a))
So how do i go about replacing a column to include the geo_mean in the exponent formula and do it without changing the diagonal value?
Let's start by saying that diag_indices_from() is kind of useless here since we already know that diagonal elements are those that have equal indices i and j and run up to value n. Therefore, let's simplify the code a little bit at the beginning:
a, b = np.ogrid[0:n:1, 0:n:1]
A = np.exp(1j * (np.pi / 3) * np.abs(a - b))
diag = np.arange(n)
A[diag, diag] = 1 - 1j / np.sqrt(3)
Now, let's say you would like to set the column k values, except for the diagonal element, to
np.exp(1j * (np.pi/3) * np.abs(a - 17 + geo_mean(x)))
(I guess a in the above formula is row index).
This can be done using integer indices, especially that they are almost computed: we already have diag and we just need to remove from it the index of the diagonal element that needs to be kept unchanged:
r = np.delete(diag, k)
Then
x = np.random.uniform(0, 0.5, (r.size, 50))
A[r, k] = np.exp(1j * (np.pi/3) * np.abs(r - k + geo_mean(x)))
However, for the above to work, you need to rewrite your geo_mean() function in a such a way that it will work with 2D input arrays (I will also add some checks and conversions to make it backward compatible):
def geo_mean(x):
x = np.asarray(x)
dim = len(x.shape)
x = np.atleast_2d(x)
v = np.prod(x, axis=1) ** (1.0 / x.shape[1])
return v[0] if dim == 1 else v
I need some help with this problem.
The midpoint rule for approximating an integral can be expressed as:
h * summation of f(a -(0.5 * h) + i*h)
where h = (b - a)/2
Write a function midpointint(f,a,b,n) to compute the midpoint rule using the numpy sum function.
Make sure your range is from 1 to n inclusive. You could use a range and convert it to an array.
for midpoint(np.sin,0,np.pi,10) the function should return 2.0082
Here is what I have so far
import numpy as np
def midpointint(f,a,b,n):
h = (b - a) / (float(n))
for i in np.array(range(1,n+1)):
value = h * np.sum((f(a - (0.5*h) + (i*h))))
return value
print(midpointint(np.sin,0,np.pi,10))
My code is not printing out the correct output.
Issue with the posted code was that we needed accumulation into output : value += .. after initializing it as zero at the start.
You can vectorize by using a range array for the iterator, like so -
I = np.arange(1,n+1)
out = (h*np.sin(a - (0.5*h) + (I*h))).sum()
Sample run -
In [78]: I = np.arange(1,n+1)
In [79]: (h*np.sin(a - (0.5*h) + (I*h))).sum()
Out[79]: 2.0082484079079745
Say I have 10 coins from the same mint, I flip them each 50 times, now I want to estimate bias of the mint as well as the individual bias of all the coins.
The way I want to do this is like this:
# Generate a list of 10 arrays with 50 flips in each
test = [bernoulli.rvs(0.5, size=50) for x in range(10)]
with pm.Model() as test_model:
k = pm.Gamma('k', 0.01, 0.01) + 2
w = pm.Beta('w', 1, 1)
thetas = pm.Beta('thetas', w * (k - 2) + 1, (1 - w) * (k - 2) + 1, shape = len(test))
y = pm.Bernoulli('y', thetas, observed=test)
But this doesn't work, because now it seems like pymc expects 50 coins with 10 flips. I can hack around this issue in this instance. But, I'm both a beginner at python and pymc(3) so I want to learn why it behaves like this and what a proper simulation of this situation should look like.
If you are new to Python may be you are not familiar with the concept of broadcasting, that is used when working with NumPy arrays, and is also useful for defining PyMC3 models. Broadcasting enables us to operate arithmetically with arrays of difference size under certain circumstances.
For your particular example the problem is that according to the broadcasting rules the shape of the data vector and the shape of the thetas vector are not compatible. The easiest solution for your problem is to transpose the data vector (make rows columns and columns rows).
Notice also that using SciPy you can create you mock data without using a list comprehension, you just need to pass the proper shape.
test = bernoulli.rvs(0.5, size=(50, 10))
with pm.Model() as test_model:
k = pm.Gamma('k', 0.01, 0.01) + 2
w = pm.Beta('w', 1, 1)
thetas = pm.Beta('thetas', w * (k - 2) + 1, (1 - w) * (k - 2) + 1, shape = test.shape[1])
y = pm.Bernoulli('y', thetas, observed=test)
my code is pretty simple, but when I try to multiply a 3x2 and 2x1 matrix, I get the following error (which, to me, makes no sense):
ValueError: operands could not be broadcast together with shapes (3,2) (2,1)
In this program, the first thing I do is randomly generate two points in the domain [-1,1] x [-1,1], and define a line by these points, using the variables slope and y_int. Then, I create N random x values of the form {x_0, x_1, x_2} where x_0 is always 1, and x_1,x_2 are randomly generated numbers in the range [-1,1]. These N values comprise the x_matrix in the code.
The y_matrix is the classification of each of the values x_1,...,x_N. If x_1 is to the right of the random line specified by slope and y_int, then the value of y_1 is +1, and is otherwise -1.
Now, once x_matrix and y_matrix have been specified, I just want to multiply the pseudo-inverse of x_matrix (pinv_x in the code) by y_matrix. This is where the error comes in. I'm at my wit's end and I cannot think of anything that could be wrong.
Any help is greatly appreciated. The code is below:
from numpy import *
import random
N = 2
# Determine target function f(x)
x_1 = [random.uniform(-1,1),random.uniform(-1,1)]
x_2 = [random.uniform(-1,1),random.uniform(-1,1)]
slope = (x_1[1] - x_2[1]) / (x_1[0] - x_2[0])
y_int = x_1[1] - (slope * x_1[0])
# Construct training data.
x_matrix = array([1, random.uniform(-1,1), random.uniform(-1,1)])
x_on_line = (x_matrix[1] / slope) - (y_int / slope)
if x_matrix[1] >= x_on_line:
y_matrix = array([1])
else:
y_matrix = array([-1])
for i in range(N-1):
x_val = array([1, random.uniform(-1,1), random.uniform(-1,1)])
x_matrix = vstack((x_matrix, x_val))
x_on_line = (x_val[1] / slope) - (y_int / slope)
if x_val[1] >= x_on_line:
y_matrix = vstack((y_matrix, array([1])))
else:
y_matrix = vstack((y_matrix, array([-1])))
pinv_x = linalg.pinv(x_matrix)
print y_matrix
print pinv_x
w = pinv_x*y_matrix
You are using arrays, not matrices. To get matrix multiplication from arrays, you need to use the dot() function, not *. See this page. The * operator is element-wise multiplication when the data is in an array, so the shapes must match exactly.