Getting input dimensions in pymc3 correct - python

Say I have 10 coins from the same mint, I flip them each 50 times, now I want to estimate bias of the mint as well as the individual bias of all the coins.
The way I want to do this is like this:
# Generate a list of 10 arrays with 50 flips in each
test = [bernoulli.rvs(0.5, size=50) for x in range(10)]
with pm.Model() as test_model:
k = pm.Gamma('k', 0.01, 0.01) + 2
w = pm.Beta('w', 1, 1)
thetas = pm.Beta('thetas', w * (k - 2) + 1, (1 - w) * (k - 2) + 1, shape = len(test))
y = pm.Bernoulli('y', thetas, observed=test)
But this doesn't work, because now it seems like pymc expects 50 coins with 10 flips. I can hack around this issue in this instance. But, I'm both a beginner at python and pymc(3) so I want to learn why it behaves like this and what a proper simulation of this situation should look like.

If you are new to Python may be you are not familiar with the concept of broadcasting, that is used when working with NumPy arrays, and is also useful for defining PyMC3 models. Broadcasting enables us to operate arithmetically with arrays of difference size under certain circumstances.
For your particular example the problem is that according to the broadcasting rules the shape of the data vector and the shape of the thetas vector are not compatible. The easiest solution for your problem is to transpose the data vector (make rows columns and columns rows).
Notice also that using SciPy you can create you mock data without using a list comprehension, you just need to pass the proper shape.
test = bernoulli.rvs(0.5, size=(50, 10))
with pm.Model() as test_model:
k = pm.Gamma('k', 0.01, 0.01) + 2
w = pm.Beta('w', 1, 1)
thetas = pm.Beta('thetas', w * (k - 2) + 1, (1 - w) * (k - 2) + 1, shape = test.shape[1])
y = pm.Bernoulli('y', thetas, observed=test)

Related

Is there a DP solution for my subset average problem?

I have a combinatorics problem that I can't solve.
Given a set of vectors and a target vector, return a scalar for each vector, so that the average of the scaled vectors in the set is closest to the target.
Edit: Weights w_i are in range [0, 1]. This is a constrained optimisation problem:
minimise d(avg(w_i * x_i), target)
subject to sum(w_i) - 1 = 0
If i had to name this problem it would be unbounded subset average.
I have looked at the unbounded knapsack and similar problems, but a dynamic programming implementation seems to be impossible due to the interdependence of the numbers.
I also inplemented a genetic algorithm that is able to approximate the weights moderately well, but it takes too long and I was initially hoping to solve the problem using dynamic programming.
Is there any hope?
Visualization
In a 2D space the solution to the problem can be represented like this
Problem class identification
As recognized by others this is a an optimization problem. You have linear constraints and a convex objective function, it can be cast to quadratic programming, (read Least squares session)
Casting to standard form
If you want to minimize the average of w[i] * x[i], this is sum(w[i] * x[i]) / N, if you arrange w[i] as the elements of a (1 x N_vectors) matrix, and each vector x[i] as the i-th row of a (N_vectors x DIM) matrix, it becomes w # X / N_vectors (with # being the matrix product operator).
To cast to that form you would have to construct a matrix so that each rows of A*x < b expressing -w[i] < 0, the equality is sum(w) = 1 becomes sum(w) < 1 and -sum(w) < -1. But there there are amazing tools to automate this part.
Implementation
This can be readily implemented using cvxpy, and you don't have to care about expanding all the constraints.
The following function solves the problem and if the vectors have dimension 2 plot the result.
import cvxpy;
import numpy as np
import matplotlib.pyplot as plt
def place_there(X, target):
# some linear algebra arrangements
target = target.reshape((1, -1))
ncols = target.shape[1]
X = np.array(X).reshape((-1, ncols))
N_vectors = X.shape[0]
# variable of the problem
w = cvxpy.Variable((1, X.shape[0]))
# solve the problem with the objective of minimize the norm of w * X - T (# is the matrix product)
P = cvxpy.Problem(cvxpy.Minimize(cvxpy.norm((w # X) / N_vectors - target)), [w >= 0, cvxpy.sum(w) == 1])
# here it is solved
print('Distance from target is: ', P.solve())
# show the solution in a nice plot
# w.value is the w that gave the optimal solution
Y = w.value.transpose() * X / N_vectors
path = np.zeros((X.shape[0] + 1, 2))
path[1:, :] = np.cumsum(Y, axis=0)
randColors=np.random.rand( 3* X.shape[0], 3).reshape((-1, 3)) * 0.7
plt.quiver(path[:-1,0], path[:-1, 1], Y[:, 0], Y[:, 1], color=randColors, angles='xy', scale_units='xy', scale=1)
plt.plot(target[:, 0], target[:, 1], 'or')
And you can run it like this
target = np.array([[1.234, 0.456]]);
plt.figure(figsize=(12, 4))
for i in [1,2,3]:
X = np.random.randn(20) * 100
plt.subplot(1,3,i)
place_there(X, target)
plt.xlim([-3, 3])
plt.ylim([-3, 3])
plt.grid()
plt.show();

How can I access the neighboring elements of the matrix using numpy?

I have made a code that calculates the dissolution of fluids, the problem is that the code is very poor, so I have been looking at that with numpy I can optimize it but I have been stuck without knowing how to do the following code using numpy and the roll function. Basically I have a matrix that the index and cannot be more than 1024, for this I use% to calculate what index it is. But this takes a long time.
I tried using numpy, using roll, rotating the matrix and then I don't have to calculate the module. But I don't know how to take the values ​​of the neighbors.
def evolve(grid, dt, D=1.0):
xmax, ymax = grid_shape
new_grid = [[0.0,] * ymax for x in range(xmax)]
for i in range(xmax):
for j in range(ymax):
grid_xx = grid[(i+1)%xmax][j] + grid[(i-1)%xmax][j] - 2.0 * grid[i][j]
grid_yy = grid[i][(j+1)%ymax] + grid[i][(j-1)%ymax] - 2.0 * grid[i][j]
new_grid[i][j] = grid[i][j] + D * (grid_xx + grid_yy) * dt
return new_grid
You have to rewrite the evolve function from (almost) zero using numpy.
Here the guidelines:
First, grid must be a 2D numpy array, not a list of lists.
Your teacher suggested the roll function: look at its docs and try to understand how it works. roll will solve the problem of finding neighbour entries in the matrix by shifting (or rolling) the matrix over one of the axis. You can then create shifted versions of grid in the four directions and use them, instead of searching for neighbours.
Once you have the shifted grids, you'll see that you will not need the for loops to calculate each cell of new_grid: you can use vectorized calculation, which is faster.
So the code will look like this:
def evolve(grid, dt, D=1.0):
if not isinstance(grid, np.ndarray): #ensuring that is a numpy array.
grid = np.array(grid)
u_grid = np.roll(grid, 1, axis=0)
d_grid = np.roll(grid, -1, axis=0)
r_grid = np.roll(grid, 1, axis=1)
l_grid = np.roll(grid, -1, axis=1)
new_grid = grid + D * (u_grid + d_grid + r_grid + l_grid - 4.0*grid) * dt
return new_grid
With a 1024 x 1024 matrix, each numpy evolve takes (on my machine) ~0.15 seconds to return the new_grid. Your evolve with the for loops takes ~3.85 seconds.

Efficiently sample from arbitrary multivariate function

I would like to sample from an arbitrary function in Python.
In Fast arbitrary distribution random sampling it was stated that one could use inverse transform sampling and in Pythonic way to select list elements with different probability it was mentioned that one should use inverse cumulative distribution function. As far as I undestand those methods only work the univariate case. My function is multivariate though and too complex that any of the suggestions in https://stackoverflow.com/a/48676209/4533188 would apply.
Prinliminaries: My function is based on Rosenbrock's banana function, which value we can get the value of the function with
import scipy.optimize
scipy.optimize.rosen([1.1,1.2])
(here [1.1,1.2] is the input vector) from scipy, see https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.optimize.rosen.html.
Here is what I came up with: I make a grid over my area of interest and calculate for each point the function value. Then I sort the resulting data frame by the value and make a cumulative sum. This way we get "slots" which have different sizes - points which have large function values have larger slots than points with small function values. Now we generate random values and look into which slot the random value falls into. The row of the data frame is our final sample.
Here is the code:
import scipy.optimize
from itertools import product
from dfply import *
nb_of_samples = 50
nb_of_grid_points = 30
rosen_data = pd.DataFrame(array([item for item in product(*[linspace(fm[0], fm[1], nb_of_grid_points) for fm in zip([-2,-2], [2,2])])]), columns=['x','y'])
rosen_data['z'] = [np.exp(-scipy.optimize.rosen(row)**2/500) for index, row in rosen_data.iterrows()]
rosen_data = rosen_data >> \
arrange(X.z) >> \
mutate(z_upperbound=cumsum(X.z)) >> \
mutate(z_upperbound=X.z_upperbound/np.max(X.z_upperbound))
value = np.random.sample(1)[0]
def get_rosen_sample(value):
return (rosen_data >> mask(X.z_upperbound >= value) >> select(X.x, X.y)).iloc[0,]
values = pd.DataFrame([get_rosen_sample(s) for s in np.random.sample(nb_of_samples)])
This works well, but I don't think it is very efficient. What would be a more efficient solution to my problem?
I read that Markov chain Monte Carlo might helping, but here I am in over my head for now on how to do this in Python.
I was in a similar situation, so, I implemented a rudimentary version of Metropolis-Hastings (which is an MCMC method) to sample from a bivariate distribution. An example follows.
Say, we want to sample from the following denisty:
def density1(z):
z = np.reshape(z, [z.shape[0], 2])
z1, z2 = z[:, 0], z[:, 1]
norm = np.sqrt(z1 ** 2 + z2 ** 2)
exp1 = np.exp(-0.5 * ((z1 - 2) / 0.8) ** 2)
exp2 = np.exp(-0.5 * ((z1 + 2) / 0.8) ** 2)
u = 0.5 * ((norm - 4) / 0.4) ** 2 - np.log(exp1 + exp2)
return np.exp(-u)
which looks like this
The following function implements MH with multivariate normal as the proposal
def metropolis_hastings(target_density, size=500000):
burnin_size = 10000
size += burnin_size
x0 = np.array([[0, 0]])
xt = x0
samples = []
for i in range(size):
xt_candidate = np.array([np.random.multivariate_normal(xt[0], np.eye(2))])
accept_prob = (target_density(xt_candidate))/(target_density(xt))
if np.random.uniform(0, 1) < accept_prob:
xt = xt_candidate
samples.append(xt)
samples = np.array(samples[burnin_size:])
samples = np.reshape(samples, [samples.shape[0], 2])
return samples
Run MH and plot samples
samples = metropolis_hastings(density1)
plt.hexbin(samples[:,0], samples[:,1], cmap='rainbow')
plt.gca().set_aspect('equal', adjustable='box')
plt.xlim([-3, 3])
plt.ylim([-3, 3])
plt.show()
Check out this repo of mine for details.

Rosenbrock function with D dimension using PyTorch

How can I implement the Rosenbrock function with D dimension, using PyTorch?
Create the variables, where D is the number of dimensions and N is the number of elements.
x = (xmax - xmin)*torch.rand(N,D).type(dtype) + xmin
Function :
Using straight Python I'd do something like this:
fit = 0
for i in range(D-1):
term1 = x[i + 1] - x[i]**2
term2 = 1 - x[i]
fit = fit + 100 * term1**2 + term2**2
My attempt using Pytorch:
def Rosenbrock(x):
return torch.sum(100*(x - x**2)**2 + (x-1)**2)
I just do not know how to do the x[i+1] without using a for loop.
How can I deal with it?
Thank you!
Numpy has the function roll which I think can be really helpful.
Unfortunately, I am not aware of any function similar to numpy.roll for pytorch.
In my attempt, x is a numpy array in the form DxN. First we use roll to move the items in the first dimension (axis=0) one position to the left. Like this, everytime we compare x_1[i] is like if we do x[i+1]. Then, since the function takes only D-1 elements for the sum, we remove the last column slicing the pytorch tensor with [:-1, :]. Then the summation is really similar to the code you posted, just changing x for x_1 at the correct place.
def Rosenbrock(x):
x_1 = torch.from_numpy(np.roll(x, -1, axis=0)).float()[:-1, :]
x = torch.from_numpy(x).float()[:-1, :]
return torch.sum(100 * (x_1 - x ** 2) ** 2 + (x - 1) ** 2, 0)
Similarly, by using roll you could remove you for loop in the numpy version

Python returning error when attempting to multiply two numpy matrices of appropriate dimension

my code is pretty simple, but when I try to multiply a 3x2 and 2x1 matrix, I get the following error (which, to me, makes no sense):
ValueError: operands could not be broadcast together with shapes (3,2) (2,1)
In this program, the first thing I do is randomly generate two points in the domain [-1,1] x [-1,1], and define a line by these points, using the variables slope and y_int. Then, I create N random x values of the form {x_0, x_1, x_2} where x_0 is always 1, and x_1,x_2 are randomly generated numbers in the range [-1,1]. These N values comprise the x_matrix in the code.
The y_matrix is the classification of each of the values x_1,...,x_N. If x_1 is to the right of the random line specified by slope and y_int, then the value of y_1 is +1, and is otherwise -1.
Now, once x_matrix and y_matrix have been specified, I just want to multiply the pseudo-inverse of x_matrix (pinv_x in the code) by y_matrix. This is where the error comes in. I'm at my wit's end and I cannot think of anything that could be wrong.
Any help is greatly appreciated. The code is below:
from numpy import *
import random
N = 2
# Determine target function f(x)
x_1 = [random.uniform(-1,1),random.uniform(-1,1)]
x_2 = [random.uniform(-1,1),random.uniform(-1,1)]
slope = (x_1[1] - x_2[1]) / (x_1[0] - x_2[0])
y_int = x_1[1] - (slope * x_1[0])
# Construct training data.
x_matrix = array([1, random.uniform(-1,1), random.uniform(-1,1)])
x_on_line = (x_matrix[1] / slope) - (y_int / slope)
if x_matrix[1] >= x_on_line:
y_matrix = array([1])
else:
y_matrix = array([-1])
for i in range(N-1):
x_val = array([1, random.uniform(-1,1), random.uniform(-1,1)])
x_matrix = vstack((x_matrix, x_val))
x_on_line = (x_val[1] / slope) - (y_int / slope)
if x_val[1] >= x_on_line:
y_matrix = vstack((y_matrix, array([1])))
else:
y_matrix = vstack((y_matrix, array([-1])))
pinv_x = linalg.pinv(x_matrix)
print y_matrix
print pinv_x
w = pinv_x*y_matrix
You are using arrays, not matrices. To get matrix multiplication from arrays, you need to use the dot() function, not *. See this page. The * operator is element-wise multiplication when the data is in an array, so the shapes must match exactly.

Categories

Resources