Theano 2D histogram - python

theano complains x0 is never used when I try to use it to calculate a 2D histogram.
fth is the theano implementation that doesn't work. fnp is the numpy implementation that works as expected.
import theano
import theano.tensor as T
import numpy as np
def fth(a,b,c):
x0 = T.lvector('x0')
x1 = T.lvector('x1')
z = T.lvector('z')
x1 *= x0.size
x0 += x1
T.subtensor.AdvancedIncSubtensor1(z, x0)
f = theano.function([x0, x1, z], z)
return f(a, b, c, on_unused_input='ignore')
def fnp(a, b, c):
np.add.at(c, a+b*a.size, 1)
return c
a = np.array([0, 1, 2, 3], dtype=np.int64)
b = np.array([0, 1, 0, 1], dtype=np.int64)
c = np.zeros(16, dtype=np.int64)
print(fnp(a, b, c))
print(fth(a, b, c))
Error:
theano.compile.function_module.UnusedInputError: theano.function
was asked to create a function computing outputs given certain
inputs, but the provided input variable at index 0 is not part of
the computational graph needed to compute the outputs:
Elemwise{add,no_inplace}.0.
To make this error into a warning, you can pass the parameter
on_unused_input='warn' to theano.function. To disable it
completely, use on_unused_input='ignore'.

Related

"IndexError: too many indices for array" when trying to use scipy.optimize.minimize

I want to use Scipy minimize function to find the optimal values that achieve the minimum error function. I used scipy.optimize.minimize, which requires me to specify the rubber and lower bound and any constraint to be passed to the minimization function. I wanted to add an inequality constraint such that A*x < b, so here is my code:
from scipy.optimize import minimize, LinearConstraint
import numpy as np
def error_func(theta):
return theta[0] - theta[1]
theta0 = [100, 0]
A = np.array([[1, 0], [0, 1]])
b = np.array([[100], [0]])
bnds = ((0, 100), (0, 0))
constraint = LinearConstraint(A, lb=-np.inf, ub=b)
theta = minimize(error_func, theta0, method='trust-constr',constraints=constraint, bounds=bnds, options={'maxiter': 500})
But, when I run the code, I receive the following error on the optimization function line:
/usr/local/lib/python3.7/dist-packages/scipy/optimize/_constraints.py in __init__(self, constraint, x0, sparse_jacobian, finite_diff_bounds)
259 mask = keep_feasible & (lb != ub)
260 f0 = fun.f
--> 261 if np.any(f0[mask] < lb[mask]) or np.any(f0[mask] > ub[mask]):
262 raise ValueError("`x0` is infeasible with respect to some "
263 "inequality constraint with `keep_feasible` "
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
So can anyone please explain why I receive such an error? what I'm doing wrong here?
I just figured out the solution. The constraint I added will find a solution such that A*x <= b. Therefore, there will be a comparison between A*x and b. The output of the comparison is of shape (2,2) (I don't understand why although the shape of the matrix multiplication is (2,1) and so is b). Long story short, the minimization function expects the return of the constraint comparison to being a list containing two values as same as I defined initial theta. Therefore, I needed to change my constraint function such that it returns the same shape as the initial theta. Here is the correct code:
from scipy.optimize import minimize, NonlinearConstraint
import numpy as np
def error_func(theta):
return theta[0] - theta[1]
theta0 = [100, 0]
A = np.array([[1, 0], [0, 1]])
b = [100, 0]
bnds = ((0, 100), (0, 0))
func = lambda x: A.dot(x).tolist()
constraint = NonlinearConstraint(func, -np.inf, b)
theta = minimize(error_func, theta0, method='trust-constr',constraints=constraint, bounds=bnds, options={'maxiter': 500})

Scipy curve fit (optimization) - vectorizing a conditional to identify threshold using a custom function

I'm trying to use scipy curve_fit to capture the value of a0 parameter. As of now, it is not changing (always comes out as 1):
X = [[1,2,3],[4,5,6]]
def func(X, a0, c):
x1 = X[0]; x2 = X[1]
a = x1*x2
result = np.where(a(a<a0), -c*(a + np.sqrt(x2)), -c*x1)
return result
Popt, Cov = scipy.curve_fit(func, X, y)
a0, c = Popt
Predicted = func(X, a0, c) # a0 and c are constants
I get the values for c, which is a scalar, without any problem. I can't explain why a0 (also a scalar) is always 1, and I am not sure how to fix it. I did see elsewhere on SO that np.where can be used the way I have used it here, but apparently not for curve_fit function. Maybe I need to use a different method of optimization, and I'd like some pointers to do this using scipy methods.
Edit: I tried the construct suggested by Brad, but that's not it.
Updated!
This should work. note that the a variable is a vector in this example of length 3 because it is computed by the element wise multiplication of the first and second elements of X which is a 2x3 matrix. Therefore a0 can either be a scalar or a vector of length 3 and c can also be a scalar or a vector of length 3.
import numpy as np
X = np.array([[1, 2, 3], [4, 5, 6]])
a0 = np.array([8,25,400])
# a0 = 2
# Code works whether C is scalar or a matrix since it can be broadcast to matrix a below.
# c = 3 # Uncomment this for scalar
c = np.array([8, 12, 2000]) # Element wise
def func(X, a0, c):
x = X[0]
y = X[1]
a = x * y
print(a.shape)
result = np.where(a < a0, c * (a + np.sqrt(y)), c * x)
return result
func(X, a0, c)
This is a minimum amount of code that works. Notice I removed the y>0 and defined a to be the same size as c. Now you get the correct insertions because the first parameter of np.where is now the same size as the second and third parameters. Before (x<a) & (y>0) always evaluated to True or False and that is a scalar in this context. If a was a N dimensional array you would have received a ValueError because the operands could not be broadcast together
import numpy as np
c = np.array([[22,34],[33,480]])
def func(X, a):
x = X[0]; y = X[1]
return np.where(c[(x<a)], -c*(a + np.sqrt(y)), -c*x)
X = [25, 600]
a = np.array([[2,14],[33,22]])
func(X,a)
This also works if c is a constant and a was the array you wanted manipulated
import numpy as np
c = 2
def func(X, a):
x = X[0]; y = X[1]
return np.where(a[(x<a)], -c*(a + np.sqrt(y)), -c*x)
X = [25, 600]
a = np.array([[2,14],[33,22]])
func(X,a)

Evaluating a numpy array

I have a function that returns a numpy array as follows:
import numpy as np
from scipy.misc import derivative
import sympy as sp
x, y = sp.symbols('x y')
f = x**2 + y **2
def grad(f):
exp = sp.expand(f)
dfdx = sp.diff(exp,x)
dfdy = sp.diff(exp,y)
global grad
Df = np.array([dfdx,dfdy])
return Df
I'm using the variable Df in another function and do some computations with it.
As you may have guessed, the results come out including x and y. However, I need the results to be evaluated each time with the initial values I choose for x and y instead of the symbols.
I was wondering if there was something like the .subs() in sympy but works on a numpy array rather than a function????
Sympy and numpy are two separate worlds, that aren't easy to bring together.
With sympy's lambdify, sympy expressions can be made to work on numpy arguments. When arrays are used as arguments, they all need to be 1D and of the same size. The function np_grad_1 below is how it works standard. It returns an array with two subarrays.
To get your desired functionality, a wrapper can take a 2D numpy input and convert the result back to a 2D numpy array:
import sympy as sp
import numpy as np
x, y = sp.symbols('x y')
f = x ** 2 + y ** 2
def grad(f, x, y):
exp = sp.expand(f)
dfdx = sp.diff(exp, x)
dfdy = sp.diff(exp, y)
return [dfdx, dfdy]
np_grad_1 = sp.lambdify([x, y], grad(f, x, y))
np_grad_2 = lambda points: np.array(np_grad_1(points[:, 0], points[:, 1])).T
points = np.random.uniform(-1, 1, (5, 2))
np_grad_1(points[:, 0], points[:, 1]) # returns an array with 2 subarrays
np_grad_2(points) # returns an Nx2 array

Optimizing a function where one of the parameters is an array

I want to optimize a function by varying the parameters where two of the parameters are actually arrays. I've tried to do
...
# initial parameters
params0 = np.array([p1, p2, ... , p_array1, p_array2])
p_min = minimize(myfunc, params0, args)
...
where the pj's are scalars and p_array1 and p_array2 are arrays of the same length, but this gave me an error saying
ValueError: setting an array element with a sequence.
I've also tried passing p_array1 and p_array2 as scalars into myfunc and then create predetermined arrays from those two inside myfunc (e.g. setting p_array1 = p_array1*np.arange(6) and similarly for p_array2), eliminating the error, but I don't want them to be predetermined -- instead I want 'minimize' to figure out what they should be.
Is there any way that I can utilize one of Scipy's optimization functions without getting this error while still keeping p_array1 and p_array2 as arrays and not scalars?
EDIT
Sorry for being very broad but here is my code:
NOTE: 'myfunc' here is actually norm_residual .
import pandas as pd
import numpy as np
def f(yvec, t, a, b, c, d, M, theta):
# the system of ODEs to be solved
x, y = yvec
dydt = [ a*x - b*y**2 + 1, -c*x - d*x*y + np.sum(M * np.cos(theta*t)) ]
return dydt
ni = 3 # the number of periodic forcing functions to add to the DE system
M = 0.56*np.random.rand(ni) # the initial amplitudes of forcing functions
theta = np.pi/6*np.arange(ni) # the initial coefficients of the forcing functions
# initialize the parameters
params0 = [0.75, 0.23, 1.0, 0.2, M, theta]
# grabbing the data to be used later
data = pd.read_csv('data.csv')
y_data = data['Y']
N = y_data.shape[0] #20
t = np.linspace(0, N, N) # array of t values to integrate over
yvec0 = [0.3, 0.34] # initial conditions for x and y respectively
def norm_residual(params, *args):
"""
Computes the L^2 norm of the residual of y and the data (y as defined above).
Input: params = array of parameters (scalars or arrays) for the DE system
args = other arguments to pass into the function f or to use
to compute the residual.
Output: err = L^2 error of the solution vector (scalar).
"""
data, yvec0, t = args
a, b, c, d, M, theta = params
sol = odeint(f, yvec0, t, args=(a, b, c, d, M, theta))
x = sol[:, 0]; y = sol[:, 1]
res = data - y
err = np.linalg.norm(res, 2)
return err
from scipy.optimize import minimize
p_min = minimize(norm_residual, params0, args=(y_data, yvec0, t))
print(p_min)
And the traceback
Traceback (most recent call last):
File "model_ex_1.py", line 62, in <module>
p_min = minimize(norm_residual, params0, args=(y_anom, yvec0, t))
File "/usr/lib/python2.7/dist-packages/scipy/optimize/_minimize.py", line 354, in minimize
x0 = np.asarray(x0)
File "/usr/lib/python2.7/dist-packages/numpy/core/numeric.py", line 482, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
You cannot put a list in a numpy array if the other elements are scalars.
>>> import numpy as np
>>> foo_array = np.array([1,2,3,[5,6,7]])
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
foo_array = np.array([1,2,3,[5,6,7]])
ValueError: setting an array element with a sequence.
It would be helpful if you post myfunc
but you can do this -
def foo():
return [p0,p1,p2..pn]
params0 = numpy.array([foo(), p_array1, p_array2])
p_min = minimize(myfunc, params0, args)
OR from Multiple variables in SciPy's optimize.minimize
import scipy.optimize as optimize
def f(params):
# print(params) # <-- you'll see that params is a NumPy array
a, b, c = params # <-- for readability you may wish to assign names to the component variables
return a**2 + b**2 + c**2
initial_guess = [1, 1, 1]
result = optimize.minimize(f, initial_guess)
if result.success:
fitted_params = result.x
print(fitted_params)
else:
raise ValueError(result.message)
I figured it out! The solution that I found to work was to change
params0 = [0.75, 0.23, 1.0, 0.2, M, theta]
in line 6 to
params0 = np.array([ 0.75, 0.23, 1.0, 0.2, *M, *theta], dtype=np.float64)
and in my function definition of my system of ODEs to be solved, instead of having
def f(yvec, t, a, b, c, d, M, theta):
x, y = yvec
dydt = [ a*x - b*y**2 + 1, -c*x - d*x*y + np.sum(M * np.cos(theta*t)) ]
return dydt
I now have
def f(yvec, t, myparams):
x, y = yvec
a, b, c, d = myparams[:4]
ni = (myparams[4:].shape[0])//2 # halved b/c M and theta are of the same shape
M = myparams[4:ni+4]
theta = myparams[ni+4:]
dydt = [ a*x - b*y**2 + 1, -c*x - d*x*y + np.sum(M * np.cos(theta*t)) ]
return dydt
NOTE: I had to add "dtype=np.float64" for 'params0' because I was getting the error
AttributeError: 'numpy.float64' object has no attribute 'cos'
when I did not have it there and it appears that 'cos' does not know how to handle 'ndarray' objects. The workaround can be found here.
Thanks everyone for the suggestions!

An error in matplotlib related to numpy.roots

I was trying to plot (the modulus of) sum of quadratic roots and it returns me an error illustrated as follow:
import numpy as np
import matplotlib.pyplot as plt
def rooting(a, b, c):
y = [a, b, c]
z = np.roots(y)
return np.absolute(z[0]+z[1])
x = np.linspace(1, 10, 10)
plt.plot(x, rooting(x, 2, 3))
and the error was:
File "C:\Users\user\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py", line 1570, in nonzero
res = nonzero()
SystemError: <built-in method nonzero of numpy.ndarray object at 0x000001422B9BFD00> returned a result with an error set
Can someone tell me what's going on?
The problem arises because you are passing the variable to a vector and then concatenate with b and c are numbers, you must pass to the variable to a scalar, I show my next solution based on the above.
def rooting(a, b, c):
y = [a, b, c]
z = np.roots(y)
return np.absolute(z[0]+z[1])
x = np.linspace(1, 10, 10)
y = [rooting(xi, 2, 3) for xi in x]
plt.plot(x, y)
plt.show()
Using the quadratic formula, we know the roots are (-b ± √(b**2-4ac))/2a.
So the modulus of the sum of the roots is |b/a|.
With this simplification, we can compute the result in a vectorized way (no list comprehesion, looping, or multiple calls of rooting necessary):
import numpy as np
import matplotlib.pyplot as plt
def rooting(a, b, c):
# The roots are (-b ± √(b**2-4ac))/2a
# So the modulus of the sum of the roots is |b/a|
return np.abs(b/a)
x = np.linspace(0, 10, 11)
plt.plot(x, rooting(x, 2, 3))
plt.show()

Categories

Resources