I learned that we need to use tf.OPERATIONS to define the computation graph, but I found sometimes, using + or = are just fine, without using tf.add or tf.assign see here.
My question is that what are the operations allowed in tensorflow loss function definition without using "tf.OPERATIONS". In other words, other than + and = what else? can we use for example *, or ^2 on variables?
PS: I just do not understand why x*x is ok but x^2 is not ...
Related
I am looking for help about the implementation of a logit model with statsmodel for binary variables.
Here is my code :
(I am using feature selection methods : MinimumRedundancyMaximumRelevance and RecursiveFeatureElimination available on python)
for i_mrmr in range(4,20):
for i_rfe in range(3,i_mrmr):
regressors_step1 = I am selecting the MRMR features
regressors_step2 = I am selecting features from the previous list with RFE method
for method in ['newton', 'nm', 'bfgs', 'lbfgs', 'powell', 'cg', 'ncg']:
logit_model = Logit(y,X.loc[:,regressors_step2])
try:
result = logit_model.fit(method=method, cov_type='HC1')
print(result.summary)
except:
result = "error"
I am using Logit from statsmodels.discrete.discrete_model.Logit.
The y variable, the target, is a binary.
All explanatory variables, the X, are binary too.
The logit model is "functionning" for the different optimization methods. That is to say, I end up with some summary to print. Nonetheless, various warnings print such as : "Maximum Likelihood optimization failed to converge."
The optimization methods presented in the statsmodel algorithm are the ones from scipy :
‘newton’ for Newton-Raphson, ‘nm’ for Nelder-Mead
‘bfgs’ for Broyden-Fletcher-Goldfarb-Shanno (BFGS)
‘lbfgs’ for limited-memory BFGS with optional box constraints
‘powell’ for modified Powell’s method
‘cg’ for conjugate gradient
‘ncg’ for Newton-conjugate gradient
We can find these methods on scipy.optimize.
Here are my questions :
I did not find anywhere any argument against the use of these optimization methods for a binary set of variables. But, because of the warnings, I am asking myself if it is correct to do so. And then, what is the best method, the one which is the more appropriate in this case ?
Here : Scipy minimize: how to restrict x only to 0 and 1? it is implicitely said that a model of the kind Python MIP (Mixed-Integer Linear Programming) could be better in the binary set of variables case. In the documentation of the MIP package of python it appears that to implement this kind of model I should explicitly give a function to minimize or maximize and also I should express the constraints... (see : https://docs.python-mip.com/en/latest/quickstart.html#creating-models)
Therefore I am wondering if i need to define a logit function as the objective function ? What are the constraints I should express ? Is there any easier way to do ?
I was reading this question and I was trying to do the same, but I want the function to have a single parameter say x. And that parameter is an array of "values" to be filled by an optimization solver. For instance:
def f(x):
return x[0]**2 + 3*x[1]
That function will refer to: f(x)=x^2 + 3y, meaning x is an array of variables. Those variables will be present on the current function or not, because they are all the variables in the whole optimization problem, meaning they can be present on the constraints. So I will like to find that functions partial derivatives of all variables. So,in this case, i will need 2 callable functions so I can use it to form a new array that is the Jacobian of the function. Is there a way to do that? How?
Disclaimer: I am the author of pyneqsys.
If you are open to using a library, pyneqsys does exactly this. If not, you can look at the source of pyneqsys/symbolic.py which (approximately) does this to calculate the jacobian:
f = sympy.Matrix(self.nf, 1, self.exprs)
x = sympy.Matrix(self.nx, 1, self.x)
J = f.jacobian(x)
You then need to use sympy.lambdify to obtain a callable with the expected syntax of your particular solver.
I am using a scipy.minimize function, where I'd like to have one parameter only searching for options with two decimals.
def cost(parameters,input,target):
from sklearn.metrics import mean_squared_error
output = self.model(parameters = parameters,input = input)
cost = mean_squared_error(target.flatten(), output.flatten())
return cost
parameters = [1, 1] # initial parameters
res = minimize(fun=cost, x0=parameters,args=(input,target)
model_parameters = res.x
Here self.model is a function that performs some matrix manipulation based on the parameters. Input and target are two matrices. The function works the way I want to, except I would like to have parameter[1] to have a constraint. Ideally I'd just like to give an numpy array, like np.arange(0,10,0.01). Is this possible?
In general this is very hard to do as smoothness is one of the core-assumptions of those optimizers.
Problems where some variables are discrete and some are not are hard and usually tackled either by mixed-integer optimization (working good for MI-linear-programming, quite okay for MI-convex-programming although there are less good solvers) or global-optimization (usually derivative-free).
Depending on your task-details, i recommend decomposing the problem:
outer-loop for np.arange(0,10,0.01)-like fixing of variable
inner-loop for optimizing, where this variable is fixed
return the model with the best objective (with status=success)
This will effect in N inner-optimizations, where N=state-space of your to fix-var.
Depending on your task/data, it might be a good idea to traverse the fixing-space monotonically (like using np's arange) and use the solution of iteration i as initial-point for the problem i+1 (potentially less iterations needed if guess is good). But this is probably not relevant here, see next part.
If you really got 2 parameters, like indicated, this decomposition leads to an inner-problem with only 1 variable. Then, don't use minimize, use minimize_scalar (faster and more robust; does not need an initial-point).
I am new to programming, especially programming with tensorflow. I'm making toy problems to understand using it.
In that case I want to build a function like softmax, where the denominator is not the sum of all classes, but a sum of some sampled classes.
In python using numpy would be like:
def my_softmax(X,W, num_of_samples):
K = 4
S = np.zeros(((np.dot(X,np.transpose(W))).shape))
for line in range(X.shape[0]):
XW = np.dot(X[line],np.transpose(W))
m = np.max(XW)
samples_sum = 0
for s in range(num_of_samples):
r = (randint(0,K-1))
samples_sum += np.exp(XW[r]- m)
S[line] = (np.exp(XW-m))/(samples_sum)
return S
How could this be implemented in tensorflow?
More generally, is there a possible way to create new "custom" functions like that?
You can wrap Python/numpy functions as tensorflow operators. See tf.py_func
https://www.tensorflow.org/versions/r0.9/api_docs/python/script_ops.html
However, it is better to not use it in production setting as performance will be (significantly) impacted. For most of np.* functions you will find corresponding tf.* functions that you can use. Try to represent all your computation in terms of matrix/vector instead of for loop.
Also see
https://www.tensorflow.org/versions/r0.11/api_docs/python/constant_op.html
The following example is stated just for the purpose of precise definition of the query. Consider a recursive equation x[k+1] = a*x[k] where a is some constant. Now, is there an easier way or an existing method within sympy/numpy that does the following (i.e., gives an expression over a horizon for a given recursive equation):
def get_expr(init, num):
a = Symbol('a')
expr = init
for i in range(num):
expr = a*expr
return expr
x0 = Symbol('x0')
get_expr(x0,3)
Horizon above is 3.
I was going to suggest using SymPy's rsolve to try to find a closed form solution to your equation, but it seems that at least for this specific one, there is a bug that prevents it from working. See http://code.google.com/p/sympy/issues/detail?id=2943. Maybe if you really want to know for a more complicated expression you could try that. For this one, the closed form solution is just a**n*x0.
Aside from that, SymPy doesn't have any functions that would do this evaluation directly, but it does have some things that can help. There are some memoization decorators in sympy.utilities.memoization that are made for internal use, but should work just fine for external uses. They can help make your evaluation more efficient by caching the result of previous evaluations. You'll need to write the get_expr recursively for it to work effectively. Or you could just write your own cacher. It's not that complicated.