Fitting with Transcendental equation - python

I have a normal set of data, density of current as a function of voltage J(V). My goal is to fit these data with a model. The problem is that my model is composed of transcendental equations, so that I can not write a function in terms of J and use lmfit, for instance. My model looks like this:
Please have a look at the image
Any ideas of how I could do it?
If I solve the sistem with fsolve or similar, I would have to provide the parameters, so I don't know what to do.
I also tried to solve the sistem with Scypy, but it did not work..

I'm not sure you will find a clean, easy way to do this -- please let us know if you do.
Since your functions are basically exponentials, you may find that taking a few iterations within the model function produces a stable, self-consistent result. That is, if values are "well behaved" so that the Voltage-drop perturbations (Vdn - V) are fairly small, taking a few loops to get to near-self-consistency might be sufficient.
Then again, since they are exponentials, for large positive values of V they are likely to diverge quickly.

I think scipy.optimize.curve_fit has what you're looking for, I found this tutorial helpful for my case.
You could perhaps do something like this:
from scipy.optimize import curve_fit
def CurrentDensityFromVoltage(V, RS1, RS12, r, J01, J02):
VD1 = <expression>
VD2 = <expression>
J1 = <expression>
J2 = <expression>
J3 = <expression>
return J1+J2+J3
# coefficients to get CurrentDensity as a function of Voltage
param, _ = curve_fit(CurrentDensityFromVoltage, Voltage, CurrentDensity)
# current density 'cd' for any voltage 'v'
cd = CurrentDensityFromVoltage(v, param[0], param[1], param[2], param[3], param[4])


DBSCAN with custom metric

I have the following given:
a dataset in the range of thousands
a way of computing the similarity, but the datapoints themselves I cannot plot them in euclidian space
I know that DBSCAN should support custom distance metric but I dont know how to use it.
say I have a function
def similarity(x,y):
return similarity ...
and I have a list of data that can be passed pairwise into that function, how do I specify this when using the DBSCAN implementation of scikit-learn ?
Ideally what I want to do is to get a list of the clusters but I cant figure out how to get started in the first place.
There is a lot of terminology that still confuses me:
How do I pass a feature array and what is it ? How do I fit this implementation to my needs ? How will I be able to get my "sublists" from this algorithm ?
A "feature array" is simply an array of the features of a datapoint in your dataset.
metric is the parameter you're looking for. It can be a string (the name of a builtin metric), or a callable. Your similarity function is a callable. This isn't well described in the documentation, but a metric has to do just that, take two datapoints as parameters, and return a number.
def similarity(x, y):
return ...
reduced_dataset = sklearn.cluster.DBSCAN(metric=similarity).fit(dataset)
In case someone is searching the same for strings with a custom metric
def metric(x, y):
return yourDistFunc(string_seqs[int(x[0])],string_seqs[int(y[0])])
def clusterPockets():
global string_seqs
string_seqs = load_data() #["foo","bar"...]
dat = np.arange(len(string_seqs)).reshape(-1, 1)
clustered_dataset = DBSCAN(metric=metric)).fit(X=dat, y=dat)

Alternatives for loss functions in python CNTK

I have created a sequential model in CNTK and pass this model into a loss function like the following:
ce = cross_entropy_with_softmax(model, labels)
As mentioned here and as I have multilabel classifier, I want to use a proper loss function. The problem is I can not find any proper document to find these loss functions in Python. Is there any suggestion or sample code for this requirement.
I should notice that I found these alternatives (logistic and weighted logistic) in BrainScript language, but not in Python.
"my data has more than one label (three label) and each label has more than two values (30 different values)"
Do I understand right, you have 3 network outputs and associated labels, and each one is a 1-in-30 classifier? Then it seems you can just add three cross_entropy_with_softmax() values. Is that what you want?
E.g. if the model function returns a triple (ending in something like return combine([z1, z2, z3])), then your criterion function that you pass to Trainer could look like this (if you don't use Python 3, the syntax is a little different):
from cntk.layers.typing import Tensor, SparseTensor
def my_criterion(input : Tensor[input_dim], labels1 : SparseTensor[30],
labels2 : SparseTensor[30], labels3 : SparseTensor[30]):
z1, z2, z3 = my_model(input).outputs
loss = cross_entropy_with_softmax(z1, labels1) + \
cross_entropy_with_softmax(z2, labels2) + \
cross_entropy_with_softmax(z3, labels3)
return loss
learner = ...
trainer = Trainer(None, my_criterion, learner)
# in MB loop:
input_mb, L1_mb, L2_mb, L3_mb = my_next_minibatch()
trainer.train_minibatch(my_criterion.argument_map(input_mb, L1_mb, L2_mb, L3_mb))
Update (based on comments below): If you are using a sequential model then you are probably interested in taking a sum over all positions in the sequence of the loss at each position. cross_entropy_with_softmax is appropriate for the per-position loss and CNTK will automatically compute the sum of the loss values over all positions in the sequence.
Note that the terminology multilabel is non-standard here as it is typically referring to problems with multiple binary labels. The wiki page you link to refers to that case which is different from what you are doing.
Original answer (valid for the actual multilabel case): You will want to use binary_cross_entropy or weighted_binary_cross_entropy. (We decided to rename Logistic when porting this to Python). At the time of this writing these operations only support {0,1} labels. If your labels are in (0,1) then you will need to define your loss like this
import cntk as C
my_bce = label*C.log(model)+(1-label)*C.log(1-model)
Currently, most operators are in the cntk.ops package and documented here. The only exception being the sequence related operators, which reside in cntk.ops.sequence.
We have plans to restructure the operator space (without breaking backwards compatibility) to increase discoverability.
For your particular case, cross_entropy_with_softmax seems to be a reasonable choice, and you can find its documentation with examples here. Please also check out this Jupyter Notebook for a complete example.

basic example with fmin_bfgs from scipy.optimize (python) does not work

I am learning the optimization functions in scipy. I want to use the BFGS algorithm where the gradient of a function can be provided. As a basic example I want to minimize the following function: f(x) = x^T A x , where x is a vector.
When I implement this in python (see implementation below), I get the following error:
message: 'Desired error not necessarily achieved due to precision loss.'
Tracing the error back to the source led me to the function of scipy that performs a line search to determine the step length, but I have no clue why it fails on such a simple example.
My code to implement this is as follows:
# coding: utf-8
from scipy import optimize
import numpy as np
# Matrix to be used in the function definitions
A = np.array([[1.,2.],[2.,3.]])
# Objectve function to be minimized: f = x^T A x
def f(x):
# gradient of the objective function, df = 2*A*x
def fp(x):
return 2*,x)
# Initial value of x
x0 = np.array([1.,2.])
# Try with BFGS
xopt = optimize.minimize(f,x0,method='bfgs',jac=fp,options={'disp':1})
The problem here is that you are looking for a minimum, but the value of your target function f(x) is not bounded in the negative direction.
At first sight, your problem looks like a basic example of a convex target function, but if you take a closer look, you will realize it is not.
For convexity A has to be positive (semi-)definite. This condition is violated in your case. (Just calculate the determinant of A and you will see that immediately).
If you instead pick A = np.array([[2.,2.],[2.,3.]]), everything will be fine again.

Mathematical Equations - Rendering and Evaluation with Python and QT (and sympy?)

I am developing a GUI application (in the civil engineering context) with python3 and QT and want to display an equation in three different ways:
symbolic: sigma=N/A
with values: sigma=200kN/20cm²
as a result: sigma=10kN/cm²
The layout of the equation and the order of symbols has to be the same for both (1) and (2), but i only want to enter the equation once in my sourcecode. I searched a lot, this is the best i could get:
class myfancy_equation():
def __init__(self):
self.a = Symbol('a')
self.b = Symbol('b',commutative=False)
self.x = Symbol('x')
self.expr = (self.b * self.a)/self.x
def mlatex(self):
return latex(self.expr)
def mevaluate(self,a_in,b_in,x_in):
unev = self.expr.subs({self.a:a_in,self.b:b_in,self.x:x_in})
symb_names = dict()
symb_names[self.a] = str(a_in)
symb_names[self.b] = str(b_in)
symb_names[self.x] = str(x_in)
# unev_latex = latex(self.expr.subs({self.a:a_in,self.b:b_in,self.x:x_in}))
unev_latex = latex(self.expr,symbol_names=symb_names)
ev = self.expr.subs({self.a:a_in,self.b:b_in,self.x:x_in}).evalf()
return unev,unev_latex,ev
mfe = myfancy_equation()
lat = mfe.mlatex()
un,unl,ev = mfe.mevaluate(5,7,3)
print("Original, unevaluated, evaluated:")
print( lat,"=",unl,'=',ev)
I have read that sympy was not primarly developed for displaying equations, but the result is hardly readable (and unpredictable) for more complex equations. i tried playing around with the "commutative" parameter, but always end up with a mixed equation like this:
am i missing the point or is it just impossible with sympy?
btw: i encountered a different behaviour of the commutative parameter when using python2.
commutative=False will only mark that one symbol as non-commutative. SymPy will put the commuting part (in this case, everything else) first, and follow it by the non-commuting symbols in the correct order.
But you shouldn't use that anyhow. It will not give you what you want (e.g., you'll get a*b**-1 instead of a/b if a and b are noncommutative, since a*b**-1 and b**-1*a are different).
I would recommend just getting the latex for the individual parts that you want, and piecing them together in the order you want using string formatting.
Alternately, you can write your own printer that orders things the way you want. See if you are interested in taking that route, and you should also read the source code for the existing printer, since you'll probably want to just take what is already there and modify it a little. This method is good if you want to be more general, and the basic string concatenation gets too messy. If the example you showed is as complicated as it will get, it may be overkill, but if you need to support potentially arbitrarily complicated expressions, it may be better to do it this way.
If you decide to take that second route and need help writing a custom printer, feel free to ask here or on the SymPy mailing list.

Generic Mixture Models in pymc

I have a distribution with multiple humps. I would like to try fitting several different types of distributions to each hump, gaussian, exponential, weibuill, etc. However, as it stands, it seems that I have to manually define a stochastic class for each combination. What I would like to do is something like
#stochastic(model_a, model_b, observed=True)
def mixture(value=observed_time_series, model_a_parameters, model_b_parameters, p):
def logp(value, model_a_parameters, model_b_parameters):
return p*model_a.logp(value, *model_a_parameters) + (1-p)*model_b.logp(value, *model_b_parameters)
def random(model_a_parameters, model_b_paramters, ratio):
if(random() < ratio):
return model_a.random()
return model_b.random()
Is delegation like this possible? Is there a standard way to do this? The main thing that would stop something like the above is that I can't think of any way to group sets of variables together.
You are on the right track. Your stochastic decorator can be simplified simply to:
def mixture(...):
Also, you only need to define random if you need to sample from the likelihood.
Another approach for modeling mixtures is to use a latent variable model, where individual observations have indicators corresponding to which distribution they are derived from. These indicators can be modeled with a Categorical distribution, for example. This can then have a Dirichlet prior, etc.

