Python: Find the equation of my line, Logit - python

I have some data that, when plotted, is looking like it is showing similarities to the LOGIT function. However, I need the equation of this line and I am struggling to find a package/function in Python to help me with this! Below is a screenshot of the graph. I have a load of X,Y data and I need the full equation of this line.
Every time I try and research this I am just pointed towards using logistic regression, however i'm convinced there must be a function in which I can input my data and comfortably receive an equation/coefficients for an equation in the output.
Sorry guys as I'm asking for whereabouts of a package I can't really provide my workings so far, because there are no workings thus far unfortunately.
Thanks all

Related

Computing log of integral in terms of log of integrand

This question may be half computational math, half programming.
I'm trying to estimate log[\int_0^\infty\int_0^\infty f(x,y)dxdy] [actually thousands of such integrals] in Python. The function f(x,y) involves some very large/very small numbers that are bound to cause overflow/underflow errors; so I'd really prefer to work with log[f(x,y)] instead of f(x,y).
Thus my question is two parts:
1) Is there a way to estimate log[\int_0^\infty\int_0^\infty f(x,y)dxdy] using the log of the function instead of the function itself?
2) Is there an implementation of this in Python?
Thanks
I would be surprised if the math and/or numpy libraries or perhaps some more specific third party libraries would not be able to solve a problem like this. Here are some of their log functions:
math.log(x[, base]), math.log1p(x), math.log2(x), math.log10(x) (https://docs.python.org/3.3/library/math.html)
numpy.log, numpy.log10, numpy.log2, numpy.log1p, numpy.logaddexp, numpy.logaddexp2 (https://numpy.org/doc/stable/reference/routines.math.html#exponents-and-logarithms)
Generally, Just google: "logarithm python library" and try to identify similar stackoverflow problems, which will allow you to find the right libraries and functions to try out. Once you do that, then you can follow this guide, so that someone can try to help you get from input to expected output: How to make good reproducible pandas examples

How to cluster points based on the function they belong in Python?

sorry, if the title is ambiguous. Let me explain the problem. By the way, I'm really new to Data Science, so sorry if I make a statement that doesn't make sense.
Recently came across to a problem which was related to clustering. The coordinates were given for a lot of points. The task was to cluster them. But it is not the type of clustering based on distance. In fact, those points belong to functions and they need to be clustered, accordingly.
This is not what my data looks like, but the problem is the same:
Please take a look here. In the given link, the provided problem is what I am looking for, but it is in R, not Python. When I searched for "functional clustering" in Python, I couldn't find anything. Please direct me in the correct path, if you know how to do it.

Which mathematical method is used by odeint?

I'm working with scipy.integrate.odeint and want to understand it better. For this I have two slightly related questions:
Which mathematical method is it using? Runge-Kutta? Adams-Bashforth? I found this site, but it seems to be for C++, but as far as I know the python function uses the C++ version as well... It states that it switches automatically between implicit and explicit solver, does anybody know how it does this?
To understand/reuse the information I would like to know at which timepoints it evaluates the function and how exactly it computes the solution of the ODE, but fulloutput does not seem to help/I wasn't able to find out how. So to be more precise, an example with Runge-Kutta-Fehlberg: I want the different timepoints at which it evaluated f and the weights it used to multiply it.
Additional information (what for this Info is needed):
I want to reuse this information to use automatic differentiation. So I would call odeint as a black box, find out all the relevant steps it made and reuse this info to calculate the differential dx(T_end)/dx0.
If you know of any other method to solve my problem, please go ahead. Also if another ode solver might be more appropriate to d this.
PS: I'm new, so would it be better to split this question into to questions? I.e. seperate 1. and 2.?

Fitting multiple Lorentzians to data using scipy in Python3

Okay so I appreciate this will require a bit of patience but bear with me. I'm analysing some Raman spectra and have written the basis of a program to use Scipy curve_fit to fit multiple lorentzians to the peaks on my data. The trick is I have so much data that I want the program to automatically identify initial guesses for the lorentzians, rather than manually doing it. On the whole, the program gives it a good go (and might be of use to others with a similar use case with simpler data), but I don't know Scipy well enough to optimise the curve_fit enough to make it work on many different examples.
Code repo here: https://github.com/btjones-me/raman_spectroscopy
An example of it working well can be see in fig 1.
Part of the problem is my peak finding algorithm, which sometimes struggles to find the appropriate starting positions for each lorentzian. You can see this in fig 2.
The next problem is that, for some reason, curve_fit occasionally catastrophically diverges (my best guess is due to rounding errors). You can see this behaviour in fig 3.
Finally while I usually make good guesses with the height and x position of each lorentzian, I haven't found a good way of predicting the width, or FWHM of the curves. Predicting this might help curve_fit.
If anybody can help with either of these problems in any way I would appreciate it greatly. I'm open to any other methods or suggestions including additional third party libraries, so long as they improve upon the current implementation. Many thanks to anybody who attempts this one!
Here it is working exactly as I intend:
Below you can see the peak finding method has failed to identify all the peaks. There are many peak finding algorithms, but the one I use is Scipy's 'find_peaks_cwt()' (it's not usually this bad, this is an extreme case).
Here it's just totally way off. This happens fairly often and I can't really understand why, nor stop it from happening. Possibly it occurs when I tell it to find more/less peaks than are available in the spectra, but just a guess.
I've done this in Python 3.5.2. PS I know I won't be winning any medals for code layout, but as always comments on code style and better practices are welcomed.
I stumbled across this because I'm attempting to solve the exact same problem, here is my solution. For each peak, I only fit my lorentzian in the region of the domain + or - 1/2 the distance to the next closest peak. Here is my function that does breaks up the domain:
def get_local_indices(peak_indices):
#returns the array indices of the points closest to each peak (distance to next peak/2)
chunks = []
for i in range(len(peak_indices)):
peak_index = peak_indices[i]
if i>0 and i<len(peak_indices)-1:
distance = min(abs(peak_index-peak_indices[i+1]),abs(peak_indices[i-1]-peak_index))
elif i==0:
distance = abs(peak_index-peak_indices[i+1])
else:
distance = abs(peak_indices[i-1]-peak_index)
min_index = peak_index-int(distance/2)
max_index = peak_index+int(distance/2)
indices = np.arange(min_index,max_index)
chunks.append(indices)
return chunks
And here is a picture of the resulting graph, the dashed lines indicate which areas the lorentzians are fit in:
Lorentzian Fitting Plot

solving ODEs on networks with PyDSTool

After using scipy.integrate for a while I am at the point where I need more functions like bifurcation analysis or parameter estimation. This is why im interested in using the PyDSTool, but from the documentation I can't figure out how to work with ModelSpec and if this is actually what will lead me to the solution.
Here is a toy example of what I am trying to do: I have a network with two nodes, both having the same (SIR) dynamic, described by two ODEs, but different initial conditions. The equations are coupled between nodes via the Epsilon (see formula below).
formulas as a picture for better read, the 'n' and 'm' are indices, not exponents ~>
http://image.noelshack.com/fichiers/2014/28/1404918182-odes.png
(could not use the upload on stack, sadly)
In the two node case my code (using PyDSTool) looks like this:
#multiple SIR metapopulations
#parameter and initial condition definition; a dict is a must
import PyDSTool as pdt
params={'alpha': 0.7, 'beta':0.1, 'epsilon1':0.5,'epsilon2':0.5}
ini={'s1':0.99,'s2':1,'i1':0.01,'i2':0.00}
DSargs=pdt.args(name='SIRtest_multi',
ics=ini,
pars=params,
tdata=[0,20],
#the for-macro generates formulas for s1,s2 and i1,i2;
#sum works similar but sums over the expressions in it
varspecs={'s[o]':'for(o,1,2,-alpha*s[o]*sum(k,1,2,epsilon[k]*i[k]))',
'i[l]':'for(l,1,2,alpha*s[l]*sum(m,1,2,epsilon[m]*i[m]))'})
#generator
DS = pdt.Generator.Vode_ODEsystem(DSargs)
#computation, a trajectory object is generated
trj=DS.compute('test')
#extraction of the points for plotting
pts=trj.sample()
#plotting; pylab is imported along with PyDSTool as plt
pdt.plt.plot(pts['t'],pts['s1'],label='s1')
pdt.plt.plot(pts['t'],pts['i1'],label='i1')
pdt.plt.plot(pts['t'],pts['s2'],label='s2')
pdt.plt.plot(pts['t'],pts['i2'],label='i2')
pdt.plt.legend()
pdt.plt.xlabel('t')
pdt.plt.show()
But in my original problem, there are more than 1000 nodes and 5 ODEs for each, every node is coupled to a different number of other nodes and the epsilon values are not equal for all the nodes. So tinkering with this syntax did not led me anywhere near the solution yet.
What I am actually thinking of is a way to construct separate sub-models/solver(?) for every node, having its own parameters (epsilons, since they are different for every node). Then link them to each other. And this is the point where I do not know wether it is possible in PyDSTool and if it is the way to handle this kind of problems.
I looked through the examples and the Docs of PyDSTool but could not figure out how to do it, so help is very appreciated! If the way I'm trying to do things is unorthodox or plain stupid, you are welcome to make suggestions how to do it more efficiently. (Which is actually more efficient/fast/better way to solve problems like this: subdivide it into many small (still not decoupled) models/solvers or one containing all the ODEs at once?)
(Im neither a mathematician nor a programmer, but willing to learn, so please be patient!)
The solution is definitely not to build separate simulation models. That won't work because so many variables will be continuously coupled between the sub-models. You absolutely must have all the ODEs in one place together.
It sounds like the solution you need is to use the ModelSpec object constructs. These let you hierarchically build the sub-model definitions out of symbolic pieces. They can have their own "epsilon" parameters, etc. You declare all the pieces when you're finished and let PyDSTool make the final strings containing the ODE definitions for you. I suggest you look at the tutorial example at:
http://www.ni.gsu.edu/~rclewley/PyDSTool/Tutorial/Tutorial_compneuro.html
and the provided examples: ModelSpec_test.py, MultiCompartments.py. But, remember that you still have to have a source for the parameters and coupling data (i.e., a big matrix or dictionary loaded from a file) to be able to automate the process of building the model, otherwise you'd still be writing it all out by hand.
You have to build some classes for the components that you want to have. You might also create a factory function (compare 'makeSoma' in the neuralcomp.py toolbox) that will take all your sub-components and create an ODE based on summing something up from each of the declared components. At the end, you can refer to the parameters by their position in the hierarchy. One might be 's1.epsilon' while another might be 'i4.epsilon'.
Unfortunately, to build models like this efficiently you will have to learn to do some more complex programming! So start by understanding all the steps in the tutorial. You can email me directly through the SourceForge support discussions or email once you've got started and have specific questions.

Categories

Resources